MSN Home  |  My MSN  |  Hotmail
Sign in to Windows Live ID Web Search:   
go to MSNGroups 
Groups Home  |  My Groups  |  Language  |  Help  
 
C\C++\Visual C++ ProgramCCVisualCProgram@groups.msn.com 
  
What's New
  Join Now
  Message Board  
  Pictures  
  Introduction to C programming pitfalls  
  C++ vs Java Performance  
  Documents  
  
  
  Tools  
 
<CENTER>

C++ vs Java Performance

</CENTER> <CENTER>By Eric Galyon </CENTER>

In 1995, Sun Microsystems released Java, a multipurpose, object oriented programming language, and, since that time, the application development community has heard many rumors about the new language. Java has been hyped in many places including computer magazines, broadcast news, and the Internet. Like many new technologies, Java has been described as the solution to all problems. Java will make web development easy. Java will become an operating system and challenge Microsoft's dominance. Java will allow applications to run on any platform. Sorting out these issues requires focus on what Java really is: a modern, object oriented programming language that is highly portable. As with any programming language, there are clear differences that can be construed as advantages and disadvantages. Comparing Javaıs differences to a popular and fairly similar programming language such as C++ helps developers make the correct decisions about whether to embrace the new language or not.



Java's Advantages

The Java programming language has many advantages over other languages. It is object oriented which allows programmers to design reusable components easily. It closely resembles C which makes the language very easy to learn for anyone who has experience programming in C. Java has built in garbage collection, similar to LISP or Prolog. "Java frees memory automatically, by performing automatic garbage collection, so you never need worry about memory leaks, nor must you waste time looking for one. Thus, you are more productive, and less likely to be driven crazy via tedious, mind-numbing debugging." (On to Java, 3). Plus, Java includes built in data structures and algorithms for creating Graphical User Interfaces and communicating with other computers over a network.

Another advantage Java has over other languages is its portability. When a Java program is compiled, it is not compiled into native machine code; instead it is compiled into byte code which can be interpreted by a Java Virtual Machine. Once a specific computer architecture has a Virtual Machine designed for it, the computer can execute any Java program that has been compiled into byte code. This portability becomes evident in web based applications. According to PC Magazine, "Java is an interpreted, machine-independent language, so that any PC system with a Java-based browser can execute programs from any Java Web site." (PC Magazine, Dec. 19, 1995, 116). Web developers can therefore write applications in Java, compile them once, and run them on any machine that has a Java compatible web browser. Although Java's portability gives it a clear advantage over other languages, this feature also creates one of Javaıs biggest disadvantages.



Java's Primary Disadvantage

Although Java's ability for producing portable, architecturally neutral code is desirable, the method used to create this code is inefficient. As mentioned above, once Java code is compiled into byte code, an interpreter called a Java Virtual Machine, specifically designed for a computer architecture, runs the program. Why is this a problem? "Java, being an interpreted system, is currently an order of magnitude slower than C." (Just Java, 302). Unlike natively compiled code, which is a series of instructions that correlate directly to a microprocessors instruction set, an interpreter must first translate the Java binary code into the equivalent microprocessor instruction. Obviously, this translation takes some amount of time and, no matter how small a length of time this is, it is inherently slower than performing the same operation in machine code.

How important is this speed difference? According to PC Magazine, it appears very significant: "Compared with native code, Java VMs are excruciatingly slow. ... Java still cannot compete with natively compiled C++ code." (PC Magazine, April 7, 1998, 104). The difference in speed between C++ and Java is very important. Even with all of Java's benefits, Java will not be widely excepted if it can not perform adequately. C++ has been widely adopted by developers and they will not be willing to change languages if the applications they develop with Java do not measure up to their personal and their clients standards. However, if the speed difference is negligible, developers may be willing to learn and program in Java because of the significant advantages the language offers. Before developers can make this decision, they need an accurate picture of what the speed tradeoffs between the two languages are. This picture can only be created by testing both languages and determining the speed difference between the two languages by gathering data and generating statistics to explain the data.



Testing the Speed Difference

The goal of my project was to begin answering some of the questions asked above: How much faster is C++ than Java? Does the speed difference really matter? Are there specific areas where Javaıs performance is so much slower that it should not be used? Answering these questions involved writing code that performed identical operations in both C++ and Java and testing the amount of time it took to execute the programs. These programs, since they are as close as possible to being identical, can then be used to measure the performance difference.

For testing purposes, six different operations were selected. Four of the programs test calculation intensive activities and the other two test input/output and memory intensive activities. All six tests were programmed in both languages and implemented as identically as possible.

All four of the calculation intensive programs perform matrix operations. Two different operations, addition and matrix multiplication, are performed and each operation is performed on floating point and integer numbers. Both programs store all of the values in an array, instead of a linked list, to help keep as much of the overhead as possible focused on the actual calculations. The programs are fairly simple. They all accept two command line arguments that specify the size of the matrix to test, the program automatically generates the test matrices, the calculation is performed and then the program exits.

The two input/output and memory intensive programs create a singly linked list (SLL) of employee information. The programs each store three pieces of information about a distinct employee: their name as a string, department as a string, and salary as a floating point number. The method each program uses to retrieve the information, however, is different. One of the programs is designed to accept all of the data from standard input until 'done' is entered in the name field of an employee. After all of the data has been entered, the program prints all of the data to standard output. The other program is designed to open a specific data file, read the information from it, and print the information it retrieved to a specific output file. Both programs test how C++ and Java handle memory allocation because they create the SLL of employees but they differ in testing the type of input being performed.

After the programs were designed, data had to be created to test the programs. The data used, like the programs themselves, had to be identical between the programs to make the tests valid. For the matrices, Unix scripts were created that ran and timed the four different calculations with increasingly larger matrices. Obviously, the larger the matrix becomes, the more time the operation is expected to take because there are more calculations to be done. For the SLL's, Unix scripts were created that either echoed employee information to standard input or created an employee data file to be used by the programs. Each consecutive test entered more employees into the database. The specifics of the test data used and what specifically the program does are discussed, along with the results, later in the paper.

Data is useless until it is used to make inferences. Inferences are also useless if the data does not represent an accurate sample of the population. In order to check if the data gathered from testing the programs was an accurate sample, a test of the data's standardized residual was performed. The standardized residual is a statistical method that determines how closely a data set corresponds to a normal distribution (also known as a bell curve). Data that corresponds to a normal distribution can be considered to be a representative sample and, therefore, inferences made from the data can be considered to be fairly certain. The standardized residual graphs for all of the programs indicated that they were representative samples meaning we can use the data to draw conclusions about the performance of C++ versus Java and be fairly confident in them.

The following six sections discuss the specifics of each program, how it was tested, and the results of the test. The sections are followed by a summarizing section that draws conclusions based upon the gathered data.



Matrix Addition of Floating Point Values

The Code

The MatrixAddFloat Java and C++ programs add two square matrices together and store the result in a third matrix. To do so, the program performs a doubly nested loop that increments between each row and column and adds every entry in the two matrices and stores the result in the respective spot in the third matrix. The programs accept two values from standard input that determine the size of matrix to add. The two matrices to be added are duplicates of one another and are created by storing the reciprocal of a value that is increment across each column and then continued to be increment across the next row down. For example, a 4 by 4 square test matrix would look like this:

<CENTER> </CENTER>

Creating the matrix in this way reduces the amount of overhead generated by input because only the matrix dimensions need to be known. Before the add is performed, the upper left and lower right corners of both matrices are printed. After the add is performed, the upper left and lower right corners of the result matrix are printed to help verify that the operation was performed correctly.

The Data

A total of one hundred tests were performed, five times each, on both of the programs. The tests used identical matrix dimensions and values to enable direct comparison of their performance. Originally, the tests started at 1 by 1 and ranged to 100 by 100. These values did not produce very interesting results because they were too small and the additions were too fast. To compensate, the matrix size was increased. The square matrix sizes used to generate the tests started at 5 by 5, incremented by 5 in both width and height each time, and ranged to 500 by 500. The number of adds performed for a matrix is dependent upon itıs size and is equal to the matrices height times its width. For instance, for the largest matrix, the program performed 250,000 additions. Each of these matrices was tested five times to ensure a representative sample. After completing all of the tests, the data was reduced to one hundred distinct points by calculating the mean average of the five individual trials.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.3028 sec 2.31912 sec 0.0608 sec 0.20748 sec 0.20614 sec 1.68478 sec
Median 0.263 sec 1.982 sec 0.062 sec 0.206 sec 0.166 sec 1.351 sec
Standard Deviation 0.15080819 1.14421709 0.00624095 0.01604695 0.14846862 1.10905808
Number of Times C++
is Faster than Java
7.697794121 3.456724796 8.474991232

A plot of the standardized residual of the number of times C++ is faster than Java confirms that this data set closely corresponds to a normal distribution. This means that the data collected is a good sample of how C++ performs compared to Java and that inferences based on this data are valid.

C++ is approximately 7.7 times faster than Java at floating point matrix addition.

As expected, the amount of time needed to complete the program increased exponentially due to the exponential increase in the size of the matrices. This is shown by a plot of the amount of time needed to complete each consecutive test .



Matrix Addition of Integer Values

The Code

The MatrixAddInt Java and C++ programs are very similar to the MatrixAddFloat programs except the test matrices are created as integer values instead of floating point values. The program accepts two command line arguments from standard input and generates identical test matrices of the size specified. The procedure that creates the test matrices increments a value across each column and then continues to be incremented across the next row down. For example, a 4 by 4 square test matrix would look like this:

<CENTER> </CENTER>

As with the MatrixAddFloat program, before the add is performed, the upper left and lower right corners of both matrices are printed. After the add is performed, the upper left and lower right corners of the result matrix are printed to help verify that the operation was performed correctly.

The Data

Testing was performed exactly the same way as with the MatrixAddFloat programs. One hundred tests were performed five times each. Each consecutive test increased the size of the test matrices by 5 in both width and height. The resulting program times were averaged for the final test time.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.29296 sec 2.14324 sec 0.06122 sec 0.19842 sec 0.1703 sec 1.53664 sec
Median 0.292 sec 1.909 sec 0.06 sec 0.198 sec 0.141 sec 1.253 sec
Standard Deviation 0.11157657 1.0634884 0.00691007 0.01505356 0.11655263 1.02084694
Number of Times C++
is Faster than Java
7.210231942 3.278431978 9.230543707

As with the MatrixAddFloat programs, a plot of the standardized residual indicated that these values correlate with a normal distribution.

C++ is approximately 7.2 times faster than Java at integer matrix addition.

As expected, the amount of time needed to complete the program increased exponentially due to the exponential increase in the size of the matrices. This is shown by a plot of the amount of time needed to complete each consecutive test.



Matrix Product of Floating Point Values

The Code

The MatrixProductFloat Java and C++ programs compute the product of two square matrices and stores the result in a third matrix. The tests performed on these programs were very similar to the tests performed on the matrix addition programs. Identical square test matrices, ranging from 1 by 1 to 100 by 100, were created and their product was calculated. The matrices were created by supplying the program with the matrix width and height from standard input. The program then created test matrices automatically, to reduce the overhead associated with data input, by filling each position with the reciprocal of a number that is incremented from left to right and then down each consecutive row. For example, a 4 by 4 test matrix generated by the program would look like this:

<CENTER> </CENTER>

The program then calculated the product of the two test matrices and stored the result in the third. Before the product is calculated, the upper left and lower right corners of both matrices are printed. After the product is calculated, the upper left and lower right corners of the result matrix are printed to help verify that the operation was performed correctly.

The Data

One hundred tests were performed, five times each for a total of five hundred data values. As mentioned before, the data used to test the programs ranged from a 1 by 1 square matrix to a 100 by 100 square matrix. Once all of the data was collected, the mean average of each of the five trials for a single matrix size was calculated to reduce the final data set size to one hundred data points. Unlike matrix addition, ranging the data from a 1 by 1 matrix to a 100 by 100 matrix created interesting results. For addition, the range had to stretch to 500 by 500 because the operation was relatively simple. Calculating the product of two matrices is a much more intensive operation. Each value in a particular row of the first matrix must by multiplied by each particular column of the second and then these products are all added together and stored in the intersecting row/column position of the result matrix. Obviously, this calculation requires many more individual calculations than simply adding each element in the two matrices. This allowed the test matrices to be smaller but still result in data that showed the expected exponential trends.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.35542 sec 2.57126 sec 0.0584 sec 0.2068 sec 0.2473 sec 1.96646 sec
Median 0.264 sec 1.793 sec 0.058 sec 0.206 sec 0.142 sec 1.204 sec
Standard Deviation 0.23778019 1.78648024 0.00674499 0.016367 0.23523904 1.76564865
Number of Times C++
is Faster than Java
7.178870764 3.589740834 8.711618771

A plot of the standardized residual of the number of times C++ is faster than Java confirms that this data set closely corresponds to a normal distribution. This means that the data collected is a good sample of how C++ performs compared to Java and that inferences based on this data are valid.

C++ is approximately 7.2 times faster than Java at calculating the product of a floating point matrix.

As expected, the amount of time needed to complete the program increased exponentially due to the exponential increase in the size of the matrices. This is shown by a plot of the amount of time needed to complete each consecutive test.



Matrix Product of Integer Values

The Code

The MatrixProductInteger Java and C++ programs are exactly like the MatrixProductFloat programs except all arithmetic uses integer values instead of floating point values. The test matrices generated are identical square matrices that increment a value across a row and continues incrementing across the next row and down the rest of the matrix. For example, a 4 by 4 test matrix look like this:

<CENTER> </CENTER>

After constructing the test matrices, the program takes their product and stores the result in a third matrix. Before the calculation, the corners of the two test matrices are displayed and after calculating the corners of the result are displayed.

The Data

The test data used for this program was identical the MatrixProductFloat. Five trials of one hundred specific matrix dimensions were run. The matrix dimensions ranged from 1 by 1 to 100 by 100. The five distinct trials were then averaged to reduce the data from five hundred to one hundred data points. Since calculating the product of two matrices requires many individual calculations, the range did not need to extend to 500 by 500 as the addition programs did. For further discussion of this, consult the The Data sub-section of the Matrix Product of Floating Point Values section.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.34358 sec 2.53458 sec 0.06294 sec 0.20328 sec 0.24842 sec 1.91482 sec
Median 0.244 sec 1.81 sec 0.062 sec 0.202 sec 0.148 sec 1.155 sec
Standard Deviation 0.23653839 1.73149577 0.00694832 0.01680673 0.23687056 1.72494968
Number of Times C++
is Faster than Java
7.407805962 3.27072355 3.27072355

A plot of the standardized residual of the number of times C++ is faster than Java confirms that this data set closely corresponds to a normal distribution. This means that the data collected is a good sample of how C++ performs compared to Java and that inferences based on this data are valid.

C++ is approximately 7.4 times faster than Java at calculating the product of an integer matrix.

As expected, the amount of time needed to complete the program increased exponentially due to the exponential increase in the size of the matrices. This is shown by a plot of the amount of time needed to complete each consecutive test.



Singly Linked List, Data From Standard Input

The Code

The SLL,StdIn Java and C++ programs create an employee database by storing each employees record in a node of a linked list. The program reads an employees name, department and salary amount from standard input and stores it in the database. After storing the record, it asks for another employee until 'done' is entered as the employees name. The program then prints the entire list of entered employees to standard output and exits.

The Data

A total of one hundred tests were performed, five times each, on both of the programs. The test data was generated by repeating the same employee over and over, starting at ten total times and ranging to 1,000 total times in increments of ten. . Each data point was tested five times and averaged to produce a final data set of one hundred points.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.23162 sec 1.98756 sec 0.08002 sec 0.26702 sec 0.11488 sec 1.33016 sec
Median 0.229 sec 1.95 sec 0.079 sec 0.274 sec 0.116 sec 1.309 sec
Standard Deviation 0.09357488 0.67448725 0.03909119 0.05971616 0.05520307 0.61809886
Number of Times C++
is Faster than Java
8.922348075 4.223604692 11.79025711

A plot of the standardized residual of the number of times C++ is faster than Java confirms that this data set closely corresponds to a normal distribution. This means that the data collected is a good sample of how C++ performs compared to Java and that inferences based on this data are valid.

C++ is approximately 8.9 times faster than Java at filling and printing a SLL with data from standard input. Java is considerable slower at inputting data from standard input than C++; especially when compared to the performance difference of the arithmetic test discussed above.

As expected, the amount of time needed to complete the program increased linearly due to the linear increase in number of employees being entered. This is shown by a plot of the amount of time needed to complete each consecutive test.



Singly Linked List, Data From File Input

The Code

The SLL,FileIn Java and C++ programs are identical to the SLL,StdIn programs except they read and output the data from and to files instead of standard input and output. The program reads employee information from a file until and, once finished, outputs all of the data entered to a different output file.

The Data

The data set for these programs was identical to the data used for the SLL,StdIn programs. The only difference is that the data was stored in a file instead of entered as standard input. One hundred tests were executed, five times each, and then averaged to produce a total of 100 individual data points.

The Results

The following table summarizes the descriptive statistics of this test. The table is followed by explanations of what the statistics imply.
<CENTER>

Real Time

</CENTER>
<CENTER>

System Time

</CENTER>
<CENTER>

User Time

</CENTER>
Statistic C++ Java C++ Java C++ Java
Mean 0.18844 sec 2.74996 sec 0.05086 sec 0.3192 sec 0.06392 sec 0.87548 sec
Median 0.186 sec 2.665 sec 0.05 sec 0.308 sec 0.064 sec 0.859 sec
Standard Deviation 0.06694824 0.91947156 0.02132055 0.08588976 0.02529969 0.35454288
Number of Times C++
is Faster than Java
14.87286053 6.900287009 13.79013001

A plot of the standardized residual of the number of times C++ is faster than Java confirms that this data set closely corresponds to a normal distribution. This means that the data collected is a good sample of how C++ performs compared to Java and that inferences based on this data are valid.

C++ is approximately 14.9 times faster than Java at filling and printing a SLL with data from file input. C++ performs much better than Java at this test than at any other. Most of the other timings indicated C++ was seven to eight times as fast but, in this case, C++ is nearly fifteen times faster. Clearly, C++ handles file I/O much more efficiently than Java does.

As expected, the amount of time needed to complete the program increased linearly due to the linear increase in number of employees being entered. This is shown by a plot of the amount of time needed to complete each consecutive test.



Summary of the Data

What does all this data mean? For starters, it means Java is definitely slower than C++. There was not a single test performed where Java was within five times as fast as C++. A plot of the number of times real execution of C++ was faster than Java clearly shows two important trends. First, most of the data indicates that C++ is seven to eight times faster than Java. All of this information comes from the matrix calculations. Based on this data, it is safe to assume that C++ is approximately seven to eight times faster than Java at arithmetic. The other interesting trend comes from the line that shows the number of times faster C++ is than Java for inputting data into a program from a file. The line is much higher up on the vertical axis because C++ is far faster than Java at this type of operation. The data indicates that any program that performs a large amount of file I/O, such as a database, would perform much slower if it were implemented in Java rather than C++.



What Can Be Done About the Speed Problem

Java may be slower, but there is hope. Several possibilities exist for decreasing the performance gap between C++ and Java.

First, programmers need to consider how important the speed difference between C++ and Java is to their application. Modern computers, many running at more than 200 MHz, make the speed difference tolerable. Java may be slower, but when the amount only differs in a few milliseconds, the difference becomes tolerable. Applications that do not need require large amounts of computation and donıt need to run in real time may benefit from the advantages Java provides without suffering from its side effects. Computers will only become faster over time and so will Java.

Another possible speed boost Java is receiving is in the use of the Just In Time (JIT) compiler. ³... performance improvements can come from a ³Just-In-Time² compiler that does early binding by loading classes in anticipation, rather than on demand at runtime in an applet. Early symbol binding can be done in a separate thread, with otherwise unused CPU cycles put to use.² (Just Java, 302) The JIT and other Virtual Machine improvements will reduce the amount of noticeable speed difference between C++ and Java.

As a last resort, Java could receive a huge performance boost from changing the way the code is compiled. Instead of compiling into the architecturally neutral byte code as it is now, Java could be compiled into native machine language in the same way C++ is. ³[Java] is not inherently interpreted, and it may just as well be implemented by compiling its instruction set to that of a real CPU, as for a conventional programming language.² (The Java Virtual Machine Specification, Introduction.doc.html). A well designed compiler could close nearly all of the performance gap between C++ and Java. The problem, of course, is by doing so the code loses the advantage of being portable. But, if speed is the primary concern, this approach would provide a solution.



Conclusion

Java has many advantages, but it also has a large disadvantage. Java performs many times slower than C++, especially when it comes to I/O operations. Although this performance difference cannot be ignored, it may not be overly important. Modern day computers are very fast and the speed difference may often translate to waiting for 8 milliseconds for Java as opposed to 1 millisecond for C++. Java may be much slower, but, from the users perspective, there is relatively little difference. The advantages to Java outweigh its disadvantages in most cases. The ease of development, reusability and portability make Java well worth the time to learn.


Notice: Microsoft has no responsibility for the content featured in this group. Click here for more info.
  Try MSN Internet Software for FREE!
    MSN Home  |  My MSN  |  Hotmail  |  Search
Feedback  |  Help  
  İ2005 Microsoft Corporation. All rights reserved.  Legal  Advertise  MSN Privacy