C code always runs way faster than Java, right? Wrong!
So
we all know the prejudice that Java being interpreted is slow and that C
being compiled and optimized runs very fast. Well as you might know,
the picture is quite different.
TL;DR Java is
faster for constellations, where the JIT can perform inlining as all
methods/functions are visible whereas the C compiler cannot perform
optimizations accross compilation units (think of libraries etc.).
A
C compiler takes the C code as input, compiles and optimizes it and
generates machine code for a specific CPU or architecture to be
executed. This leads to an executable which can be directly run on the
given machine without further steps. Java on the other hand, has an
intermediate step: Bytecode. So the Java compiler takes Java code as
input and generates bytecode, which is basically machine code for an
abstract machine. Now for each (popular) CPU architecture there is a
Java Virual Machine, which simulates this abstract machine and executes
(interprets) the generated bytecode. And this is as slow as it sounds.
But on the other hand, bytecode is quite portable, as the same output
will run on all platforms – hence the slogan “
Write once, run everywhere“.
Now with the approach described above it would be rather “
write once, wait everywhere” as the interpreter would be quite slow. So what a modern JVM does is
just in time
compilation. This means the JVM internally translates the bytecode into
machine code for the CPU at hands. But as this process is quite
complex, the
Hotspot JVM (the one most commonly used) only does this for code fragments which are executed often enough (hence the name
Hotspot).
Next to being faster at startup (interpreter starts right away, JIT
compiler kicks in as needed) this has another benefit: The hotspot JIT
known already what part of the code is called frequently and what not –
so it might use that while optimizing the output – and this is where our
example comes into play.
Now before having a look at my tiny,
totally made up example, let me note, that Java has a lot of features
like dynamic dispatching (calling a method on an interface) which also
comes with runtime overhead. So Java code is probably easier to write
but will still generally be slower than C code. However, when it comes
to pure number crunching, like in my example below, there are
interesting things to discover.
So without further talk, here is the example C code:
test.c:
05 | int main(int argc, char** argv) { |
07 | for (int l = 0; l < 1000; l++) { |
test1.c:
Now
what the main function actually computes isn’t important at all. The
point is that it calls two functions (test and compute) very often and
that those functions are in anther compilation unit (test1.c). Now lets
compile and run the program:
So this takes about
6.6 seconds to perform the computation. Now let’s have a look at the Java program:
Test.java
03 | private static int test( int i) { |
06 | private static int compute( int i) { |
09 | private static int exec() { |
10 | int sum = 0 ; for ( int l = 0 ; l < 1000 ; l++) { |
11 | int i = 0 ; while (i < 2000000 ) { |
18 | public static void main(String[] args) { |
Now lets compile and execute this:
So taking
3.4 seconds,
Java is quite faster for this simple task (and this even includes the
slow startup of the JVM). The question is why? And the answer of course
is, that the JIT can perform code optimizations that the C compiler
can’t. In our case it is function inlining. As we defined our two tiny
functions in their own compilation unit, the comiler cannot inline those
when compiling test.c – on the other hand, the JIT has all methods at
hand and can perform aggressive inlining and hence the compiled code is
way faster.
So is that a totally exotic and made-up example which
never occurs in real life? Yes and no. Of course it is an extreme case
but think about all the libraries you include in your code. All those
methods cannot be considered for optimization in C whereas in Java it
does not matter from where the byte code comes. As it is all present in
the running JVM, the JIT can optimize at its heart content. Of course
there is a dirty trick in C to lower this pain: Marcos. This is, in my
eyes, one of the mayor reasons, why so many libraries in C still use
macros instead of proper functions – with all the problems and headache
that comes with them.
Now before the flamewars start: Both of
these languages have their strenghs and weaknesses and both have there
place in the world of software engineering. This post was only written
to open your eyes to the magic and wonders that a modern JVM makes
happen each and every day.