For some reason people read this as a rebuttal of
it is not
. It is merely a response, I accept almost everything he said but have a slightly different interpretation on some of the points.
Over the weekend quite a few people wrote to me about a
well researched article written by Drew Crawford
But first lets start with who I am (Shai Almog) and what I did, I wrote a lot of Java VM code when consulting for Sun Microsystems. I did this on mobile devices that had a fracti
on of the RAM/CPU available for today’s devices. Today I’m the co-founder of Codename One where I regularly write low level Android, iOS, RIM, Windows Phone etc. code to allow out platform to work everywhere seamlessly. So I have pretty decent qualification to discuss devices, their performance issues etc.
Lets start by the bottom line of my opinion:
I think Drew didn’t cover the slowest and biggest problem in web technologies: the DOM.
His claims regarding GC/JIT are inaccurate.
People refer to performance in many ways, but generally most of us think of performance in terms of UI sluggishness.
e call “perceived performance”.
Perceived performance is pretty hard to measure but its pretty easy to see why it sucks on web UI’s: DOM.
To make matters worse, many small things such as complex repeat patterns, translucency layers etc. make optimizing/benchmarking such UI’s really difficult.
Why Java Is Fast & Objective-C Is Slow
The first thing people need to understand about Objective-C: it isn’t C.
C is fast, pretty much as fast as could be when done right.
Objective-C doesn’t use methods like Java/C++/C#, it uses messages like Smalltalk. This effectively means it always performs dynamic binding and invoking a message is REALLY slow in Objective-C. At least two times slower than statically compiled Java.
A JIT can (and does) produce faster method invocations
than a static compiler since it can perform dynamic binding and even virtual method inlining e.g. removing a setter/getter overhead! Mobile JITs are usually simpler than desktop JITs but they can still do a lot of these optimizations.
We used to do some pretty amazing things with Java VMs on devices that had less than 1mb of RAM in some cases, you can check out the rather
old blog from Mark Lam
about some of the things Sun used to do here.
But iPhone is faster than Android?
iPhone has better perceived performance. Apps seem to launch instantly since they have hardcoded splash images (impractical for Android which has too many flavors and screen sizes). The animations in iOS are amazingly smooth (although Android with project butter is pretty much there too), these aren’t written in Objective-C… All the heavy lifting animations you see in iOS are performed on the GPU using CoreAnimation, Objective-C is only a thin API on top of that.
Getting back to the point though
CPU cache utilization is one of the most important advantages of native code when it comes to raw performance. On the desktop the cache is already huge but on mobile its small and every cache miss costs precious CPU cycles. Even elaborate benchmarks usually sit comfortably within
a CPU cache, but a large/complex application that makes use of external modules is problematic. But I digress….
the work Mozilla did with ASM.js
Are GCs Expensive
Yes they have a cost, no its not a big deal.
ARC is an Apple “workaround” for their awful GC.
Writing a GC is painful for a language like Objective-C which inherits the “problematic” structure of C pointers (pointer arithmetic’s and
memory manipulation) and adds to it plenty of complexities of its own. I’m not saying a GC is trivial in a managed language like Java but it is a panacea by comparison.
The problem with GC is in its unpredictable nature. A gc might suddenly “decide” it needs to stop the world and literally trash your framerate, this is problematic for games and smooth UI’s. However, there is a very simple solution: Don’t allocate when you need fast performance. This is good practice regardless of whether you are using a GC since allocation/deallocation of memory are slow operations (in fact game programmers NEVER allocate during game level execution).
This isn’t really hard, you just make sure that while you are performing an animation or within a game level you don’t make any allocations. The GC is unlikely to kick in and your performance will be predictable and fast. ARC on the other hand doesn’t allow you to do that since ARC instantly deallocates an object you finished working with (just to clarify: reference counting is used and instantly means when the ref count reaches 0). While its faster than a full GC cycle or manual reference counting its still pretty slow
. So yet you can hand code faster memory management code in C and get better performance in that way, however for very complex applications (and UI is pretty complex not to mention the management of native peers) you will end up with crashes. To avoid your crashes you add checks and safeties which go against the basic performance penalties you are trying to counter.
Furthermore, good JITs can detect various pattens such as allocations that are tied together and unify the memory allocation/deallocation. They can also reallocate elements into the stack frame rather than heap when they detect specific allocation usage. Unfortunately, while some of these allocation patterns were discussed by teams when I was at Sun I don’t know if these were actually implemented (mostly because Sun focused on server GCs/JITs that have a very different type of requirements).
The article also mentions desktop GCs being optimized for larger heap spaces and
a study from 2005
that “proves it”. This is true for desktop GCs but isn’t true for mobile GCs, e.g. Monty (Sun’s VM) had the ability to GC the actual compiled machine code. So effectively if your app was JITed and took too much space in RAM for an execution path you no longer use much, Monty could just collect that memory (the desktop JIT to my knowledge was never this aggressive).
A proper GC optimized for mobile devices and smaller heap overhead will be slower than some of the better desktop GCs but it can actually reduce memory usage compared to native code (by removing unused code paths). Just so we can talk scales, our code performed really well on a 2mb 240×320 Nokia device and weaker devices than that. It ran smoothly animations and everything, including GC.
Notice: This post was automatically converted using a script from an older blogging system. Some elements might not have come out as intended…. If that is the case please let us know via the comments section below.