For some reason people read this as a rebuttal of Drew Crawford
article, it is not
. It is merely a response, I accept almost everything he said but have a slightly different interpretation on some of the points.
Over the weekend quite a few people wrote to me about a well researched article written by Drew Crawford
on of the RAM/CPU available for today's devices. Today I'm the co-founder of Codename One where I regularly write low level Android, iOS, RIM, Windows Phone etc. code to allow out platform to work everywhere seamlessly. So I have pretty decent qualification to discuss devices, their performance issues etc.
Lets start by the bottom line of my opinion:
- I think Drew didn't cover the slowest and biggest problem in web technologies: the DOM.
- His claims regarding GC/JIT are inaccurate.
To make matters worse, many small things such as complex repeat patterns, translucency layers etc. make optimizing/benchmarking such UI's really difficult.
Why Java Is Fast & Objective-C Is Slow
The rest of the article talks a lot about native code and how fast it is, unfortunately it ignores some basic facts that are pretty important while repeating some things that aren't quite accurate. The first thing people need to understand about Objective-C: it isn't C. C is fast, pretty much as fast as could be when done right. Objective-C doesn't use methods like Java/C++/C#, it uses messages like Smalltalk. This effectively means it always performs dynamic binding and invoking a message is REALLY slow in Objective-C. At least two times slower than statically compiled Java. A JIT can (and does) produce faster method invocations
than a static compiler since it can perform dynamic binding and even virtual method inlining e.g. removing a setter/getter overhead! Mobile JITs are usually simpler than desktop JITs but they can still do a lot of these optimizations. We used to do some pretty amazing things with Java VMs on devices that had less than 1mb of RAM in some cases, you can check out the rather old blog from Mark Lam about some of the things Sun used to do here. But iPhone is faster than Android?
CPU cache utilization is one of the most important advantages of native code when it comes to raw performance. On the desktop the cache is already huge but on mobile its small and every cache miss costs precious CPU cycles. Even elaborate benchmarks usually sit comfortably within
Yes they have a cost, no its not a big deal.
ARC is an Apple "workaround" for their awful GC. Writing a GC is painful for a language like Objective-C which inherits the "problematic" structure of C pointers (pointer arithmetic's and
memory manipulation) and adds to it plenty of complexities of its own. I'm not saying a GC is trivial in a managed language like Java but it is a panacea by comparison. The problem with GC is in its unpredictable nature. A gc might suddenly "decide" it needs to stop the world and literally trash your framerate, this is problematic for games and smooth UI's. However, there is a very simple solution: Don't allocate when you need fast performance. This is good practice regardless of whether you are using a GC since allocation/deallocation of memory are slow operations (in fact game programmers NEVER allocate during game level execution). This isn't really hard, you just make sure that while you are performing an animation or within a game level you don't make any allocations. The GC is unlikely to kick in and your performance will be predictable and fast. ARC on the other hand doesn't allow you to do that since ARC instantly deallocates an object you finished working with (just to clarify: reference counting is used and instantly means when the ref count reaches 0). While its faster than a full GC cycle or manual reference counting its still pretty slow
. So yet you can hand code faster memory management code in C and get better performance in that way, however for very complex applications (and UI is pretty complex not to mention the management of native peers) you will end up with crashes. To avoid your crashes you add checks and safeties which go against the basic performance penalties you are trying to counter.
Furthermore, good JITs can detect various pattens such as allocations that are tied together and unify the memory allocation/deallocation. They can also reallocate elements into the stack frame rather than heap when they detect specific allocation usage. Unfortunately, while some of these allocation patterns were discussed by teams when I was at Sun I don't know if these were actually implemented (mostly because Sun focused on server GCs/JITs that have a very different type of requirements). The article also mentions desktop GCs being optimized for larger heap spaces and a study from 2005 that "proves it". This is true for desktop GCs but isn't true for mobile GCs, e.g. Monty (Sun's VM) had the ability to GC the actual compiled machine code. So effectively if your app was JITed and took too much space in RAM for an execution path you no longer use much, Monty could just collect that memory (the desktop JIT to my knowledge was never this aggressive).
A proper GC optimized for mobile devices and smaller heap overhead will be slower than some of the better desktop GCs but it can actually reduce memory usage compared to native code (by removing unused code paths). Just so we can talk scales, our code performed really well on a 2mb 240x320 Nokia device and weaker devices than that. It ran smoothly animations and everything, including GC.
Notice: This post was automatically converted using a script from an older blogging system. Some elements might not have come out as intended.... If that is the case please let us know via the comments section below.