Why Mobile Web Is Slow?

Clarification update: For some reason people read this as a r...
Post Image

Why Mobile Web Is Slow?

Clarification update: For some reason people read this as a rebuttal of Drew Crawford article, it is not . It is merely a response, I accept almost everything he said but have a slightly different interpretation on some of the points.

Over the weekend quite a few people wrote to me about a well researched article written by Drew Crawford where he gives some insights about why the mobile web/JavaScript is slow and will not (for the foreseeable future) compete with native code. My opinion of the article is mixed, it is well written and very well researched, I also agree with a few of the points but I think that despite getting some of the conclusions wrong I think his reasoning is inaccurate.

But first lets start with who I am (Shai Almog) and what I did, I wrote a lot of Java VM code when consulting for Sun Microsystems. I did this on mobile devices that had a fracti on of the RAM/CPU available for today's devices. Today I'm the co-founder of Codename One where I regularly write low level Android, iOS, RIM, Windows Phone etc. code to allow out platform to work everywhere seamlessly. So I have pretty decent qualification to discuss devices, their performance issues etc.

Lets start by the bottom line of my opinion:
  1. I think Drew didn't cover the slowest and biggest problem in web technologies: the DOM.
  2. His claims regarding GC/JIT are inaccurate.

So why is JavaScript slow?
People refer to performance in many ways, but generally most of us think of performance in terms of UI sluggishness. In fact JavaScript can't technically perform slowly since it is for most intents and purposes single threaded (ignoring the joke that is web workers), so long running JavaScript code that will take 50 seconds just won't happen (you will get constant browser warnings). Its all just UI stalls or what w e call "perceived performance".

Perceived performance is pretty hard to measure but its pretty easy to see why it sucks on web UI's: DOM.

To understand this you need to understand how DOM works: every element within the page is a box whose size/flow can be determined via content manipulation and style manipulation. Normally this could be very efficient since the browser could potentially optimize the living hell out of rendering this data. However, JavaScript allows us to change DOM on the fly and actually requires that to create most UI's. The problem is that "reflow" is a really difficult concept, when you have a small amount of data or simple layout the browsers amazing rendering engines can do wonders. However, when dependencies become complex and the JavaScript changes a root at a "problematic" point it might trigger deep reflow calculations that can appear very slow. This gets worse since the logic is so deep in the browser and its performance overhead you can end up with a performance penalty that's browser specific and really hard to track.

To make matters worse, many small things such as complex repeat patterns, translucency layers etc. make optimizing/benchmarking such UI's really difficult.

Why Java Is Fast & Objective-C Is Slow

The rest of the article talks a lot about native code and how fast it is, unfortunately it ignores some basic facts that are pretty important while repeating some things that aren't quite accurate.

The first thing people need to understand about Objective-C: it isn't C.
C is fast, pretty much as fast as could be when done right.

Objective-C doesn't use methods like Java/C++/C#, it uses messages like Smalltalk. This effectively means it always performs dynamic binding and invoking a message is REALLY slow in Objective-C. At least two times slower than statically compiled Java.
A JIT can (and does) produce faster method invocations than a static compiler since it can perform dynamic binding and even virtual method inlining e.g. removing a setter/getter overhead! Mobile JITs are usually simpler than desktop JITs but they can still do a lot of these optimizations.

We used to do some pretty amazing things with Java VMs on devices that had less than 1mb of RAM in some cases, you can check out the rather old blog from Mark Lam about some of the things Sun used to do here.

But iPhone is faster than Android?
Is it?
iPhone has better perceived performance. Apps seem to launch instantly since they have hardcoded splash images (impractical for Android which has too many flavors and screen sizes). The animations in iOS are amazingly smooth (although Android with project butter is pretty much there too), these aren't written in Objective-C... All the heavy lifting animations you see in iOS are performed on the GPU using CoreAnimation, Objective-C is only a thin API on top of that.

Getting back to the point though he is 100% right about JavaScript not being a good language to optimize, it doesn't handle typing strictly which makes JITs far too complex. The verification process in Java is HUGELY important, once it has run the JIT can make a lot of assumptions and be very simple. Hence it can be smaller which means better utilization of the CPU cache, this is hugely important since bytecode is smaller than machine code in the case of Java. CPU cache utilization is one of the most important advantages of native code when it comes to raw performance. On the desktop the cache is already huge but on mobile its small and every cache miss costs precious CPU cycles. Even elaborate benchmarks usually sit comfortably within a CPU cache, but a large/complex application that makes use of external modules is problematic. But I digress....

Proving that JavaScripts strictness is problematic is really easy all we need to do is look at the work Mozilla did with ASM.js which brings JavaScript performance to a completely different place. Remove abilities from JavaScript and make it strict: it becomes fast.

Are GCs Expensive
Yes they have a cost, no its not a big deal.

ARC is an Apple "workaround" for their awful GC.
Writing a GC is painful for a language like Objective-C which inherits the "problematic" structure of C pointers (pointer arithmetic's and memory manipulation) and adds to it plenty of complexities of its own. I'm not saying a GC is trivial in a managed language like Java but it is a panacea by comparison.

The problem with GC is in its unpredictable nature. A gc might suddenly "decide" it needs to stop the world and literally trash your framerate, this is problematic for games and smooth UI's. However, there is a very simple solution: Don't allocate when you need fast performance. This is good practice regardless of whether you are using a GC since allocation/deallocation of memory are slow operations (in fact game programmers NEVER allocate during game level execution).

This isn't really hard, you just make sure that while you are performing an animation or within a game level you don't make any allocations. The GC is unlikely to kick in and your performance will be predictable and fast. ARC on the other hand doesn't allow you to do that since ARC instantly deallocates an object you finished working with (just to clarify: reference counting is used and instantly means when the ref count reaches 0). While its faster than a full GC cycle or manual reference counting its still pretty slow . So yet you can hand code faster memory management code in C and get better performance in that way, however for very complex applications (and UI is pretty complex not to mention the management of native peers) you will end up with crashes. To avoid your crashes you add checks and safeties which go against the basic performance penalties you are trying to counter.

Furthermore, good JITs can detect various pattens such as allocations that are tied together and unify the memory allocation/deallocation. They can also reallocate elements into the stack frame rather than heap when they detect specific allocation usage. Unfortunately, while some of these allocation patterns were discussed by teams when I was at Sun I don't know if these were  actually implemented (mostly because Sun focused on server GCs/JITs that have a very different type of requirements).

The article also mentions desktop GCs being optimized for larger heap spaces and a study from 2005 that "proves it". This is true for desktop GCs but isn't true for mobile GCs, e.g. Monty (Sun's VM) had the ability to GC the actual compiled machine code. So effectively if your app was JITed and took too much space in RAM for an execution path you no longer use much, Monty could just collect that memory (the desktop JIT to my knowledge was never this aggressive). A proper GC optimized for mobile devices and smaller heap overhead will be slower than some of the better desktop GCs but it can actually reduce memory usage compared to native code (by removing unused code paths). Just so we can talk scales, our code performed really well on a 2mb 240x320 Nokia device and weaker devices than that. It ran smoothly animations and everything, including GC.

Notice: This post was automatically converted using a script from an older blogging system. Some elements might not have come out as intended.... If that is the case please let us know via the comments section below.

Share this Post:

Posted by Shai Almog

Shai is the co-founder of Codename One. He's been a professional programmer for more than 25 years during that time he has worked with dozens of companies and extensively with Sun Microsystems.