.syntaxhighlighter .line .content .block { background: none !important; }

Micro-benchmarking 2nd / 3rd gen iPhones

I follow the excellent weekly posts by Mike Ash, and entered a brief discussion in comments about toll free bridging.  In particular, the difference between calling a method via Objective-C (objc_msgSend) and it's equivalent CoreFoundation C call.  Mike suggested adding it to his original suite of tests, which lead to the following results.


iPhone 3G

ARM1176 ~412MHz / 2.4ns per cycle

NameIterationsTotal time (sec)Time per (ns)
IMP-cached message send1000000003.938.6
C++ virtual method call1000000005.049.9
Floating-point division100000000.881.3
Float division with int conversion100000000.881.4
16 byte memcpy100000001.4136.0
Objective-C message send10000000014.9148.6
Integer division10000000016.2162.2
CF CFArrayGetValueAtIndex100000002.0201.7
Objective-C objectAtIndex:100000004.2418.3
NSInvocation message send1000000.21833.2
16 byte malloc/free1000000027.32729.8
NSObject alloc/init/release1000001.414179.1
NSAutoreleasePool alloc/init/release1000001.918956.7
16MB malloc/free10000.047811.3
Zero-second delayed perform10000.8803419.3
pthread create/join1000.11085830.0
1MB memcpy1001.09902796.7

iPhone 3GS (ARMv7 binary)

ARM Cortex A8 ~600MHz / 1.66 ns per cycle

NameIterationsTotal time (sec)Time per (ns)
IMP-cached message send1000000001.211.8
C++ virtual method call1000000004.342.9
Objective-C message send1000000005.959.2
CF CFArrayGetValueAtIndex100000001.097.9
Integer division1000000009.898.4
16 byte memcpy100000001.1109.3
Floating-point division100000001.2118.5
Objective-C objectAtIndex:100000001.3129.0
Float division with int conversion100000001.4142.6
16 byte malloc/free100000007.5748.6
NSInvocation message send1000000.1806.0
NSObject alloc/init/release1000000.54793.1
NSAutoreleasePool alloc/init/release1000000.54953.1
16MB malloc/free10000.017969.2
Zero-second delayed perform10000.2211840.4
pthread create/join1000.0214742.5
1MB memcpy1000.33162774.6

Note that I did reduce the iterations from the original tests, so whilst the total times are significantly less, the iteration times are still a reflection of overall performance.  Compared to Mike's results, these show that the IMP method is indeed faster as expected, but this was only after I changed to a release build.  I also compiled these with Thumb disabled.

Observations

  • The IMP-cached message send is significantly faster on the newer Cortex CPU.  I have read of improvements in the branch prediction logic, which is particularly important due to the greater penalty of a misprediction in the longer A8 pipeline.  The code for executing the call is
      blx r8
    r8 contains the target address of the function, and remains so for the duration of the test.

  • For the 3GS, the Objective-C message send is very close to the C++ virtual method call.  I ran this test several times, and the behaviour didn't change.  The virtual method call is an indirect load of the pc register
      ldr pc, [r3]
    Without being able to access the PMC registers, I can't be sure of mispredictions; however, I know that 9 instructions are executed every iteration in the C++ test.  That suggests around 15ns / iteration; but, we're at 42.9.  Adding an additional 13 cycles every iteration (21.58ns) for a mispredition would get us to 37ns / iteration - much closer.  Stepping in to the objc_msgSend function finds the cached method on the first pass, totaling 28 instructions per iteration.  Given there are significantly more instructions for the Objective-C call, we're probably seeing the benefits of the dual—issue architecture.

  • Memory performance of the 3GS is significantly higher.  I've done some other micro-benchmarks, showing 2nd gen around 200 MB/s and 3rd gen around 800MB/s.  With some very well placed cache-preloads, I've actually pushed the ARM1176 to almost 300MB/s.

  • Calling the objectAtIndex: using CoreFoundation API is 2x faster on older devices; however, the gap is less significant with the newer hardware.  We've seen significant improvements to the objc_msgSend performance on the 3GS, which undoubtedly is making up much of the gap.

  • Floating point performance for scalar operations is slightly slower on the newer device.

Source code for this test is available here.

C64 for iPhone: Rejected.

After a deafening silence, I can finally talk about my little side project.  I posted my initial efforts about a year ago and after the excitement of TUAW publishing a story, I spent some more time adding a keyboard and improving the performance with dreams of an App Store release.  That is, until reality sunk in.  There is an incredible amount of work to turn a concept like this into a polished, user friendly and legal product, ready for sale.  I attempted to find who owned the Commodore 64 brand, but constantly hit dead ends.  I finally took a break from C64 and played around with new projects, like the SID player.  It has made progress, but I'll leave that for another post.

Not much happened with the emulator for some time; however, everything changed when I received an email from Brian Lyscarz, a Danish entrepreneur who is now a resident of Sweden.  It turns out he is just as passionate as I when it comes to retro, and had personally funded some initial development of a C64 emulator for the iPhone.  Fortunately (for me) this didn't go too far and Brian found me because of the initial press.  Aside from an initial phone call, we have communicated entirely via email, post and Google chat, to achieve what follows.

We concluded (having never met) that the next obvious step was to form a company dedicated to retro gaming, and Manomio LLC was born.

As the months have gone by, we've really settled into what has become a great partnership.  Essentially, I got it working and Brian had the aesthetic eye and the skill with Photoshop to make it pretty:



The next hurdle was licensing.  Fortunately, Brian knew the right people, which lead to Manomio securing an official license for the brand from Commodore Gaming and Kiloo Apc.

The final hurdle was Apple and the SDK agreement, section 3.2.2.  We contacted Apple Developer Relations in the United Kingdom and explained our approach.  In principal, we don't allow you to download arbitrary content - we'll secure the licenses and release game packs officially via the App Store.  We agree it's not ideal, but we had to start somewhere.  If Apple loosens the reins, so will we!  They were very excited by what we had built and assured us we'd be okay given we weren't directly competing with the App Store and had locked down the emulator to only installing official titles.  They also mentioned there were already other 'emulators' available, like the SID player and SC68, which actually can download content freely via the net.  This leads us to today - we've been rejected on those very grounds.  We're going to resubmit with no access to BASIC, so it simply plays games in the hope it will be perceived as just a pack of games.

Hopefully we can be noisy across the digital communication channels and perhaps Apple will change their mind.  We have our first review in too, so go check it out!

Enough talk, here it is in action on a 3G (not my new 3Gs):

Enable -Wformat for better compile time help

Let the compiler do all the hard work, and be sure to enable the following warning:

Picture 1.png

It does more than just validate printf/scanf formatting calls, which is helpful in itself. It also validates that a sentinel is present in variadic functions. A sentinel is typically NULL or nil for the last parameter. A common place you would benefit from this is using the arrayWithObjects method of NSArray, that requires a nil for the last value.

NSArray *items = [NSArray arrayWithObjects:@"one", @"two", nil];

If the nil is absent, you're receive the following:

Picture 3.png

Small Zip / Unzip library for iPhone

I have looked with ZipArchive, which is an Objective-C compression framework; however, I ended up using Lite Zip / Unzip on codeproject as it is just two .C files. Include one for compressing ZIP files and the other for decompressing.

XCode Tip: Generate comments in your assembler output

To make it easier to find the assembly generated when you 'Show Assembly Code', embed comments using:

asm("# your comment")

How to debug handleOpenURL

I've seen a number of questions on the Apple iPhone developer forums asking how to debug the UIApplication handleOpenURL message. I finally had a complex scenario that I needed to debug, and came up with the following solution. Note that this example has been tested using the simulator only. Firstly, you'll need to grab the two 'DebugSupport' files from my google code repository here. Include them in your project and modify your handleOpenURL message as follows:
-(BOOL)application:(UIApplication *)application handleOpenURL:(NSURL *)url {
 [DebugSupport waitForDebugger];
 ...
 // place breakpoint after the above line
}
The call to [DebugSupport waitForDebugger] shows a UIAlertView, which will wait until you confirm by clicking the OK button. You'll notice that the prompt tells you the process ID. Don't click OK yet. Return to XCode and from the Run menu, choose Attach To Process | Process ID... Enter the PID given to you from the alert box and XCode will attach and enable all the breakpoints. Obviously, remove this from production code. Enjoy.

iPhone 3.0 Wish List

I took this from a comment I made on TUAW

Here's what I'd like to see, assuming we'll also get iPhone 3.0 software update.

Hardware
  • The entire screen should be touch enabled, so the border could be used for interactivity.
  • ARM Cortex A9 (multi-core version would be a dream),
  • PowerVR SGX and
  • minimum of 256MB RAM (up from 128MB)


Software, given the above specs
  • Background applications
    • I'm wondering if this is coming rather than Apple's original 'push' solution, since we haven't seen it yet...

  • Support for cross-application communication / shared data
    • At the very minimum, this should be permitted for apps released by the same developer

  • Major innovations to Springboard, since it is becoming unwieldy on iPhones with many apps installed
    • 'spotlight' style launching would be good
    • It could be arranged in a grid pattern so you could flip left, right, up or down or diagonally. A faint arrow could show you where centre is. Pressing home would take you back to the centre screen.
    • For up to 9 screens you would be no more than 2 flips from any other screen.
    • A gesture (like pinch) could be used to zoom out the spring board, like spaces to see 4 page or further out for 6 and then 9 pages. Tap on a page to zoom back.
    • You should be able to name spaces, so you could see text on each space. e.g. 'Games', 'Financial', etc...

Utility class for loading a UIImage without caching

I've checked in a utility class for UIImage, which adds a new category to load images without caching .  You use it as follows:

UIImage *image = [UIImage imageFromResource:@"my_image.png"];

You should only do this if there is a specific reason you do not want caching.