The iPhone architecture, although generally out of the scope of application development, is a very simplistic design in terms of operating system theory. A decent understanding of it is necessary for a detailed overview of the iPhone’s functionality, and this article may help to satisfy the curiosity of those interested in how it works (i.e. geeks) but whom lack an available document describing it.
First off, the following (hastily-made) diagram provides a graphical overview of the entire process of how hardware interaction is provided to and from the iPhone software:
This layered (and color-coded, for the non-colorblind) diagram will serve as a graphical overview of the iPhone’s entire input/output process. Computer engineers should probably start at the bottom, and computer scientists should start at the top .
(you may notice that we have both a hardware layer AND a processor layer, and the reason is this: the hardware refers to the actual chips of the device’s peripherals, e.g. gyros or display, whereas the processor deals more with the ARM instruction set at the assembly level – just to clear up any resulting confusion. For more of an overview of the diagram, see the layers section below.)
Example 1: Hardware to Application: Gyros
Now, the easiest way to break this down is via example, so we’ll use the gyros (the thingies that detect when you turn the iPhone to the side, A.K.A. Accelerometers), starting from the hardware:
- Physical gyros detect a movement change, raise a bit (or modify a register) to indicate the change
- Firmware detects the change, notes that the value indicates that the state of the iPhone is now different (as opposed to an erroneous value), and sends a processor interrupt through the system bus for attention.
- The iPhone’s ARM processor receives the interrupt, and calls an in-memory interrupt service routine (ISR) that the iPhone operating system set up during driver initialization.
- The ISR, a function (subroutine) located within the code for the iPhone OS, acknowledges the interrupt and begins to process it via the corresponding driver. The OS itself sends a signal to the current active application (if any) in the form of a Unix-style signal.
- This signal is handled according to the specifications of the C/Objective-C runtimes, since the iPhone application compiler and linker constructs the application specifically to do this rather than handle the signal directly. The Objective-C runtime forwards the signal as a framework-specific message to the application, first checking whether the current application has been designed to handle said message.
- If the message is able to be processed via a method (subroutine) within the application, then the message is received and processed. In the case of our gyro example, the application will adjust it’s interface to display in widescreen (landscape) format for the display. Depending on the message type, the framework may also check whether or not the application indicates an error in the processing of message, and may perform additional actions in this case.
This example shows an overview of the process in which a hardware action is processed by software. This same process is used for touchscreen input, sound input, and camera input. The buttons on the top and bottom of the iPhone are more OS-specific, however: the menu button merely sends a termination signal to the application so it can perform any closing tasks (such as file writing, etc.), whereas the topmost button triggers a power management/scheduler routine that simply suspends the current application and turns off the touchscreen to conserve battery life.
Example 2: Application to Hardware: Display
Now that we’ve provided an example of how the iPhone application receives hardware-caused messages, lets now look at how the application requests an action of the hardware. For this example, we’ll use graphical display:
- The application, now loaded into memory and starting its execution, wishes to display an image for a splash screen. We’ll assume that the process of loading the image from storage is complete, and we now simply have to display the image on the touchscreen hardware for the user to see.
- The application makes an API (framework) call, something along the lines of -(void)setBackgroundImage, passing a reference to the image within the call (which is, at the lowest level, the starting memory address or hard drive reference of where the image is stored, the type of the image, and the length of the image).
- The API/framework receives the function call, and does the dirty work of translating the higher-level image processing interface into a collection of calls to the Objective-C runtime, which will then make the appropriate calls to the C library.
- The (dynamically-linked) C library, having received a series of function calls from the Objective-C runtime and corresponding frameworks and APIs, will further refine the functions into assembly-level system calls (via a software interrupt) which the iPhone OS kernel can process.
- The iPhone OS kernel, having received the system calls in assembly format from the C library, will then call upon the appropriate drivers (the touchscreen display drivers, in this case) to interact with the hardware. The drivers will, at the lowest level, set the registers of the touchscreen display’s chips in order to display the right colors at the appropriate coordinates in order to display the image as requested by the application.
- Now we’ve reached the hardware level, where the physical display screen is being changed so as to display the image for user’s eyes to see.
Whew! Still with us? You may be wondering why there are three different API/framework/library calls needed just to tell the kernel to start mucking with the hardware. This, my friend, is one of the joys of working with an underlying Unix operating system: providing a high-level, developer-friendly interface to an almost 40 year-old operating system architecture (recall that XNU is based upon the Mach kernel, and uses the olde Unix system call convention for kernel requests).
The frameworks/API is the set of functions the developer wants to see in order to make his/her application interact with the device and underlying iPhone OS. Well the framework, which is part of the overall Objective-C runtime, is really just a series of upper-level function calls (and C extensions, in the case of the Obj-C runtime) in order to allow a more developer-friendly interface to the grimy, obfuscated (and undocumented) C library.
The C library lies underneath the Objective-C runtime, and in fact makes up the Obj-C runtime, providing true object oriented dynamic-typing extensions to the C language (unlike C++, which just looks object-oriented). The C library is dynamically linked to from the Objective-C runtime to “translate” messages requiring lower-level intervention into UNIX system calls, where the iPhone OS can process them as needed.
This process can be seen using GDB on an iPhone application mostly in the default “thread 2″ of a given application, with the lower-level calls resulting in dyld_* (dynamic linkage) system calls or even processor interrupts shown in the assembly language itself.
Some immediately learn this and think, “Gee, why don’t developers just use the C library or system calls directly?”, and the answer is this: it’s both unnecessarily hard to do, and largely undocumented. Why would you want to take over twice the time to develop an app using the C library when the much easier to use Objective-C API and frameworks are staring you in the face? And, Apple has their own private functions within the C library that they don’t document, and you probably don’t want to use them anyways for app development (and the compiler/linker generally keeps you from doing so).
The diagram we showed above is meant to provide an overview of the iPhone OS architecture, but just to clear up what each of the layers cover, here is a description of each layer:
- Application: This is the currently-running iPhone application, purchased through the app store (unless jailbroken firmware is used). This application was compiled to native code by the Apple-distributed iPhone compiler, and linked with the Objective-C runtime and C library by the linker. This application also runs entirely within the userspace environment set up by the iPhone OS.
- Frameworks/API: Cocoa Touch, upper-level OpenGL calls, its all in here. These API calls are simply Apple-distributed headers with the iPhone SDK, with some dynamic linking occuring at runtime. These reside on top of the Objective-C runtime, as many of these are written in Objective-C.
- Objective-C runtime: This layer is comprised of both the Objective-C dynamically-linked runtime libraries, as well as the underlying C libraries. The C library sets up the environment for the Objective-C runtime so much that I simply included them both within the same layer (although I should have probably called it “runtime libraries” instead).
- iPhone OS: This is the kernel, drivers, and services that comprise of the iPhone Operating System. This is sometimes called iPhone OS, iPhone OS X, or just OS X, but it all refers to the same deal: it sits between the userspace and hardware.
- Processor: Not so much the physical ARM chip (that’s contained within the hardware layer), but instead referring to the ARM instruction set and the interrupt descriptor table as set up by the iPhone OS during boot and driver initialization.
- Firmware: Although we refer to the entire OS as “firmware” sometimes (especially in respect to jailbreaking), this layer instead references the chip-specific code that is either contained with memory in/around the peripheral itself, or within the driver for said peripheral (example: touch screen or gyroscope)
- Hardware: Pretty obvious, but refers to the physical chips soldered to the iPhone’s circuitry. If you can feel/see it, it’s under this layer. The actual processor falls under this layer, but the instruction set and in-memory descriptor tables are contained within the “processor” layer.
As a final detail, I want to say something about the iPhone OS scheduler and how it works closely with the iPhone’s power management. I mentioned earlier how the button atop the iPhone/iPod suspends the currently running app in order to enter a low-power state, and this demonstrates both the special power management features built in to the iPhone to conserve battery life, as well as the scheduler’s ability to halt a currently-running program and resume it later.
The scheduler itself is capable of running a full iPhone process (app) in the background, but up until the 3.0 beta this has been restricted to Apple applications such as the Mail application. The scheduler/kernel apparently keeps track of which process ID is currently the “active one”, in order to send signals (received as messages via the libraries) to the right process, instead of telling the background-running mail app that the iPhone was just tilted instead of Safari.
Jailbroken iPhone/iPod apps are able to currently run background apps on the current OS version, so there is either an undocumented system call for doing so, or else the jailbreakers are modifying the iPhone scheduler to enable the functionality. I’d lean more towards the undocumented feature, however, since it seems more logical and jailbreaking gives access to the C runtimes for disassembly and inspection for these undocumented routines.
That said, I’m glad Apple finally allows official apps to run in the background, and allowing this functionality will not only benefit more developers but also leave one less reason (amongst plenty) to jailbreak the iPhone/iPod.
The iPhone/iPod operating system has a very simplistic architecture although things get a little muchy when it comes to how signals are passed to and from the different libraries and the kernel. This is expected, however, when you consider the Unix-style system call convention we’ve been using since the stone ages (and copied by other systems). A good book for a more detailed overview of this is The UNIX-Haters Handbook, available online.
With that, I hope everyone is “enlightened” as far as how the iPhone works.
Anthony is the Silicon News editor-in-chief. Many dedicated readers know him from his prior blog The Coffee Desk before its sale in early 2010, which was featured in everything from Yahoo! News, Slashdot.org, and countless other news agencies pulling in millions of unique visitors a month. He has ample experience with software, hardware, and networking, having been employed by numerous companies ranging from U.S. government agencies, research and development firms and Google. Though his approach is usually technical and dry, he is notorious for his subtle and witty observational humor.