Branch Prediction in Android applications
In this work I have attempted to study the hardware branch prediction accuracy of ARM processor in mobile devices (specifically the Cortex A8) running Android.
We simulate the processor using full system cycle accurate simulation mode of gem5 and run the Moby application benchmark suite to study the micro-architectural characteristics.
We account for context switches made while running these applications and separate the branches into user mode and kernel mode branches.
We also study the number of instructions squashed from the pipeline due to the incorrect branch prediction and speculative execution along the incorrect path.
The processor consists of a 2-level tournament predictor with a 2048 entry local and 8192 entry global predictor tables and a 2048 branch table buffer (BTB).
OBESERVATIONS : The number of context switches in Android applications is quite few (between 0.54 to 0.78 Million) and these are spaced far apart in the execution timeline of the applications. This leads us to the second result. As a consequence of the few and far spaced kerel mode executions, we find that although the overall branch prediction accuracy is pretty high (between 88% to 93%) the number of incorrect predictions in in user and kernel mode is disproportionate. The user mode doesn't suffer from too much from these incorrect predictions (1.53% to 6.83%) while the kernel mode suffers from a not only a high percentage of mis-predictions but also a wide range of mis-predictions rates (48.46 to 78.46%). The instructions squashed in the pipeline due to execution on mis-predicted path represents wasted cycles and power being a first rate design concern for mobile processors this worrisome.
Modern mobile OS are much more capable and allow for multi-tasking between applications and background application refreshes. This results is increased number of context switches between the user and kernel mode. However the decrease in mis-prediction rate with increase in kernel mode burst length and number of context switches is only linear which is insufficient for handling these modern OS.
We hope that future hardware designers take into account these findings and use OS aware branch prediction techniques as proposed in ”Understanding and improving operating system effects in control flow prediction.” by Sivasubramaniam, Li et al. in ASPLOS 2002 here .
The complete report with results and simulation procedure is available here
- Computer Architecure
- Branch Prediction
- Moby Benchmark
- Course Project