Code Execution Times: IA-32/IA-64 Instruction Set Architecture
The purpose of this document is to provide software developers with precise methods to measure the clock cycles required to execute specific C code in a Linux* environment running on a generic Intel architecture processor. These methods can be very useful in a CPU-benchmarking context, in a code-optimization context, and also in an OS-tuning context. In all these cases, the developer is interested in knowing exactly how many clock cycles are elapsed while executing code.
At the time of this writing, the best description of how to benchmark code execution can be found in Using the RDTSC Instruction for Performance Monitoring. Unfortunately, many problems were encountered while using this method. This paper describes the problems and proposes two separate solutions.
In this paper, all the results shown were obtained by running tests on a platform whose BIOS was optimized by removing every factor that could cause indeterminism. All power optimization, Intel® Hyper-Threading Technology, frequency scaling, and turbo mode functionalities were turned off.
The OS used was openSUSE* 11.2 (linux-126.96.36.199-0.1).
Read the full Code Execution Times: IA-32/IA-64 Instruction Set Architecture White Paper.