bild
Skolan för
elektroteknik
och datavetenskap

Virtualized Software Development and the Multicore Revolution

Full system simulation, or virtualization, provides a binary-compatible instance of the target hardware, but one which operates completely within a virtualized environment running on standard laptop or desktop PCs. Virtual platforms breaks a project's dependence on physical hardware for effective software design, implementation and test, fostering truly concurrent hardware and software development activities in ways that change the nature of the overall product development process. With the multicore revolution currently taking place virtual platforms are expected to become increasingly important for software developers as the complexity of the software grows as a result of concurrency and timing issues related to multicore systems.

Since the early 1980s, embedded software developers have been taking advantage of the regular increases in processor performance. Every year or two new processors would ship that would increase the performance of the system without disturbing the software structure. A design using a 500 MHz processor one year could rely on the appearance of a 1GHz processor the next year. This performance increase was often factored into application and system design: the software designers planned and implemented application sets that were too heavy for the currently available processors, counting on a new faster processor to come out on the market and solve their performance problems before the product got to market. Such a strategy was and is necessary in order to stay competitive. This comfortable state was driven by the steady progress of the semiconductor industry that kept packing more transistors into smaller packages at higher clock frequencies. Processor designers used this continuous increase of resources to create processors that ran single programs faster and faster. This increase in speed was achieved by increasing the clock frequency, along with architectural innovations like more efficient processing pipelines and improved branch prediction, and by adding resources like on-chip caches and memory controllers. Overall, the net result was a steady increase in performance for single-threaded programs, achieved by ever more complex processor designs.

However, in 2004 it became clear that the progress in single-processor performance was starting to slow considerably. Partially due to semiconductor transistor leakage, voltage could no longer decrease enough from process generation to process generation to allow the frequency to be increased without increasing the power to levels that cannot be dissipated. Clock frequencies will thus only increase slowly in the future. Instead, the semiconductor industry is turning to parallelism to increase performance.

Using multiple processor cores on the same chip, the performance per chip can increase dramatically even if the performance per core is only improving slowly. However, this causes a big problem. The increase in "performance" is now measured by aggregate throughput rather than processing speed on a single thread. The peak performance numbers assume that an application can keep all the processor cores busy. This is straightforward for large server-side applications such as web servers and databases. They are handling hundreds of concurrent requests and are naturally parallel, easily taking advantage of multiple processors. The same is not true either for desktop applications or for most embedded applications. In practice, applications that cannot take advantage of the parallel machine often run slower than on a single core chip.

The reliance on parallelism in the hardware to improve overall performance creates a problem for software developers. Applications that have traditionally used single processors will now have to be parallelized over multiple processors in order to take advantage of the multiple cores and so continue to increase their performance. This is a huge change in the embedded software landscape, and one that will put great pressure on software designers to wring good performance out of the new architectures.

Every high-performance processor family and instruction set is moving to multicore designs. Freescale has announced the PowerPC 8641D, with two G4-class cores on one chip. IBM has the PowerPC 970MP, with two G5 cores. Intel is rolling out multicore chips across its product line, and AMD has dual-core server and desktop processors. PMC Sierra is selling dual-core RM9200 MIPS64 processors. Cavium has announced up to 16-way parallel MIPS64-based processors. ARM has begun selling their ARM11 MPCore, with four cores sold as a single package. The multicore revolution is here!

Copyright © Sidansvarig: Alexander Baltatzis <alba@nada.kth.se>
Uppdaterad 2007-11-28