Improving the Performance of CPU Architectures by Reducing the Operating System Overhead (Extended Version)

Open access


The predictable CPU architectures that run hard real-time tasks must be executed with isolation in order to provide a timing-analyzable execution for real-time systems. The major problems for real-time operating systems are determined by an excessive jitter, introduced mainly through task switching. This can alter deadline requirements, and, consequently, the predictability of hard real-time tasks. New requirements also arise for a real-time operating system used in mixed-criticality systems, when the executions of hard real-time applications require timing predictability. The present article discusses several solutions to improve the performance of CPU architectures and eventually overcome the Operating Systems overhead inconveniences. This paper focuses on the innovative CPU implementation named nMPRA-MT, designed for small real-time applications. This implementation uses the replication and remapping techniques for the program counter, general purpose registers and pipeline registers, enabling multiple threads to share a single pipeline assembly line. In order to increase predictability, the proposed architecture partially removes the hazard situation at the expense of larger execution latency per one instruction.

[1] G. C. Buttazzo, Hard Real-Time Computing Systems – Predictable Scheduling Algorithms and Applications (Real-Time Systems Series 24). 3rd ed., Springer US, 2011. ISBN 978-1-4614-0675-4.

[2] B. Kumthekar, L. Benini, E. Macii and F. Somenzi, “Power optimisation of FPGA-based designs without rewiring,” in IEE Proc. – Comput. and Digital Techniques, vol. 147, no. 3, pp. 167–174, May 2000.

[3] M. Shahbazi, P. Poure, S. Saadate and M. R. Zolghadri, “FPGA-Based Reconfigurable Control for Fault-Tolerant Back-to-Back Converter Without Redundancy,” IEEE Trans. on Industrial Electronics, vol. 60, no. 8, pp. 3360–3371, Aug. 2013.

[4] E. Dodiu and V. G. Gaitan, “Custom designed CPU architecture based on a hardware scheduler and independent pipeline registers – concept and theory of operation,” in 2012 IEEE Int. Conf. on Electro/Information Technology, Indianapolis, IN, USA, May 2012, pp. 1–5.

[5] V. G. Gaitan, N. C. Gaitan and I. Ungurean, “CPU Architecture Based on a Hardware Scheduler and Independent Pipeline Registers,” IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 9, pp. 1661–1674, Sep. 2015.

[6] I. Zagan, “Improving the performance of CPU architectures by reducing the Operating System overhead,” in 2015 IEEE 3rd Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE), Riga, Nov. 2015, pp. 1–6.

[7] D. May, “The XMOS Architecture and XS1 Chips,” IEEE Micro, vol. 32, no. 6, pp. 28–37, Nov.–Dec. 2012.

[8] T. Ungerer et al., “Merasa: Multicore execution of hard real-time applications supporting analyzability,” IEEE Micro, vol. 30, no. 5, pp. 66–75, 2010.

[9] J. Wolf, M. Gerdes, F. Kluge, S. Uhrig, J. Mische, S. Metzlaff, C. Rochange, H. Cassé, P. Sainrat and T. Ungerer, “RTOS Support for Parallel Execution of Hard Real-Time Applications on the MERASA Multi-core Processor,” in 2010 13th IEEE Int. Symp. on Object/Component/Service-Oriented Real-Time Distributed Computing, Carmona, Seville, May 2010, pp. 193–201.

[10] M. Zimmer, D. Broman, C. Shaver and E. A. Lee, “FlexPRET: A processor platform for mixed-criticality systems,” in 2014 IEEE 20th Real-Time and Embedded Technology and Applicat. Symp. (RTAS), Berlin, 2014, pp. 101–110.

[11] S. Andalam, “Predictable platforms for safety-critical embedded systems,” Thesis, The University of Auckland, 2013.

[12] D. Andrews et al., “hthreads: A hardware/software co-designed multithreaded RTOS kernel,” in 2005 10th IEEE Conference on Emerging Technol. and Factory Autom., Catania, Italy, Sep. 2005, pp. 331–338.

[13] J. Agron, D. Andrews, “Hardware Microkernels for Heterogeneous Manycore Systems,” in 2009 Int. Conf. on Parallel Processing Workshops (ICPPW '09), Vienna, 2009, pp. 19–26.

[14] J. Kreuzinger, R. Marston, T. Ungerer, U. Brinkschulte and C. Krakowski, “The Komodo project: thread-based event handling supported by a multithreaded Java microcontroller,” in Proc. 25th EUROMICRO Conf. Informatics: Theory and Practice for the New Millennium, Milan, 1999, vol. 2, pp. 122–128.

[15] R. Wilhelm, D. Grund, J. Reineke, M. Schlickling, M. Pister and C. Ferdinand, “Memory Hierarchies, Pipelines, and Buses for Future Architectures in Time-Critical Embedded Systems,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 28, no. 7, pp. 966–978, July 2009.

[16] A. El-Haj-Mahmoud, A. S. Al-Zawawi, A. Anantaraman, and E. Rotenberg, “Virtual multiprocessor: an analyzable, highperformance architecture for real-time computing,” in Proc. of the 2005 int. conf. on Compilers, architectures and synthesis for embedded systems, CASES ’05. San Francisco, 2005, pp. 213–224.

[17] A. El-Haj-Mahmoud and E. Rotenberg, “Safely Exploiting Multithreaded Processors to Tolerate Memory Latency in Real-Time Systems,” in Proc. of the 2004 int. conf. on Compilers, architecture, and synthesis for embedded systems, Washington, 2004, pp. 2–13.

[18] P. Bratley, M. Florian, and P. Robillard, “Scheduling with earliest start and due date constraints,” Naval Research Quarterly, vol. 18, no. 4, 1971.

[19] J. Stankovic and K. Ramamritham, “The design of the spring kernel,” in Proc. of the IEEE Real-Time Systems Symp., Dec. 1987.

[20] E. L. Lawler, “Optimal sequencing of a single machine subject to precedence constraints,” Managements Science, vol. 19, no. 5, pp. 544–546, 1973.

[21] H. Chetto, M. Silly, and T. Bouchentouf, “Dynamic scheduling of realtime tasks under precedence constraints,” J. of Real-Time Systems, vol. 2, no. 3, pp. 181–194, Sep. 1990.

[22] J. R. Jackson, “Scheduling a production line to minimize maximum tardiness,” Management Science Research, vol. 43, 1955.

[23] W. Horn, “Some simple scheduling algorithms,” Naval Research Logistics Quarterly, vol. 21, no. 1, Mar. 1974.

[24] C. L. Liu and J. Layland, “Scheduling algorithms for multiprogramming in a hard real-time environment,” Journal of the ACM (JACM), vol. 20, no. 1, pp. 46–61, 1973.

[25] J. Leung and J. Whitehead, “On the complexity of fixed-priority scheduling of periodic real-time tasks,” Performance Evaluation, vol. 2, no. 4, pp. 237–250, 1982.

[26] C. L. Liu and J. W. Layland, “Scheduling algorithms for multiprogramming in a hard-real-time environment,” J. of the Association for Computing Machinery, vol. 20, no. 1, 1973.

[27] N. C. Gaitan, I. Zagan and V. G. Gaitan, “Predictable CPU Architecture Designed for Small Real-Time Application - Concept and Theory of Operation,” Int. J. of Advanced Computer Science and Applications, vol. 6, no. 4, pp. 47–52, 2015.

[28] S. Kelinman and J. Eykholt, “Interrupts as threads,” ACM SIGOPS Operating Syst. Rev., vol. 29, no. 2, pp. 21–26, Apr. 1995.

[29] N. C. Gaitan, V. G. Gaitan, I. Ungurean and I. Zagan, “Methods to Improve the Performances of the Real-Time Operating Systems for Small Microcontrollers,” in 2015 20th Int. Conf. on Control Systems and Computer Science, Bucharest, 2015, pp. 261–266.

[30] N. C. Gaitan, I. Zagan and V. G. Gaitan, “Improving the Predictability of nMPRA and nHSE Architecture,” Bulletin of the Polytechnic Institute of Iasi, Automatic Control and Computer Science Section, fasc. 1/2015, pp. 27–38, 2015, ISSN 1220-2169,

[31] E. Dodiu, “Real-Time Hardware Scheduler for FPGA Based Embedded Systems,” Ph.D. dissertation, University Stefan cel Mare of Suceava, Romania, 2013.

[32] S. A. Edwards and E. A. Lee, “The Case for the Precision Timed (PRET) Machine,” in Proc. of the 44th annu. Design Automation Conf. DAC '07, San Diego, 2007, pp. 264–265.

[33] L. Lindh, “Fastchart – A fast time deterministic CPU and hardware based real-time-kernel,” in Proc. EUROMICRO `91 Workshop on Real-Time Syst., Paris-Orsay, 1991, pp. 36–40.

[34] F. Stanischewski, “FASTCHART – Performance, Benefits and Disadvantages of the Architecture,” in Proc. Fifth Euromicro Workshop on Real-Time Syst., 1993, pp. 246–250.

[35] M. Schoeberl, “A time predictable Java processor,” in Proc. of the Design Automation & Test in Europe Conference, DATE’06, Munich, 2006, pp. 1–6.

[36] M. Nadeem, M. Biglari-Abhari and Z. Salcic, “JOP-plus - A processor for efficient execution of java programs extended with GALS concurrency,” in 2012 17th Asia and South Pacific Design Automation Conf., ASP-DAC, 2012, pp. 17–22.

Electrical, Control and Communication Engineering

The Journal of Riga Technical University

Journal Information


All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 78 78 24
PDF Downloads 18 18 4