## Compiler-Directed Energy Optimization on Power-Gated VLIW Architecture Yung-Cheng Ma and Tzu-Hsuan Liao Department of Computer Science and Information Engineering, Chang-Gung University, Kwei-Shan, Taoyuan, Taiwan

The coming of dark-silicon age requires novel architecture and compiler technologies for energy optimization. With deep-submicron semiconductor manufacturing process, leakage power becomes the dominant part of a processor's power dissipation. The growth of leakage current also ends the Dennard scaling, which tries to keep power density as constant [1]. Power-gating technologies were proposed to control the leakage power dissipation [2]. It is expected that, in the dark-silicon age, lots of current switches will be deployed on a semiconductor chip. We study the processor architecture and compiler design for the coming dark-silicon age.

Our approach, named parallelism scaling, is to adapt processor power by dynamically changing the instruction-level parallelism. The concept is shown in Figure 1. Realization of parallelism scaling relies on architecture re-design and compiler optimization supports:

- (1) A novel architecture, named PGRF-VLIW architecture, featuring distributed and power-gated register files were proposed to enable the power scaling [3].
- (2) An instruction scheduling algorithm, named deadline-constrained clustered scheduling (DCCS), is proposed to utilize local data transfer in PGRF-VLIW architecture for power saving [3], and
- (3) We are currently working on a profiling-guided control flow graph partitioning algorithm as the global code optimization to drive the parallelism scaling.

Our evaluation shows 30% to 60% energy saving with the global code optimization.



Figure. 1. Idea of parallelism scaling.

## **Reference:**

- [1] Hadi Esmaeilzadeh et. al, "Dark silicon and the end of multicore scaling," *Proceedings of International Symposium on Computer Architecture* (ISCA), 2011.
- [2] Youngsoo Shin, Jun Seomun, Kyu-Myung Choi, and Takayasu Sakurai, "Power gating: Circuits, design methodologies, and best practice for standard-cell VLSI designs," ACM Transactions on Design Automation for Electronics Systems, Vol. 15, No. 4, Article 28, October 2010.
- [3] Zhibin Liang, Wei Zhang, and Yung-Cheng Ma, "Deadline-Constrained Clustered Scheduling for VLIW Architectures using Power-Gated Register Files," ACM Transactions on Architecture and Code Optimization, Vol. 11, No. 2, Article 20, July 2014.