Left unchecked, the fundamental drive to increase peak performance using tens of thousands of power hungry components will lead to intolerable operating costs and failure rates. High-performance, power-aware distributed computing reduces power and energy consumption of distributed applications and systems without sacrificing performance. Generally, we use DVS (Dynamic Voltage Scaling) technology now available in high-performance microprocessors to reduce power consumption during parallel application runs when peak CPU performance is not necessary due to load imbalance, communication delays, etc. We propose distributed performance-directed DVS scheduling strategies for use in scalable power-aware HPC clusters. By varying scheduling granularity we can obtain significant energy savings without increasing execution time (36% for FT from NAS PB). We created a software framework to implement and evaluate our various techniques and show performance-directed scheduling consistently saves more energy (nearly 25% for several codes) than comparable approaches with less impact on execution time (< 5%). Additionally, we illustrate the use of energy-delay products to automatically select distributed DVS schedules that meet users needs.
展开▼