Software-based CPU power consumption using PowerApi
September 03, 2020Maxime Thoonsen6 min read
The ICT sectors carbon footprint represents now around 4% of the global CO2 emissions. To minimize that, one of our goals at Theodo is to design sustainable web applications with minimal carbon footprint. And in order to know how performant we are at this, we need to be able to evaluate the power consumption of our applications and more specifically the power consumption of the VMs we use in the cloud. Unfortunately, at the moment no cloud provider is able to provide such a data. So we had to look for a solution.
Our research led us to PowerApi, an open-source project developed by the Spirals research group from University of Lille 1 and Inria in France. It allows people to estimate the CPU’s power consumption induced by an application without any external device like a wattmeter. As we can't put wattmeters to measure the VMs consumption in the cloud, this looks like a good option.
The magic formula
From the documentation, we can read that PowerApi collects with a sensor raw data from the hardware of the server. Then PowerApi applies a “formula” and voilà: we have the power consumption of the monitored software stored in a database.
After reading this I asked myself two questions : what is this “raw data” and how did they build this magic formula ? Fortunately, we can all have access to their research papers where they explain everything. So let’s dig a bit deeper.
A Power Model is used to approximate the power consumption
The formula is what is called a Power Model in the academic field. A power model is a model that approximates the power consumption of a hardware based on different factors.
I knew that each CPU has a specific TDP on their specification page: “Thermal Design Power (TDP) represents the average power, in watts, the processor dissipates when operating at Base Frequency with all cores active under an Intel-defined, high-complexity workload”
So when I pictured the formula, I thought it was going to be quite simple and something like:
Consumption = Idle power consumption + %CPU load * (TDP - Idle power consumption)
Reading their research paper, it turns out it’s more complicated: “the CPU load does not accurately reflect the diversity of the CPU activities. In particular, to faithfully capture the power model of a CPU, the types of tasks that are executed by the CPU have to be clearly identified.”
Power consumption is related to the CPU activity which is not exactly the CPU load
The types of tasks in a CPU are also known as events. Each processor type has hundreds of different events with potentially a different power consumption. To have an acute idea of what the CPU is doing and thus knowing the related power consumption, the PowerApi teams decided to rely on those events and count them.
To do that they rely on specific counters of the CPU: the HPCs (hardware performance counters) : “We therefore decide to base our power models on hardware performance counters (HPCs) to collect raw, yet accurate, metrics reflecting the types of operations that are truly executed by the CPU.”
Learning phase: creating the Power Model from the regression analysis of real power consumption
The goal of this phase is to approximate the amount of power consumption per HPC events during a workload. It’s made possible by measuring both real Power consumption of the server and counting HPC events in real time.
Step 1: select an unbiased set of workloads
To ensure that the data is not biased to one kind of CPU activity, the team used an academic set of different workloads called PARSEC.
“To explore the diversity of activities of a CPU, we consider a set of representative benchmark applications covering the features provided by a CPU. In particular, to promote the reproducibility of our results, we favor freely available and widely used benchmark suites, such as PARSEC.”
Step 2: get the correlation between real power consumption and HPC events
The team measured the real power consumption using a bluetooth power meter : the Power Spy 2.
Real time in the context of this experiment is that every 250 ms:
- The Power Spy 2 send the power consumption of the last 250 ms to a collector
- The number of HPC events of the last 250 ms are sent to a collector
By doing so we obtain a lot of data linking power consumption and HPC events.
Step 3: select the HPC events that are the most correlated to power consumption
The two final steps are a mathematical ones. Step 3 is about removing the HPC events that are weakly correlated with power consumption to simplify the step 4.
Step 4: Create the Power Model using ridge regression
At this step, the team uses an advanced mathematical method of regression (ridge regression), to get the 2 or 3 HPC events that will be able to predict the power consumption of the software on the machine. This is what we call the Power Model
The final Power Model includes memory, disks, motherboard and fans
As the Power Spy 2 measures the power consumption of the whole server, all other hardware components (memory, disks, motherboard and fans, ..) are integrated into the power model. It works well because the CPU is the most energy consumming component.
The final power model is P = Pidle + Pactive, where Pidle corresponds to the power consumption of the CPU when no program is running and Pactive to the additional power consumption from a workload.
For example, one power model for the Intel Xeon W3520 is
With e1 and e2 two HPC events.
To get a real time power consumption of this server PowerApi has then just to count those events.
Real life application
As Cyrielle has explained in this blog post, we were able to measure the power consumption on our local machine. But we couldn't use it to calculate power consumption on AWS servers which was our original goal. Indeed the research team can't do the learning phase on the AWS servers (yet?) as they don't have access to it. So the PowerAPI software can't get the data needed for its Power Model.
Thanks to this project and the things we read to try it, we learnt a lot:
- The CPU is the component in a server that consumes the most (~100 Watts). The memory is around 10 Watts for 32GB and a SSD consummes around 2 Watts.
- CPUs, when idle still consume around 1/3 of the peak consumption.
- The next generation of Intel's CPU may see a 91% reduction in power consumption
- ARM processors have a very low energy consumption compared to x86 processor(~2 Watts vs ~100 Watts)
- AWS provides arm-based EC2 instances if you look for efficient way of reducing the carbon impact of your web application.
The next idea that looks promising to have a rough idea of the power consumption of our application is the "Cloud Jewels" approach by Etsy adapted for AWS.