Low Power Techniques in multicore systems based in ARM architecture (Part III)

Ok so I promised to show you the test I’ve done with pandaboard to test CPU HotPlug technique right? So come on… I’ve talk before about Pandaboard development board, and if you didn’t hear about it before, now it’s a good time. Pandaboard ships with an OMAP4 4430 chip, with dual core ARM Cortex A9  processor @1GHz, 1 GB low power RAM and Bluetooth 2.1, Ethernet, Wireless, HDMI… as you can see, great connectivity and great details for such a little device.

Pandaboard

For the tests, I set up two different scenarios, one playing video file on SD card and the good one with  an Apache Web Server on Pandaboard, and running several loads with JMeter to see how it performs. As OS I used Ubuntu 11.04 with kernel 2.6.38-1208-omap4, and to measure the power consumption, I had to use a multimeter (with USB connection, what a great idea!!) because at the moment it seemed impossible to get voltage or power values directly from the board.

Test Environment with PandaBoard and Digital Multimeter

Here you can see the results of playing the same video file, using 2 cores and using only 1 core, after having the second core disabled by CPU Hotplug. As you can see, power consumption is lower, but not as much as to prefer to use 1 core instead of two, and specially for playing video.

Power consumption results for video playing test with 1 and 2 cores

Maximum Power consumption during video playing

The next tests are more interesting. These are the tests done with Apache Web Server and JMeter load testbench. In the next images, you can see the power consumption of the Pandaboard using 1 or 2 cores, with different threads values: 5, 10, 25 and 50. Those are threads or connections opened on Apache Web Server. For the application used for the tests, GLPI (www.glpi.org), there was also a MySQL on Pandaboard, but as the configuration was the same for all the tests, the important part is the load behaviour on Apache.

Jmeter test with 5 threads / connections using 1 and 2 cores

Jmeter test with 10 threads / connections using 1 and 2 cores

Jmeter test with 25 threads / connections using 1 and 2 cores

Jmeter test with 50 threads / connections using 1 and 2 cores

As you can see in the graphics, using only 1 core power consumption on Pandaboard was near half of the power consumption when using 2 cores, even if for a higher number of threads, HTTP requests delay was higher. After doing this test I reach the following conclusion / question: It’s better to do the job with less power but more time, or faster but with more power consumption? When I was thinking about it and about the behaviour or CPU HotPlug, I realized that I could try to use it in a dynamic behaviour, so instead of processing all jobs with 1 or 2 cores, it would be possible to enable or disable the second core of ARM Cortex A9 processor using the system load. With this “dynamic” core management, we can enable the second core when system load is too high for only 1 core, and even if the power consumption would raise, the results of doing this could be interesting. The following test are showing this particular behaviour, and you can see how the yellow line which represents Dynamic Core Management, performs compared to the same system load with 1 and 2 cores.

Jmeter test with 5 threads / connections using 1 and 2 cores and Dynamic Core Management

Jmeter test with 10 threads / connections using 1 and 2 cores and Dynamic Core Management

Jmeter test with 25 threads / connections using 1 and 2 cores and Dynamic Core Management

Jmeter test with 50 threads / connections using 1 and 2 cores and Dynamic Core Management

As you can see, using a medium system load, the power consumption obtained using our script for Dynamic Core Management performs as the 1 core execution for lower loads, but as the 2 core execution when system load becomes higher. The benefit here could be in medium or more detailed system loads, when power consumption could be lower then using 2 cores, but with better performance than using only 1 core. In terms of Maximum Power Consumption, you can see how it goes in the four scenarios tested, and is very interesting to see how for the test with a higher number of threads, the use of our script enabling the second core has a higher power consumption than the same case with 2 cores.

Maximum Power consumption for 5 threads

Maximum Power consumption for 10 threads

Maximum Power consumption for 25 threads

Maximum Power consumption for 50 threads

After this last result, where power consumption using the script that enables the second core is higher than using 2 cores, I was curious of such behaviour, so I decided to test and measure only the period when CPU Hotplug enables (CPU UP) and disables (CPU Down) the second core. After doing those tests, I realized that as the studies showed, the use of CPU HotPlug technique was not free in terms of power consumption, and here you can see how the power consumption behaves during the process of CPU up and CPU down.

Maximum Power activating the second core with CPU Hotplug

Maximum Power disabling the second core with CPU Hotplug

After all these test and the research done about CPU Hotplug, the conclusion is that could be a interesting technique for embedded devices, when idle time and battery saving are really important (Are you thinking on mobile phones?? Yes, they meet those conditions).

But for other scenarios this technique is not the best as we have seen, and there are different studies about a “low power scheduler” for ARM multicore systems or similar techniques without the power consumption overhead present in CPU HotPlug activating and disabling the second core. Examples of those techniques are sched_mc and CPU Set developed by Linaro Power Management Working Group.

The most interesting thing is not only the grow of ARM based systems, and not only on embedded devices, but also how ARM processors performance is increasing thinking also in power management, essential for embedded devices.

Low Power Techniques in multicore systems based in ARM architecture (Part II)

Hello my friends! 🙂

In the previous post I introduced a little the idea about low power techniques in multicore systems based in ARM architecture but, what are those techniques? Well, we can divide them in two groups, those techniques from general Linux power management (and working in actual embedded devices) and those developed and implemented for recent multicore embedded devices.

As Linux Power Management techniques the most relevant are:

  • Suspend and Resume
  • Runtime Power Management
  • CPU Idle
  • Dynamic Frequency and Voltage Scaling
  • Power Management QoS

Most of them, are really good for classical embedded devices, where there’s only one processor or core, but what would happen with those techniques with multicore embedded devices like Pandaboard and its OMAP4 chip?

The OMAP4 chip is based in two ARM Cortex A9 Processors, so it can balance the use of both CPU to achieve the best performance / power saving ratio:

As you can imagine, power management using two cores in embedded devices is very different from the usual power management on laptops or pcs: here you usually have a big constraint to your product, battery life.

So some months ago, there was an important development activity looking for differents power management techniques for multicore ARM systems, and most of them were developed and supported by Linaro Power Management Group, who is doing such a great job trying to simplify one of the most common problems for all kind of embedded devices.

The most interesting power management techniques developed for multicore systems are:

  • CPUIdle
  • CPUFreq
  • CPUHotPlug
  • Thermal
  • CPUSet
  • MultiScheduling

I’d based my studies  in CPUHotplug, a cool technique that gives you the possibility of switching of one of the cores of the processor by a simple echo command from the terminal like this

$ echo 1 > /sys/devices/system/cpu/cpu1/online

The most important advantages of this techniques are the lower consumption obtained using only one of the cores, and an easier condition to enter in processors low power modes. Of course there are some performance issue, but we are looking for a good low power environments and techniques, so we would look to performance effects later on.

In the next post, we’ll see the testing environment and several tests done with the pandaboard using CPU Hotplug, so stay tuned     😉

MB