pátek 19. dubna 2013

F19 Power Management Test Day Early Report

Thanks all who attended the event (personally or online). There were about 40 visitors during the day in our "test room" in Brno office and it was really great to meet there. The demand was more than the available room could handle :) and we are sorry for the limited number of seats. Most of the visitors helped us with the testing and went through the prepared test cases. As a thank you they got a small gift from us.

On the attached picture you can see testers in action :). Notice the graph on the TV in the background - everybody who connected her/his machine to our measurement equipment was able to see it's power consumption in the real time on that TV.

As the proof that Fedora is ARM friendly, we had there Cubieboard, of course running Fedora 19 :) Later in the day we used it for the real time power consumption graphing, replacing the dedicated PC and saving a lot of energy :).

Few visitors were interested in Fedora 19 in general. That wasn't problem, because we had there spare laptops with Fedora 19 pre-release installation, so they could freely try it and ask questions. We also ran the test day online and provided guidance on the #fedora-test-day Freenode IRC channel. We did our best, but sometimes it was really hard to handle it all, so we are very sorry if you experienced increased latency in our IRC responses.

So far many of the testers have already submitted their results. Such feedback is very valuable for us. If you missed the event, you can still participate online, just follow the instructions on the test day wiki http://fedoraproject.org/wiki/Test_Day:2013-04-17_Power_Management. And of course don't forget to submit your results :). We plan to release the detailed stats later (it will be calculated from the data submitted till 2013-04-28).

úterý 16. dubna 2013

F19 Power Management Test Day

Fedora 19 power management test day will start this Wednesday (2013-04-17). The event will be mainly focused on laptops, but even desktop machines can be tested. There are also prepared three test cases which are suitable for secondary architectures, thus e.g. if you are running Fedora 19 pre-release on your ARM box, please join us and share your numbers. There are also prepared special test cases targeting aggressive power-savings on Intel, Nvidia and ATI / AMD (Radeon) graphics cards. This is also on-site event that is running along with the Redhat Open House event. So if you are near Brno, feel free to join us in person. You can bring your hardware and test it there. There will be available calibrated digital power meter (Chroma 66202) so you will be able to measure power consumption of your hardware. Live USB/CD will be available so you can participate in the test day and not affecting your production machine. There is also prepared new web application which you can use for submitting your results, but if you prefer the wiki, you can still submit your results through the wiki as usual. Everybody is welcome to attend this event and your attendance will help us to make the Fedora better. Going through all test cases take less an hour, but you don't need to finish them all - just select test cases you are interested in, partial results are also valuable for us. Just visit the PM test day WWW page http://fedoraproject.org/wiki/Test_Day:2013-04-17_Power_Management and follow on-site instructions or join us in person.

pátek 25. ledna 2013

F17/F18 simple power consumption comparison on Lenovo T520


Fedora 18 is out and I had spare Lenovo ThinkPad T520 so I performed a simple check how it stays against Fedora 17 regarding power consumption. The check was very simple, only 4 tests were run on default installation of Fedora 17/18:

  1. Active idle test - crond disabled and 30 min in the idle for stabilization then for 3 x 20 min the energy consumption was measured. From these 3 results the average power consumption was calculated.
  2. Archive (tar.bz2) unpack - the test archive was unpack 3 times on ext4 root partition. Between each run the VM cache was dropped. Energy consumption was measured and average power consumption was calculated.
  3. Kernel rebuild - kernel-3.7.4-104 srpm was rebuild in mock (one time only). Energy consumption was measured and average power consumption was calculated. In the measurement the mock setup and builddeps installation wasn't counted.
  4. PowerTOP statistics (number of wakeups) - PowerTOP tool was run for 3 x 20 seconds and average number of wakeups was calculated.
For the measurement the Chroma 66202 Energy Star measurements compliant wattmeter was used. The energy consumed from the AC outlet was measured. It means that losses on the cabling and AC/DC adapter are counted in the results, so if you run on the battery the real power consumption will be slightly lower. The Lenovo ThinkPad T520 was used for the test. It has Intel Core i7-2640M CPU @ 2.80GHz and 8GB RAM. LCD backlight and wireless was turned off during the test.

Table 1: Results
TestFedora 17Fedora 18
Active idle Pavg [W]10.2218.748
Archive unpack Pavg [W]31.34631.341
Kernel rebuild Pavg [W]34.90734.588
Wakeups count [1]39.356.7

From the table 1 it is apparent that the power consumption under the load is nearly the same (probably the same due to measurement error). For the active idle the power consumption of Fedora 18 was lower. Unfortunately I haven't time to observe the reason of this, but I am going to do more tests on different HW to proof that it wasn't some anomaly. The wakeups count are comparable. Different PowerTOP versions was used for the measurement and it may be the source of the small difference. The good message is that there wasn't observed any regression in Fedora 18 regarding power consumption (on T520).

úterý 25. září 2012

Tip: control your external monitor from your desktop by ddccontrol

Recently I packaged ddccontrol for Fedora. It is nifty tool that allows you to change settings of your external monitor (brightness, backlight, RGB, ...) from your desktop or CLI. It uses DDC/CI protocol thus this functionality needs to be supported by your monitor and graphics card. Personally, I tested it with Intel video cards (i915) and recent Nvidia video cards (with binary driver) and it worked correctly. It should also work on others. There is a list of supported HW in upstream documentation, but the list isn't exhaustive, thus give it a try even if your HW is not listed there. In case it doesn't work don't forget to try all other video connectors (on both monitor and video card) because this functionality is sometimes supported only on a subset of installed connectors. You will also need i2c_dev kernel module, because this functionality requires I2C.

For quickstart there is GTK GUI application called gddccontrol (package ddccontrol-gtk). It allows you to create/manage/switch profiles, thus it is possible to have different contrast/brightness settings for day/night or gaming/work. For all controls to work correctly your monitor needs to be in ddccontrol database otherwise there will be only generic VESA controls. In this case you can create the profile for your monitor yourself or at least send report to upstream (ddccontrol-users AT lists.sourceforge.net) containing the output from the following command:

LANG= LC_ALL= ddccontrol -p -c -d

For more details and instructions how to use the CLI tool see upstream documentation.

sobota 3. prosince 2011

Tip: power management settings for Radeon

From kernel 2.6.35 and up Radeon driver supports power management (PM). The default behavior is performance oriented, thus if you want to save some power you will need to tune the defaults. You can select from two PM methods: dynpm and profile. The dynpm method dynamically change the GPU clocks according to GPU load. With this method you have enough performance when needed and power savings when in idle. But this method can cause flickering during reclocking and doesn't support multi heads. To activate this method use:
# echo dynpm > /sys/class/drm/card0/device/power_method
With profile method you can select from several profiles: default, auto, low, mid and high. Default settings is the default profile. It uses default clocks and doesn't change power states. The low, mid, high profiles change the GPU clocks accordingly. The auto profile switches automatically between high and mid depending whether the system is running on AC or battery. It also switches to low when the monitors are in dpms off. Not all cards supports all profiles. For example to activate the auto profile, use:
# echo profile > /sys/class/drm/card0/device/power_method
# echo auto > /sys/class/drm/card0/device/power_profile
More details can be found on http://www.x.org/wiki/RadeonFeature.

We also added experimental support for this to tuned. We activate the power savings in desktop and laptop profiles. The code is currently in tuned git.

pátek 25. listopadu 2011

Fedora 16: Firefox power consumption comparison



Fedora 16 has been released so I decided to test it with my Firefox test script to find out what's the power consumption in typical real world "web browsing" scenarios.

Test description

I used the same iMacros script as in previous Firefox power consumption test. I only had to do some syntax changes, because iMacros API changed little bit since the last test.


Hardware

Tests run on HP Proliant DL360 G6 with default BIOS settings / no tunings. For measurements, I used Chroma 66202 the ENERGY STAR/IEC 62301 compliant power meter. The total energy consumed on AC side was monitored. As data logger / power meter controller another machine was used to not influence the machine under test.

Software

Latest available kernels and SW builds as available during the test day were used for Fedora 16.

Table 1: Used softwareare
SystemKernel / SW buildFirefoxFlash
Fedora 152.6.38.8-355.0Beta 2 11.0.d1.98
Fedora 163.1.1-27.0.111.1.102.55

Results


For comparison with Fedora 15 I've used the results measured in the previous test.

In the tables with results, there are power consumption, energy and sample standard deviations (marked stdev) for both values.

Active idle

For this test, Firefox has been running with about:blank for 30 minutes. 

Table 2: Idle state
SystemPavg [W]Pavg stdevEavg [Wh]Eavg stdev
Fedora 1555.96130.038127.97440.019
Fedora 1656.11190.050028.04970.025

There is no big difference when comparing Fedora 16 and Fedora 15 in idle state. Fedora 15 needs little bit less energy, but this is probably an error caused by several peaks during Fedora 16 measurement.


iMacros script with HTML5 Youtube videos

For this test, iMacros test was used as described above and Youtube was configured to use HTML5 video playback.


Table 3: iMacros with HTML5 Youtube
System
Pavg [W]
Pavg stdevEavg [Wh]Eavg stdev
Fedora 1565.68240.133132.83380.0665
Fedora 1665.47780.159932.73160.0799

Fedora 16 needs little bit less energy when Flash is disabled and HTML5 is used for Youtube videos playback.


iMacros script with Flash Youtube videos

This test is the same as previous one, but Firefox flash plugin was activated and Youtube was configured to use Flash for videos playback. This also activates Flash adverts on other pages.
 

Table 4: iMacros with Flash Youtube
System
Pavg [W]
Pavg stdevEavg [Wh]Eavg stdev
Fedora 1568.95750.261734.47110.1308
Fedora 1671.24040.083735.61220.0418

Big surprise is that new version of Flash or Firefox in Fedora 16 caused significant increase of power consumption. Fedora 16 with Flash needed almost 1.5 W more than Fedora 15.

Flash power consumption


This table shows difference between Table 4 and Table 3 which demonstrates saved energy when not using Flash.

Table 5: Flash power consumption
System
Pavg [W]
Eavg [Wh]
Fedora 153.27511.6373
Fedora 165.76262.8806

We can see the Flash power consumption increase here.

Conclusion

As the previous test showed, browsing web with Flash enabled consumes more energy. This is especially true in Fedora 16 with the Flash/Firefox version we tested. There's no big difference between Fedora 15 and Fedora 16 when browsing web without Flash.

úterý 18. října 2011

Deeper C states and increased latency



Today it is common to describe processor power consumption and thermal management state by CX, states where X can be number 0 to n (n depends on the CPU type). In the C0 state the CPU is running. In C-states higher than 0 the CPU is stopped (sleeps). Higher C states means more power savings but also longer delay when returning to C0 (higher latency). ACPI specification describes C0 - C3, but recent CPUs mostly supports more C states. With several BIOSes, higher C states are mapped through C3. The mapping can be done dynamically according to operational conditions (e.g. for some laptops when running on battery the C6 is mapped, when running on AC the C4 is mapped). Overview of the most known C-states can be found in the Table 1 (non-complete compilation of [1-4]).

C-stateNameDescription
C0Operating stateCPU is fully turned on and executing instructions. It can be in one of P-states (P0 - Pn) which defines operational voltage and frequency.
C1HaltCPU main internal clocks are stopped. Bus interface unit and APIC are kept running at full speed.
C1EEnhanced HaltCPU main internal clocks are stopped and the CPU voltage is reduced. Bus interface unit and APIC are kept running. The frequency can be also reduced.
C2Stop ClockCPU internal and external clocks are stopped via hardware.
C3Deep SleepCPU internal and external clocks are stopped, L1/L2 cache can be flushed.
C4Deeper SleepCPU voltage is reduced.
C5Enhanced Deeper SleepCPU voltage is reduced even more and the memory cache is turned off.
C6Deep Power DownCore states are saved into memory with low power consumption. It can reduce the CPU internal voltage to any value, including 0 V.
C7Deeper Power Down *1Same as C6 + flush of L3 cache.

As mentioned earlier higher C states means not only less power consumption, but also higher latency. That's why several BIOSes uses different mapping for AC / battery. Simple experiment will proof the above claims.

Comparison of latency for AC / battery mode

At first we let the Lenovo T500 idling on AC. After while we got the following powertop output:
Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 0,8%)       Turbo Mode     2,7%
polling           0,0ms ( 0,0%)         2,81 Ghz     0,0%
C1 mwait          0,0ms ( 0,0%)         2,14 Ghz     0,0%
C2 mwait          0,3ms ( 0,3%)         1,60 Ghz     0,2%
C4 mwait          6,6ms (98,9%)          800 Mhz    97,2%
As you can see the CPU is most of the time in C4. We can check the latency reported by kernel with the following command:
# cat /sys/devices/system/cpu/cpu0/cpuidle/state3/latency
57
This means 57 microseconds (C4 is currently mapped to state3 - the last state in the cpuidle subdir). Then we ping the T500 through LAN from another machine and we got: 401 (31) us. It is average of 10 runs. The standard deviation of the sample is written in the braces.

Next we performed the same experiment with the T500 running on battery. We got the following powertop output:
Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 0.1%)       Turbo Mode     0.9%
polling           0.0ms ( 0.0%)         2.81 Ghz     0.0%
C1 mwait          0.1ms ( 0.0%)         2.14 Ghz     0.0%
C2 mwait          0.7ms ( 0.1%)         1.60 Ghz     0.0%
C6 mwait         58.9ms (99.8%)          800 Mhz    99.1%

Wakeups-from-idle per second : 18.8     interval: 15.0s
Power usage (ACPI estimate): 12.5W (5.3 hours)
As you can see, the CPU is now most of the time in the C6, thus the CPU power consumption can be reduced near to zero. For the latency:
# cat /sys/devices/system/cpu/cpu0/cpuidle/state3/latency
162
It means 162 microseconds (C6 is currently mapped to state3). The ping result: 493 (33) us. That is increase about 100 us.

The intel_idle driver

The problem with the latency can be even worse if the intel_idle driver [5] is utilized. This driver has been included since kernel version 2.6.35. It is native hardware driver for the latest Intel CPUs. It supersedes acpi_idle on supported processors (currently Intel Atom, Intel Core i3/i5/i7, associated Intel Xeons). The intel_idle knows more than ACPI and it can bypass the firmware / BIOS settings and the processor can then enter deeper power savings states more aggressively. By default the intel_idle driver is built into the Fedora kernel and is activated automatically on boot (on supported CPUs). This result in higher power savings by default but also higher latency.

If the increased latency is unacceptable, it is possible to specify the max allowed C state by the kernel command line parameter intel_idle.max_cstate, e.g. to use only C0, boot with the intel_idle.max_cstate=0.

The PM QoS kernel interface

For finer runtime control the PM QoS interface [6] can be utilized. Through this interface every process can register it's latency requirement and the cpuidle driver will not transition to deeper C states if the lowest request wouldn't be satisfied. The request is written as four-bytes signed integer to /dev/cpu_dma_latency. The request is valid till the file descriptor is held open. E.g. to request the latency to be lower than 100 us the following commands can be used:
# exec 3>/dev/cpu_dma_latency
# echo -ne '\0144\000\000\000' >&3
For the echo command the 100 (decimal) was translated into 0144 (octal). Let's repeat our experiment:
Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 0.2%)       Turbo Mode     0.1%
polling           0.0ms ( 0.0%)         2.81 Ghz     0.0%
C1 mwait          0.1ms ( 0.2%)         2.14 Ghz     1.3%
C2 mwait         39.7ms (99.6%)         1.60 Ghz     0.0%
C6 mwait          0.0ms ( 0.0%)          800 Mhz    97.3%

Wakeups-from-idle per second : 50.5     interval: 20.0s
Power usage (ACPI estimate): 14.9W (4.3 hours)
As you can see the CPU is now not transitioning to C6. As the side effect the power consumption increased about 2.5 W (by ACPI estimation) and the ping result is a bit better: 301 (30) us.

When the lower latency is not needed, we can remove the requirement from the kernel by closing the file descriptor:
# exec 3>&-
Let's try to set the required latency to 0, the powertop results:
Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 0.2%)       Turbo Mode     0.0%
polling          26.4ms (99.8%)         2.81 Ghz     0.0%
C1 mwait          0.0ms ( 0.0%)         2.14 Ghz     0.0%
C2 mwait          0.0ms ( 0.0%)         1.60 Ghz     0.0%
C6 mwait          0.0ms ( 0.0%)          800 Mhz    99.9%

Wakeups-from-idle per second : 37.8     interval: 15.0s
Power usage (ACPI estimate): 16.9W (3.9 hours) (long term: 15.7W,/4.2h)
Note the increased power consumption and that the cpufreq wasn't affected, i.e. the CPU is running at it's lowest speed. The ping result: 195 (34) us.

User control of PM QoS

The PM QoS interface is also proxied by upower daemon, thus it is possible to control the PM QoS settings through dbus interface (org.freedesktop.UPower.QoS).

The support was also recently added into tuned latency-performance profile (currently only in upstream git [7], but it will be probably part of the future v0.2.22 release). To test it:
# tuned-adm profile latency-performance
The powertop results:
Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 0.4%)       Turbo Mode   100.0%
polling          74.3ms (99.6%)         2.81 Ghz     0.0%
C1 mwait          0.0ms ( 0.0%)         2.14 Ghz     0.0%
C2 mwait          0.0ms ( 0.0%)         1.60 Ghz     0.0%
C6 mwait          0.0ms ( 0.0%)          800 Mhz     0.0%

Wakeups-from-idle per second : 13.4     interval: 5.0s
Power usage (ACPI estimate): 31.4W (2.1 hours)
Note that the power consumption nearly doubles and the CPU is running most of the time in Turbo mode. The ping results: 176 (32) us. Thus we got the lowest latency, but our setup is far from being power efficient.


*1 Deduced, no official name found in Intel specs.
[1] http://www.acpi.info/spec.htm
[2] http://www.hardwaresecrets.com/article/611
[3] http://www.intel.com/content/www/us/en/processors/xeon/xeon-e3-1200-family-vol-1-datasheet.html
[4] http://www.lesswatts.org/documentation/silicon-power-mgmnt/
[5] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2671717265ae6e720a9ba5f13fbec3a718983b65
[6] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob_plain;f=Documentation/power/pm_qos_interface.txt;hb=HEAD
[7] http://git.fedorahosted.org/git/?p=tuned.git;a=commit;h=31639fdec76b294fd67c78ec332fe26bf0ad7bb9