[personal profile] mjg59
Edit to add: These patches on their own won't enable this functionality, they just give us a better set of options. Once they're merged we can look at changing the defaults so people get the benefit of this out of the box.

Haswell and Broadwell (Intel's previous and current generations of x86) both introduced a range of new power saving states that promised significant improvements in battery life. Unfortunately, the typical experience on Linux was an increase in power consumption. The reasons why are kind of complicated and distinctly unfortunate, and I'm at something of a loss as to why none of the companies who get paid to care about this kind of thing seemed to actually be caring until I got a Broadwell and looked unhappy, but here we are so let's make things better.

Recent Intel mobile parts have the Platform Controller Hub (Intel's term for the Southbridge, the chipset component responsible for most system i/o like SATA and USB) integrated onto the same package as the CPU. This makes it easier to implement aggressive power saving - the CPU package already has a bunch of hardware for turning various clock and power domains on and off, and these can be shared between the CPU, the GPU and the PCH. But that also introduces additional constraints, since if any component within a power management domain is active then the entire domain has to be enabled. We've pretty much been ignoring that.

The tldr is that Haswell and Broadwell are only able to get into deeper package power saving states if several different components are in their own power saving states. If the CPU is active, you'll stay in a higher-power state. If the GPU is active, you'll stay in a higher-power state. And if the PCH is active, you'll stay in a higher-power state. The last one is the killer here. Having a SATA link in a full-power state is sufficient to keep the PCH active, and that constrains the deepest package power savings state you can enter.

SATA power management on Linux is in a kind of odd state. We support it, but we don't enable it by default. In fact, right now we even remove any existing SATA power management configuration that the firmware has initialised. Distributions don't enable it by default because there are horror stories about some combinations of disk and controller and power management configuration resulting in corruption and data loss and apparently nobody had time to investigate the problem.

I did some digging and it turns out that our approach isn't entirely inconsistent with the industry. The default behaviour on Windows is pretty much the same as ours. But vendors don't tend to ship with the Windows AHCI driver, they replace it with the Intel Rapid Storage Technology driver - and it turns out that that has a default-on policy. But to make things even more awkwad, the policy implemented by Intel doesn't match any of the policies that Linux provides.

In an attempt to address this, I've written some patches. The aim here is to provide two new policies. The first simply inherits whichever configuration the firmware has provided, on the assumption that the system vendor probably didn't configure their system to corrupt data out of the box[1]. The second implements the policy that Intel use in IRST. With luck we'll be able to use the firmware settings by default and switch to the IRST settings on Intel mobile devices.

This change alone drops my idle power consumption from around 8.5W to about 5W. One reason we'd pretty much ignored this in the past was that SATA power management simply wasn't that big a win. Even at its most aggressive, we'd struggle to see 0.5W of saving. But on these new parts, the SATA link state is the difference between going to PC2 and going to PC7, and the difference between those states is a large part of the CPU package being powered up.

But this isn't the full story. There's still work to be done on other components, especially the GPU. Keeping the link between the GPU and an internal display panel active is both a power suck and requires additional chipset components to be powered up. Embedded Displayport 1.3 introduced a new feature called Panel Self-Refresh that permits the GPU and the screen to negotiate dropping the link, leaving it up to the screen to maintain its contents. There's patches to enable this on Intel systems, but it's still not turned on by default. Doing so increases the amount of time spent in PC7 and brings corresponding improvements to battery life.

This trend is likely to continue. As systems become more integrated we're going to have to pay more attention to the interdependencies in order to obtain the best possible power consumption, and that means that distribution vendors are going to have to spend some time figuring out what these dependencies are and what the appropriate default policy is for their users. Intel's done the work to add kernel support for most of these features, but they're not the ones shipping it to end-users. Let's figure out how to make this right out of the box.

[1] This is not necessarily a good assumption, but hey, let's see
Page 1 of 3 << [1] [2] [3] >>

Nice

Date: 2015-04-27 07:11 pm (UTC)
From: (Anonymous)
Nice work, as always.

Thank you

Date: 2015-04-27 08:31 pm (UTC)
From: (Anonymous)
It's awesome that you're fixing things like that, thank you.

But at the same time, it's also kinda sad that nobody is getting paid to do this.

Date: 2015-04-27 08:42 pm (UTC)
From: (Anonymous)
Would you consider posting a full list of patches here, rather than just on Twitter?

Hack the planet!

Date: 2015-04-27 08:43 pm (UTC)
From: (Anonymous)
Once again, thanks for your awesome work, and for writing about it.

Patch set posted to LKML on 18 April 2015

Date: 2015-04-27 11:41 pm (UTC)
From: (Anonymous)
  1. [PATCH 1/3] libata: Stash initial power management configuration (https://lkml.org/lkml/2015/4/18/77)
  2. [PATCH 2/3] libata: Add firmware_default LPM policy (https://lkml.org/lkml/2015/4/18/78)
  3. [PATCH 3/3] libata: Change medium_power LPM policy to match Intel recommendations (https://lkml.org/lkml/2015/4/18/78)

Thanks!

Date: 2015-04-28 04:46 am (UTC)
From: (Anonymous)
Thanks! You're a genius!

Kernel Version

Date: 2015-04-28 05:07 am (UTC)
From: (Anonymous)
Which kernel version did you patch?

Re: Patch set posted to LKML on 18 April 2015

Date: 2015-04-28 12:38 pm (UTC)
From: (Anonymous)
Last link is the same as the second.

Thank you

Date: 2015-04-28 12:45 pm (UTC)
From: (Anonymous)
We are looking at these issues for the Fedora Workstation, but due to the new kernel engineer in the Fedora team only starting last week we hadn't had a chance to even think about the kernel space yet. But Josh will likely be reaching out to you about this at some point.

Hopefully we can work with you to make this stuff top notch in Fedora.

Christian

Possible problem for some corner cases

Date: 2015-04-28 01:33 pm (UTC)
From: (Anonymous)
It's great to see someone working on reducing power usage on linux, every little bit counts :) Also, it's always nice to see your explanations of the issues you deal with.

What jumped into my mind right away was this[1-2]. Having bought my first SSD very recently and doing some googling about any possible troubles I found those bug reports.

Correct me if I'm wrong, which I may very well be, but your proposed changes will affect all systems to more aggressively try to save power. If I'm interpreting the bug reports and your changes correctly, it seems that in a few corner cases it might lead to problems.

Just wanted to raise this concern as some hardware combinations might not play well with these changes.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=71371#c25
[2] https://bugzilla.kernel.org/show_bug.cgi?id=89261#c4

Re: Thank you

Date: 2015-04-28 06:11 pm (UTC)
From: (Anonymous)
I've just spun a Fedora 21 kernel test build with the patches.

The kernel artifacts are at http://koji.fedoraproject.org/koji/taskinfo?taskID=9588298. Feel free to try it.

NOTICE: It was __not__ tested; just applied the patch against this kernel.

PSR Patches

Date: 2015-04-28 08:38 pm (UTC)
From: (Anonymous)
On your twitter you mentioned using some PSR patches from drm-intel. Do you have a list of what patches you used from that branch? Also, did you experience any flicker on your laptop as a result of enabling PSR?

powertop

Date: 2015-04-29 12:56 pm (UTC)
From: (Anonymous)
I have noticed this difference simply with the command powertop --auto-tune.
So now I launch it at boot, and go from 9 watts to 4.5! The autotune activates SATA power save, and also audio codec powersave, and even USB powersave.

I encourage users to test this command...

I suppose your work will do the sata part automaticaly, so thank you!

Re: PSR Patches

Date: 2015-04-29 01:10 pm (UTC)
From: (Anonymous)
Thanks. I did that. My monitor now gives me a blank screen for half a second every so often (Using 4.1-rc1 with your patches and the intel_psr merge). I'm going to run it this way for awhile and see how long that happens. Have you made any other changes to the graphics? (sna vs uxa, etc) I heard you're using an XPS 13. Are you using the QHD or the FHD version? Wonder where the difference is.

Date: 2015-04-29 06:28 pm (UTC)
From: [identity profile] hugo.barrera.io
So, if I understood correctly, building the kernel with these patches isn't enough: I still need to reconfigure something before building, right?

> The first simply inherits whichever configuration the firmware has provided, on the assumption that the system vendor probably didn't configure their system to corrupt data out of the box.

This may fail with odd combinations (common on desktops, since it's not a single vendor that put things together), but will probably stand true for laptop. I assume this is quite true for macbooks: given the small variety of hardware they use, they're probably quite finely tested.

Re: Patch set posted to LKML on 18 April 2015

Date: 2015-04-29 07:23 pm (UTC)
From: (Anonymous)
Yeah, but they're consecutively numbered so the third link is: https://lkml.org/lkml/2015/4/18/79

Re: Patch set posted to LKML on 18 April 2015

Date: 2015-04-29 08:32 pm (UTC)
From: (Anonymous)
I meant the patches *other* than those for SATA LPM; Matthew has mentioned various other fixes elsewhere that, taken together, provide even more power savings.

Re: Possible problem for some corner cases

Date: 2015-04-29 08:33 pm (UTC)
From: (Anonymous)
For the latter, what would it take to safely start using devslp more widely? A whitelist of working drives? A blacklist of broken drives?

Date: 2015-04-29 08:37 pm (UTC)
From: (Anonymous)
They're quite finely tested with OSX. Anything other than OSX should be fine to the extent that it behaves exactly like OSX.

See also other hardware and Windows.

Date: 2015-04-30 05:40 am (UTC)
From: (Anonymous)
Great work Matthew! Keen to see how this will cut the idle powerdrain on my t440p. Any chance the patch will be merged into drm-intel/drm-intel-nightly anytime soon?

S-ATA ALPM and Lenovo T440s

Date: 2015-04-30 07:22 pm (UTC)
From: (Anonymous)
Hi Matthew,

Just patched a 4.1-rc1 kernel with your patches and will test drive it on a Haswell powered Lenovo T440s. I had bad experience with ALPM earlier and lost my partition table several times due to it, until I realized it was the power setting I tweaked, see:
https://lkml.org/lkml/2014/1/20/486

I'm now trying medium_power and firmware_defaults settings. I will report in case I hit problems with one of that, fingers crossed :-)
Page 1 of 3 << [1] [2] [3] >>

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Aurora. Ex-biologist. [personal profile] mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer. Also on Mastodon.

Expand Cut Tags

No cut tags