Matthew Garrett ([personal profile] mjg59) wrote2016-04-13 12:46 pm
Entry tags:

Skylake's power management under Linux is dreadful and you shouldn't buy one until it's fixed

(Edit to add: this issue is restricted to the mobile SKUs. Desktop parts have very different power management behaviour)

Linux 4.5 seems to have got Intel's Skylake platform (ie, 6th-generation Core CPUs) to the point where graphics work pretty reliably, which is great progress (4.4 tended to lose all my windows every so often, especially over suspend/resume). I'm even running Wayland happily. Unfortunately one of the reasons I have a laptop is that I want to be able to do things like use it on battery, and power consumption's an important part of that. Skylake continues the trend from Haswell of moving to an SoC-type model where clock and power domains are shared between components that were previously entirely independent, and so you can't enter deep power saving states unless multiple components all have the correct power management configuration. On Haswell/Broadwell this manifested in the form of Serial ATA link power management being involved in preventing the package from going into deep power saving states - setting that up correctly resulted in a reduction in full-system power consumption of about 40%[1].

I've now got a Skylake platform with a nice shiny NVMe device, so Serial ATA policy isn't relevant (the platform doesn't even expose a SATA controller). The deepest power saving state I can get into is PC3, despite Skylake supporting PC8 - so I'm probably consuming about 40% more power than I should be. And nobody seems to know what needs to be done to fix this. I've found no public documentation on the power management dependencies on Skylake. Turning on everything in Powertop doesn't improve anything. My battery life is pretty poor and the system is pretty warm.

The best thing about this is the following statement from page 64 of the 6th Generation Intel ® Processor Datasheet for U-Platforms:

Caution: Long term reliability cannot be assured unless all the Low-Power Idle States are enabled.

which is pretty concerning. Without support for states deeper than PC3, Linux is running in a configuration that Intel imply may trigger premature failure. That's obviously not good. Until this situation is improved, you probably shouldn't buy any Skylake systems if you're planning on running Linux.

[1] These patches never went upstream. Someone reported that they resulted in their SSD throwing errors and I couldn't find anybody with deeper levels of SATA experience who was interested in working on the problem. Intel's AHCI drivers for Windows do the right thing, but I couldn't find anybody at Intel who could get any information from their Windows driver team.

Re: MSR

(Anonymous) 2016-04-16 07:18 am (UTC)(link)
Yes, you're right. I have misread the bits specification.

Anyway I was able to get it working and now my system can go to PC8:-)

It's Acer Aspire VN7-592g, system by default only goes to PC2.

1. PC3 - I needed enable to C1E auto promotion (intel_idle disables it for some reason by default - so I overriden it)
2. PC8 - My system has NVidia Optimus and I don't use it, so I use bbswitch to turn it off and have nouveau module blacklisted. But that seems to prevent PC8 :-D So the workaround is to:

modprobe nouveau
sleep 5
echo OFF > /proc/acpi/bbswitch

If the nouveau module NEVER initialized the card, it would only go to PC3.

Pavel

Re: MSR

(Anonymous) 2016-04-16 08:07 am (UTC)(link)
Another update. The C1E doesn't seem to be the real issue.
I have discovered that it is not sufficient to enable it using wrmsr, but I actually have to suspend and resume the machine :-d After the resume, the C1E bit is enabled and something else (I don't know what, output from turbostat looksthe same, same ALPM in lspci).

Here is debug before suspend (PC2 max)

https://gist.github.com/anonymous/703ea5a4026a333bacf0811f41280d62

And here is output after suspend (PC8 max)

https://gist.github.com/anonymous/8e53bc438792ed679c907a9685ce6c7d

So the differences are link state for 00:01.0 PCI bridge: Intel Corporation Skylake PCIe Controller (x16) (rev 07) (prog-if 00 [Normal decode])