[personal profile] mjg59
Update: Patches to fix this have been posted

There's a story going round that Lenovo have signed an agreement with Microsoft that prevents installing free operating systems. This is sensationalist, untrue and distracts from a genuine problem.

The background is straightforward. Intel platforms allow the storage to be configured in two different ways - "standard" (normal AHCI on SATA systems, normal NVMe on NVMe systems) or "RAID". "RAID" mode is typically just changing the PCI IDs so that the normal drivers won't bind, ensuring that drivers that support the software RAID mode are used. Intel have not submitted any patches to Linux to support the "RAID" mode.

In this specific case, Lenovo's firmware defaults to "RAID" mode and doesn't allow you to change that. Since Linux has no support for the hardware when configured this way, you can't install Linux (distribution installers will boot, but won't find any storage device to install the OS to).

Why would Lenovo do this? I don't know for sure, but it's potentially related to something I've written about before - recent Intel hardware needs special setup for good power management. The storage driver that Microsoft ship doesn't do that setup. The Intel-provided driver does. "RAID" mode prevents the Microsoft driver from binding and forces the user to use the Intel driver, which means they get the correct power management configuration, battery life is better and the machine doesn't melt.

(Why not offer the option to disable it? A user who does would end up with a machine that doesn't boot, and if they managed to figure that out they'd have worse power management. That increases support costs. For a consumer device, why would you want to? The number of people buying these laptops to run anything other than Windows is miniscule)

Things are somewhat obfuscated due to a statement from a Lenovo rep:This system has a Signature Edition of Windows 10 Home installed. It is locked per our agreement with Microsoft. It's unclear what this is meant to mean. Microsoft could be insisting that Signature Edition systems ship in "RAID" mode in order to ensure that users get a good power management experience. Or it could be a misunderstanding regarding UEFI Secure Boot - Microsoft do require that Secure Boot be enabled on all Windows 10 systems, but (a) the user must be able to manage the key database and (b) there are several free operating systems that support UEFI Secure Boot and have appropriate signatures. Neither interpretation indicates that there's a deliberate attempt to prevent users from installing their choice of operating system.

The real problem here is that Intel do very little to ensure that free operating systems work well on their consumer hardware - we still have no information from Intel on how to configure systems to ensure good power management, we have no support for storage devices in "RAID" mode and we have no indication that this is going to get better in future. If Intel had provided that support, this issue would never have occurred. Rather than be angry at Lenovo, let's put pressure on Intel to provide support for their hardware.

What does "RAID" actually do?

Date: 2016-09-23 01:50 am (UTC)
From: (Anonymous)
Does anyone have concrete information about what "RAID" does. For example, lspci -vvvnn output from an affected system might be nice.

--luto

Re: What does "RAID" actually do?

Date: 2016-09-23 04:41 am (UTC)
From: (Anonymous)
https://forums.lenovo.com/t5/Linux-Discussion/Yoga-900-13ISK2-BIOS-update-for-setting-RAID-mode-for-missing/m-p/3436876#M8406

Someone compared outputs with a system that one person was able to mod to go into AHCI mode.

Re: What does "RAID" actually do?

Date: 2016-09-23 04:42 am (UTC)
From: (Anonymous)
Hi, this is copy/paste from my post on the Lenovo forum; apologies in advance for my being clueless about dreamwidth post formatting...

nb:

  • by "patched" and "unpatched" I mean with/without the BIOS mod - the "patched" one being in "AHCI" mode.
  • systems were running different distros/versions of lspci, so minor formatting and naming differences
  • some "randomly" assigned stuff like interrupts and memory locations also differ.
  • this is the "diff"; I can post a full patched/unpatched lspci if you really want.


In the unpatched stock dump a 'RAID' controller appears:


00:17.0 RAID bus controller [0104]: Intel Corporation 82801 Mobile SATA Controller [RAID mode] [8086:282a] (rev 21)
Subsystem: Lenovo 82801 Mobile SATA Controller [RAID mode] [17aa:381c]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 16
Region 0: Memory at a12a8000 (32-bit, non-prefetchable) [size=32K]
Region 1: Memory at a12be000 (32-bit, non-prefetchable) [size=256]
Region 2: I/O ports at 3080 [size=8]
Region 3: I/O ports at 3088 [size=4]
Region 4: I/O ports at 3060 [size=32]
Region 5: Memory at a1200000 (32-bit, non-prefetchable) [size=512K]
Capabilities: [d0] MSI-X: Enable+ Count=10 Masked-
Vector table: BAR=0 offset=00000000
PBA: BAR=1 offset=00000000
Capabilities: [70] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
Kernel driver in use: ahci
Kernel modules: ahci



that "slot" (00:17.0) doesn't appear in the patched system.



conversely, in the patched system there is a PCI bridge that doesn't appear in the unpatched system:


00:1d.0 PCI bridge [0604]: Intel Corporation Sunrise Point-LP PCI Express Root Port #9 [8086:9d18] (rev f1) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 124
Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
I/O behind bridge: 00003000-00003fff
Memory behind bridge: a1000000-a10fffff
Prefetchable memory behind bridge: 00000000fff00000-00000000000fffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 128 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #9, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L0s <1us, L1 <16us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #8, PowerLimit 25.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet- LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Address: fee00278 Data: 0000
Capabilities: [90] Subsystem: Lenovo Device [17aa:3805]
Capabilities: [a0] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [100 v0] #00
Capabilities: [140 v1] Access Control Services
ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
Capabilities: [220 v1] #19
Kernel driver in use: pcieport
Kernel modules: shpchp


And, of finally, here's the bad boy we all want - the Samsung drive on a PCI bus that doesn't appear in the unpatched system:


03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller [144d:a802] (rev 01) (prog-if 02 [NVM Express])
Subsystem: Samsung Electronics Co Ltd Device [144d:a801]
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 16
NUMA node: 0
Region 0: Memory at a1000000 (64-bit, non-prefetchable) [size=16K]
Region 2: I/O ports at 3000 [size=256]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 25.000W
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L0s <4us, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR+, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [b0] MSI-X: Enable+ Count=9 Masked-
Vector table: BAR=0 offset=00003000
PBA: BAR=0 offset=00002000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
Capabilities: [158 v1] Power Budgeting <?>
Capabilities: [168 v1] #19
Capabilities: [188 v1] Latency Tolerance Reporting
Max snoop latency: 3145728ns
Max no snoop latency: 3145728ns
Capabilities: [190 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
Kernel driver in use: nvme
Kernel modules: nvme


So my completey unfounded speculation at this point is that that "RAID/AHCI" BIOS switch is changing the mode of whatever chip that PCI bridge lives in for some reason... to aggregate attached NVMe devices as a RAID?? dunno.

your efforts to shed lights are welcome.

cheers,
HH

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Google. Member of the Free Software Foundation board of directors. Ex-biologist. @mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer.

Page Summary

Expand Cut Tags

No cut tags