mjg59 | What the fuck is an SBAT and why does everyone suddenly care

You're viewing

mjg59's journal
Create a Dreamwidth Account Learn More

Reload page in style: site light

Short version: Secure Boot Advanced Targeting and if that's enough for you you can skip the rest you're welcome.

Long version: When UEFI Secure Boot was specified, everyone involved was, well, a touch naive. The basic security model of Secure Boot is that all the code that ends up running in a kernel-level privileged environment should be validated before execution - the firmware verifies the bootloader, the bootloader verifies the kernel, the kernel verifies any additional runtime loaded kernel code, and now we have a trusted environment to impose any other security policy we want. Obviously people might screw up, but the spec included a way to revoke any signed components that turned out not to be trustworthy: simply add the hash of the untrustworthy code to a variable, and then refuse to load anything with that hash even if it's signed with a trusted key.

Unfortunately, as it turns out, scale. Every Linux distribution that works in the Secure Boot ecosystem generates their own bootloader binaries, and each of them has a different hash. If there's a vulnerability identified in the source code for said bootloader, there's a large number of different binaries that need to be revoked. And, well, the storage available to store the variable containing all these hashes is limited. There's simply not enough space to add a new set of hashes every time it turns out that grub (a bootloader initially written for a simpler time when there was no boot security and which has several separate image parsers and also a font parser and look you know where this is going) has another mechanism for a hostile actor to cause it to execute arbitrary code, so another solution was needed.

And that solution is SBAT. The general concept behind SBAT is pretty straightforward. Every important component in the boot chain declares a security generation that's incorporated into the signed binary. When a vulnerability is identified and fixed, that generation is incremented. An update can then be pushed that defines a minimum generation - boot components will look at the next item in the chain, compare its name and generation number to the ones stored in a firmware variable, and decide whether or not to execute it based on that. Instead of having to revoke a large number of individual hashes, it becomes possible to push one update that simply says "Any version of grub with a security generation below this number is considered untrustworthy".

So why is this suddenly relevant? SBAT was developed collaboratively between the Linux community and Microsoft, and Microsoft chose to push a Windows update that told systems not to trust versions of grub with a security generation below a certain level. This was because those versions of grub had genuine security vulnerabilities that would allow an attacker to compromise the Windows secure boot chain, and we've seen real world examples of malware wanting to do that (Black Lotus did so using a vulnerability in the Windows bootloader, but a vulnerability in grub would be just as viable for this). Viewed purely from a security perspective, this was a legitimate thing to want to do.

(An aside: the "Something has gone seriously wrong" message that's associated with people having a bad time as a result of this update? That's a message from shim, not any Microsoft code. Shim pays attention to SBAT updates in order to avoid violating the security assumptions made by other bootloaders on the system, so even though it was Microsoft that pushed the SBAT update, it's the Linux bootloader that refuses to run old versions of grub as a result. This is absolutely working as intended)

The problem we've ended up in is that several Linux distributions had not shipped versions of grub with a newer security generation, and so those versions of grub are assumed to be insecure (it's worth noting that grub is signed by individual distributions, not Microsoft, so there's no externally introduced lag here). Microsoft's stated intention was that Windows Update would only apply the SBAT update to systems that were Windows-only, and any dual-boot setups would instead be left vulnerable to attack until the installed distro updated its grub and shipped an SBAT update itself. Unfortunately, as is now obvious, that didn't work as intended and at least some dual-boot setups applied the update and that distribution's Shim refused to boot that distribution's grub.

What's the summary? Microsoft (understandably) didn't want it to be possible to attack Windows by using a vulnerable version of grub that could be tricked into executing arbitrary code and then introduce a bootkit into the Windows kernel during boot. Microsoft did this by pushing a Windows Update that updated the SBAT variable to indicate that known-vulnerable versions of grub shouldn't be allowed to boot on those systems. The distribution-provided Shim first-stage bootloader read this variable, read the SBAT section from the installed copy of grub, realised these conflicted, and refused to boot grub with the "Something has gone seriously wrong" message. This update was not supposed to apply to dual-boot systems, but did anyway. Basically:

1) Microsoft applied an update to systems where that update shouldn't have been applied
2) Some Linux distros failed to update their grub code and SBAT security generation when exploitable security vulnerabilities were identified in grub

The outcome is that some people can't boot their systems. I think there's plenty of blame here. Microsoft should have done more testing to ensure that dual-boot setups could be identified accurately. But also distributions shipping signed bootloaders should make sure that they're updating those and updating the security generation to match, because otherwise they're shipping a vector that can be used to attack other operating systems and that's kind of a violation of the social contract around all of this.

It's unfortunate that the victims here are largely end users faced with a system that suddenly refuses to boot the OS they want to boot. That should never happen. I don't think asking arbitrary end users whether they want secure boot updates is likely to result in good outcomes, and while I vaguely tend towards UEFI Secure Boot not being something that benefits most end users it's also a thing you really don't want to discover you want after the fact so I have sympathy for it being default on, so I do sympathise with Microsoft's choices here, other than the failed attempt to avoid the update on dual boot systems.

Anyway. I was extremely involved in the implementation of this for Linux back in 2012 and wrote the first prototype of Shim (which is now a massively better bootloader maintained by a wider set of people and that I haven't touched in years), so if you want to blame an individual please do feel free to blame me. This is something that shouldn't have happened, and unless you're either Microsoft or a Linux distribution it's not your fault. I'm sorry.

Flat | Top-Level Comments Only

From: (Anonymous)

Maybe a better way of handling this would have been to have two updates instead of one:

Update 1: Change SBAT policy to "warn". Then ask the user to press a key to continue if the security generation isn't matching the policy. This allows users to continue using their software and report this to the vendor and update etc.

Update 2: Change SBAT policy to "enforce".

And instead of having SBAT define a single minimal security generation it could have two levels, one for warning and one for enforcing.

From: (Anonymous)

The problem with all "allow anyway" options for any security settings is that most users don't care about security. They care that they have some task/job they want to get done, and they will answer "yes, bypass" to a big red "warning, answering yes will infect your computer with a virus that will steal all your money from your bank account" message in order to get on with their task/job. Security settings, for them to have any power at all to block malware, have to be default on and unable to be bypassed by the end user (because the end user *will* bypass them if they get in the way of whatever task/job they have to do right now).

From: (Anonymous)

It's their machine. Who are YOU to decide that their task/job today is less important than a theoretical security threat?

I think that at the "Update 1" (warning, "press a key to continue at your own risk") level the message should mention a future date certain (the "Update 2" date), upon which the system won't boot unless it has been updated. That makes punishing them by rendering their computer unbootable, on that date, less unreasonable than doing it with no warning.

Dave

From: (Anonymous)

P.S. -- however, that "Update 2" date needs to be way in the future, because this is a warning that won't be seen except when the system is booting, and many systems go many months without being rebooted.

Dave

From: (Anonymous)

Freedom is all well in good until your freedom effects others.

If your machine becomes part of a botnet that is used to infect or spam others because you couldn't be bothered to update, are you going to pay the millions in damages?

This is what happened with Windows XP. People had too many choices. Nobody was forced to run Windows Update or install service packs. I was inundated with customer computers for virus removal.

Like it or not, forced updates and security are to protect us from you.

From: (Anonymous)

Agree 100%. Updates & security are akin to vaccination.

From: (Anonymous)

Freedom is all well in good until your freedom effects others.

should be:

Freedom is all well AND good until your freedom effects others.

From: (Anonymous)

I just checked with another jurisdiction (Spelling Enforcement) and they'll get back to you on 'effects'.

From: (Anonymous)

You fail to explain how this is any problem at all. People should not be protected from self-harm at the expense of other people's choice, if they select for it. True for assisted suicide, true for security mechanisms. Educate but let people decide their own faith.

From: (Anonymous)

they DO currently have that freedom - by disabling secure boot.

if the user has it enabled, the OS should assume they want it to work as intended.

was there a better way to do this? absolutely, without question. but to suggest that this situation should not cause a failure to boot is not the answer here. (that was a double negative. to rephrase: this particular combination of circumstances _absolutely should_ cause a no-boot situation.)

From:

That makes this an undesirable end state, but it's fine as a temporary measure ensuring that, if there's something wrong with Microsoft's latest implementation (as apparently happened last time), they'll hear about it and have the chance to fix it before any users get locked out of their machines.

From: (Anonymous)

> and they will answer "yes, bypass" to a big red "warning,

Yes: but now it's their fault, not yours. Huge difference.

Flat | Top-Level Comments Only

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Aurora. Ex-biologist.

mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer. Also on Mastodon.

Page Summary

(Anonymous) - (no subject)

Expand Cut Tags

No cut tags