![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
WebAuthn improves login security a lot by making it significantly harder for a user's credentials to be misused - a WebAuthn token will only respond to a challenge if it's issued by the site a secret was issued to, and in general will only do so if the user provides proof of physical presence[1]. But giving people tokens is tedious and also I have a new laptop which only has USB-C but does have a working fingerprint reader and I hate the aesthetics of the Yubikey 5C Nano, so I've been thinking about what WebAuthn looks like done without extra hardware.
Let's talk about the broad set of problems first. For this to work you want to be able to generate a key in hardware (so it can't just be copied elsewhere if the machine is compromised), prove to a remote site that it's generated in hardware (so the remote site isn't confused about what security assertions you're making), and tie use of that key to the user being physically present (which may range from "I touched this object" to "I presented biometric evidence of identity"). What's important here is that a compromised OS shouldn't be able to just fake a response. For that to be possible, the chain between proof of physical presence to the secret needs to be outside the control of the OS.
For a physical security token like a Yubikey, this is pretty easy. The communication protocol involves the OS passing a challenge and the source of the challenge to the token. The token then waits for a physical touch, verifies that the source of the challenge corresponds to the secret it's being asked to respond to the challenge with, and provides a response. At the point where keys are being enrolled, the token can generate a signed attestation that it generated the key, and a remote site can then conclude that this key is legitimately sequestered away from the OS. This all takes place outside the control of the OS, meeting all the goals described above.
How about Macs? The easiest approach here is to make use of the secure enclave and TouchID. The secure enclave is a separate piece of hardware built into either a support chip (for x86-based Macs) or directly on the SoC (for ARM-based Macs). It's capable of generating keys and also capable of producing attestations that said key was generated on an Apple secure enclave ("Apple Anonymous Attestation", which has the interesting property of attesting that it was generated on Apple hardware, but not which Apple hardware, avoiding a lot of privacy concerns). These keys can have an associated policy that says they're only usable if the user provides a legitimate touch on the fingerprint sensor, which means it can not only assert physical presence of a user, it can assert physical presence of an authorised user. Communication between the fingerprint sensor and the secure enclave is a private channel that the OS can't meaningfully interfere with, which means even a compromised OS can't fake physical presence responses (eg, the OS can't record a legitimate fingerprint press and then send that to the secure enclave again in order to mimic the user being present - the secure enclave requires that each response from the fingerprint sensor be unique). This achieves our goals.
The PC space is more complicated. In the Mac case, communication between the biometric sensors (be that TouchID or FaceID) occurs in a controlled communication channel where all the hardware involved knows how to talk to the other hardware. In the PC case, the typical location where we'd store secrets is in the TPM, but TPMs conform to a standardised spec that has no understanding of this sort of communication, and biometric components on PCs have no way to communicate with the TPM other than via the OS. We can generate keys in the TPM, and the TPM can attest to those keys being TPM-generated, which means an attacker can't exfiltrate those secrets and mimic the user's token on another machine. But in the absence of any explicit binding between the TPM and the physical presence indicator, the association needs to be up to code running on the CPU. If that's in the OS, an attacker who compromises the OS can simply ask the TPM to respond to an challenge it wants, skipping the biometric validation entirely.
Windows solves this problem in an interesting way. The Windows Hello Enhanced Signin doesn't add new hardware, but relies on the use of virtualisation. The agent that handles WebAuthn responses isn't running in the OS, it's running in another VM that's entirely isolated from the OS. Hardware that supports this model has a mechanism for proving its identity to the local code (eg, fingerprint readers that support this can sign their responses with a key that has a certificate that chains back to Microsoft). Additionally, the secrets that are associated with the TPM can be held in this VM rather than in the OS, meaning that the OS can't use them directly. This means we have a flow where a browser asks for a WebAuthn response, that's passed to the VM, the VM asks the biometric device for proof of user presence (including some sort of random value to prevent the OS just replaying that), receives it, and then asks the TPM to generate a response to the challenge. Compromising the OS doesn't give you the ability to forge the responses between the biometric device and the VM, and doesn't give you access to the secrets in the TPM, so again we meet all our goals.
On Linux (and other free OSes), things are less good. Projects like tpm-fido generate keys on the TPM, but there's no secure channel between that code and whatever's providing proof of physical presence. An attacker who compromises the OS may not be able to copy the keys to their own system, but while they're on the compromised system they can respond to as many challenges as they like. That's not the same security assertion we have in the other cases.
Overall, Apple's approach is the simplest - having binding between the various hardware components involved means you can just ignore the OS entirely. Windows doesn't have the luxury of having as much control over what the hardware landscape looks like, so has to rely on virtualisation to provide a security barrier against a compromised OS. And in Linux land, we're fucked. Who do I have to pay to write a lightweight hypervisor that runs on commodity hardware and provides an environment where we can run this sort of code?
[1] As I discussed recently there are scenarios where these assertions are less strong, but even so
Let's talk about the broad set of problems first. For this to work you want to be able to generate a key in hardware (so it can't just be copied elsewhere if the machine is compromised), prove to a remote site that it's generated in hardware (so the remote site isn't confused about what security assertions you're making), and tie use of that key to the user being physically present (which may range from "I touched this object" to "I presented biometric evidence of identity"). What's important here is that a compromised OS shouldn't be able to just fake a response. For that to be possible, the chain between proof of physical presence to the secret needs to be outside the control of the OS.
For a physical security token like a Yubikey, this is pretty easy. The communication protocol involves the OS passing a challenge and the source of the challenge to the token. The token then waits for a physical touch, verifies that the source of the challenge corresponds to the secret it's being asked to respond to the challenge with, and provides a response. At the point where keys are being enrolled, the token can generate a signed attestation that it generated the key, and a remote site can then conclude that this key is legitimately sequestered away from the OS. This all takes place outside the control of the OS, meeting all the goals described above.
How about Macs? The easiest approach here is to make use of the secure enclave and TouchID. The secure enclave is a separate piece of hardware built into either a support chip (for x86-based Macs) or directly on the SoC (for ARM-based Macs). It's capable of generating keys and also capable of producing attestations that said key was generated on an Apple secure enclave ("Apple Anonymous Attestation", which has the interesting property of attesting that it was generated on Apple hardware, but not which Apple hardware, avoiding a lot of privacy concerns). These keys can have an associated policy that says they're only usable if the user provides a legitimate touch on the fingerprint sensor, which means it can not only assert physical presence of a user, it can assert physical presence of an authorised user. Communication between the fingerprint sensor and the secure enclave is a private channel that the OS can't meaningfully interfere with, which means even a compromised OS can't fake physical presence responses (eg, the OS can't record a legitimate fingerprint press and then send that to the secure enclave again in order to mimic the user being present - the secure enclave requires that each response from the fingerprint sensor be unique). This achieves our goals.
The PC space is more complicated. In the Mac case, communication between the biometric sensors (be that TouchID or FaceID) occurs in a controlled communication channel where all the hardware involved knows how to talk to the other hardware. In the PC case, the typical location where we'd store secrets is in the TPM, but TPMs conform to a standardised spec that has no understanding of this sort of communication, and biometric components on PCs have no way to communicate with the TPM other than via the OS. We can generate keys in the TPM, and the TPM can attest to those keys being TPM-generated, which means an attacker can't exfiltrate those secrets and mimic the user's token on another machine. But in the absence of any explicit binding between the TPM and the physical presence indicator, the association needs to be up to code running on the CPU. If that's in the OS, an attacker who compromises the OS can simply ask the TPM to respond to an challenge it wants, skipping the biometric validation entirely.
Windows solves this problem in an interesting way. The Windows Hello Enhanced Signin doesn't add new hardware, but relies on the use of virtualisation. The agent that handles WebAuthn responses isn't running in the OS, it's running in another VM that's entirely isolated from the OS. Hardware that supports this model has a mechanism for proving its identity to the local code (eg, fingerprint readers that support this can sign their responses with a key that has a certificate that chains back to Microsoft). Additionally, the secrets that are associated with the TPM can be held in this VM rather than in the OS, meaning that the OS can't use them directly. This means we have a flow where a browser asks for a WebAuthn response, that's passed to the VM, the VM asks the biometric device for proof of user presence (including some sort of random value to prevent the OS just replaying that), receives it, and then asks the TPM to generate a response to the challenge. Compromising the OS doesn't give you the ability to forge the responses between the biometric device and the VM, and doesn't give you access to the secrets in the TPM, so again we meet all our goals.
On Linux (and other free OSes), things are less good. Projects like tpm-fido generate keys on the TPM, but there's no secure channel between that code and whatever's providing proof of physical presence. An attacker who compromises the OS may not be able to copy the keys to their own system, but while they're on the compromised system they can respond to as many challenges as they like. That's not the same security assertion we have in the other cases.
Overall, Apple's approach is the simplest - having binding between the various hardware components involved means you can just ignore the OS entirely. Windows doesn't have the luxury of having as much control over what the hardware landscape looks like, so has to rely on virtualisation to provide a security barrier against a compromised OS. And in Linux land, we're fucked. Who do I have to pay to write a lightweight hypervisor that runs on commodity hardware and provides an environment where we can run this sort of code?
[1] As I discussed recently there are scenarios where these assertions are less strong, but even so
Could possibly reuse something from Android AVF?
Date: 2022-12-10 01:39 pm (UTC)Google is already working on a KVM-based protected hypervisor for Android: https://lwn.net/Articles/836693/. It should be possible to make it work on generic ARM Linux with some modifications. It's currently ARM-only but the basic design is applicable for x86 too (and it's also likely that Google will eventually port it to x86 for Chrome OS if it works out).
The first feature to make use of this on Android will be compiling DEX bytecode of signed system apps: https://source.android.com/docs/core/virtualization/usecases#isolated-compilation
no subject
Date: 2022-12-10 08:55 pm (UTC)Are these guarantees and associated threat models documented somewhere? The "a compromised OS shouldn't be able to just fake a response" model discussed here seems stronger than my personal expectations of WebAuthn. After all, a compromised OS or browser can already break (a), so I am not sure why I would expect it to be unable to break (b).
no subject
Date: 2022-12-11 02:19 am (UTC)This is really interesting, thanks. I didn't realize that Apple hardware had the dedicated channel.
Why does windows hello gets involved with some yubikey auth? (It occurs to me that I've never paid attention to which yuibkey logins interact with hello; is it all the fido ones? It definitely doesn't give me a windows hello window for yubikey otp.) I expect hello when I do fingerprint sign-in, but is hello also acting as a virtualization layer between the yubikey and the OS? Is that just the API for "I'm talking to hardware auth" now?
no subject
Date: 2022-12-11 02:57 am (UTC)no subject
Date: 2022-12-11 03:49 am (UTC)Huh, that makes sense. It makes sense to have the layer anyway. Thanks.
How meaningful are the replay protections?
Date: 2022-12-11 07:57 pm (UTC)In practice I would expect that the only protection this provides is requiring the user to interact with the machine at least once after it was compromised.
Relevance of physical presence
Date: 2022-12-14 06:16 pm (UTC)Mere tapping gives a false sense of security: The user still doesn't see what they sign, just that they sign something right now. Could be the mail's signature, could be the enrollment of a second key. End user verification is obtained if there's a trusted channel to display something to the user, and that mostly needs a full display (or a Betrusted / Precursor). For most applications, that's overkill, and if that is, I think the touch-once-to-sign is too (especially if it incurs complexity cost).
Impossible with Free Software
Date: 2022-12-18 01:03 am (UTC)And even if you do find some contorted way to do this in software under a Free Software license while still preventing me from faking the attestation (probably involving some preexisting proprietary hardware like the TPM), IMHO, the moment you do this, your software is no longer Free Software, no matter the license.
See also https://www.gnu.org/philosophy/right-to-read.en.html – even the workaround mentioned there (lending the entire computer and sharing the password) would not be possible with hardware attestation enforcing biometric authentication. So this can project us into an even worse distopia than what RMS had imagined back in 1997.
I find it really sad that Free Software developers are working on "security" concepts that are inherently incompatible with software freedom and hence threaten our freedom.
Virtualization-Based Security on Linux
Date: 2025-03-18 08:10 pm (UTC)You might be aware of this already, but it looks like Microsoft (https://github.com/heki-linux) and Amazon (https://github.com/vianpl/linux/tree/vsm/next) are doing it on their own: there are two projects to implement the Hyper-V VSM spec that powers Windows Virtualization Based Security (VBS), which in turn enables Credential Guard and Windows Hello Enhanced Sign-In Security (ESS).
https://www.youtube.com/watch?v=QY7lg6aMaWc https://www.youtube.com/watch?v=IKUJTTZ7Vv0
I wrote a little about them here too: https://fosstodon.org/@iinuwa/114185140764840129