![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Since I wrote this, we've made some worthwhile progress on avoiding damaging Samsung hardware. The first is that the samsung-laptop driver appeared to be causing the firmware to attempt to write to an area of memory that was marked in the chipset, triggering a Machine Check Exception. That was what generated the pstore output that caused the problem originally. The driver now refuses to load if EFI is enabled, which avoids the problem. It's not ideal, since it's currently the only mechanism we have for certain functionality on Samsung laptops, but there you go.
The second problem was that avoiding crashing on boot didn't actually fix the problem in any fundamental way. Even with pstore disabled, it was possible for userspace to fill the nvram and trigger the same problem. Our first approach to this was to prevent any writes to nvram if the UEFI QueryVariableInfo() call reported that more than 50% of the nvram storage space would be used. That was safe, but led to another issue. The nvram storage area is typically implemented as part of the same flash chip as the firmware. Flash isn't arbitrarily accessible - changing the contents of a block typically involves rewriting the entire block. It's impractical to rewrite the entire nvram area on every write, so what actually happens is that deleting variables just results in them being marked as inactive but doesn't actually free up the space. The firmware can later perform some sort of garbage collection to free it up.
This caused us problems, since inactive space that hasn't been garbage collected yet isn't actually available, and as a result firmware implementations tend to count it as used. Say you had 64KB of nvram and wrote 32KB of variables. We'd then refuse to write any more because you'd drop below 50%. So you delete 16KB of the variables you've created and try again. Unfortunately, the firmware still thinks that there's 32KB in use and Linux would still refuse.
If you were lucky, rebooting would trigger a garbage collection run. If you weren't, it wouldn't. Problematic. Our next approach was to try to account for the space actually actively used by the variables, rather than relying on what the firmware told us via QueryVariableInfo(). This seems simple enough - just add up the size of all the variables and subtract that from the overall size to determine how much of the "used" space is actually just old inactive variables that can be ignored. However, there's still some problems there. The first is that each variable has some additional overhead associated with it, and the size of that overhead varies depending on the system vendor. We had to make a conservative guess, which could cause problems if systems had large numbers of small variables. The second is that the only variables the kernel can see are those that are flagged as runtime-visible. There may also be a significant quantity of nvram used to store variables that are only visible in boot services code. We could work around this by adding up sizes while we're still in boot services code, but on some systems calling QueryVariableInfo() before ExitBootServices() results in later calls to GetNextVariable() jumping to invalid addresses and crashing the kernel. Not a great approach.
Meanwhile, Samsung got back to us and let us know that their systems didn't require more than 5KB of nvram space to be available, which meant we could get rid of the 50% value and replace it with 5KB. The hope was that any system that booted with only 5KB of space available in nvram would trigger a garbage collection run. Unfortunately, it turned out that that wasn't true - some systems will only trigger garbage collection if the OS actually makes an attempt to write a variable that won't otherwise fit.
Hence this patch. The new approach is to ask the firmware how much space is available. If the size of the new variable would reduce this to less than 5K, we attempt to create a variable bigger than the remaining space. This should cause the firmware to realise that it's out of room and either (depending on implementation) perform a garbage collection run at runtime or set a flag that will cause the system to perform garbage collection on the next reboot. We then call QueryVariableInfo() again to see whether a garbage collection run actually happened, and if so check whether we now have enough space. If so, we go ahead and write the variable. If not, we tell userspace that there's not enough space.
This seems to work in all the situations I've tested, and it should avoid ending up in a situation where a Samsung can end up bricked. However, it's firmware, so who knows whether it's going to break things for someone else.
The second problem was that avoiding crashing on boot didn't actually fix the problem in any fundamental way. Even with pstore disabled, it was possible for userspace to fill the nvram and trigger the same problem. Our first approach to this was to prevent any writes to nvram if the UEFI QueryVariableInfo() call reported that more than 50% of the nvram storage space would be used. That was safe, but led to another issue. The nvram storage area is typically implemented as part of the same flash chip as the firmware. Flash isn't arbitrarily accessible - changing the contents of a block typically involves rewriting the entire block. It's impractical to rewrite the entire nvram area on every write, so what actually happens is that deleting variables just results in them being marked as inactive but doesn't actually free up the space. The firmware can later perform some sort of garbage collection to free it up.
This caused us problems, since inactive space that hasn't been garbage collected yet isn't actually available, and as a result firmware implementations tend to count it as used. Say you had 64KB of nvram and wrote 32KB of variables. We'd then refuse to write any more because you'd drop below 50%. So you delete 16KB of the variables you've created and try again. Unfortunately, the firmware still thinks that there's 32KB in use and Linux would still refuse.
If you were lucky, rebooting would trigger a garbage collection run. If you weren't, it wouldn't. Problematic. Our next approach was to try to account for the space actually actively used by the variables, rather than relying on what the firmware told us via QueryVariableInfo(). This seems simple enough - just add up the size of all the variables and subtract that from the overall size to determine how much of the "used" space is actually just old inactive variables that can be ignored. However, there's still some problems there. The first is that each variable has some additional overhead associated with it, and the size of that overhead varies depending on the system vendor. We had to make a conservative guess, which could cause problems if systems had large numbers of small variables. The second is that the only variables the kernel can see are those that are flagged as runtime-visible. There may also be a significant quantity of nvram used to store variables that are only visible in boot services code. We could work around this by adding up sizes while we're still in boot services code, but on some systems calling QueryVariableInfo() before ExitBootServices() results in later calls to GetNextVariable() jumping to invalid addresses and crashing the kernel. Not a great approach.
Meanwhile, Samsung got back to us and let us know that their systems didn't require more than 5KB of nvram space to be available, which meant we could get rid of the 50% value and replace it with 5KB. The hope was that any system that booted with only 5KB of space available in nvram would trigger a garbage collection run. Unfortunately, it turned out that that wasn't true - some systems will only trigger garbage collection if the OS actually makes an attempt to write a variable that won't otherwise fit.
Hence this patch. The new approach is to ask the firmware how much space is available. If the size of the new variable would reduce this to less than 5K, we attempt to create a variable bigger than the remaining space. This should cause the firmware to realise that it's out of room and either (depending on implementation) perform a garbage collection run at runtime or set a flag that will cause the system to perform garbage collection on the next reboot. We then call QueryVariableInfo() again to see whether a garbage collection run actually happened, and if so check whether we now have enough space. If so, we go ahead and write the variable. If not, we tell userspace that there's not enough space.
This seems to work in all the situations I've tested, and it should avoid ending up in a situation where a Samsung can end up bricked. However, it's firmware, so who knows whether it's going to break things for someone else.
Yeep
Date: 2013-06-03 03:30 pm (UTC)Re: Yeep
Date: 2013-06-03 06:45 pm (UTC)I'm not sure if there is a solution to all this foolishness. The only thing I can think of is a blacklist/compliance check performed by the installer: "Your current firmware is not suited to run this software. Contact your vendor for an upgrade and try again". We'd be doing the manufacturer's quality control, as it were.
Ideally (at least for me), a Linux installer would be able to flash a known-good version of coreboot onto the motherboard before installing. However that would "define" an interface at a much lower level than even the old BIOS, and I shudder when thinking about the hackery required in Coreboot. At least hardware revisions come less frequently than firmware updates.
Re: Yeep
Date: 2013-06-04 11:09 am (UTC)If the UEFI folks are going to go to all the time and trouble to define an API, they should also go to the time and trouble to define some minimum behaviours/standards that should be followed. i.e. a test suite that the OEMs can't manipulate.
The likes of Samsung should be getting publicly ridiculed much more than they are, and perhaps even regulators getting involved to declare with devices unfit for sale.
Re: Yeep
Date: 2013-06-05 04:53 pm (UTC)They have. It's called "whatever is needed to get Windows to boot".
Were Samsung willing to issue a BIOS update?
Date: 2013-06-13 06:44 am (UTC)Re: Were Samsung willing to issue a BIOS update?
Date: 2013-06-13 02:46 pm (UTC)Re: Were Samsung willing to issue a BIOS update?
Date: 2013-06-13 09:22 pm (UTC)Is there a recommended fedora or centos version to try? Kudos to everyone trying to get them to fix this problem!
Re: Were Samsung willing to issue a BIOS update?
Date: 2013-06-13 09:33 pm (UTC)Re: Samsung P08RAN BIOS update and F19 TC3
My Setup: np510r5e-A01U8, original W8 factory load, both fast boot and secure boot OFF, using UEFI OS setting.
Once I get some sleep, I will try installing F19 TC3 DVD in an empty area I made on the HD. If you have any advice on things to watch out for, would be appreciated. Take care. Bitflip10
Re: Samsung P08RAN BIOS update and F19 TC3
Date: 2013-06-15 01:46 am (UTC)The UEFI Bugzilla 873207 could be pretty confusing for some newer users but an F10 to switch OSs works for me.
I agree with Linus on Gnome 3.x; yuk. But I started on Slackware 2.0 way back when so I am jaded.
I will email you my contact info in case you would like some info or testing. Thank you for the F19 TC3 guidance and your efforts.
Bruce bitflip10
Re: Samsung P08RAN BIOS update and F19 TC3
Date: 2013-06-21 07:23 pm (UTC)Samsung-module
Date: 2013-06-21 09:34 am (UTC)Re: Samsung-module
Date: 2013-08-01 02:38 pm (UTC)I installed F19 in UEFI mode, but it doesn't boot if I change it to CSM, I guess because there's no standard grub MBR? Are there any other pros/cons of CSM?
For now I will probably stick with UEFI, but if there's a compelling reason to change, I will. I'd really like these nice features to all work 100% though. Is there any way to force samsung-laptop to load? Or any way to check if the EFI bug has been resolved? My laptop is still under warranty so I'm not too worried about the possibility of a brick.
install ubuntu 12.04.02 in a samsung laptop
Date: 2013-09-24 06:54 pm (UTC)I would like to install Ubuntu in a samsung laptop but now I'm not sure if it's possible without a risk of broke my laptop. I've used ubuntu since 5 years ago and I don't want to use windows, but I don't want to broke my laptop...
How can I do?
the laptop is a samsung np270e5e with uefi and windows8 preinstalled.
Have you written a EFI to space?
Date: 2018-03-15 11:02 pm (UTC)Re: Have you written a EFI to space?
Date: 2018-03-16 02:30 am (UTC)Oh. Which Linux LIve CD as of now would do it
Date: 2018-04-07 03:08 pm (UTC)Do you have a known Version of Linux you can recommend that has a live CD that you know reclaims the space on boot?
I'm trying to recover the the space. I can't flash the higher UEFI version until the the space is recovered. My IntensePC is a bit broken as a result. If I recover the space I might be able to get out of the current issues.
emurach comcast n
I tried the latest Ubuntu 18.04 daily build.
Date: 2018-04-13 01:41 am (UTC)My Firmware is bit broken on roll backup and I can't re-flash forward until the space is freed up. The firmware capsule won't load. Thus it does not update on the automated reboot. Have tried DosFlash and ShellFlash64.efi. using /patch /cvar does run but that does nothing to reclaim the space. its just clears (deletes) the variables.
Could you help me out with method to reclaim the variable space?