[personal profile] mjg59
The problem with Samsung laptops bricking themselves turned out to be down to the UEFI variable store becoming more than 50% full and Samsung's firmware being dreadful, but the trigger was us writing a crash dump to the nvram. I ended up using this feature to help someone get a backtrace from a kernel oops during suspend today, and realised that it's not been terribly well publicised, so.

First, make sure pstore is mounted. If you're on 3.9 then do:

mount -t pstore /sys/fs/pstore /sys/fs/pstore

For earlier kernels you'll need to find somewhere else to stick it. If there's anything in there, delete it - we want to make sure there's enough space to save future dumps. Now reboot twice[1]. Next time you get a system crash that doesn't make it to system logs, mount pstore again and (with luck) there'll be a bunch of files there. For tedious reasons these need to be assembled in reverse order (part 12 comes before part 11, and so on) but you should have a crash log. Report that, delete the files again and marvel at the benefits that technology has brought to your life.

[1] UEFI implementations generally handle variable deletion by flagging the space as reclaimable rather than immediately making it available again. You need to reboot in order for the firmware to garbage collect it. Some firmware seems to require two reboot cycles to do this properly. Thanks, firmware.

Yeah, this is good stuff

Date: 2013-03-22 07:59 pm (UTC)
From: [identity profile] benanov.livejournal.com
It's sort of hard to debug a crash where you don't know what happened, this is a good development.

We should start seeing a whole bunch of these bugs that were hard to work on before become surmountable.

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Aurora. Ex-biologist. [personal profile] mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer. Also on Mastodon.

Expand Cut Tags

No cut tags