[personal profile] mjg59
Many people still install Linux from CDs. But a growing number install from USB. In an ideal world you'd be able to download one image that would let you do either, but it turns out that that's quite difficult. Shockingly enough, it's another situation where the system firmware exists to make your life difficult.

Booting a hard drive is pretty easy. The BIOS reads the first 512 bytes off the drive, copies them to RAM and executes them. That code is then responsible for either starting your bootloader or identifying the currently active partition and jumping to its boot sector, but before too long you're in a happy place where you're executing whatever you want to. Life is good. So you'd think that CDs would work in a similar way. The ISO 9660 format even leaves a whole 32KB at the start of a filesystem, which is enough space for a pretty awesome bootloader. But no. This is not how CDs work. That would be far too easy.

Let's imagine we're back in the 90s. People want to be able to boot off CD without needing a boot floppy to do so. And you're a PC vendor with a BIOS that's been lovingly[1] forced into a tiny piece of flash and which has to execute out of an almost as tiny piece of RAM if you want your users to be able to play any games. Letting boot code read arbitrary content off the CD would mean adding a new set of interrupt hooks, and that's going to be even more complicated because CDs have a sector size of 2K while hard drives are 512 bytes[2] and who's going to pay to implement this and for the extra flash and RAM and look surely there has to be another way?

So, of course, another way was found. The El Torito specification defines a way for shoving a reference to some linear blocks into the ISO 9660 header. The BIOS reads those blocks into memory and then redirects either the floppy or hard drive access interrupts (depending on the El Torito type) to that region. The boot code can then proceed as if it had been read off a floppy without all the trouble of actually putting a floppy in the machine, and the extra code required in the system BIOS is minimal.

USB sticks, however, are treated as hard drives. The BIOS won't look for El Torito images on them. Instead, it'll try to execute a boot sector. That isn't there on a CD image. Sigh.

A few years ago a piece of software called isohybrid popped up and solved this problem nicely. isohybrid is a companion to isolinux, which itself is a bootloader that fits into an El Torito image and can then load your kernel and installer from CD. isohybrid takes an ISO image, adds an x86 boot sector and partition table and does some more fiddling to turn a valid ISO image into one that can be copied directly onto a USB stick and booted. The world suddenly becomes a better place.

But that's BIOS. EFI makes this easier, right? Right?

No. EFI does not make this easier.

Despite EFI being a modern firmware for the modern world[3], EFI implementations are not required to be able to understand ISO 9660. In fact, I've never seen one that does. FAT is all the spec requires, and FAT is typically all you get. Nor will EFI just execute some arbitrary boot code from the start of the CD. So, how does EFI boot off CD?

El Torito. Obviously.

It's not quite as bad as it sounds, merely almost as bad as it sounds. While the typical way of using El Torito for a long time was to use floppy or hard drive emulation, it also supports a "No emulation" mode. It also supports setting a type flag for your media, which means you can distinguish between images intended for BIOS booting and EFI booting. But the fact remains that your CD has to include an embedded FAT partition that then contains a bootloader that's able to read ISO 9660 because your firmware is too inept to handle that itself[4].

How about USB sticks? Thankfully, booting these on EFI doesn't require any boot sectors at all. Instead you just have to have a partition table, a FAT partition and a bootloader in a well known location in that FAT partition. The required partition is, in fact, identical to the one you need in an El Torito image. And so this is where we start introducing some extra hacks.

Like I said earlier, isohybrid fakes up an MBR and adds some boot code that points at the actual bootloader. It needs to do a little more on EFI. The first problem is that the isohybrid MBR partition has to cover the entire ISO 9660 filesystem on the USB stick so that the operating system can access it later, but the El Torito FAT image is inside that partition. A lot of MBR-based code becomes very unhappy if you try to set up a partition that's a subset of another partition. So we can't really use MBR. On to GPT.

GPT, or the GUID Partition Table, is the EFI era's replacement for MBR partitions. It has two main advantages over MBR - firstly it can cover partitions larger than 2TB without having to increase sector size, and secondly it doesn't have the primary/logical partition horror that still makes MBR more difficult than it has any right to be. The format is pretty simple - you have a header block 1 logical block into the media (so 512 bytes on a typical USB stick), and then a pointer to a list of partitions. There's then a secondary table one block from the end of the disk, which points at another list of partitions. Both blocks have multiple CRCs that guarantee that neither the header nor the partition list have been corrupted. It turns out to be a relatively straightforward modification of isohybrid to get it to look for a secondary EFI image and construct a GPT entry pointing at it. This works surprisingly well, and media prepared this way will boot EFI machines if burned to a CD or written to a USB stick.

There's a few quirks. Macs will show two boot icons for these CDs[6], one marked "EFI Boot" and one helpfully marked "Windows"[7], with the latter booting the BIOS El Torito image. That's a little irritating, but not insurmountable. The other issue is that older Macs won't look for boot loaders in the legacy locations. This is where things start getting horrible.

Back in the old days, Apple boot media used to have a special "blessed" folder. Attempting to boot would involve the firmware looking for such a folder and then using that to start itself up. Any folder in the filesystem could be blessed. Modern hardware doesn't use boot folders, but does use boot files. For an HFS+ filesystem, the inode of the bootloader is written to a specific offset in the filesystem superblock and the firmware simply finds that inode and executes it. And this appears to be all that older Macs support.

So, having written a small tool to bless an HFS+ partition, I tried the obvious first step of burning a CD with three El Torito images (one BIOS, one FAT, one HFS+). It failed. While Refit could see the bootloader in the HFS+ image, the firmware appeared to have no interest at all in booting off it. Yet Apple install media would boot. What was the difference?

The difference, obviously, was that these earlier Macs don't appear to support El Torito booting. The Apple install media contained an Apple partition map.

The Apple partition map (APM) is Apple's legacy partition table format. Apple mostly dropped it when they went to x86, where it's retained for two purposes. The first is for drives that need to be shared between Intel Macs and PPC ones. The second seems to be for their install DVDs. Some further playing revealed that burning a CD with an APM entry pointing at the HFS+ filesystem on the CD gave me a boot icon. Problem solved?

Not really. Remember how I earlier mentioned that ISO 9660 leaves 32KB at the start of the image, and that an isohybrid image then writes an MBR and boot sector in the first 512 bytes of that, and the GPT header starts 512 bytes into a drive? That means that it's easy to produce an ISO that has both a boot sector, MBR partition table and GPT. None of them overlap. APM, on the other hand, has a header that's located at byte 0 of the media, overlapping with the boot sector. And it has a partition listing that's located at sector 1, overlapping with the GPT. Is all lost?

No. Merely sanity.

The first thing to remember is that the boot sector is just raw assembler. It's a byte stream that's executed by the CPU. And there's a lot of things you can tell a CPU to do that result in nothing happening. Peter Jones pointed out that the only bits of the AFP header you actually need are the letters "ER", followed by the sector size as a two byte big endian integer. These disassemble to harmless instructions, so we can simply move the boot code down a little and stick these at the beginning. A PC that executes it will read straight through the bizarre (but harmless) Apple bytes and then execute the real boot code.

The second thing that's important here is that we were just given the opportunity to specify the sector size. The GPT is only relevant when the image is written to a USB stick, so assumes a sector size of 512 bytes. So when the GPT starts one sector into the drive, it's actually starting 512 bytes into the drive. APM also starts one sector into the drive, but we can simply put a different sector size into the header and suddenly we're able to choose where that's going to be. 2K seems like a good choice, and so the firmware will now look for the header at byte 2048.

That's still in them middle of the GPT partition listing, though. Except we can avoid that as well. GPT lets you specify where the partition listing starts and doesn't require it to be immediately after the header. So we can offset the partition listing to, say, byte 8192 and leave a hole for the Apple partition map.

And, shockingly, this works. Setting up a CD this way gives a boot icon on old Macs. On new Macs, it gives three - one for legacy boot, one for EFI boot via FAT and one for EFI boot via HFS. Less than ideal, but eh. The one remaining problem is that this doesn't work for USB sticks (the firmware sees the GPT and ignores the APM), so we also need to add a GPT entry for the HFS+ partition. Job done.

So, it is possible to produce install media that will work if burned to CD or written to a USB stick. It's even possible to produce a version that will work on Macs, as long as you're willing to put up with three partition tables and an x86 boot sector that doubles as an APM header. And patches to isohybrid to do all of this will be turning up as soon as I tidy the code to the point where it works without having to hack in offsets by hand.

[1] Insert some other adverb here if you feel like it
[2] Why yes, 15 years later BIOSes still tend to assume 512 bytes. Which is why your 4K sector disk is much harder to work with than you'd like it to be.
[3] Ever noticed how the modern world involves a great deal of suffering, misery and death? EFI fits into that world perfectly.
[4] Obviously if you want your media to be bootable via both BIOS and EFI you need to produce a CD with two El Torito images. BIOS systems should ignore the image that says it's for EFI, and EFI systems should ignore the BIOS one. Some especially creative BIOS authors[5] have decided that users shouldn't have their choices limited in such a way, and so pop up a screen that says:

1.

2.

Select CD-ROM boot type:

and wait for the user to press a key. The lack of labels after the numbers is not a typographical error on my part.
[5] Older (pre-2009, and some 2009 models) Apple hardware has this bug if a dual-El Torito CD is booted via the BIOS compatibility layer. This is especially unfortunate because said machines often fail to provide a working keyboard emulation at this stage, resulting in you being stuck forever at an impressively unhelpful screen. This isn't a Linux bug, since it's happening before we've run any of our code at all. It's not even limited to Linux. 64-bit install media for Vista SP1, Windows 7 and Server 2008 all have similar El Torito layout and all trigger the same bug on Apple hardware. Apple's aware of this, and has resolved the issue by declaring that these machines don't support 64 bit Windows.
[6] Even further investigation reveals that Apple will show you as many icons as there are El Torito images, which is a rare example of Apple giving the user the freedom to brutally butcher their extremities if they so desire
[7] "Windows" is Apple code for "Booting via BIOS compatibility". The Apple boot menu will call any filesystem with a BIOS boot sector Windows.

long live coreboot

Date: 2011-07-26 02:58 pm (UTC)
From: (Anonymous)
Is it just me or does every single one of your posts read like a veiled endorsement for Coreboot?

Re: long live coreboot

Date: 2011-07-26 04:03 pm (UTC)
From: [identity profile] patrick.georgi-clan.de
coreboot gives you no interface to work with (well, there's a table with some interesting data points about the platform that you're free to ignore).

The interesting thing with coreboot is to use it with a payload that provides the interface of choice - such as tianocore, which, as far as I can see, is a rather complete and _compliant_ UEFI implementation. That gives you EFI without all those hacks you have to find (like jumping to the fallback boot loader directory).

I'm working on that. slowly.

Re: long live coreboot

Date: 2011-07-26 04:28 pm (UTC)
From: [identity profile] patrick.georgi-clan.de
I know how some EFI implementations look from the inside. Tiano must be an improvement over them, coreboot or not.

Maybe these EFIs are less of a problem for you because they're so dysfunctional as EFI platform that they always come with a transparent CSM.

Re: long live coreboot

Date: 2011-07-26 05:17 pm (UTC)
From: (Anonymous)
Getting updated versions out to everyone would probably be easier if we're not dependent on the vendors for updates.

We'd just be able to say: "oh, that was fixed in coreboot version xxx; please flash your firmware". IOW, we'd be able to fix the problems where they really are, instead of working around them in the bootloaders or kernel.

This won't ever be possible if we have to wait (in vain?) for updates from the motherboard makers.

Re: long live coreboot

Date: 2011-07-31 07:08 pm (UTC)
From: (Anonymous)
I'd have to agree with mjg59 on this one; not only is it not practical, but vendors likely wouldn't warrant a bios obtained from someone else.

Of course if you can fix it upstream and send the changes to the vendors for verification, testing, and their seal of approval, that'd probably be different.

Re: long live coreboot

Date: 2011-07-26 04:15 pm (UTC)
From: (Anonymous)
imho the interesting thing about coreboot is, that you are able to FIX the bugs you find.
apart from that... let me introduce you to RMS... ;)

anyway, magnificent blog post, thank you very much!

Re: long live coreboot

Date: 2011-07-27 10:29 am (UTC)
From: [identity profile] pjc50.livejournal.com
The interesting thing about meeting RMS is that you're not able to fix the bugs you find.

Kudos for keeping your sanity!

Date: 2011-07-26 03:14 pm (UTC)
From: [identity profile] jvilk.myopenid.com
I'm for anything that makes booting from USB easier. Burning a CD or a DVD for an OS install is such a waste, and it's currently quite annoying to use tools to make a bootable USB stick out of an installation ISO, especially when it doesn't work sometimes! *shakes fist*

Thank you for sticking through the insanity of it all. If this works as planned, hopefully distributions will offer easy options for making a bootable USB key from livediscs, which will make it even easier for new people to try out Linux. :D

Re: Kudos for keeping your sanity!

Date: 2011-07-26 03:44 pm (UTC)
From: (Anonymous)
Last I checked, that is true in Fedora only for the live image. If you want to install from the DVD image (for instance, because you want btrfs root), you have to do more than simply dd the image onto a stick.

Re: Kudos for keeping your sanity!

Date: 2011-07-27 08:58 am (UTC)
From: (Anonymous)
Maybe DVD would work with the BIOS Gujin bootloader at sourceforge.
It doesn't do EFI but shall handle MBR+GTP partition because it doesn't use sectors just after the MBR.

Date: 2011-07-26 03:17 pm (UTC)
From: (Anonymous)
grml's iso images can be written to usb-stick with just dd.

so there is already an image which can do both.

http://grml.org/download/

Crazy

Date: 2011-07-26 09:54 pm (UTC)
From: (Anonymous)
I'm guessing that researching this kind of thing isn't good for one's mental health?

Further reading?

Date: 2011-07-26 10:00 pm (UTC)
From: (Anonymous)
Epic post as usual, Matthew! I really appreciate the level of detail you go into, both in terms of history of the implementation and present day problems.

Every time I read one of your posts I find myself wanting to learn more about what you've talked above. If I may ask, where did you pick up your [historical / present day] knowledge around booting / rebooting PCs? Is there any particular resource you can recommend where one could learn more? Thanks!

Date: 2011-07-26 10:42 pm (UTC)
From: (Anonymous)
How does OpenBoot (as provided on most Sun SPARC hardware) stand w.r.t. to the stupidities of PC x86 BIOSes and PC (U)EFI implementations? From what I know, OpenBoot either looks for a Sun partition map or El Torito; and would accept raw/aout images from netboot.

Date: 2011-07-27 12:05 am (UTC)
From: [identity profile] rww.name
Thanks for your post! It's always a pleasure to see your handle come up in my feed reader, it lets me know I'm about to learn something :)

Date: 2011-07-27 04:19 am (UTC)
From: (Anonymous)
Hi, thank you for sharing this. It makes me hopeful, that finally there is a way to install Fedora on a Mac.

Does it mean that
https://bugzilla.redhat.com/show_bug.cgi?id=527443
is going to be fixed even though it is NOTABUG?

I know that this bug corresponds to CD boot, but according to what you are saying I should be able to install from USB with the exact same image.

Do you have such an image available so that I can give it a try?

A+++ would read again

Date: 2011-07-27 05:51 am (UTC)
From: (Anonymous)
I just want to thank you for taking the time to write this, you're writing style is AWESOME, and very enjoyable to read.

I wasn't familiar with why I had to use special tools to make live-usbs, I was under the impression that it had something to do with bootstrabing and drivers for reading from the usb. But now I know-ish.

Good grief....

Date: 2011-07-27 07:08 am (UTC)
From: (Anonymous)
So this is why my USB key doesn't boot on x86 Macs and now I know why 64 bit Windows isn't supported on some 64 bit Macs.

And to think that some BIOSes try to be clever and try and detect what sort of emulation they should perform. Crazy...

ISO9660

Date: 2011-07-27 05:41 pm (UTC)
From: (Anonymous)
There is a uefi driver to read iso9660 fs with all H67 ASUS board (I've test it in an efi shell and it's working), I don't know how they implemented the boot from cd if it's Torito or else (all I know is I end with kernel panic during boot... but that's not their problem as you don't need linux to boot windows).

rafirafi

BIOS and 4KB sectors

Date: 2011-07-30 08:17 am (UTC)
From: [identity profile] yuhong.wordpress.com
"Why yes, 15 years later BIOSes still tend to assume 512 bytes. Which is why your 4K sector disk is much harder to work with than you'd like it to be."
Not to mention that probably many real mode code like DOS applications that uses INT 13 have the sector size hardcoded to 512 bytes, as this has never changed before since 1981.

Date: 2011-07-31 06:36 pm (UTC)
gerald_duck: (Default)
From: [personal profile] gerald_duck
Firstly, congratulations on posting this so soon after XKCD 927.

Secondly, I assume none of the woes go away if you start with a bootable USB mass storage image and frobble it to be bootable from CD, rather than starting with the ISO image and frobbling it to be bootable from USB mass storage?

No, thought not. )-8

inodes

Date: 2012-05-26 08:51 pm (UTC)
From: (Anonymous)
mjg59: "For an HFS+ filesystem, the inode of the bootloader is written to a specific offset in the filesystem superblock and the firmware simply finds that inode and executes it. And this appears to be all that older Macs support." Not entirely true. On HFS+ inodes are actually special files in "\0\0\0\0HFS+ Private Data" (the joy of forcing non-POSIX filesystem into POSIX world). What's written in superblock is catalog node id and converting it to actual extents involves btree lookup, so I wouldn't say it's simple.

Additional partitions in the hybrid

Date: 2013-11-05 12:21 am (UTC)
From: (Anonymous)
Thank you for such an excellent and clear post. Does this support having an additional Windows partition first, before all the other image contents? I'm trying to make a live usb that also has a Windows readable extra space partition.

Profile

Matthew Garrett

About Matthew

Power management, mobile and firmware developer on Linux. Security developer at Nebula. Ex-biologist. @mjg59 on Twitter. Content here should not be interpreted as the opinion of my employer.

Page Summary

Expand Cut Tags

No cut tags