There's a story going round that Lenovo have signed an agreement with Microsoft that prevents installing free operating systems. This is sensationalist, untrue and distracts from a genuine problem.

The background is straightforward. Intel platforms allow the storage to be configured in two different ways - "standard" (normal AHCI on SATA systems, normal NVMe on NVMe systems) or "RAID". "RAID" mode is typically just changing the PCI IDs so that the normal drivers won't bind, ensuring that drivers that support the software RAID mode are used. Intel have not submitted any patches to Linux to support the "RAID" mode.

In this specific case, Lenovo's firmware defaults to "RAID" mode and doesn't allow you to change that. Since Linux has no support for the hardware when configured this way, you can't install Linux (distribution installers will boot, but won't find any storage device to install the OS to).

Why would Lenovo do this? I don't know for sure, but it's potentially related to something I've written about before - recent Intel hardware needs special setup for good power management. The storage driver that Microsoft ship doesn't do that setup. The Intel-provided driver does. "RAID" mode prevents the Microsoft driver from binding and forces the user to use the Intel driver, which means they get the correct power management configuration, battery life is better and the machine doesn't melt.

(Why not offer the option to disable it? A user who does would end up with a machine that doesn't boot, and if they managed to figure that out they'd have worse power management. That increases support costs. For a consumer device, why would you want to? The number of people buying these laptops to run anything other than Windows is miniscule)

Things are somewhat obfuscated due to a statement from a Lenovo rep:This system has a Signature Edition of Windows 10 Home installed. It is locked per our agreement with Microsoft. It's unclear what this is meant to mean. Microsoft could be insisting that Signature Edition systems ship in "RAID" mode in order to ensure that users get a good power management experience. Or it could be a misunderstanding regarding UEFI Secure Boot - Microsoft do require that Secure Boot be enabled on all Windows 10 systems, but (a) the user must be able to manage the key database and (b) there are several free operating systems that support UEFI Secure Boot and have appropriate signatures. Neither interpretation indicates that there's a deliberate attempt to prevent users from installing their choice of operating system.

The real problem here is that Intel do very little to ensure that free operating systems work well on their consumer hardware - we still have no information from Intel on how to configure systems to ensure good power management, we have no support for storage devices in "RAID" mode and we have no indication that this is going to get better in future. If Intel had provided that support, this issue would never have occurred. Rather than be angry at Lenovo, let's put pressure on Intel to provide support for their hardware.
I read this tweet a couple of weeks ago:

and it got me thinking. Security research is often derided as unnecessary stunt hacking, proving insecurity in things that are sufficiently niche or in ways that involve sufficient effort that the realistic probability of any individual being targeted is near zero. Fixing these issues is basically defending you against nation states (who (a) probably don't care, and (b) will probably just find some other way) and, uh, security researchers (who (a) probably don't care, and (b) see (a)).

Unfortunately, this may be insufficient. As basically anyone who's spent any time anywhere near the security industry will testify, many security researchers are not the nicest people. Some of them will end up as abusive partners, and they'll have both the ability and desire to keep track of their partners and ex-partners. As designers and implementers, we owe it to these people to make software as secure as we can rather than assuming that a certain level of adversary is unstoppable. "Can a state-level actor break this" may be something we can legitimately write off. "Can a security expert continue reading their ex-partner's email" shouldn't be.
There's been a bunch of coverage of this attack on Microsoft's Secure Boot implementation, a lot of which has been somewhat confused or misleading. Here's my understanding of the situation.

Windows RT devices were shipped without the ability to disable Secure Boot. Secure Boot is the root of trust for Microsoft's User Mode Code Integrity (UMCI) feature, which is what restricts Windows RT devices to running applications signed by Microsoft. This restriction is somewhat inconvenient for developers, so Microsoft added support in the bootloader to disable UMCI. If you were a member of the appropriate developer program, you could give your device's unique ID to Microsoft and receive a signed blob that disabled image validation. The bootloader would execute a (Microsoft-signed) utility that verified that the blob was appropriately signed and matched the device in question, and would then insert it into an EFI Boot Services variable[1]. On reboot, the boot loader reads the blob from that variable and integrates that policy, telling later stages to disable code integrity validation.

The problem here is that the signed blob includes the entire policy, and so any policy change requires an entirely new signed blob. The Windows 10 Anniversary Update added a new feature to the boot loader, allowing it to load supplementary policies. These must also be signed, but aren't tied to a device id - the idea is that they'll be ignored unless a device-specific policy has also been loaded. This way you can get a single device-specific signed blob that allows you to set an arbitrary policy later by using a combination of supplementary policies.

This is all fine in the Anniversary Edition. Unfortunately older versions of the boot loader will happily load a supplementary policy as if it were a full policy, ignoring the fact that it doesn't include a device ID. The loaded policy replaces the built-in policy, so in the absence of a base policy a supplementary policy as simple as "Enable this feature" will effectively remove all other restrictions.

Unfortunately for Microsoft, such a supplementary policy leaked. Installing it as a base policy on pre-Anniversary Edition boot loaders will then allow you to disable all integrity verification, including in the boot loader. Which means you can ask the boot loader to chain to any other executable, in turn allowing you to boot a compromised copy of any operating system you want (not just Windows).

This does require you to be able to install the policy, though. The PoC released includes a signed copy of SecureBootDebug.efi for ARM, which is sufficient to install the policy on ARM systems. There doesn't (yet) appear to be a public equivalent for x86, which means it's not (yet) practical for arbitrary attackers to subvert the Secure Boot process on x86. I've been doing my testing on a setup where I've manually installed the policy, which isn't practical in an automated way.

How can this be prevented? Installing the policy requires the ability to run code in the firmware environment, and by default the boot loader will only load signed images. The number of signed applications that will copy the policy to the Boot Services variable is presumably limited, so if the Windows boot loader supported blacklisting second-stage bootloaders Microsoft could simply blacklist all policy installers that permit installation of a supplementary policy as a primary policy. If that's not possible, they'll have to blacklist of the vulnerable boot loaders themselves. That would mean all pre-Anniversary Edition install media would stop working, including recovery and deployment images. That's, well, a problem. Things are much easier if the first case is true.

Thankfully, if you're not running Windows this doesn't have to be a issue. There are two commonly used Microsoft Secure Boot keys. The first is the one used to sign all third party code, including drivers in option ROMs and non-Windows operating systems. The second is used purely to sign Windows. If you delete the second from your system, Windows boot loaders (including all the vulnerable ones) will be rejected by your firmware, but non-Windows operating systems will still work fine.

From what we know so far, this isn't an absolute disaster. The ARM policy installer requires user intervention, so if the x86 one is similar it'd be difficult to use this as an automated attack vector[2]. If Microsoft are able to blacklist the policy installers without blacklisting the boot loader, it's also going to be minimally annoying. But if it's possible to install a policy without triggering any boot loader blacklists, this could end up being embarrassing.

Even outside the immediate harm, this is an interesting vulnerability. Presumably when the older boot loaders were written, Microsoft policy was that they would never sign policy files that didn't include a device ID. That policy changed when support for supplemental policies was added. without this policy change, the older boot loaders could still be considered secure. Adding new features can break old assumptions, and your design needs to take that into account.

[1] EFI variables come in two main forms - those accessible at runtime (Runtime Services variables) and those only accessible in the early boot environment (Boot Services variables). Boot Services variables can only be accessed before ExitBootServices() is called, and in Secure Boot environments all code executing before this point is (theoretically) signed. This means that Boot Services variables are nominally tamper-resistant.

[2] Shim has explicit support for allowing a physically present machine owner to disable signature validation - this is basically equivalent
My experiences with Amazon reviewing have been somewhat unusual. A review of a smart switch I wrote received enough attention that the vendor pulled the product from Amazon. At the time of writing, I'm ranked as around the 2750th best reviewer on Amazon despite having a total of 18 reviews. But the world of Amazon reviews is even stranger than that, and the past couple of weeks have given me some insight into it.

Amazon's success is fairly phenomenal. It's estimated that there's over 50 million people in the US paying $100 a year to get free shipping on Amazon purchases, and combined with Amazon's surprisingly customer friendly service there's a lot of people with a very strong preference for choosing Amazon rather than any other retailer. If you're not on Amazon, you're hurting your sales.

And if you're an established brand, this works pretty well. Some people will search for your product directly and buy it, leaving reviews. Well reviewed products appear higher up in search results, so people searching for an item type rather than a brand will still see your product appear early in the search results, in turn driving sales. Some proportion of those customers will leave reviews, which helps keep your product high up in the results. As long as your products aren't utterly dreadful, you'll probably maintain that position.

But if you're a brand nobody's ever heard of, things are more difficult. People are unlikely to search for your product directly, so you're relying on turning up in the results for more generic terms. But if you're selling a more generic kind of item (say, a Bluetooth smart bulb) then there's probably a number of other brands nobody's ever heard of selling almost identical objects. If there's no reason for anybody to choose your product then you're probably not going to get any reviews and you're not going to move up the search rankings. Even if your product is better than the competition, a small number of sales means a tiny number of reviews. By the time that number's large enough to matter, you're probably onto a new product cycle.

In summary: if nobody's ever heard of you, you need reviews but you're probably not getting any.

The old way of doing this was to send review samples to journalists, but nobody's going to run a comprehensive review of 3000 different USB cables and even if they did almost nobody would read it before making a decision on Amazon. You need Amazon reviews, but you're not getting any. The obvious solution is to send review samples to people who will leave Amazon reviews. This is where things start getting more dubious.

Amazon run a program called Vine which is intended to solve this problem. Send samples to Amazon and they'll distribute them to a subset of trusted reviewers. These reviewers write a review as normal, and Amazon tag the review with a "Vine Voice" badge which indicates to readers that the reviewer received the product for free. But participation in Vine is apparently expensive, and so there's a proliferation of sites like Snagshout or AMZ Review Trader that use a different model. There's no requirement that you be an existing trusted reviewer and the product probably isn't free. You sign up, choose a product, receive a discount code and buy it from Amazon. You then have a couple of weeks to leave a review, and if you fail to do so you'll lose access to the service. This is completely acceptable under Amazon's rules, which state "If you receive a free or discounted product in exchange for your review, you must clearly and conspicuously disclose that fact". So far, so reasonable.

In reality it's worse than that, with several opportunities to game the system. AMZ Review Trader makes it clear to sellers that they can choose reviewers based on past reviews, giving customers an incentive to leave good reviews in order to keep receiving discounted products. Some customers take full advantage of this, leaving a giant number of 5 star reviews for products they clearly haven't tested and then (presumably) reselling them. What's surprising is that this kind of cynicism works both ways. Some sellers provide two listings for the same product, the second being significantly more expensive than the first. They then offer an attractive discount for the more expensive listing in return for a review, taking it down to approximately the same price as the original item. Once the reviews are in, they can remove the first listing and drop the price of the second to the original price point.

The end result is a bunch of reviews that are nominally honest but are tied to perverse incentives. In effect, the overall star rating tells you almost nothing - you still need to actually read the reviews to gain any insight into whether the customer actually used the product. And when you do write an honest review that the seller doesn't like, they may engage in heavy handed tactics in an attempt to make the review go away.

It's hard to avoid the conclusion that Amazon's review model is broken, but it's not obvious how to fix it. When search ranking is tied to reviews, companies have a strong incentive to do whatever it takes to obtain positive reviews. What we're left with for now is having to laboriously click through a number of products to see whether their rankings come from thoughtful and detailed reviews or are just a mass of 5 star one liners.
The best known smart bulb setups (such as the Philips Hue and the Belkin Wemo) are based on Zigbee, a low-energy, low-bandwidth protocol that operates on various unlicensed radio bands. The problem with Zigbee is that basically no home routers or mobile devices have a Zigbee radio, so to communicate with them you need an additional device (usually called a hub or bridge) that can speak Zigbee and also hook up to your existing home network. Requests are sent to the hub (either directly if you're on the same network, or via some external control server if you're on a different network) and it sends appropriate Zigbee commands to the bulbs.

But requiring an additional device adds some expense. People have attempted to solve this in a couple of ways. The first is building direct network connectivity into the bulbs, in the form of adding an 802.11 controller. Go through some sort of setup process[1], the bulb joins your network and you can communicate with it happily. Unfortunately adding wifi costs more than adding Zigbee, both in terms of money and power - wifi bulbs consume noticeably more power when "off" than Zigbee ones.

There's a middle ground. There's a large number of bulbs available from Amazon advertising themselves as Bluetooth, which is true but slightly misleading. They're actually implementing Bluetooth Low Energy, which is part of the Bluetooth 4.0 spec. Implementing this requires both OS and hardware support, so older systems are unable to communicate. Android 4.3 devices tend to have all the necessary features, and modern desktop Linux is also fine as long as you have a Bluetooth 4.0 controller.

Bluetooth is intended as a low power communications protocol. Bluetooth Low Energy (or BLE) is even lower than that, running in a similar power range to Zigbee. Most semi-modern phones can speak it, so it seems like a pretty good choice. Obviously you lose the ability to access the device remotely, but given the track record on this sort of thing that's arguably a benefit. There's a couple of other downsides - the range is worse than Zigbee (but probably still acceptable for any reasonably sized house or apartment), and only one device can be connected to a given BLE server at any one time. That means that if you have the control app open while you're near a bulb, nobody else can control that bulb until you disconnect.

The quality of the bulbs varies a great deal. Some of them are pure RGB bulbs and incapable of producing a convincing white at a reasonable intensity[2]. Some have additional white LEDs but don't support running them at the same time as the colour LEDs, so you have the choice between colour or a fixed (and usually more intense) white. Some allow running the white LEDs at the same time as the RGB ones, which means you can vary the colour temperature of the "white" output.

But while the quality of the bulbs varies, the quality of the apps doesn't really. They're typically all dreadful, competing on features like changing bulb colour in time to music rather than on providing a pleasant user experience. And the whole "Only one person can control the lights at a time" thing doesn't really work so well if you actually live with anyone else. I was dissatisfied.

I'd met Mike Ryan at Kiwicon a couple of years back after watching him demonstrate hacking a BLE skateboard. He offered a couple of good hints for reverse engineering these devices, the first being that Android already does almost everything you need. Hidden in the developer settings is an option marked "Enable Bluetooth HCI snoop log". Turn that on and all Bluetooth traffic (including BLE) is dumped into /sdcard/btsnoop_hci.log. Turn that on, start the app, make some changes, retrieve the file and check it out using Wireshark. Easy.

Conveniently, BLE is very straightforward when it comes to network protocol. The only thing you have is GATT, the Generic Attribute Protocol. Using this you can read and write multiple characteristics. Each packet is limited to a maximum of 20 bytes. Most implementations use a single characteristic for light control, so it's then just a matter of staring at the dumped packets until something jumps out at you. A pretty typical implementation is something like:

0x56,r,g,b,0x00,0xf0,0x00,0xaa

where r, g and b are each just a single byte representing the corresponding red, green or blue intensity. 0x56 presumably indicates a "Set the light to these values" command, 0xaa indicates end of command and 0xf0 indicates that it's a request to set the colour LEDs. Sending 0x0f instead results in the previous byte (0x00 in this example) being interpreted as the intensity of the white LEDs. Unfortunately the bulb I tested that speaks this protocol didn't allow you to drive the white LEDs at the same time as anything else - setting the selection byte to 0xff didn't result in both sets of intensities being interpreted at once. Boo.

You can test this out fairly easily using the gatttool app. Run hcitool lescan to look for the device (remember that it won't show up if anything else is connected to it at the time), then do gatttool -b deviceid -I to get an interactive shell. Type connect to initiate a connection, and once connected send commands by doing char-write-cmd handle value using the handle obtained from your hci dump.

I did this successfully for various bulbs, but annoyingly hit a problem with one from Tikteck. The leading byte of each packet was clearly a counter, but the rest of the packet appeared to be garbage. For reasons best known to themselves, they've implemented application-level encryption on top of BLE. This was a shame, because they were easily the best of the bulbs I'd used - the white LEDs work in conjunction with the colour ones once you're sufficiently close to white, giving you good intensity and letting you modify the colour temperature. That gave me incentive, but figuring out the protocol took quite some time. Earlier this week, I finally cracked it. I've put a Python implementation on Github. The idea is to tie it into Ulfire running on a central machine with a Bluetooth controller, making it possible for me to control the lights from multiple different apps simultaneously and also integrating with my Echo.

I'd write something about the encryption, but I honestly don't know. Large parts of this make no sense to me whatsoever. I haven't even had any gin in the past two weeks. If anybody can explain how anything that's being done there makes any sense at all[3] that would be appreciated.

[1] typically via the bulb pretending to be an access point, but also these days through a terrifying hack involving spewing UDP multicast packets of varying lengths in order to broadcast the password to associated but unauthenticated devices and good god the future is terrifying

[2] For a given power input, blue LEDs produce more light than other colours. To get white with RGB LEDs you either need to have more red and green LEDs than blue ones (which costs more), or you need to reduce the intensity of the blue ones (which means your headline intensity is lower). Neither is appealing, so most of these bulbs will just give you a blue "white" if you ask for full red, green and blue

[3] Especially the bit where we calculate something from the username and password and then encrypt that using some random numbers as the key, then send 50% of the random numbers and 50% of the encrypted output to the device, because I can't even
I bought some awful WiFi lightbulbs a few months ago. The short version: they introduced terrible vulnerabilities on your network, they violated the GPL and they were also just bad at being lightbulbs. Since then I've bought some other Internet of Things devices, and since people seem to have a bizarre level of fascination with figuring out just what kind of fractal of poor design choices these things frequently embody, I thought I'd oblige.

Today we're going to be talking about the KanKun SP3, a plug that's been around for a while. The idea here is pretty simple - there's lots of devices that you'd like to be able to turn on and off in a programmatic way, and rather than rewiring them the simplest thing to do is just to insert a control device in between the wall and the device andn ow you can turn your foot bath on and off from your phone. Most vendors go further and also allow you to program timers and even provide some sort of remote tunneling protocol so you can turn off your lights from the comfort of somebody else's home.

The KanKun has all of these features and a bunch more, although when I say "features" I kind of mean the opposite. I plugged mine in and followed the install instructions. As is pretty typical, this took the form of the plug bringing up its own Wifi access point, the app on the phone connecting to it and sending configuration data, and the plug then using that data to join your network. Except it didn't work. I connected to the plug's network, gave it my SSID and password and waited. Nothing happened. No useful diagnostic data. Eventually I plugged my phone into my laptop and ran adb logcat, and the Android debug logs told me that the app was trying to modify a network that it hadn't created. Apparently this isn't permitted as of Android 6, but the app was handling this denial by just trying again. I deleted the network from the system settings, restarted the app, and this time the app created the network record and could modify it. It still didn't work, but that's because it let me give it a 5GHz network and it only has a 2.4GHz radio, so one reset later and I finally had it online.

The first thing I normally do to one of these things is run nmap with the -O argument, which gives you an indication of what OS it's running. I didn't really need to in this case, because if I just telnetted to port 22 I got a dropbear ssh banner. Googling turned up the root password ("p9z34c") and I was logged into a lightly hacked (and fairly obsolete) OpenWRT environment.

It turns out that here's a whole community of people playing with these plugs, and it's common for people to install CGI scripts on them so they can turn them on and off via an API. At first this sounds somewhat confusing, because if the phone app can control the plug then there clearly is some kind of API, right? Well ha yeah ok that's a great question and oh good lord do things start getting bad quickly at this point.

I'd grabbed the apk for the app and a copy of jadx, an incredibly useful piece of code that's surprisingly good at turning compiled Android apps into something resembling Java source. I dug through that for a while before figuring out that before packets were being sent, they were being handed off to some sort of encryption code. I couldn't find that in the app, but there was a native ARM library shipped with it. Running strings on that showed functions with names matching the calls in the Java code, so that made sense. There were also references to AES, which explained why when I ran tcpdump I only saw bizarre garbage packets.

But what was surprising was that most of these packets were substantially similar. There were a load that were identical other than a 16-byte chunk in the middle. That plus the fact that every payload length was a multiple of 16 bytes strongly indicated that AES was being used in ECB mode. In ECB mode each plaintext is split up into 16-byte chunks and encrypted with the same key. The same plaintext will always result in the same encrypted output. This implied that the packets were substantially similar and that the encryption key was static.

Some more digging showed that someone had figured out the encryption key last year, and that someone else had written some tools to control the plug without needing to modify it. The protocol is basically ascii and consists mostly of the MAC address of the target device, a password and a command. This is then encrypted and sent to the device's IP address. The device then sends a challenge packet containing a random number. The app has to decrypt this, obtain the random number, create a response, encrypt that and send it before the command takes effect. This avoids the most obvious weakness around using ECB - since the same plaintext always encrypts to the same ciphertext, you could just watch encrypted packets go past and replay them to get the same effect, even if you didn't have the encryption key. Using a random number in a challenge forces you to prove that you actually have the key.

At least, it would do if the numbers were actually random. It turns out that the plug is just calling rand(). Further, it turns out that it never calls srand(). This means that the plug will always generate the same sequence of challenges after a reboot, which means you can still carry out replay attacks if you can reboot the plug. Strong work.

But there was still the question of how the remote control works, since the code on github only worked locally. tcpdumping the traffic from the server and trying to decrypt it in the same way as local packets worked fine, and showed that the only difference was that the packet started "wan" rather than "lan". The server decrypts the packet, looks at the MAC address, re-encrypts it and sends it over the tunnel to the plug that registered with that address.

That's not really a great deal of authentication. The protocol permits a password, but the app doesn't insist on it - some quick playing suggests that about 90% of these devices still use the default password. And the devices are all based on the same wifi module, so the MAC addresses are all in the same range. The process of sending status check packets to the server with every MAC address wouldn't take that long and would tell you how many of these devices are out there. If they're using the default password, that's enough to have full control over them.

There's some other failings. The github repo mentioned earlier includes a script that allows arbitrary command execution - the wifi configuration information is passed to the system() command, so leaving a semicolon in the middle of it will result in your own commands being executed. Thankfully this doesn't seem to be true of the daemon that's listening for the remote control packets, which seems to restrict its use of system() to data entirely under its control. But even if you change the default root password, anyone on your local network can get root on the plug. So that's a thing. It also downloads firmware updates over http and doesn't appear to check signatures on them, so there's the potential for MITM attacks on the plug itself. The remote control server is on AWS unless your timezone is GMT+8, in which case it's in China. Sorry, Western Australia.

It's running Linux and includes Busybox and dnsmasq, so plenty of GPLed code. I emailed the manufacturer asking for a copy and got told that they wouldn't give it to me, which is unsurprising but still disappointing.

The use of AES is still somewhat confusing, given the relatively small amount of security it provides. One thing I've wondered is whether it's not actually intended to provide security at all. The remote servers need to accept connections from anywhere and funnel decent amounts of traffic around from phones to switches. If that weren't restricted in any way, competitors would be able to use existing servers rather than setting up their own. Using AES at least provides a minor obstacle that might encourage them to set up their own server.

Overall: the hardware seems fine, the software is shoddy and the security is terrible. If you have one of these, set a strong password. There's no rate-limiting on the server, so a weak password will be broken pretty quickly. It's also infringing my copyright, so I'd recommend against it on that point alone.
Inspiring change is difficult. Fighting the status quo typically means being able to communicate so effectively that powerful opponents can't win merely by outspending you. People need to read your work or hear you speak and leave with enough conviction that they in turn can convince others. You need charisma. You need to be smart. And you need to be able to tailor your message depending on the audience, even down to telling an individual exactly what they need to hear to take your side. Not many people have all these qualities, but those who do are powerful and you want them on your side.

But the skills that allow you to convince people that they shouldn't listen to a politician's arguments are the same skills that allow you to convince people that they shouldn't listen to someone you abused. The ability that allows you to argue that someone should change their mind about whether a given behaviour is of social benefit is the same ability that allows you to argue that someone should change their mind about whether they should sleep with you. The visibility that gives you the power to force people to take you seriously is the same visibility that makes people afraid to publicly criticise you.

We need these people, but we also need to be aware that their talents can be used to hurt as well as to help. We need to hold them to higher standards of scrutiny. We need to listen to stories about their behaviour, even if we don't want to believe them. And when there are reasons to believe those stories, we need to act on them. That means people need to feel safe in coming forward with their experiences, which means that nobody should have the power to damage them in reprisal. If you're not careful, allowing charismatic individuals to become the public face of your organisation gives them that power.

There's no reason to believe that someone is bad merely because they're charismatic, but this kind of role allows a charismatic abuser both a great deal of cover and a great deal of opportunity. Sometimes people are just too good to be true. Pretending otherwise doesn't benefit anybody but the abusers.
Github recently introduced the option to squash commits on merge, and even before then several projects requested that contributors squash their commits after review but before merge. This is a terrible idea that makes it more difficult for people to contribute to projects.

I'm spending today working on reworking some code to integrate with a new feature that was just integrated into Kubernetes. The PR in question was absolutely fine, but just before it was merged the entire commit history was squashed down to a single commit at the request of the reviewer. This single commit contains type declarations, the functionality itself, the integration of that functionality into the scheduler, the client code and a large pile of autogenerated code.

I've got some familiarity with Kubernetes, but even then this commit is difficult for me to read. It doesn't tell a story. I can't see its growth. Looking at a single hunk of this diff doesn't tell me whether it's infrastructural or part of the integration. Given time I can (and have) figured it out, but it's an unnecessary waste of effort that could have gone towards something else. For someone who's less used to working on large projects, it'd be even worse. I'm paid to deal with this. For someone who isn't, the probability that they'll give up and do something else entirely is even greater.

I don't want to pick on Kubernetes here - the fact that this Github feature exists makes it clear that a lot of people feel that this kind of merge is a good idea. And there are certainly cases where squashing commits makes sense. Commits that add broken code and which are immediately followed by a series of "Make this work" commits also impair readability and distract from the narrative that your RCS history should present, and Github present this feature as a way to get rid of them. But that ends up being a false dichotomy. A history that looks like "Commit", "Revert Commit", "Revert Revert Commit", "Fix broken revert", "Revert fix broken revert" is a bad history, as is a history that looks like "Add 20,000 line feature A", "Add 20,000 line feature B".

When you're crafting commits for merge, think about your commit history as a textbook. Start with the building blocks of your feature and make them one commit. Build your functionality on top of them in another. Tie that functionality into the core project and make another commit. Add client support. Add docs. Include your tests. Allow someone to follow the growth of your feature over time, with each commit being a chapter of that story. And never, ever, put autogenerated code in the same commit as an actual functional change.

People can't contribute to your project unless they can understand your code. Writing clear, well commented code is a big part of that. But so is showing the evolution of your features in an understandable way. Make sure your RCS history shows that, otherwise people will go and find another project that doesn't make them feel frustrated.

(Edit to add: Sarah Sharp wrote on the same topic a couple of years ago)
Moxie, the lead developer of the Signal secure communication application, recently blogged on the tradeoffs between providing a supportable federated service and providing a compelling application that gains significant adoption. There's a set of perfectly reasonable arguments around that that I don't want to rehash - regardless of feelings on the benefits of federation in general, there's certainly an increase in engineering cost in providing a stable intra-server protocol that still allows for addition of new features, and the person leading a project gets to make the decision about whether that's a valid tradeoff.

One voiced complaint about Signal on Android is the fact that it depends on the Google Play Services. These are a collection of proprietary functions for integrating with Google-provided services, and Signal depends on them to provide a good out of band notification protocol to allow Signal to be notified when new messages arrive, even if the phone is otherwise in a power saving state. At the time this decision was made, there were no terribly good alternatives for Android. Even now, nobody's really demonstrated a free implementation that supports several million clients and has no negative impact on battery life, so if your aim is to write a secure messaging client that will be adopted by as many people is possible, keeping this dependency is entirely rational.

On the other hand, there are users for whom the decision not to install a Google root of trust on their phone is also entirely rational. I have no especially good reason to believe that Google will ever want to do something inappropriate with my phone or data, but it's certainly possible that they'll be compelled to do so against their will. The set of people who will ever actually face this problem is probably small, but it's probably also the set of people who benefit most from Signal in the first place.

(Even ignoring the dependency on Play Services, people may not find the official client sufficient - it's very difficult to write a single piece of software that satisfies all users, whether that be down to accessibility requirements, OS support or whatever. Slack may be great, but there's still people who choose to use Hipchat)

This shouldn't be a problem. Signal is free software and anybody is free to modify it in any way they want to fit their needs, and as long as they don't break the protocol code in the process it'll carry on working with the existing Signal servers and allow communication with people who run the official client. Unfortunately, Moxie has indicated that he is not happy with forked versions of Signal using the official servers. Since Signal doesn't support federation, that means that users of forked versions will be unable to communicate with users of the official client.

This is awkward. Signal is deservedly popular. It provides strong security without being significantly more complicated than a traditional SMS client. In my social circle there's massively more users of Signal than any other security app. If I transition to a fork of Signal, I'm no longer able to securely communicate with them unless they also install the fork. If the aim is to make secure communication ubiquitous, that's kind of a problem.

Right now the choices I have for communicating with people I know are either convenient and secure but require non-free code (Signal), convenient and free but insecure (SMS) or secure and free but horribly inconvenient (gpg). Is there really no way for us to work as a community to develop something that's all three?
Ubuntu 16.04 was released today, with one of the highlights being the new Snap package format. Snaps are intended to make it easier to distribute applications for Ubuntu - they include their dependencies rather than relying on the archive, they can be updated on a schedule that's separate from the distribution itself and they're confined by a strong security policy that makes it impossible for an app to steal your data.

At least, that's what Canonical assert. It's true in a sense - if you're using Snap packages on Mir (ie, Ubuntu mobile) then there's a genuine improvement in security. But if you're using X11 (ie, Ubuntu desktop) it's horribly, awfully misleading. Any Snap package you install is completely capable of copying all your private data to wherever it wants with very little difficulty.

The problem here is the X11 windowing system. X has no real concept of different levels of application trust. Any application can register to receive keystrokes from any other application. Any application can inject fake key events into the input stream. An application that is otherwise confined by strong security policies can simply type into another window. An application that has no access to any of your private data can wait until your session is idle, open an unconfined terminal and then use curl to send your data to a remote site. As long as Ubuntu desktop still uses X11, the Snap format provides you with very little meaningful security. Mir and Wayland both fix this, which is why Wayland is a prerequisite for the sandboxed xdg-app design.

I've produced a quick proof of concept of this. Grab XEvilTeddy from git, install Snapcraft (it's in 16.04), snapcraft snap, sudo snap install xevilteddy*.snap, /snap/bin/xevilteddy.xteddy . An adorable teddy bear! How cute. Now open Firefox and start typing, then check back in your terminal window. Oh no! All my secrets. Open another terminal window and give it focus. Oh no! An injected command that could instead have been a curl session that uploaded your private SSH keys to somewhere that's not going to respect your privacy.

The Snap format provides a lot of underlying technology that is a great step towards being able to protect systems against untrustworthy third-party applications, and once Ubuntu shifts to using Mir by default it'll be much better than the status quo. But right now the protections it provides are easily circumvented, and it's disingenuous to claim that it currently gives desktop users any real security.
Around a year ago I wrote some patches in an attempt to improve power management on Haswell and Broadwell systems by configuring Serial ATA power management appropriately. I got a couple of reports of them triggering SATA errors for some users, couldn't reproduce them myself and so didn't have a lot of confidence in them. Time passed.

I've been working on power management stuff again this week, so it seemed like a good opportunity to revisit these. I've made a few changes and pushed a couple of trees - one against master and one against 4.5.

First, these probably only have relevance to users of mobile Intel parts in the U or S range (/proc/cpuinfo will tell you - you're looking for a four-digit number that starts with 4 (Haswell), 5 (Broadwell) or 6 (Skylake) and ends with U or S), and won't do anything unless you have SATA drives (including PCI-based SATA). To test them, first disable anything like TLP that might alter your SATA link power management policy. Then check powertop - you should only be getting to PC3 at best. Build a kernel with these patches and boot it. /sys/class/scsi_host/*/link_power_management_policy should read "firmware". Check powertop and see whether you're getting into deeper PC states. Now run your system for a while and check the kernel log for any SATA errors that you didn't see before.

Let me know if you see SATA errors and are willing to help debug this, and leave a comment if you don't see any improvement in PC states.
The first time I was paid to do software development came as something of a surprise to me. I was working as a sysadmin in a computational physics research group when a friend asked me if I'd be willing to talk to her PhD supervisor. I had nothing better to do, so said yes. And that was how I started the evening having dinner with David MacKay, and ended the evening better fed, a little drunker and having agreed in principle to be paid to write free software.

I'd been hired to work on Dasher, an information-efficient text entry system. It had been developed by one of David's students as a practical demonstration of arithmetic encoding after David had realised that presenting a visualisation of an effective compression algorithm allowed you to compose text without having to enter as much information into the system. At first this was merely a neat toy, but it soon became clear that the benefits of Dasher had a great deal of overlap with good accessibility software. It required much less precision of input, it made it easy to correct mistakes (you merely had to reverse direction in order to start zooming back out of the text you had entered) and it worked with a variety of input technologies from mice to eye tracking to breathing. My job was to take this codebase and turn it into a project that would be interesting to external developers.

In the year I worked with David, we turned Dasher from a research project into a well-integrated component of Gnome, improved its support for Windows, accepted code from an external contributor who ported it to OS X (using an OpenGL canvas!) and wrote ports for a range of handheld devices. We added code that allowed Dasher to directly control the UI of other applications, making it possible for people to drive word processors without having to leave Dasher. We taught Dasher to speak. We strove to avoid the mistakes present in so many other pieces of accessibility software, such as configuration that could only be managed by an (expensive!) external consultant. And we visited Dasher users and learned how they used it and what more they needed, then went back home and did what we could to provide that.

Working on Dasher was an incredible opportunity. I was involved in the development of exciting code. I spoke on it at multiple conferences. I became part of the Gnome community. I visited the USA for the first time. I entered people's homes and taught them how to use Dasher and experienced their joy as they realised that they could now communicate up to an order of magnitude more quickly. I wrote software that had a meaningful impact on the lives of other people.

Working with David was certainly not easy. Our weekly design meetings were, charitably, intense. He had an astonishing number of ideas, and my job was to figure out how to implement them while (a) not making the application overly complicated and (b) convincing David that it still did everything he wanted. One memorable meeting involved me gradually arguing him down from wanting five new checkboxes to agreeing that there were only two combinations that actually made sense (and hence a single checkbox) - and then admitting that this was broadly equivalent to an existing UI element, so we could just change the behaviour of that slightly without adding anything. I took the opportunity to delete an additional menu item in the process.

I was already aware of the importance of free software in terms of developers, but working with David made it clear to me how important it was to users as well. A community formed around Dasher, helping us improve it and allowing us to develop support for new use cases that made the difference between someone being able to type at two words per minute and being able to manage twenty. David saw that this collaborative development would be vital to creating something bigger than his original ideas, and it succeeded in ways he couldn't have hoped for.

I spent a year in the group and then went back to biology. David went on to channel his strong feelings about social responsibility into issues such as sustainable energy, writing a freely available book on the topic. He served as chief adviser to the UK Department of Energy and Climate Change for five years. And earlier this year he was awarded a knighthood for his services to scientific outreach.

David died yesterday. It's unlikely that I'll ever come close to what he accomplished, but he provided me with much of the inspiration to try to do so anyway. The world is already a less fascinating place without him.
(Edit to add: this issue is restricted to the mobile SKUs. Desktop parts have very different power management behaviour)

Linux 4.5 seems to have got Intel's Skylake platform (ie, 6th-generation Core CPUs) to the point where graphics work pretty reliably, which is great progress (4.4 tended to lose all my windows every so often, especially over suspend/resume). I'm even running Wayland happily. Unfortunately one of the reasons I have a laptop is that I want to be able to do things like use it on battery, and power consumption's an important part of that. Skylake continues the trend from Haswell of moving to an SoC-type model where clock and power domains are shared between components that were previously entirely independent, and so you can't enter deep power saving states unless multiple components all have the correct power management configuration. On Haswell/Broadwell this manifested in the form of Serial ATA link power management being involved in preventing the package from going into deep power saving states - setting that up correctly resulted in a reduction in full-system power consumption of about 40%[1].

I've now got a Skylake platform with a nice shiny NVMe device, so Serial ATA policy isn't relevant (the platform doesn't even expose a SATA controller). The deepest power saving state I can get into is PC3, despite Skylake supporting PC8 - so I'm probably consuming about 40% more power than I should be. And nobody seems to know what needs to be done to fix this. I've found no public documentation on the power management dependencies on Skylake. Turning on everything in Powertop doesn't improve anything. My battery life is pretty poor and the system is pretty warm.

The best thing about this is the following statement from page 64 of the 6th Generation Intel ® Processor Datasheet for U-Platforms:

Caution: Long term reliability cannot be assured unless all the Low-Power Idle States are enabled.

which is pretty concerning. Without support for states deeper than PC3, Linux is running in a configuration that Intel imply may trigger premature failure. That's obviously not good. Until this situation is improved, you probably shouldn't buy any Skylake systems if you're planning on running Linux.

[1] These patches never went upstream. Someone reported that they resulted in their SSD throwing errors and I couldn't find anybody with deeper levels of SATA experience who was interested in working on the problem. Intel's AHCI drivers for Windows do the right thing, but I couldn't find anybody at Intel who could get any information from their Windows driver team.
I've been working on TPMTOTP a little this weekend. I merged a pull request that adds command-line argument handling, which includes the ability to choose the set of PCRs you want to seal to without rebuilding the tools, and also lets you print the base32 encoding of the secret rather than the qr code so you can import it into a wider range of devices. More importantly it also adds support for setting the expected PCR values on the command line rather than reading them out of the TPM, so you can now re-seal the secret against new values before rebooting.

I also wrote some new code myself. TPMTOTP is designed to be usable in the initramfs, allowing you to validate system state before typing in your passphrase. Unfortunately the initramfs itself is one of the things that's measured. So, you end up with something of a chicken and egg problem - TPMTOTP needs access to the secret, and the obvious thing to do is to put the secret in the initramfs. But the secret is sealed against the hash of the initramfs, and so you can't generate the secret until after the initramfs. Modify the initramfs to insert the secret and you change the hash, so the secret is no longer released. Boo.

On EFI systems you can handle this by sticking the secret in an EFI variable (there's some special-casing in the code to deal with the additional metadata on the front of things you read out of efivarfs). But that's not terribly useful if you're not on an EFI system. Thankfully, there's a way around this. TPMs have a small quantity of nvram built into them, so we can stick the secret there. If you pass the -n argument to sealdata, that'll happen. The unseal apps will attempt to pull the secret out of nvram before falling back to looking for a file, so things should just magically work.

I think it's pretty feature complete now, other than TPM2 support? That's on my list.
There's a piece of software called XScreenSaver. It attempts to fill two somewhat disparate roles:
  • Provide a functioning screen lock on systems using the X11 windowing system, a job made incredibly difficult due to a variety of design misfeatures in said windowing system[1]
  • Provide cute graphical output while the screen is locked
XScreenSaver does an excellent job of the second of these[2] and is pretty good at the first, which is to say that it only suffers from a disastrous security flaw once very few years and as such is certainly not appreciably worse than any other piece of software.

Debian ships an operating system that prides itself on stability. The Debian definition of stability is a very specific one - rather than referring to how often the software crashes or misbehaves, it refers to how often the software changes behaviour. Debian is very reluctant to upgrade software that is part of a stable release, to the extent that developers will attempt to backport individual security fixes to the version they shipped rather than upgrading to a release that contains all those security fixes but also adds a new feature. The argument here is that the new release may also introduce new bugs, and Debian's users desire stability (in the "things don't change" sense) more than new features. Backporting security fixes keeps them safe without compromising the reason they're running Debian in the first place.

This all makes plenty of sense at a theoretical level, but reality is sometimes less convenient. The first problem is that security bugs are typically also, well, bugs. They may make your software crash or misbehave in annoying but apparently harmless ways. And when you fix that bug you've also fixed a security bug, but the ability to determine whether a bug is a security bug or not is one that involves deep magic and a fanatical devotion to the cause so given the choice between maybe asking for a CVE and dealing with embargoes and all that crap when perhaps you've actually only fixed a bug that makes the letter "E" appear in places it shouldn't and not one that allows the complete destruction of your intergalactic invasion fleet means people will tend to err on the side of "Eh fuckit" and go drinking instead. So new versions of software will often fix security vulnerabilities without there being any indication that they do so[3], and running old versions probably means you have a bunch of security issues that nobody will ever do anything about.

But that's broadly a technical problem and one we can apply various metrics to, and if somebody wanted to spend enough time performing careful analysis of software we could have actual numbers to figure out whether the better security approach is to upgrade or to backport fixes. Conversations become boring once we introduce too many numbers, so let's ignore that problem and go onto the second, which is far more handwavy and social and so significantly more interesting.

The second problem is that upstream developers remain associated with the software shipped by Debian. Even though Debian includes a tool for reporting bugs against packages included in Debian, some users will ignore that and go straight to the upstream developers. Those upstream developers then have to spend at least 15 or so seconds telling the user that the bug they're seeing has been fixed for some time, and then figure out how to explain that no sorry they can't make Debian include a fixed version because that's not how things work. Worst case, the stable release of Debian ends up including a bug that makes software just basically not work at all and everybody who uses it assumes that the upstream author is brutally incompetent, and they end up quitting the software industry and I don't know running a nightclub or something.

From the Debian side of things, the straightforward solution is to make it more obvious that users should file bugs with Debian and not bother the upstream authors. This doesn't solve the problem of damaged reputation, and nor does it entirely solve the problem of users contacting upstream developers. If a bug is filed with Debian and doesn't get fixed in a timely manner, it's hardly surprising that users will end up going upstream. The Debian bugs list for XScreenSaver does not make terribly attractive reading.

So, coming back to the title for this entry. The most obvious failure of the commons is where a basically malicious actor consumes while giving nothing back, but if an actor with good intentions ends up consuming more than they contribute that may still be a problem. An upstream author releases a piece of software under a free license. Debian distributes this to users. Debian's policies result in the upstream author having to do more work. What does the upstream author get out of this exchange? In an ideal world, plenty. The author's software is made available to more people. A larger set of developers is willing to work on making improvements to the software. In a less ideal world, rather less. The author has to deal with bug mail about already fixed bugs. The author's reputation may be harmed by user exposure to said fixed bugs. The author may get less in the way of useful bug fixes or features because people are running old versions rather than fixing new ones. If the balance tips towards the latter, the author's decision to release their software under a free license has made their life more difficult.

Most discussions about Debian's policies entirely ignore the latter scenario, focusing more on the fact that the author chose to release their software under a free license to begin with. If the author is unwilling to handle the consequences of that, goes the argument, why did they do it in the first place? The unfortunate logical conclusion to that argument is that the author realises that they made a huge mistake and never does so again, and woo uh oops.

The irony here is that one of Debian's foundational documents, the Debian Free Software Guidelines, makes allowances for this. Section 4 allows for distribution of software in Debian even if the author insists that modified versions[4] are renamed. This allows for an author to make a choice - allow themselves to be associated with the Debian version of their work and increase (a) their userbase and (b) their support load, or try to distinguish what Debian ship from their identity. But that document was ratified in 1997 and people haven't really spent much time since then thinking about why it says what it does, and so this tradeoff is rarely considered.

Free software doesn't benefit from distributions antagonising their upstreams, even if said upstream is a cranky nightclub owner. Debian's users are Debian's highest priority, but those users are going to suffer if developers decide that not using free licenses improves their quality of life. Kneejerk reactions around specific instances aren't helpful, but now is probably a good time to start thinking about what value Debian bring to its upstream authors and how that can be increased. Failing to do so doesn't serve users, Debian itself or the free software community as a whole.

[1] The X server has no fundamental concept of a screen lock. This is implemented by an application asking that the X server send all keyboard and mouse input to it rather than to any other application, and then that application creating a window that fills the screen. Due to some hilarious design decisions, opening a pop-up menu in an application prevents any other application from being able to grab input and so it is impossible for the screensaver to activate if you open a menu and then walk away from your computer. This is merely the most obvious problem - there are others that are more subtle and more infuriating. The only fix in this case is to nuke the site from orbit.

[2] There's screenshots here. My favourites are the one that emulate the electrical characteristics of an old CRT in order to present a more realistic depiction of the output of an Apple 2 and the one that includes a complete 6502 emulator.

[3] And obviously new versions of software will often also introduce new security vulnerabilities without there being any indication that they do so, because who would ever put that in their changelog. But the less ethically challenged members of the security community are more likely to be looking at new versions of software than ones released three years ago, so you're probably still tending towards winning overall

[4] There's a perfectly reasonable argument that all packages distributed by Debian are modified in some way
Trusted Platform Modules are fairly unintelligent devices. They can do some crypto, but they don't have any ability to directly monitor the state of the system they're attached to. This is worked around by having each stage of the boot process "measure" state into registers (Platform Configuration Registers, or PCRs) in the TPM by taking the SHA1 of the next boot component and performing an extend operation. Extend works like this:

New PCR value = SHA1(current value||new hash)

ie, the TPM takes the current contents of the PCR (a 20-byte register), concatenates the new SHA1 to the end of that in order to obtain a 40-byte value, takes the SHA1 of this 40-byte value to obtain a 20-byte hash and sets the PCR value to this. This has a couple of interesting properties:
  • You can't directly modify the contents of the PCR. In order to obtain a specific value, you need to perform the same set of writes in the same order. If you replace the trusted bootloader with an untrusted one that runs arbitrary code, you can't rewrite the PCR to cover up that fact
  • The PCR value is predictable and can be reconstructed by replaying the same series of operations
But how do we know what those operations were? We control the bootloader and the kernel and we know what extend operations they performed, so that much is easy. But the firmware itself will have performed some number of operations (the firmware itself is measured, as is the firmware configuration, and certain aspects of the boot process that aren't in our control may also be measured) and we may not be able to reconstruct those from scratch.

Thankfully we have more than just the final PCR data. The firmware provides an interface to log each extend operation, and you can read the event log in /sys/kernel/security/tpm0/binary_bios_measurements. You can pull information out of that log and use it to reconstruct the writes the firmware made. Merge those with the writes you performed and you should be able to reconstruct the final TPM state. Hurrah!

The problem is that a lot of what you want to measure into the TPM may vary between machines or change in response to configuration changes or system updates. If you measure every module that grub loads, and if grub changes the order that it loads modules in, you also need to update your calculations of the end result. Thankfully there's a way around this - rather than making policy decisions based on the final TPM value, just use the final TPM value to ensure that the log is valid. If you extract each hash value from the log and simulate an extend operation, you should end up with the same value as is present in the TPM. If so, you know that the log is valid. At that point you can examine individual log entries without having to care about the order that they occurred in, which makes writing your policy significantly easier.

But there's another source of fragility. Imagine that you're measuring every command executed by grub (as is the case in the CoreOS grub). You want to ensure that no inappropriate commands have been run (such as ones that would allow you to modify the loaded kernel after it's been measured), but you also want to permit certain variations - for instance, you might have a primary root filesystem and a fallback root filesystem, and you're ok with either being passed as a kernel argument. One approach would be to write two lines of policy, but there's an even more flexible approach. If the bootloader logs the entire command into the event log, when replaying the log we can verify that the event description hashes to the value that was passed to the TPM. If it does, rather than testing against an explicit hash value, we can examine the string itself. If the event description matches a regular expression provided by the policy then we're good.

This approach makes it possible to write TPM policies that are resistant to changes in ordering and permit fine-grained definition of acceptable values, and which can cleanly separate out local policy, generated policy values and values that are provided by the firmware. The split between machine-specific policy and OS policy allows for the static machine-specific policy to be merged with OS-provided policy, making remote attestation viable even over automated system upgrades.

We've integrated an implementation of this kind of policy into the TPM support code we'd like to integrate into Kubernetes, and CoreOS will soon be generating known-good hashes at image build time. The combination of these means that people using Distributed Trusted Computing under Tectonic will be able to validate the state of their systems with nothing more than a minimal machine-specific policy description.

The support code for all of this should also start making it into other distributions in the near future (the grub code is already in Fedora 24), so with luck we can define a cross-distribution policy format and make it straightforward to handle this in a consistent way even in hetrogenous operating system environments. Remote attestation is a powerful tool for ensuring that your systems are in a valid state, but the difficulty of policy management has been a significant factor in making it difficult for people to deploy in their data centres. Making it easier for people to shield themselves against low-level boot attacks is a big step forward in improving the security of distributed workloads and makes bare-metal hosting a much more viable proposition.
I'm in London for Kubecon right now, and the hotel I'm staying at has decided that light switches are unfashionable and replaced them with a series of Android tablets.
A tablet displaying the text UK_bathroom isn't responding. Do you want to close it?
One was embedded in the wall, but the two next to the bed had convenient looking ethernet cables plugged into the wall. So.

I managed to borrow a couple of USB ethernet adapters, set up a transparent bridge (brctl addbr br0; brctl addif br0 enp0s20f0u1; brctl addif br0 enp0s20f0u2; ifconfig br0 up) and then stuck my laptop between the tablet and the wall. tcpdump -i br0 showed traffic, and wireshark revealed that it was Modbus over TCP. Modbus is a pretty trivial protocol, and notably has no authentication whatsoever. tcpdump showed that traffic was being sent to 172.16.207.14, and pymodbus let me start controlling my lights, turning the TV on and off and even making my curtains open and close. What fun!

And then I noticed something. My room number is 714. The IP address I was communicating with was 172.16.207.14. They wouldn't, would they?

I mean yes obviously they would.

It's basically as bad as it could be - once I'd figured out the gateway, I could access the control systems on every floor and query other rooms to figure out whether the lights were on or not, which strongly implies that I could control them as well. Jesus Molina talked about doing this kind of thing a couple of years ago, so it's not some kind of one-off - instead, hotels are happily deploying systems with no meaningful security, and the outcome of sending a constant stream of "Set room lights to full" and "Open curtain" commands at 3AM seems fairly predictable.

We're doomed.

(edited: this previously claimed I could only access systems on my own floor, but it turns out that each floor is a separate broadcast domain and I just needed to set a gateway to access the others)

(further edit: I'm deliberately not naming the hotel. They were receptive to my feedback and promised to do something about the issue.)
I maintain an application for bridging various non-Hue lighting systems to something that looks enough like a Hue that an Amazon Echo will still control them. One thing I hadn't really worked on was colour support, so I picked up some cheap bulbs and a bridge. The kit is badged as an iSuper iRainbow001, and it's terrible.

Things seemed promising enough at first, although the bulbs were alarmingly heavy (there's a significant chunk of heatsink built into them, which seems to get a lot warmer than I'd expect from something that claims a 7W power consumption). The app was a bit clunky, but eh - I wasn't planning on using it for long. I pressed the button on the bridge, launched the app and could control the bulbs. The first thing I noticed was that they had a separate "white" and "colour" mode. White mode was pretty bright, but colour mode massively less so - presumably the white LEDs are entirely independent of the RGB ones, and much higher intensity. Still, potentially useful as mood lighting.

Anyway. Next step was to start playing with the protocol, which meant finding the device on my network. I checked anything that had picked up a DHCP lease recently and nmapped them. The OS detection reported Linux, which wasn't hugely surprising - there was no GPL notice or source code included with the box, but I'm way past the point of shock at that. It also reported that there was a telnet daemon running. I connected and got a login prompt. And then I typed admin as the username and admin as the password and got a root prompt. So, there's that. The copy of Busybox included even came with tftp, so it was easy to get copies of tcpdump and strace on there to see what was up.

So. Next step. Protocol sniffing. I wanted to know how discovery worked, so reset the device to factory and watched what happens. The app on my phone sent out a discovery packet on UDP port 18602 which looked like this:

INLAN:CLIP:23.21.212.48:CLPT:22345:MAC:02:00:00:00:00:00

The CLIP and CLPT fields refer to the cloud server that allows for management when you're not on the local network. The mac field contains an utterly fake address. If you send out a discovery packet and your mac hasn't been registered with the device, you get a denial back. If your mac has been (and by "mac" here I mean "utterly fake mac that's the same for all devices"), you get back a response including the device serial number and IP address. And if you just leave out the mac field entirely, you get back a response no matter whether your address is registered or not. So, that's a start. Once you've registered one copy of the app with the device, anything can communicate with it by just using the same fake mac in the discovery packets.

Onwards. The command format turns out to be simple. They start ##, are followed by two ascii digits encoding a command, four ascii digits containing a bulb id, two ascii digits containing the number of following bytes and then the command data (in ascii). An example is:

##05010002ff

which decodes as command 5 (set white intensity) on bulb 1 with two bytes of data following, each of which is an f. This translates as "Set bulb 1 to full white intensity". I worked out the rest pretty quickly - command 03 sets the RGB colour of the bulb, 0A asks the bridge to search for new bulbs, 0B tells you which bulbs are available and their capabilities and 0E gives you the MAC addresses of the bulbs(‽). 0C crashes the server process, and 06 spews a bunch of garbage back at you and potentially crashes the bulb in a hilarious way that involves it flickering at about 15Hz. It turns out that 06 is actually the "Rename bulb" command, and if you send it less data than it's expecting something goes hilariously wrong in string parsing and everything is awful.

Ok. Easy enough, but not hugely exciting. What about the remote protocol? This turns out to involve sending a login packet and then a wrapped command packet. The login has some length data, a header of "MQIsdp", a long bunch of ascii-encoded hex, a username and a password.

The username is w13733 and the password is gbigbi01. These are hardcoded in the app. The ascii-encoded hex can be replaced with 0s and the login succeeds anyway.

Ok. What about the wrapping on the command? The login never indicated which device we wanted to talk to, so presumably there's some sort of protection going on here oh wait. The command packet is a length, the serial number of the bridge and then a normal command packet. As long as you know the serial number of the device (which it tells you in response to a discovery packet, even if you're unauthenticated), you can use the cloud service to send arbitrary commands to the device (including the one that crashes the service). One of which involves the device then doing some kind of string parsing that doesn't appear to work that well. What could possibly go wrong?

Ok, so that seemed to be the entirety of the network protocol. Anything else to do? Some poking around on the bridge revealed (a) that it had an active wireless device and (b) a running DHCP server. They wouldn't, would they?

Yes. Yes, they would.

The devices all have a hardcoded SSID of "iRainbow", although they don't broadcast it. There's no security - anybody can associate. It'll then hand out an IP address. It's running telnetd on that interface as well. You can bounce through there to the owner's internal network.

So, in summary: it's a device that infringes my copyright, gives you root access in response to trivial credentials, has access control that depends entirely on nobody ever looking at the packets, is sufficiently poorly implemented that you can crash both it and the bulbs, has a cloud access protocol that has no security whatsoever and also acts as an easy mechanism for people to circumvent your network security. This may be the single worst device I've ever bought.

I called the manufacturer and they insisted that the device was designed in 2012, was no longer manufactured or supported, that they had no source code to give me and there would be no security fixes. The vendor wants me to pay for shipping back to them and reserves the right to deduct the cost of the original "free" shipping from the refund. Everything is awful, which is why I just ordered four more random bulbs to play with over the weekend.
The US Government is attempting to force Apple to build a signed image that can be flashed onto an iPhone used by one of the San Bernardino shooters. To their credit, Apple have pushed back against this - there's an explanation of why doing so would be dangerous here. But what's noteworthy is that Apple are arguing that they shouldn't do this, not that they can't do this - Apple (and many other phone manufacturers) have designed their phones such that they can replace the firmware with anything they want.

In order to prevent unauthorised firmware being installed on a device, Apple (and most other vendors) verify that any firmware updates are signed with a trusted key. The FBI don't have access to Apple's firmware signing keys, and as a result they're unable to simply replace the software themselves. That's why they're asking Apple to build a new firmware image, sign it with their private key and provide it to the FBI.

But what do we mean by "unauthorised firmware"? In this case, it's "not authorised by Apple" - Apple can sign whatever they want, and your iPhone will happily accept that update. As owner of the device, there's no way for you to reconfigure it such that it will accept your updates. And, perhaps worse, there's no way to reconfigure it such that it will reject Apple's.

I've previously written about how it's possible to reconfigure a subset of Android devices so that they trust your images and nobody else's. Any attempt to update the phone using the Google-provided image will fail - instead, they must be re-signed using the keys that were installed in the device. No matter what legal mechanisms were used against them, Google would be unable to produce a signed firmware image that could be installed on the device without your consent. The mechanism I proposed is complicated and annoying, but this could be integrated into the standard vendor update process such that you simply type a password to unlock a key for re-signing.

Why's this important? Sure, in this case the government is attempting to obtain the contents of a phone that belonged to an actual terrorist. But not all cases governments bring will be as legitimate, and not all manufacturers are Apple. Governments will request that manufacturers build new firmware that allows them to monitor the behaviour of activists. They'll attempt to obtain signing keys and use them directly to build backdoors that let them obtain messages sent to journalists. They'll be able to reflash phones to plant evidence to discredit opposition politicians.

We can't rely on Apple to fight every case - if it becomes politically or financially expedient for them to do so, they may well change their policy. And we can't rely on the US government only seeking to obtain this kind of backdoor in clear-cut cases - there's a risk that these techniques will be used against innocent people. The only way for Apple (and all other phone manufacturers) to protect users is to allow users to remove Apple's validation keys and substitute their own. If Apple genuinely value user privacy over Apple's control of a device, it shouldn't be a difficult decision to make.
I had no access to the internet for most of my childhood. Nobody in my area knew anything about programming. I learned a great deal from a number of CDs that included free software source code and archives of project mailing lists. When I got to university, I learned even more by being able to develop a Debian-based OS for use in our computer facilities. That gave me the experience and knowledge that I needed to become involved in Debian development, which in turn gave me the background required to be able to help Ubuntu become the first free software operating system to work out of the box on modern laptops. From there, I've been able to build my career around developing free software.

Ubuntu can be translated as "I am who I am because of who we all are". I am who I am because people made the choice to release their software under licenses that permitted examination, modification and redistribution. I am who I am because I was able to participate in communities that took advantages of those freedoms to produce new and better software. I am who I am because when my priorities differed from those of existing communities, it was still possible for me to benefit from their work and for them to benefit from mine.

Free software doesn't mean that the software is entirely free of restrictions. While a core aspect is the right to distribute modified versions of code, it has never been fundamental to free software that you be able to do so while still claiming that the code is the original version. Various approaches have been taken to make it possible for users to distinguish modified versions, ranging from simply including license terms that require modified versions be marked as such, to licenses that require that you change the name of the package if you modify it. However, what's probably the most effective approach has been to apply trademark law to the problem. Mozilla's trademark policy is an example of this - if you modify the code in ways that aren't approved by Mozilla, you aren't entitled to use the trademarks.

A requirement that you avoid use of trademarks in an infringing way is reasonable. Mozilla products include support for building with branding disabled, which makes it very straightforward for a user to build a modified version of Firefox that can be redistributed without any trademark issues. Red Hat have a similar policy for Fedora and RHEL[1] - you simply replace the packages that contain the branding and you're done.

Canonical's IP policy around Ubuntu is fundamentally different. While Mozilla make it clear that you simply no longer have a right to use the trademarks under trademark law, Canonical appear to require that you remove all trademarks entirely even if using them wouldn't be a violation of trademark law. While Mozilla restrict the redistribution of modified binaries that include their trademarks, Canonical insist that you rebuild everything even if the package doesn't contain any trademarks. And while Mozilla give you a single build option that creates binaries that conform with their trademark requirements, Canonical will refuse to tell you what you have to do.

When asked about this at SCALE earlier this year, Mark Shuttleworth claimed that Ubuntu's policy was consistent with that of other projects. This is inaccurate. Nobody else requires that you rebuild every package before you can redistribute it in a modified distribution - such a restriction is a violation of freedom 2 of the Free Software Definition, and as a result the binary distributions of Ubuntu are not free software. Nobody else refuses to discuss whether you're required to remove non-infringing trademarks in order to be able to redistribute. Nobody else responds to offers to make it easier for users to produce non-infringing derivatives with a flat refusal.

Mark claims that I'm only raising this issue because I work for a competitor and wish to harm Canonical. Nothing could be further from the truth. I began discussing this before working for my current employers - my previous employers had no meaningful market overlap with Canonical at all. The reason I care is because I care about free software. I care about people being able to derive new and interesting things from existing code. I care about a small team of people being able to take Ubuntu and make something better in the same way that Ubuntu did with Debian. I care about ensuring that users receive the freedom to do this without having to jump through a significant number of hoops in the process. Ubuntu has been a spectacularly successful vehicle for getting free software into the hands of users. Mark's generosity in funding this experiment has undoubtedly made the world a better place. Canonical employs a large number of talented developers writing high quality software, many of whom I'm fortunate enough to be able to call friends. And Canonical are squandering that by restricting the rights of their users and alienating the free software community.

I want others to be who they are because of my work and the work of all the others like me. Anything that makes that more difficult saddens me, and so I do what I can to fix it. I criticise Canonical's policies in the hope that we, as a community, can convince Canonical to agree that this kind of artificial barrier to modification hurts us more than it helps them. In many ways, Canonical remain one of our best hopes for broadening the reach of free software, and this is why it's unfortunate that they do so in a way that makes it more difficult for people to have the same experiences that I did.

[1] While it's easy to turn a trademark infringing version of RHEL into a non-infringing one, Red Hat don't provide publicly available binary packages for RHEL. If you get hold of them somehow you're entitled to redistribute them freely, but Red Hat's subscriber agreement indicates that if you do this as a Red Hat customer you will lose access to further binary updates - a provision that I find utterly repugnant. Its inclusion reduces my respect for Red Hat and my enthusiasm for working with them, and given the official Red Hat support for CentOS it appears to make no sense whatsoever. Red Hat should drop it.