Megafail

Let’s take a break from Wii U hacking to take a quick look at Mega’s security.

In case you’ve been living under a rock the past few days, Kim Dotcom (of Megaupload infamy) has launched his new cloud storage site, Mega. Mega has an impressive sales pitch, promising secure cloud storage where only the user has the key to decrypt his or her files, and the encryption and decryption happens securely in the browser.

Today we aren’t going to take a look at their encryption or their key generation, which have already been the subject of several articles. Instead, we’re going to look at the security of the Mega website itself. As Mega themselves admit, if you use their web interface (and not a third-party client), the security of the entire ordeal depends on whether you trust them. After all, anyone with the ability to modify the site could just replace the JavaScript code with one that sends them (or anyone else) your password or master key. There’s no way around having to trust Mega for this, but you also have to trust that Mega’s site is delivered securely to you.

The standard solution to this problem is to use a strong form of SSL. However, Mega chose an interesting approach to SSL. Instead of serving the entire site from a single secure server or group of servers using strong SSL, they came up with a clever scheme to allow them to serve most of their site insecurely. Mega’s main index.html is hosted on a secure server using SSL with 2048-bit RSA. However, everything else is loaded dynamically from JavaScript code in index.html, and hash checked. This additional content comes from a CDN that uses weaker 1024-bit SSL with MD5 authentication. The CDN servers are third-party servers, and thus potentially easy to compromise for an attacker. Therefore, you would have to trust the entire CDN network and also trust that nobody has broken 1024-bit SSL yet (which is known to be weak by modern standards). In order to solve this problem, Mega hashes all of the additional content, and stores the hashes in index.html. This creates a chain of trust, or as they put it, “secure boot for websites”. Clever.

There’s nothing inherently wrong with this idea. However, security designs are only as secure as their implementation. Let’s look at Mega’s “web secure boot” implementation.

At the time of this writing, their code stores a hash for each file (a few hours ago they stored only a combined hash, which was problematic since it was derived from a concatenation of all inputs, which might allow an attacker to exploit the system by changing the boundary between files). Each resource is fetched using AJAX, hashed, and then loaded. The hashing is performed using the following function:

function h(s)
{
	var a = [0,0,0,0];
	var aes = new sjcl.cipher.aes([111111,222222,333333,444444]);
	s += Array(16).join('X');

	for (var i = s.length & -16; i--; )
	{
		a[(i>>2)&3] ^= s.charCodeAt(i)<<((7-i&3)<<3);
		if (!(i&15)) a = aes.encrypt(a);
	}
	return a;
}
(Indentation corrected for sanity)

This is a straightforward implementation of CBC-MAC authentication using the AES-128 block cipher and a fixed key (111111,222222,333333,444444, expressed as four 32-bit integer quarters), taking the input backwards. At this point, readers with a modicum of experience with the requirements for a secure hash function are invited to analyze the construction of the above function and see if they can spot the issue, while readers less versed in security are invited to read the Wikipedia article linked above and spot the dire warning. (There’s also a typo: the 7 should be a 3. The code overflows the shift bit count, but the browser seems to take it mod 32 anyway, so it doesn’t break in practice.)

In short, CBC-MAC is a Message Authentication Code, not a strong hash function. While MACs can be built out of hash functions (e.g. HMAC), and hash functions can be built out of block ciphers like AES (e.g. using the Davies–Meyer construction), not all MACs are also hash functions. CBC-MAC in particular is completely unsuitable for use as a hash function, because it only allows two parties with knowledge of a particular secret key to securely transmit messages between each other. Anyone with knowledge of that key can forge the messages in a way that keeps the MAC (“hash value”) the same. All you have to do is run the forged message through CBC-MAC as usual, then use the AES decryption operation on the original hash value to find the last intermediate state. XORing this state with the CBC-MAC for the forged message yields a new block of data which, when appended to the forged message, will cause it to have the original hash value. Because the input is taken backwards, you can either modify the first block of the file, or just run the hash function backwards until you reach the block that you want to modify. You can make a forged file pass the hash check as long as you can modify an arbitrary aligned 16-byte block in it.

Try it out with this proof of concept web demo (and feel free to view its source code, now commented). First, visit Mega, view the source for index.html, and download any of the resource files listed there. For example, you can try this JavaScript file. At the time of this writing, the hash value of that file as listed in index.html was as follows:

[2142146975,1426300354,-1192167238,529939563]

Then select it here to compute its hash value (note: this demo requires a modern browser that implements the HTML5 File API):


Hash value:

Now, type your forged content into the following text box and click the button to download your forged file:


You can then select the forged file in the file picker above again, to verify that it still has the same hash. If you were hosting one of Mega’s CDN nodes (or you were a government official of the CDN hoster’s jurisdiction), you could now take over Mega and steal users’ encryption keys. While Mega’s sales pitch is impressive, and their ideas are interesting, the implementation suffers from fatal flaws. This casts serious doubts over their entire operation and the competence of those behind it.

Kim Dotcom pulled a Nintendo Wii. Mega has decent design ideas, but it has been poorly implemented by people clearly unfamiliar with basic cryptography. Using CBC-MAC as a hash function is worse than using strncmp to compare binary SHA-1 values (which was the hole behind the venerable Wii fakesigning exploit). The people who botched the Wii’s hash check also wrote a “secure” kernel riddled with exploits. Do you trust web developers who can’t tell a MAC from a strong hash to write a secure cloud storage site? You decide.

Update: a few people have asked what the correct approach would’ve been here. The straightforward choice would’ve been to use SHA1. Even MD5 would’ve worked, though (the current attacks against MD5 wouldn’t work against this scenario, as there is no practical second preimage attack for MD5). SHA256 would’ve been the right choice for the more paranoid. If the goal was, as I suspect, to use the same AES core for everything, then a Davies-Meyer construction around AES (which is about the same amount of code as the CBC-MAC is) would’ve worked too, although they should also fix the padding (the “XXX” thing isn’t secure, as you can append any amount of XXXs up to the end of the last block of the file. Something like “YX…X” wouldn’t have this problem, although the right way to do it is to use a proper Merkle-Damgård construction.). SHA1 is faster, though, because when using a block cipher as a hash you have to rerun the key schedule for every block.

I suspect there is also a much subtler bug involving Unicode, because the files are hashed characterwise (not bytewise), but the character code of Unicode characters can exceed 255. Under these circumstances the XOR will do the wrong thing and mix together adjacent characters. I suspect that given some careful work you could create an exploit against the Mega resources that would work even if the construction was changed to Davies-Meyer, by tweaking adjacent pairs of characters such that they perform the same XOR transform on the intermediate hash value after overlapping. The easy fix for this is to just bail out with a failure if any characters in the files are outside of the ASCII range, or to run the hash over the UTF-8 encoded data.

Update (2013-01-24): Mega has now switched to using SHA-256. They get points for fixing it quickly, but I wonder what other subtle or not-so-subtle security problems remain.