2014-04-27

plaidCTF 2014 - wheeeee (crypto375)

By w3nk4w
Filed under plaidCTF2014 crypto python slide attack

For PlaidCTF2014, Eindbazen and fail0verflow joined forces as 0xffa, the Final Fail Alliance. Don't miss out on other write-ups at Eindbazen's site!

wheeeee
Crypto (375 pts)
----------------
Although it seems like The Plague's messaging service is secure, 
there are bound to be bugs in any 20th century crypto system. 
We've recovered a version of the block cipher The Plague implemented. 
Use their online encryptor tool, at 54.82.75.29:8193, to break the 
cipher and figure out Plague's secret plans. NOTE: When the service 
sends you a hex-encoded string, respond with a hex-encoded string.

The Oracle

To start off, we took a look at the “online encryptor tool”. Upon connecting to this “encryptor”, the encryptor replies:

We would first like some proof of work.
Send us a string starting with tXSPPWeGMrTzLMiX, of length 21, such 
that its sha1 sum ends in ffffff

With every new connection, the “starting with” string changes. Therefore, we needed to implement a realtime response to this rate-limiting challenge. Finding an input string such that the SHA1 hash ends in ffffff simply requires testing a number of random input strings prefixed with the challenge. On average, we expect to find a matching hash with 2²⁴ trials. We implemented this proof-of-work quickly with the following:

import sha, socket, re, itertools

def pow(base):
    letters = [' ', '!', '"', '#', '$', '%', '&', '\'', '(', ')',
        '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5',
        '6', '7', '8', '9', ':', ';', '<', '=', '>', '?', '@', 'A',
        'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
        'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y',
        'Z', '[', '\\', ']', '^', '_', '`', 'a', 'b', 'c', 'd', 'e',
        'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q',
        'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}',
        '~']

    for c in itertools.combinations_with_replacement(letters, 5):
        a = "".join(c)
        if sha.new(base+a).digest().endswith("\xff\xff\xff"):
            return base+a

After we submited the proof-of-work to the encryptor, the encryptor replied:

WHHEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
6a78375d974f47f84012f16acb17c69d012722c75ce62fd0406c8ca90494c68c84d637ca6195dd9e159c12f16a8a9ebbb11f7297eaf677cb038c1d72fb6159c2cc1239e0d08f86958d2821
Send your encryption string:

Right away, we noticed the large hex-encoded message following the “WHEEEE…” line. Could this be a message we had to decode to get the flag, or that would provide further information that would allow us to progress with the challenge? Guessing that this was probably an encrypted message or flag, we called this the “flag ciphertext”.

If we sent a hex-encoded bytestring to the “encryptor” followed by a newline, it would reply with a hex-encoded bytestring of equivalent length (for bytestrings with multiples-of-three length for reasons that will become obvious later). The contents, however, appeared random. Almost as if they were encrypted (which we expected, given the name of the tool!).

In cryptographic terminology, a service that allows an attacker to perform a cryptographic operation is called an “oracle”. So in proper terminology, this “encryptor” is actually an “encryption oracle”. With an “encryption oracle”, an attacker provides plaintext to the oracle and the oracle replies with encrypted ciphertext.

While querying the encryption oracle repeatedly over time, we noticed that the contents of the “flag ciphertext” changed at semirandom intervals. At the time of the change, the mapping of plaintext to ciphertext by the encryption oracle also changed. Given this, we suspected that the encryptor was changing keys for both the oracle and for the flag ciphertext.

The Block Cipher

After we examined the oracle, we turned our attention to the cipher itself. The cipher is a simple block cipher operating on a block size of 24 bits with a key size also of 24 bits. The block cipher is implemented by the encrypt_block function shown below.

def encrypt_block(plaintext, key):
    txt = plaintext
    l, r = (txt >> M) & ((1 << M) - 1), txt & ((1 << M) - 1)
    for x in xrange(numrounds):
    if x % 2 == 0:
        l1 = r
        r1 = l ^ F(r, key[0])
        l, r = l1, r1
    else:
        l1 = l
        r1 = l ^ F2(r, key[1])
        l, r = l1, r1
    return l << M | r

As with most block ciphers, the cipher is structured as repeating “rounds”. Each round combines a portion of the cipher state with the key and transforms the cipher state. This cipher is unusual in that it uses an extremely high round count of 2²⁴.

The cipher used by the plague has two separate round types: even and odd rounds. Both rounds use a nonlinear function F or F2 to mix key material into the cipher state. The F and F2 functions simply linearly combine a portion of the key with the cipher state, and pass the result through a nonlinear SBOX (f or f2) generated pseudorandomly. We show the structure of the cipher round pairs below.

{: .cipher }

As we expected to need to decrypt the “flag ciphertext” provided by the oracle, one of the first things we did was to invert the encrypt_block routine so as to make a decrypt_block routine. decrypt_block needs an inverted f2 SBOX, which can be trivially calculated. We show decrypt_block below. Note that the F function (and therefore the f sbox) is used in the forward direction (the same as encryption) due to the unusual structure of the cipher.

def invert(x):
    res = {}
    for k,v in x.iteritems():
        res[v] = k
    return res

f2_i = invert(f2)

def decrypt_block(ciphertext, key):
    txt = ciphertext
    l, r = (txt >> M) & ((1 << M) - 1), txt & ((1 << M) - 1)
    for x in xrange(numrounds - 1, -1, -2):

    r1 = l
    r1c = F(r1, key[0])
    r1d = f2_i[l^r] ^ key[1]
    l1 = r1c ^ r1d 
    l, r = l1, r1
    
    return l << M | r

Potential Attacks

Given the tiny keyspace of this cipher, a simple “brute-force” attack would seem to be the obvious attack. By testing all possible keys against a known plaintext/ciphertext pair (which we get from the oracle), we can determine which key was used to encipher the pair and therefore determine the key.

Brute force can even be used to decrypt the message with only the flag ciphertext, as long as the decrypted plaintext can be distinguished from incorrectly decrypted data (which is semirandom). This is a property that holds for all english-language plaintexts, and for some plaidCTF flags.

However, the plaidCTF organizers deter this simple brute force attack by using an extremely high (2²⁴) round count. We implemented an optimized brute-force attack (just in case), and we expected the brute-force to complete in roughly 20h, well before the end of the competition.

We can also use a “guess-and-check” attack where we submit suspected fragments of plaintext to the oracle, and see if they show up in the flag ciphertext. While this attack is viable (and indeed, other teams solved it via this method), this attack is tedious and time-consuming.

We can do better.

Slide Attacks

Fortunately, this cipher is trivially vulnerable to a slide attack. Slide attacks take advantage of weak key structures in iterated-round block ciphers (such as this one). The complexity of performing the slide attack is completely independent of the round count. We recommend reading up on the slide attack at the link above to understand what follows.

To perform a slide attack, the structure of the cipher needs to satisfy two properties:

The key-dependent parts of the cipher can be decomposed into repeated iterations of a keyed function R_k()
We can solve for ‘k’ (or for a set of potential k’s 𝕂), given (a, R_k(a) ).

Luckily, this cipher has both properties. Since we need a repeated identical R_k(), we can define R_k() as a pair of even/odd rounds. Therefore, the cipher is then made up of 2²³ applications of R_k to the plaintext (or the application of the inverse of R_k to the ciphertext).

We can solve for k given the inputs and outputs of R. The python snippet below solves for a set of potential k values given the input and output of R.

def known_plaintext(p0, c0):
    p_l, p_r = (p0 >> M) & ((1 << M) - 1), p0 & ((1 << M) - 1)
    c_l, c_r = (c0 >> M) & ((1 << M) - 1), c0 & ((1 << M) - 1)

    assert c_l == p_r

    s = set()
    for k1 in xrange(0,2**(K/2)):

    half_r = f2_i[p_r ^ c_r] ^ k1
    k0 = f_i[half_r^p_l] ^ p_r
    s.add((k0, k1)) 

    return s

Note that due to the structure of the cipher (see the diagram above), the 24 bits of the key only affect 12 bits of the cipher state during one application of R_k. This means that we solve for 2¹² potential keys (4096), given one pair of (a, R_k(a)).

These properties can be used to implement a slide attack if we can find a “slid pair”. A “slid pair” is one in which p₁ = R_k(p₀), and c₁ = R_k(c₀). (That is, if the cipher internal state is equivalent when the “R” function repetitions are offset by one). We use the notation p_x and c_x to refer to ciphertext and plaintext blocks, respectively.

To check if two blocks form a “slid pair”, we attempt to compute k, or set of possible k values (which we will call 𝕂) for (p₀,c₀) and (p₁,c₁). If an intersection 𝕂₀ ∩ 𝕂₁ exists, then (p₀,c₀),(p₁,c₁) may form a slid pair. We can verify that (p₀,c₀),(p₁,c₁) form a slid pair by checking if p0 encrypts into c0 with candidate key k.

If we determine that the candidate pair is in fact slid, we then have recovered the key for the trial block, and therefore for the oracle (and then we should be able to decrypt the message).

Obtaining Slid Pairs

With the basic slide attack, we would expect to find a slid pair with, on average, 2^N/2 random pairs, for a block size of 2^N. This relationship is due to the birthday bound.

Given an N (keysize) of 24 bits, we would expect to find a slid pair with around 2¹² queries. Unfortunately, this attack isn’t good enough for the provided oracle. With the POW script described above, we find a proof-of-work on average in 3-5 seconds, and under oracle loading conditions during the competition, the server was encrypting rougly 2 blocks per second. While we did not attempt to precisely measure oracle key-rotation times, key rotations were occuring frequently enough to make collecting 4k pairs impossible (during trials we got 600-1000 queries, which only have a 1-3% chance of containing a slid pair).

Fortunately, the weak structure of the cipher allows us to improve our birthday attack. Note that for one iteration of l’,r’ = R_k(l,r), l’ = r. (If this doesn’t make sense, see the diagram above). That is, we can predict one half of the cipher state after one iteration, regardless of of the cipher key. This weakness reduces the attack complexity to order 2^(N/4).

We can then partition our birthday attack into two classes of blocks between which we will attempt to find a slid pair. We fix (alternately) the l and r inputs to a constant values, which means that we have reduced the space in which we are trying to find a collision to 12 bits. Note that this makes our attack a Chosen-Plaintext, as compared to Known-Plaintext.

For 50% probability, we will then need to collect 77 each of the pre- and post-slide plaintext/ciphertext blocks. Note that this is 144 blocks in total, or twice what would be expected from the basic birthday bound. As the collision occurs between two different classes, this doubles the number of blocks we need.

In practice, we collected 256 blocks in total (128 from each class), which provided an 87% chance of success per collection (and if it didn’t succeed the first time, we’d simply collect more blocks).

Putting it all together was done with the following snippet (the hardcoded encstr corresponds to the hex-encoded flag ciphertext present when we gathered the blocks):

done = 0
trial = 0
for p0,c0 in blocks_0:
    for p1, c1 in blocks_1:
    trials += 1
    key = test_slid(p0, c0, p1, c1)
    if s:
        print key
        print "x found at trial %d" % trials
        done=1
        break
    if done:
    break
    
    encstr = "5a8a50ae67d8068e4e50793c60fa909ce4a5866e7d2f5c563e0633ddf66663fb56883327c7228a9e6eba50793c0881795ef6b2b355f4ae251619ff3e5cb5865fcb51d051624f9b6daeb469"

    encb = encstr.decode("hex")
    print decrypt(encb, FOUND_KEY)

yielding to the decrypted string:

Gotta love it when you can SLIDE. The flage is id_almost_rather_be_sledding

Giving us the flag of id_almost_rather_be_sledding