2014-04-23

plaidCTF 2014 - g++ (re200)

By aDR4eA
Filed under plaidCTF2014 Crypto templates preprocessor horror

For PlaidCTF2014, Eindbazen and fail0verflow joined forces as 0xffa, the Final Fail Alliance. Don't miss out on other write-ups at Eindbazen's site!

Although it seems like The Plague's projects are  open source, it's not
quite so simple to figure out what the source code  does. We believe
this project is supposed to print out secret information, but the KEY
variable in the Makefile has been lost. Find the key, build the
project, get us the information.

Oh noes, the key is gone!

Let’s first take a look at the makefile:

KEY:=guess_a_key
.PHONY: all key.h

all: solveme.cpp key.h
  g++ -std=c++0x solveme.cpp -ftemplate-depth-1000000 -o solveme

key.h:
  echo '${KEY}' | perl -p -e 's/(.)/sprintf("K(%d,%d)\n",$$i++,ord($$1))/ge' > key.h

Uh-huh. Let’s first compile this thing, maybe the supplied key is already good?

$ time make
echo 'guess_a_key' | perl -p -e
's/(.)/sprintf("K(%d,%d)\n",$i++,ord($1))/ge' > key.h
g++ -std=c++0x solveme.cpp -ftemplate-depth-1000000 -o solveme

real    0m18.328s
user    0m20.660s
sys     0m0.736s

$ cat key.h
K(0,103)
K(1,117)
K(2,101)
K(3,115)
K(4,115)
K(5,95)
K(6,97)
K(7,95)
K(8,107)
K(9,101)
K(10,121)

$ ./solveme
Wrong

(Yeah this is a particularly slow machine.)

The magic here is that the validation happens at compile-time using templates. Let’s take a look:

template <int i>
struct key {
  S r = 0;
};

#define K(i,v) \
template <> \
struct key<i> { \
  S r = v; \
};

#include "key.h"

int main() {
  if (!vv<0>::r) {
    int i;
    char skey[] = {key<0>::r, key<1>::r, key<2>::r, key<3>::r, key<4>::r, key<5>::r, key<6>::r, key<7>::r, key<8>::r, key<9>::r, key<10>::r, key<11>::r, key<12>::r, key<13>::r, key<14>::r, key<15>::r};
    for (i = 0; i < 1<<(1<<(1<<1)); i++)
      std::cout << skey[i];
    std::cout << "\n";
  }
  else {
    std::cout << "Wrong\n";
  }
  return 0;
}

We can see that key<i>::r is the i’th byte of the key that was provisioned using the makefile. All other key bytes are 0 due to the unspecialized class key.

vv<0>::r apparently is false only if we entered a good key.

template <int n>
struct gg {
  S r = R((u<n,n|2>::r)) - r<(n>>2),((n)&3)>::rr;
};

template <int n>
struct vvv {
  S r = gg<n>::r|gg<n+1>::r|gg<n+2>::r|gg<n+3>::r;
};

template <int n>
struct vv {
  S r = vvv<0>::r|vvv<4>::r|vvv<8>::r|vvv<12>::r;
};

So - vv validates the full key, vvv validates a fourth of the key, and gg validates one sixteenth of the key. We can already see that in order for gg<i>::r == 0 (which must be true for i == [0..15], so apparently the first 16 characters of the key are validated - and printed), R((u<n,n|2>::r)) must be equal to r<(n>>2),((n)&3)>::rr.

Let’s see what R(x) is:

#define R(a) (m<(a),(QQ),(QQ)>::r)
#define QQ ((1<<(1<<3))|1)
template <int a, int b, int c>
struct m {
  S r = (a-((a/b)*b));
};

QQ is 0x101, and m<a,b>::r is a % b. This leaves u<>::r and r<>::rr.

If we follow struct key, we quickly find that only r<>::rr depends on the key, and u<>::r is static. So let’s precalc u % 0x101 for all values of i:

int s[] = {
  (m<((u<0,0|2>::r)),0x101,0x101>::r),
  (m<((u<1,1|2>::r)),0x101,0x101>::r),
  (m<((u<2,2|2>::r)),0x101,0x101>::r),
  (m<((u<3,3|2>::r)),0x101,0x101>::r),
  (m<((u<4,4|2>::r)),0x101,0x101>::r),
  (m<((u<5,5|2>::r)),0x101,0x101>::r),
  (m<((u<6,6|2>::r)),0x101,0x101>::r),
  (m<((u<7,7|2>::r)),0x101,0x101>::r),
  (m<((u<8,8|2>::r)),0x101,0x101>::r),
  (m<((u<9,9|2>::r)),0x101,0x101>::r),
  (m<((u<10,10|2>::r)),0x101,0x101>::r),
  (m<((u<11,11|2>::r)),0x101,0x101>::r),
  (m<((u<12,12|2>::r)),0x101,0x101>::r),
  (m<((u<13,13|2>::r)),0x101,0x101>::r),
  (m<((u<14,14|2>::r)),0x101,0x101>::r),
  (m<((u<15,15|2>::r)),0x101,0x101>::r)
};

r then depends on the key, and is defined as

template <int a, int b>
struct r {
  S rr = d<P(a,b)>::r;
};

which cpp expands to this:

template <int a, int b>
struct r {
  static const int rr =
    d<
      g<a*4>::r,
      g<a*4+1>::r,
      g<a*4+2>::r,
      g<a*4+3>::r,
      key<b>::r,
      key<b+4>::r,
      key<b+8>::r,
      key<b+12>::r
    >::r;
};

g<>:::r again is static and doesn’t depend on the key, so let’s also precalculate this:

int gs[] = {g<0>::r, g<1>::r, g<2>::r, g<3>::r, g<4>::r, g<5>::r, g<6>::r,
    g<7>::r, g<8>::r, g<9>::r, g<10>::r, g<11>::r, g<12>::r, g<13>::r, g<14>::r,
    g<15>::r};

We also see that no more than 4 bytes of the key are dependent, so we could easily brute force 4 groups of 2²⁸ possibilities (assuming an ASCII key).

But to brute force, we need to convert the templates to regular code, otherwise it would take 20s per attempt. So let’s dive into the d<>::r template (expanded and fail-indented):

template <int s, int t, int u, int v, int w, int x, int y, int z>
struct d {
  static const int r =
    (m<((
      n<(
        n<(
          n<z,v,z,0>::s,
          n<x,t,x,0>::s,
          n<z,v,z,0>::s,
          n<x,t,x,0>::s)>::q),
      (n<
        (n<y,u,y,0>::s),
        (n<w,s,w,0>::s),
        (n<y,u,y,0>::s),
        (n<w,s,w,0>::s)>::q),
      (n<
        (n<z,v,z,0>::s),
        (n<x,t,x,0>::s),
        (n<z,v,z,0>::s),
        (n<x,t,x,0>::s)>::q),
      (n<(n<y,u,y,0>::s),
        (n<w,s,w,0>::s),
        (n<y,u,y,0>::s),
        (n<w,s,w,0>::s)>::q)
      >::q))
    ,0x101,0x101>::r);
};

This is not so useful yet. What’s n<>?

template <int a, int b, int c, int d>
struct n {
  static const int q = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::q;
  static const int r = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::r;
  static const int s = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::s;
};

template <int a, int c, int d>
struct n<a,0,c,d> {
  static const int r = a;
  static const int q = c;
  static const int s = d;
};

Now n<a,b,c,d> is specialized for b==0. Also, n<a,b,c,d> depends on n again, but if instantiated with a positive integer b, it’s guaranteed to reach zero at some point, at which the recursion will terminate. We can see that d<> uses n<>::r, n<>::q and n<>::s, which is the a, c or d when the recursion finally ends due to b == 0. For each recursion, b is decremented once, so for each step, the following calculation is done:

a := a - 1
c := c + 1
d := d % 0x101 + (a + c) / 2

This allows us to derive:

n<a,b,c,d>::r == a-b
n<a,b,c,d>::q == c+b.

Now n<a,b,c,d>::s is very complicated math (like CRC) - since a is decremented when c is incremented, (a+c)/2 is static, and added to d in each recursion step, so the result is

n<a,b,c,d>::s == d + b * (a+c) / 2.

n<a,b,c,d>::s however is always called with a == c and d == 0 so it collapses to a simply a * b % 0x101. This allows us to rewrite d<s,t,u,v,w,x,y,z>::r as

int d(int s, int t, int u, int v, int w, int x, int y, int z)
{
  return (w * s + y * u + x * t + v * z) % 0x101;
}

If you remember, s, t, u, v was static, and didn’t depend on the key, whereas w, x, y, z depends on the key. If I had a PhD I could apply advanced concept such as solving linear equations, but we can also just brute force using this horrible code:

for (int ko = 0; ko < 4; ++ko)
  for (int ka = 32; ka < 128; ++ka)
    for (int kb = 32; kb < 128; ++kb)
    {
      for (int kc = 32; kc < 128; ++kc)
      {
        for (int kd = 32; kd < 128; ++kd)
        {
          int o = 1;
          for (int t = 0; t < 4; ++t) {

            int n = t << 2;

            int go = n & 0xC;

            if (d(gs[go + 0], gs[go + 1], gs[go + 2], gs[go + 3],
              ka, kb, kc, kd) != s[n + ko]) {
              o = 0;
            }
          }
          if (o)
          {
            printf("%d %d %d %d [ko %d]\n", ka, kb, kc, kd, ko);
          }
        }
      }
    }

Let’s compare the runtime of this to compiling the original file:

$ time ./s2
67 109 95 95 [ko 0]
43 48 108 67 [ko 1]
43 114 121 45 [ko 2]
95 101 107 45 [ko 3]

real    0m9.060s
user    0m10.668s
sys     0m0.012s

So we’ve been quicker in testing ~2²⁸ combinations than g++ was in testing a single one. I call that a win.

We still have to re-assemble the flag, and we haven’t used python yet:

>>> key = [chr(int(x)) for x in "67 109 95 95 43 48 108 67 43 114 121 45 95 101 107 45".split()]
>>> ''.join([key[x * 4 % 16 + (x/4)] for x in range(16)] )
'C++_m0re_lyk_C--'