plaidCTF 2014 - g++ (re200)
Although it seems like The Plague's projects are open source, it's not
quite so simple to figure out what the source code does. We believe
this project is supposed to print out secret information, but the KEY
variable in the Makefile has been lost. Find the key, build the
project, get us the information.
Oh noes, the key is gone!
Let’s first take a look at the makefile:
KEY:=guess_a_key
.PHONY: all key.h
all: solveme.cpp key.h
g++ -std=c++0x solveme.cpp -ftemplate-depth-1000000 -o solveme
key.h:
echo '${KEY}' | perl -p -e 's/(.)/sprintf("K(%d,%d)\n",$$i++,ord($$1))/ge' > key.h
Uh-huh. Let’s first compile this thing, maybe the supplied key is already good?
$ time make
echo 'guess_a_key' | perl -p -e
's/(.)/sprintf("K(%d,%d)\n",$i++,ord($1))/ge' > key.h
g++ -std=c++0x solveme.cpp -ftemplate-depth-1000000 -o solveme
real 0m18.328s
user 0m20.660s
sys 0m0.736s
$ cat key.h
K(0,103)
K(1,117)
K(2,101)
K(3,115)
K(4,115)
K(5,95)
K(6,97)
K(7,95)
K(8,107)
K(9,101)
K(10,121)
$ ./solveme
Wrong
(Yeah this is a particularly slow machine.)
The magic here is that the validation happens at compile-time using templates. Let’s take a look:
template <int i>
struct key {
S r = 0;
};
#define K(i,v) \
template <> \
struct key<i> { \
S r = v; \
};
#include "key.h"
int main() {
if (!vv<0>::r) {
int i;
char skey[] = {key<0>::r, key<1>::r, key<2>::r, key<3>::r, key<4>::r, key<5>::r, key<6>::r, key<7>::r, key<8>::r, key<9>::r, key<10>::r, key<11>::r, key<12>::r, key<13>::r, key<14>::r, key<15>::r};
for (i = 0; i < 1<<(1<<(1<<1)); i++)
std::cout << skey[i];
std::cout << "\n";
}
else {
std::cout << "Wrong\n";
}
return 0;
}
We can see that key<i>::r
is the i’th byte of the key that was provisioned using the makefile. All other key bytes are 0 due to the unspecialized class key
.
vv<0>::r
apparently is false only if we entered a good key.
template <int n>
struct gg {
S r = R((u<n,n|2>::r)) - r<(n>>2),((n)&3)>::rr;
};
template <int n>
struct vvv {
S r = gg<n>::r|gg<n+1>::r|gg<n+2>::r|gg<n+3>::r;
};
template <int n>
struct vv {
S r = vvv<0>::r|vvv<4>::r|vvv<8>::r|vvv<12>::r;
};
So - vv
validates the full key, vvv
validates a fourth of the key, and gg
validates
one sixteenth of the key. We can already see that in order for gg<i>::r == 0
(which must be true for
i == [0..15]
, so apparently the first 16 characters of the key are
validated - and printed), R((u<n,n|2>::r))
must be equal to
r<(n>>2),((n)&3)>::rr
.
Let’s see what R(x) is:
#define R(a) (m<(a),(QQ),(QQ)>::r)
#define QQ ((1<<(1<<3))|1)
template <int a, int b, int c>
struct m {
S r = (a-((a/b)*b));
};
QQ
is 0x101
, and m<a,b>::r
is a % b
. This leaves u<>::r
and
r<>::rr
.
If we follow struct key
, we quickly find that only r<>::rr
depends on
the key, and u<>::r
is static. So let’s precalc u % 0x101
for all
values of i:
int s[] = {
(m<((u<0,0|2>::r)),0x101,0x101>::r),
(m<((u<1,1|2>::r)),0x101,0x101>::r),
(m<((u<2,2|2>::r)),0x101,0x101>::r),
(m<((u<3,3|2>::r)),0x101,0x101>::r),
(m<((u<4,4|2>::r)),0x101,0x101>::r),
(m<((u<5,5|2>::r)),0x101,0x101>::r),
(m<((u<6,6|2>::r)),0x101,0x101>::r),
(m<((u<7,7|2>::r)),0x101,0x101>::r),
(m<((u<8,8|2>::r)),0x101,0x101>::r),
(m<((u<9,9|2>::r)),0x101,0x101>::r),
(m<((u<10,10|2>::r)),0x101,0x101>::r),
(m<((u<11,11|2>::r)),0x101,0x101>::r),
(m<((u<12,12|2>::r)),0x101,0x101>::r),
(m<((u<13,13|2>::r)),0x101,0x101>::r),
(m<((u<14,14|2>::r)),0x101,0x101>::r),
(m<((u<15,15|2>::r)),0x101,0x101>::r)
};
r
then depends on the key, and is defined as
template <int a, int b>
struct r {
S rr = d<P(a,b)>::r;
};
which cpp
expands to this:
template <int a, int b>
struct r {
static const int rr =
d<
g<a*4>::r,
g<a*4+1>::r,
g<a*4+2>::r,
g<a*4+3>::r,
key<b>::r,
key<b+4>::r,
key<b+8>::r,
key<b+12>::r
>::r;
};
g<>:::r
again is static and doesn’t depend on the key, so let’s also
precalculate this:
int gs[] = {g<0>::r, g<1>::r, g<2>::r, g<3>::r, g<4>::r, g<5>::r, g<6>::r,
g<7>::r, g<8>::r, g<9>::r, g<10>::r, g<11>::r, g<12>::r, g<13>::r, g<14>::r,
g<15>::r};
We also see that no more than 4 bytes of the key are dependent, so we could easily brute force 4 groups of 228 possibilities (assuming an ASCII key).
But to brute force, we need to convert the templates to regular code,
otherwise it would take 20s per attempt. So let’s dive into the d<>::r
template (expanded and fail-indented):
template <int s, int t, int u, int v, int w, int x, int y, int z>
struct d {
static const int r =
(m<((
n<(
n<(
n<z,v,z,0>::s,
n<x,t,x,0>::s,
n<z,v,z,0>::s,
n<x,t,x,0>::s)>::q),
(n<
(n<y,u,y,0>::s),
(n<w,s,w,0>::s),
(n<y,u,y,0>::s),
(n<w,s,w,0>::s)>::q),
(n<
(n<z,v,z,0>::s),
(n<x,t,x,0>::s),
(n<z,v,z,0>::s),
(n<x,t,x,0>::s)>::q),
(n<(n<y,u,y,0>::s),
(n<w,s,w,0>::s),
(n<y,u,y,0>::s),
(n<w,s,w,0>::s)>::q)
>::q))
,0x101,0x101>::r);
};
This is not so useful yet. What’s n<>
?
template <int a, int b, int c, int d>
struct n {
static const int q = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::q;
static const int r = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::r;
static const int s = n<a-1,(b % 0x101) - 1,c+1,((a+c)>>1) + (d % 0x101)>::s;
};
template <int a, int c, int d>
struct n<a,0,c,d> {
static const int r = a;
static const int q = c;
static const int s = d;
};
Now n<a,b,c,d>
is specialized for b==0
. Also, n<a,b,c,d>
depends on n
again, but if instantiated with a positive
integer b
, it’s guaranteed to reach zero at some point, at which the
recursion will terminate. We can see that d<>
uses n<>::r
, n<>::q
and n<>::s
, which is the a
, c
or d
when the recursion finally ends
due to b == 0
. For each recursion, b
is decremented once, so for each
step, the following calculation is done:
- a := a - 1
- c := c + 1
- d := d % 0x101 + (a + c) / 2
This allows us to derive:
n<a,b,c,d>::r == a-b
n<a,b,c,d>::q == c+b
.
Now n<a,b,c,d>::s
is very complicated math (like CRC) - since a
is
decremented when c
is incremented, (a+c)/2
is static, and added to d
in each
recursion step, so the result is
n<a,b,c,d>::s == d + b * (a+c) / 2
.
n<a,b,c,d>::s
however is always called with a == c
and d == 0
so it
collapses to a simply a * b % 0x101
. This allows us to rewrite d<s,t,u,v,w,x,y,z>::r
as
int d(int s, int t, int u, int v, int w, int x, int y, int z)
{
return (w * s + y * u + x * t + v * z) % 0x101;
}
If you remember, s
, t
, u
, v
was static, and didn’t depend on the
key, whereas w
, x
, y
, z
depends on the key. If I had a PhD I could
apply advanced concept such as solving linear equations, but we can also
just brute force using this horrible code:
for (int ko = 0; ko < 4; ++ko)
for (int ka = 32; ka < 128; ++ka)
for (int kb = 32; kb < 128; ++kb)
{
for (int kc = 32; kc < 128; ++kc)
{
for (int kd = 32; kd < 128; ++kd)
{
int o = 1;
for (int t = 0; t < 4; ++t) {
int n = t << 2;
int go = n & 0xC;
if (d(gs[go + 0], gs[go + 1], gs[go + 2], gs[go + 3],
ka, kb, kc, kd) != s[n + ko]) {
o = 0;
}
}
if (o)
{
printf("%d %d %d %d [ko %d]\n", ka, kb, kc, kd, ko);
}
}
}
}
Let’s compare the runtime of this to compiling the original file:
$ time ./s2
67 109 95 95 [ko 0]
43 48 108 67 [ko 1]
43 114 121 45 [ko 2]
95 101 107 45 [ko 3]
real 0m9.060s
user 0m10.668s
sys 0m0.012s
So we’ve been quicker in testing ~228 combinations than g++ was in testing a single one. I call that a win.
We still have to re-assemble the flag, and we haven’t used python yet:
>>> key = [chr(int(x)) for x in "67 109 95 95 43 48 108 67 43 114 121 45 95 101 107 45".split()]
>>> ''.join([key[x * 4 % 16 + (x/4)] for x in range(16)] )
'C++_m0re_lyk_C--'