HubCap: pwning the ChromeCast pt. 2

The chain

In the last post, I explained the bug that we used to get a foothold into the system, but we’re far from achieving what we want. We left off being able to overwrite anything before a particular buffer but because of caching behavior that didn’t get us all that far. Additionally, we don’t know exactly where we are in memory.

I mentioned in the previous post that there’s a debug port on the Chromecast that prints out messages from the boot loader. We’re going to abuse this port to figure out where we are in memory. How you ask? Well, whenever we plug in a USB device, and something goes wrong with it we get a helpful debug message telling us what went wrong and some information relating to what went wrong, along the lines of:

unable to select configuration descriptor 0.

The 0 was filled in by printf, so the string in memory is really … descriptor %d.. What would happen if we overwrote this string with %x %x %x %x ... %x? Because of the way printf handles its arguments, it will start reading garbage off the stack and hopefully get us something interesting. The only problem with this approach is that I can only really overwrite 255 bytes of memory at a time, and - as you may remember from part 1 - we don’t know where we are in memory. We can make educated guesses, but it’s largely a crapshoot. I don’t want to sit here and replug my Teensy hundreds of times just to see if I can hit that string.

Let’s look back on the hub allocation code to see if it can help us here.

static int usb_hub_index;
if (usb_hub_index < USB_MAX_HUB)
    return &hub_dev[usb_hub_index++];

We see that usb_hub_index is auto-incremented. If I just plugged in lots and lots of USB hubs, it would walk memory for me! I created a horrible abomination of a (virtual) USB device, somewhat in the following configuration:

hubs

As you can see, the hubs are effectively plugged in a circle plus an extra device added to overwrite the dev_index variable we overwrote in the first article to get to the -1th entry. To the Chromecast what this actually looks like is a hub, plugged into a hub, plugged into a hub, etc. ad infinitum. However, after about 3000 USB devices the Chromecast runs out of stack space and dies (the hub enumeration function recurses into itself). 2/3rds of those devices were hubs, and we managed to overwrite 16 bytes for each, so we can overwrite about 32k of memory each time.

After about a day of “Move pointer, recompile, plug, wait an hour.” I finally hit a string! My serial console filled with garbage and I now know the relative distance between me and this string. Wait… Relative distance? That doesn’t help us at all! This is where our little string goes on a journey of self-discovery. I had written out a large amount of %x %x %x into memory, and a few of those printed out as 25782025 (which is the binary representation of %x %). Somewhere along the lines, my buffer had ended up on the stack. Translating that into C:

printf("%x %x %x %x %x %x %x", '%x %', 'x %x', ...);

I can abuse this by having some data in my buffer be pointers, and some be %s:

printf("%s\n%s\n%s\n", ptr1, ptr2, ptr3);

I control both the string and the arguments, so all I have to do is scan the pointers across memory until I print out my OWN buffer (so %s\n%s\n%s\n). That will tell me where the string is, and because I know the relative distance between me and the string, that will tell me where I am in memory. 5 minutes later, I had two absolute pointers sitting in front of me.

I tried the same trick with a few different strings, and got some more pointers and even a register dump in the form of a stack frame:

r0 : FF
r1 : 0
r2 : FF
r3 : 0
r4 : 211C58     ; usb_device pointer.
r5 : 0
r6 : 12
r7 : 4FEB8C     ; pointer on to the stack, ~0x204 away from the LR on the stack.
r8 : 3
r9 : 103
r10: 211710     ; usb_device parent (?)
r11: 1AC075     ; usb_hub pointer? next hub pointer?
lr : 1A1820     ; somewhere in usb_new_device

This showed me a couple of things, but most notably it got me a return address in the link register. Excitement! To recap, we now know where we are in memory, we know where a piece of .text that will be executed is, and we can overwrite it. We win… Right? Still no. We’re running with caches on currently, and this system is cache incoherent. That means that the data cache and instruction cache can be out-of-sync, and they most assuredly are at this point. The hub descriptor hasn’t even left the d-cache, and so cannot have been picked up by the i-cache. The end-result is that while one part of the CPU is seeing the hub descriptor at memory X, another is not.

Hulk smash!

This is where I cheated a little and took a peek at the older firmware dump given to me by the GTVHacker guys. The return pointer I gleaned from that stack frame pointed somewhere into usb_new_device, a function adjacent to hub_allocate:

ROM:006A105C hub_allocate                            ; CODE XREF: usb_hub_configure+Cp
ROM:006A105C                 LDR     R2, =base_ptr
ROM:006A1060                 STMFD   SP!, {R3,LR}
ROM:006A1064                 LDR     R3, [R2,#(hub_index - 0x6E2FD8)]
ROM:006A1068                 CMP     R3, #0xF
ROM:006A106C                 BGT     loc_6A1084
ROM:006A1070                 LDR     R0, =hub_array
ROM:006A1074                 ADD     R1, R3, #1
ROM:006A1078                 STR     R1, [R2,#(hub_index - 0x6E2FD8)]
ROM:006A107C                 ADD     R0, R0, R3,LSL#4
ROM:006A1080                 LDMFD   SP!, {R3,PC}
ROM:006A1084 ; ---------------------------------------------------------------------------
ROM:006A1084
ROM:006A1084 loc_6A1084                              ; CODE XREF: hub_allocate+10j
ROM:006A1084                 MOV     R1, #0x10
ROM:006A1088                 LDR     R0, =aErrorUsb_max_h ; "ERROR: USB_MAX_HUB (%d) reached\n"
ROM:006A108C                 BL      sub_681734
ROM:006A1090                 MOV     R0, #0
ROM:006A1094                 LDMFD   SP!, {R3,PC}
ROM:006A1094 ; End of function hub_allocate
ROM:006A1094
ROM:006A1094 ; ---------------------------------------------------------------------------
ROM:006A1098 off_6A1098      DCD base_ptr            ; DATA XREF: hub_allocater
ROM:006A109C off_6A109C      DCD hub_array           ; DATA XREF: hub_allocate+14r
ROM:006A10A0 off_6A10A0      DCD aErrorUsb_max_h     ; DATA XREF: hub_allocate+2Cr
ROM:006A10A0                                         ; "ERROR: USB_MAX_HUB (%d) reached\n"

A couple of interesting things here. The function uses a so called ‘literal pool’ which is a fancy term for “Stuck a couple DWORDs at the end”. It uses this to do absolute addressing, and it’s data. The processor accesses these DWORDs as regular data, not instructions, so we can overwrite those! The hub_array pointer is especially interesting as this dictates where the next hub descriptor gets written. If we can change that pointer, we can write anywhere in memory! So the plan is now:

1.  Overwrite the dev_idx using our Interface Descriptor overflow.
2.  Overwrite the hub_index using the -1 dev_idx
3.  Overwrite the literal pool with our hub descriptor to point the hub array somewhere else
4.  ???
5.  Profit

We gleaned a stack pointer earlier from our little printf foray. What would happen if we pointed the hub_array to the stack and overwrote a return pointer? We… No. We still have to deal with our cache coherency problem, so we can’t just jump into our code still. We can however do some ROP into the d-cache flushing function and THEN jump into our code. With our trusty old dump of the bootloader, we spy the d-cache flushing routine at offset 0x1C0 from the start of the ROM. Fingers crossed that not much changed between v1.0 and v1.5 in the cache routines and that it still lives there. We flush d-cache, and we jump into the stack where our shell code is waiting. This time, we actually do win.

At this point, we’ve got arbitrary code execution and we can do pretty much whatever we want. My shell code fixes the stack pointer, signals success by turning on an LED, waits for a button press, patches the bootloader to allow any image and rolls back the USB state machine. This is enough to run our custom kernel.

Conclusion

We went from source code access to fully owning a restricted embedded device. Along the way we succesfully overcame everything from cache incoherency to not having an exact binary image. This hack is one of the cooler ones I’ve done in a long time simply because it was such a restricted environment. One of the questions that’s come up a lot is “Why?”. Outside of the obvious challenge of cracking a device I’d like to draw some attention to the guys over at Team Eureka, who made the post-exploitation environment possible. They’re adding a ton of useful features using this exploit.

If that’s not your thing a small set of people have been trying to port XBMC to the Chromecast. The main hurdle right now is the VPP framebuffer driver, we’re able to get a 1080p YUV framebuffer with a HW accelerated spinning cube (courtesy of the etnaviv driver), but it has to do RGB -> YUV conversion. We’d love for someone more versed in the Marvell video pipeline to help us out with setting it up right to allow for an RGB plane. But even without that it’s turning out to be a nice $35 Linux box running Arch-Linux without any issues. And who doesn’t want that?