Word Up! Microsoft Word OneTableDocumentStream Underflow
Today, Microsoft released the MS16–148 to patch CVE-2016-7290, which addresses an integer underflow issue that I found. The underflow later triggers an out-of-bounds read during a copy operation which could result in a stack based buffer overflow outside of the protected mode winword.exe process when a processing specially crafted binary document file.
tl;dr; Whilst all that sounds dramatic, in reality the proof of concept (poc) only triggered an out-of-bounds read condition with the potential for information disclosure, however in this blog post I will detail the vulnerability further.
The vulnerability affects Microsoft Word 2007 Service Pack 3, Microsoft Office 2010 Service Pack 2 (32-bit editions), Microsoft Office 2010 Service Pack 2 (64-bit editions) and Microsoft Office Compatibility Pack Service Pack 3. More details can be found in the SRC-2016-0042 advisory. All analysis was performed on Microsoft Office 2010 Professional WinWord.exe v14.0.4734.1000, the latest patch at the time.
First, let’s take a look at the differential of the sample and the poc files using our favorite binary editor 010.
Comparing the poc.doc with the sample.doc
What you may notice, is that there is only a single byte delta modification to the file. Using Offviz, we can take look and see which chunk contains the modification.
Analyzing the structure of the poc.doc
The byte modification is within the data field of the OneTableDocumentStream chunk. The sample contains the byte value 0x68, however the poc uses 0xfa to trigger the underflow.
0x0 Triggering the vulnerability
First, I enable page heap and usermode stack traces for debugging purposes:
Then running the poc.doc file results in the following access violation outside of protected mode:
Doesn’t look so pretty without symbols does it?
0x1 Investigating Accessed Memory
The first thing I do is start checking out the memory that was accessed at the time of corruption.
We can already see that this is an out-of-bounds read on a heap buffer that is 0x19 bytes in size, trying to copy an additional 204 bytes into @edi which is a stack based address. One might ask, what size is the stack variable?
As it turns out, that stack variable is passed up the stack 6 frames down and seems dynamically calculated from a number of other variables and offsets. Incredibly hard to track without having symbols.
0x2 Writing Memory
If we can continue reading from @esi, then its safe to assume that we can continue writing. I know that is a huge assumption, but with the ability to ole spray the heap or gain precision of the heap using eps, it is likley that we can control the data at that offset. But what can we overwrite? Let’s take a look at the destination stack address:
Using @corelanc0d3r’s excellent mona plugin, we can dump the destination stack address using the remainder of the size for the copy and can see that we have a pointer to .text (wwlib!GetAllocCounters+0x128118). If I had to guess correctly at this point, I would say that we are not supposed to overwrite this value.
Therefore, we are likley overflowing a stack buffer (not by much). If we wanted to hit a return address, it wouldn’t happen until +0x1e8 of the destination address. Which, incase you were curious, is located here:
We dont see it in the call stack, because its a fair frames up in the stack:
The next question is, how are we going to simulate continuing the execution?
@bannedit wrote another excellent plugin called counterfeit that we can use to alloc a chunk (using VirtualAlloc) in windbg and fill it with marked data. We can then go ahead and replace @esi with this value and continue the copy operation.
Now, we set @esi to be the 0x14130000:
We can see we overwrote the data pointer that points to a function with a potentially controlled value from @esi. Since @esi contains marked data, we know at what offset in @esi was used to overwrite the pointer.
Looking at the call stack again, we are interested in the caller of memmove().
Using the Hex-Rays decompiler, we can see this function is simply a wrapper around memmove() and is called a lot within wwwlib. Also I renamed sub_316d9b16 to memmove_wrapper_1 for brevity.
If the size is larger than MAX_INT, an int overflow exception is raised. Additionally to that, there is no sanity checks on the size value to validate that it is smaller than the destination buffer.
For exploitation purposes, we typically want to know how the memmove() accessed or called…
To determine this, we set a breakpoint bp wwlib!DllGetClassObject+0x4554 ".printf \"calling memmove(%x, %x, %x);\\n\", poi(@esp), poi(@esp+4), poi(@esp+8); gc" and re-run the poc.
There are a number of calls using a source buffer which start with 0x2266efXX and the destination seems to be consistant as well 0x002711YY. This is suspicious of an erroneous loop that is calling memmove() multiple times.
The way I like to determine this is to analyze the stack at each call to determine if it unique. Executing the ‘k’ command in windbg is not going to cut it, as we are all ready slowing down execution a lot with the above break point. I choose to use a quick little windbg plugin that mashes the return addresses together:
Now, we will add that to our breakpoint, take out the new line and add a space on the end, finally re-running it:
It is safe to assume now that calls to memmove() with stack hash 0x1847ab6993 are within a loop!
Since the poc does not overflow a return address or anything that is later accessed and used during a write or copy operation, then it can be concluded that this vulnerability has very little impact.
Microsoft patched this vulnerability as a “Microsoft Office Information Disclosure Vulnerability” which makes sense in the context that it was presented here. However, since we can overwrite a pointer to .text on the stack due to the overflow, it demonstrates that this vulnerability has the potential for much more of an impact had it been triggered using an alternate code path.
Within sub_316f3232, there are 525 calls to the memmove_wrapper_1() which indictates that it is highley likley that several code paths exist, in order to reach this vulnerability.
Additionaly to that, none of the others functions in the call stack use the guard stack mitigation (/GS) which means that if the return address was overwritten, there is no operating system level mitigation enabled to mitigate against it.
Many complex vulnerabilities still exist within the Office codebase that can be hard to find. Often, even harder to determine the root cause analysis and develop exploits for and I think that if Microsoft had released the symbols I would have had much better chances at the later on several occasions.