1. Design The goal of Address Space Layout Randomization is to introduce randomness into addresses used by a given task. This will make a class of exploit techniques fail with a quantifiable probability and also allow their detection since failed attempts will most likely crash the attacked task. To help understand the ideas behind ASLR, let's look at an example task and its address space: we made a copy of /bin/cat into /tmp/cat then disabled all PaX features on it and executed "/tmp/cat /proc/self/maps". The [x] marks are not part of the original output, we use them to refer to the various lines in the explanation (note that the VMMIRROR document contains more examples with various PaX features active). [1] 08048000-0804a000 R+Xp 00000000 00:0b 812 /tmp/cat [2] 0804a000-0804b000 RW+p 00002000 00:0b 812 /tmp/cat [3] 40000000-40015000 R+Xp 00000000 03:07 110818 /lib/ld-2.2.5.so [4] 40015000-40016000 RW+p 00014000 03:07 110818 /lib/ld-2.2.5.so [5] 4001e000-40143000 R+Xp 00000000 03:07 106687 /lib/libc-2.2.5.so [6] 40143000-40149000 RW+p 00125000 03:07 106687 /lib/libc-2.2.5.so [7] 40149000-4014d000 RW+p 00000000 00:00 0 [8] bfffe000-c0000000 RWXp fffff000 00:00 0 As we can see, /tmp/cat is a dynamically linked ELF executable, its address space contains several file mappings. [1] and [2] correspond to the loadable ELF segments of /tmp/cat containing code and data (both initialized and uninitialized), respectively. [3] and [4] represent the dynamic linker whereas [5], [6] and [7] are the segments of the C runtime library ([7] holds its uninitialized data that is big enough to not fit into the last page of [6]). [8] is the stack which grows downwards. There are other mappings as well that this simple example does not show us: the brk() managed heap that would directly follow [2] and various anonymous and file mappings that the task can create via mmap() and would be placed between [7] and [8] (unless an explicit mapping address outside this region was requested using the MAP_FIXED flag). For our purposes all these possible mappings can be split into three groups: - [1], [2] and the brk() managed heap following them, - [3]-[7] and all the other mappings created by mmap(), - [8], the stack. The mappings in the first and last groups are established during execve() and do not move (only their size can change) whereas the mappings in the second group may come and go during the lifetime of the task. Since the base addresses used to map each group are not related to each other, we can apply different amount of randomization to each. This also has the benefit that whenever a given attack technique needs advance knowledge of addresses from more than group, the attacker will likely have to guess or brute force all entropies at once which further reduces the chances of success. Let's analyze now the (side) effects of ASLR. For our purposes the most important effect is on the class of exploit techniques that need advance knowledge of certain addresses in the attacked task, such as the address of the current stack pointer or libraries. If there is no way to exploit a given bug to divulge information about the attacked task's randomized address space layout then there is only one way left to exploit the bug: guessing or brute forcing the randomization. Guessing occurs when the randomization applied to a task changes in every attacked task in an unpredictable manner. This means that the attacker cannot learn anything of future randomizations and has the same chance of succeeding in each attack attempt. Brute forcing occurs when the attacker can learn something about future randomizations and build that knowledge into his attack. In practice brute forcing can be applied to bugs that are in network daemons that fork() on each connection since fork() preserves the randomized layout, as opposed to execve() which replaces it with a new one. This distinction between the attack methods becomes meaningless if the system has monitoring and reaction mechanisms for program crashes because the reaction can then be triggered at such low levels that the two attack methods will have practically the same (im)probability to succeed. To quantify the above statements about probability of success, let's first introduce a few variables: Rs: number of bits randomized in the stack area, Rm: number of bits randomized in the mmap() area, Rx: number of bits randomized in the main executable area, Ls: least significant randomized bit position in the stack area, Lm: least significant randomized bit position in the mmap() area, Lx: least significant randomized bit position in the main executable area, As: number of bits of stack randomness attacked in one attempt, Am: number of bits of mmap() randomness attacked in one attempt, Ax: number of bits of main executable randomness attacked in one attempt. For example, for i386 we have Rs = 24, Rm = 16, Rx = 16, Ls = 4, Lm = 12 and Lx = 12 (e.g., the stack addresses have 24 bits of randomness in bit positions 4-27 leaving the least and most significant four bits unaffected by randomization). The number of attacked bits represents the fact that in a given situation more than one bit at a time can be attacked (obviously A