How Rust Prevents Out of Bound Reads/Writes
21 Jan 2017Introduction
Rust bills itself as a memory safe language, due in part to the way the compiler forces you to write code. With the exception of using the unsafe
keyword, entire bug classes are demolished: use after frees, buffer overflows, format string bugs, etc. You can read more about how Rust defines “safe” in the documentation.
In my spare time, I like to play CTF which often involves reverse engineering binaries to find bugs. As a small exercise, I took an approach similar to CTF to take a closer look at how Rust programs defend against out of bound reads/writes. All my experiments are on 64-bit linux.
C
Let’s start by proposing a simple C program:
#include <stdio.h>
void use_arr(char* buf, int index) {
printf("Byte at [%d] = %c\n", index, buf[index]);
}
int main(int argc, char** argv) {
int i;
char test[] = { 0x41, 0x24, 0x24, 0x3a, 0x20, 0x43, 0x30, 0x30, 0x31 };
for (i=0; i < 30; i++) {
use_arr(test, i);
}
}
This lil program is a bit contrived but not unrealistic. We have an array of chars
containing 9 characters, but the for loop reads 30. From reading the source we know this is bad, but the compiler has no way of knowing that we would end up reading out of bounds. In a real vulnerability in a real program, the source of the out of bounds read would typically be user input being used to determine how far into the buffer we should read. In many cases the same code paths could also lead to an out of bounds write, possibly leading to code execution. Running the program, we get this output:
Byte at [0] = A
Byte at [1] = $
Byte at [2] = $
Byte at [3] = :
Byte at [4] =
Byte at [5] = C
Byte at [6] = 0
Byte at [7] = 0
Byte at [8] = 1
Byte at [9] = ^D
Byte at [10] = @
Byte at [11] = ^@
Byte at [12] = ^@
Byte at [13] = ^@
Byte at [14] = ^@
Byte at [15] = ^@
Byte at [16] = <F0>
Byte at [17] = <CD>
Byte at [18] = <95>
Byte at [19] = f
Byte at [20] = <FF>
Byte at [21] = ^?
Byte at [22] = ^@
Byte at [23] = ^@
Byte at [24] = ^@
Byte at [25] = G
Byte at [26] = ^Q
Byte at [27] = <D4>
Byte at [28] = F
Byte at [29] = <BA>
Let’s look at the (unoptimized) code generated for use_arr
to see what’s happening:
0000000000400564 <use_arr>:
400564: 55 push rbp
400565: 48 89 e5 mov rbp,rsp
400568: 48 83 ec 10 sub rsp,0x10
40056c: 48 89 7d f8 mov QWORD PTR [rbp-0x8],rdi
400570: 89 75 f4 mov DWORD PTR [rbp-0xc],esi
400573: 8b 45 f4 mov eax,DWORD PTR [rbp-0xc]
400576: 48 98 cdqe
400578: 48 03 45 f8 add rax,QWORD PTR [rbp-0x8]
40057c: 0f b6 00 movzx eax,BYTE PTR [rax]
40057f: 0f be d0 movsx edx,al
400582: b8 0c 07 40 00 mov eax,0x40070c
400587: 8b 4d f4 mov ecx,DWORD PTR [rbp-0xc]
40058a: 89 ce mov esi,ecx
40058c: 48 89 c7 mov rdi,rax
40058f: b8 00 00 00 00 mov eax,0x0
400594: e8 c7 fe ff ff call 400460 <printf@plt>
400599: c9 leave
40059a: c3 ret
At 0x40056c and 0x400570 we see rdi and esi being loaded into the stack. This is the calling convention on x86-64; we are loading the two arguments (char* buf
and int index
, respectively) into the stack. After some shuffling and integer promotion, the base of the array is added to the offset (0x400578), the byte from that memory location is read (0x40057c), and then that byte is printed (0x400594). This is pretty much what we would expect: reading/writing out of bounds is a result of naively adding offsets. There’s not much else that we can do since there’s no way of knowing if we’re within range.
In this example, we just end up leaking useless data from the stack. In a real world application however, a bug like this could end up leaking critical information from the process’ memory: passwords, encryption keys, etc.
Rust Slices
Let’s translate the C to its equivalent in Rust:
fn use_slice(slicerino: &[u8], offset: usize) -> () {
println!("Byte at [{}] = 0x{:x}", offset, slicerino[offset]);
}
fn main() -> () {
let test = [0x41, 0x24, 0x24, 0x3a, 0x20, 0x43, 0x30, 0x30, 0x31];
for i in 0..30 {
use_slice(&test, i);
}
}
Running it, we quickly see Rust in action:
Byte at [0] = 0x41
Byte at [1] = 0x24
Byte at [2] = 0x24
Byte at [3] = 0x3a
Byte at [4] = 0x20
Byte at [5] = 0x43
Byte at [6] = 0x30
Byte at [7] = 0x30
Byte at [8] = 0x31
thread 'main' panicked at 'index out of bounds: the len is 9 but the index is 9', slicer.rs:3
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Let’s take a look at the code generated:
0000000000007900 <_ZN6slicer4main17h687ba6627330621bE>:
7900: 48 81 ec 88 00 00 00 sub rsp,0x88
7907: 48 8d 7c 24 30 lea rdi,[rsp+0x30]
790c: 48 8d 74 24 20 lea rsi,[rsp+0x20]
7911: c6 44 24 7f 41 mov BYTE PTR [rsp+0x7f],0x41
7916: c6 84 24 80 00 00 00 mov BYTE PTR [rsp+0x80],0x24
791e: c6 84 24 81 00 00 00 mov BYTE PTR [rsp+0x81],0x24
7926: c6 84 24 82 00 00 00 mov BYTE PTR [rsp+0x82],0x3a
792e: c6 84 24 83 00 00 00 mov BYTE PTR [rsp+0x83],0x20
7936: c6 84 24 84 00 00 00 mov BYTE PTR [rsp+0x84],0x43
793e: c6 84 24 85 00 00 00 mov BYTE PTR [rsp+0x85],0x30
7946: c6 84 24 86 00 00 00 mov BYTE PTR [rsp+0x86],0x30
794e: c6 84 24 87 00 00 00 mov BYTE PTR [rsp+0x87],0x31
...
79e2: b8 09 00 00 00 mov eax,0x9
79e7: 89 c6 mov esi,eax
79e9: 48 8d 4c 24 7f lea rcx,[rsp+0x7f]
79ee: 48 8b 54 24 48 mov rdx,QWORD PTR [rsp+0x48]
79f3: 48 89 cf mov rdi,rcx
79f6: e8 95 fd ff ff call 7790 <_ZN6slicer9use_slice17h4185f678b7a09df8E>
79fb: eb ac jmp 79a9 <_ZN6slicer4main17h687ba6627330621bE+0xa9>
0000000000007790 <_ZN6slicer9use_slice17h4185f678b7a09df8E>:
7790: 48 81 ec e8 00 00 00 sub rsp,0xe8
7797: 48 89 74 24 50 mov QWORD PTR [rsp+0x50],rsi
779c: 48 89 7c 24 48 mov QWORD PTR [rsp+0x48],rdi
77a1: 48 89 54 24 40 mov QWORD PTR [rsp+0x40],rdx
77a6: 48 8b 44 24 40 mov rax,QWORD PTR [rsp+0x40]
77ab: 48 89 84 24 d8 00 00 mov QWORD PTR [rsp+0xd8],rax
77b2: 00
77b3: 48 8b 35 76 40 25 00 mov rsi,QWORD PTR [rip+0x254076] # 25b830 <_ZN6slicer9use_slice15__STATIC
77ba: 48 8b 15 77 40 25 00 mov rdx,QWORD PTR [rip+0x254077] # 25b838 <_ZN6slicer9use_slice15__STATIC
77c1: 48 8b 8c 24 d8 00 00 mov rcx,QWORD PTR [rsp+0xd8]
77c8: 00
77c9: 48 8b 7c 24 50 mov rdi,QWORD PTR [rsp+0x50]
77ce: 48 39 f9 cmp rcx,rdi
77d1: 41 0f 92 c0 setb r8b
77d5: 41 f6 c0 01 test r8b,0x1
77d9: 48 89 74 24 38 mov QWORD PTR [rsp+0x38],rsi
77de: 48 89 54 24 30 mov QWORD PTR [rsp+0x30],rdx
77e3: 48 89 4c 24 28 mov QWORD PTR [rsp+0x28],rcx
77e8: 75 05 jne 77ef <_ZN6slicer9use_slice17h4185f678b7a09df8E+0x5f>
77ea: e9 f1 00 00 00 jmp 78e0 <_ZN6slicer9use_slice17h4185f678b7a09df8E+0x150>
...
77ef: 48 8d 7c 24 68 lea rdi,[rsp+0x68]
77f4: 48 8b 15 d5 56 25 00 mov rdx,QWORD PTR [rip+0x2556d5] # 25ced0 <_DYNAMIC+0x338>
77fb: 48 8d 84 24 d8 00 00 lea rax,[rsp+0xd8]
7802: 00
7803: 48 8b 4c 24 48 mov rcx,QWORD PTR [rsp+0x48]
7808: 48 8b 74 24 28 mov rsi,QWORD PTR [rsp+0x28]
780d: 48 01 f1 add rcx,rsi
7810: 48 89 44 24 78 mov QWORD PTR [rsp+0x78],rax
7815: 48 89 8c 24 80 00 00 mov QWORD PTR [rsp+0x80],rcx
781c: 00
781d: 48 8b 74 24 78 mov rsi,QWORD PTR [rsp+0x78]
7822: 48 8b 84 24 80 00 00 mov rax,QWORD PTR [rsp+0x80]
7829: 00
782a: 48 89 44 24 20 mov QWORD PTR [rsp+0x20],rax
782f: e8 ac fc ff ff call 74e0 <_ZN4core3fmt10ArgumentV13new17ha06e67a360d64741E>
...
78e0: 48 8d 3d 81 3f 25 00 lea rdi,[rip+0x253f81] # 25b868 <panic_bounds_check_loc.4>
78e7: 48 8b 74 24 28 mov rsi,QWORD PTR [rsp+0x28]
78ec: 48 8b 54 24 50 mov rdx,QWORD PTR [rsp+0x50]
78f1: e8 4a f3 03 00 call 46c40 <_ZN4core9panicking18panic_bounds_check17h155238769b791565E>
Oof, that’s a lot more assembly to read now. One thing to immediately notice is that the addresses are much shorter. The Rust compiler by default generates a Position Independent Executable which means this program can be placed anywhere at runtime. This makes the program safer by opting into ASLR. Additionally, the names are mangled. Rust, C++, and others do this to help out the linker that would otherwise have trouble with same named functions when it comes to polymorphism, function overloading, and so on.
Starting at 0x79e2, we begin setting up the call to use_slice
. We follow the same x86-64 calling convention as always and start loading up the registers;
rdi
=[rsp+0x7f]
rsi
=0x9
rdx
=QWORD PTR [rsp+0x48]
But wait! That’s three arguments, and use_slice is clearly defined to take just two. Where’s this spooky third parameter coming from?!? It turns out that Rust sneaks in an extra parameter to the function and uses it to hold the size of the slice we’re passing in. The other two parameters are what we would expect; rdi
holds the address of the buffer (created on the stack at 0x7911 to 0x794e) and rdx
holds the current index of the slice, pulled from the parent function’s stack frame.
Following the use_slice function down to 0x77ce, we end up at a compare; we’re comparing rcx
(which now holds the input index) and rdi
(which now holds the slice size). With the cmp
and following setb r8b
instruction, we end up setting r8b
to 1 if the input index is below the slice size. That value is later used to branch at 0x77e8 and 0x77ea.
Following the failure branch to 0x78e0, we can see the binary calls panic_bounds_check, which is what’s used to print that error message and exit. The success branch takes us to 0x77ef, where we load rcx
with the address of the input buffer (0x7803) and rsi
with the index we want to read at. We then perform the addition (0x780d) - exactlylike we did in the C program - and use its value to start crafting the format string to print (0x782f).
So from this example, we can see that while Rust and C do the same thing to read the byte, Rust sneaks in an extra parameter and some logic to make sure what you’re doing is acceptable.
Dynamically Sized Slices
Wait, what about slices whose sizes aren’t known at compile time, like with Vectors? The previous example had the length hardcoded into the logic (0x79e2) but that clearly doesn’t work for something like a Vector that can hold any number of elements at any given time. The answer is somewhat obvious: the program just grabs the size and uses that as the parameter.
0000000000007fd0 <_ZN67_$LT$collections..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..Deref$GT$5deref17h9b56f5d54a9107ccE>:
7fd0: 48 83 ec 28 sub rsp,0x28
7fd4: 48 89 7c 24 20 mov QWORD PTR [rsp+0x20],rdi
7fd9: eb 00 jmp 7fdb <_ZN67_$LT$collections..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..Dere
7fdb: 48 8b 7c 24 20 mov rdi,QWORD PTR [rsp+0x20]
7fe0: e8 7b f7 ff ff call 7760 <_ZN40_$LT$alloc..raw_vec..RawVec$LT$T$GT$$GT$3ptr17h8f31e2209b06ee3
7fe5: 48 89 44 24 18 mov QWORD PTR [rsp+0x18],rax
7fea: 48 8b 7c 24 18 mov rdi,QWORD PTR [rsp+0x18]
7fef: e8 7c fa ff ff call 7a70 <_ZN4core3ptr31_$LT$impl$u20$$BP$mut$u20$T$GT$7is_null17h0e69dab9481
7ff4: 88 44 24 17 mov BYTE PTR [rsp+0x17],al
7ff8: 48 8b 44 24 20 mov rax,QWORD PTR [rsp+0x20]
7ffd: 48 8b 70 10 mov rsi,QWORD PTR [rax+0x10]
8001: 48 8b 7c 24 18 mov rdi,QWORD PTR [rsp+0x18]
8006: e8 85 fb ff ff call 7b90 <_ZN4core5slice14from_raw_parts17h1aece967a1ab5fe2E>
800b: 48 89 44 24 08 mov QWORD PTR [rsp+0x8],rax
8010: 48 89 14 24 mov QWORD PTR [rsp],rdx
8014: 48 8b 44 24 08 mov rax,QWORD PTR [rsp+0x8]
8019: 48 8b 14 24 mov rdx,QWORD PTR [rsp]
801d: 48 83 c4 28 add rsp,0x28
There are a few things to note here. At 0x7ffd, we’re grabbing the Vector’s current size. In Rust, a Vector is a struct of 3 elements: a pointer to the data, the capacity, and the size. Since my platform is 64bit, reading [rax+0x10]
ends up requesting the 3rd element (size). That size and the pointer to the data are then passed to slice::from_raw_parts (0x8006). The function then returns the data needed to pass to a function that accepts slices: rax
has the data and rdx
has ths size.
Conclusion
For a language that is mostly safety it’s no surprise that any of this is taking place behind the scenes. Even though it’s easy to have an idea how these things are implemented, it’s still a fun exercise to take apart the code that’s actually executing on your machine and see things for yourself.
A question I saw discussed often (not any more, cause I’m late to Rust) is whether or not Rust would have stopped heartbleed. The tl;dr on heartbleed was that you could send it a packet containing a size and arbitrary data, and the server would respond with the data at the size you specified. If you sent it a size bigger than the length of the data, you would end up getting extra data back - often containing sensitive information. Based on the experiments above, we know that a binary generated by the Rust compiler would not let you do that.