Introduction

Rust bills itself as a memory safe language, due in part to the way the compiler forces you to write code. With the exception of using the unsafe keyword, entire bug classes are demolished: use after frees, buffer overflows, format string bugs, etc. You can read more about how Rust defines “safe” in the documentation.

In my spare time, I like to play CTF which often involves reverse engineering binaries to find bugs. As a small exercise, I took an approach similar to CTF to take a closer look at how Rust programs defend against out of bound reads/writes. All my experiments are on 64-bit linux.

C

Let’s start by proposing a simple C program:

#include <stdio.h>

void use_arr(char* buf, int index) {
    printf("Byte at [%d] = %c\n", index, buf[index]);
}

int main(int argc, char** argv) {
    int i;
    char test[] = { 0x41, 0x24, 0x24, 0x3a, 0x20, 0x43, 0x30, 0x30, 0x31 };
    for (i=0; i < 30; i++) {
        use_arr(test, i);
    }
}

This lil program is a bit contrived but not unrealistic. We have an array of chars containing 9 characters, but the for loop reads 30. From reading the source we know this is bad, but the compiler has no way of knowing that we would end up reading out of bounds. In a real vulnerability in a real program, the source of the out of bounds read would typically be user input being used to determine how far into the buffer we should read. In many cases the same code paths could also lead to an out of bounds write, possibly leading to code execution. Running the program, we get this output:

Byte at [0] = A
Byte at [1] = $
Byte at [2] = $
Byte at [3] = :
Byte at [4] =
Byte at [5] = C
Byte at [6] = 0
Byte at [7] = 0
Byte at [8] = 1
Byte at [9] = ^D
Byte at [10] = @
Byte at [11] = ^@
Byte at [12] = ^@
Byte at [13] = ^@
Byte at [14] = ^@
Byte at [15] = ^@
Byte at [16] = <F0>
Byte at [17] = <CD>
Byte at [18] = <95>
Byte at [19] = f
Byte at [20] = <FF>
Byte at [21] = ^?
Byte at [22] = ^@
Byte at [23] = ^@
Byte at [24] = ^@
Byte at [25] = G
Byte at [26] = ^Q
Byte at [27] = <D4>
Byte at [28] = F
Byte at [29] = <BA>

Let’s look at the (unoptimized) code generated for use_arr to see what’s happening:

0000000000400564 <use_arr>:
  400564:       55                      push   rbp
  400565:       48 89 e5                mov    rbp,rsp
  400568:       48 83 ec 10             sub    rsp,0x10
  40056c:       48 89 7d f8             mov    QWORD PTR [rbp-0x8],rdi
  400570:       89 75 f4                mov    DWORD PTR [rbp-0xc],esi
  400573:       8b 45 f4                mov    eax,DWORD PTR [rbp-0xc]
  400576:       48 98                   cdqe
  400578:       48 03 45 f8             add    rax,QWORD PTR [rbp-0x8]
  40057c:       0f b6 00                movzx  eax,BYTE PTR [rax]
  40057f:       0f be d0                movsx  edx,al
  400582:       b8 0c 07 40 00          mov    eax,0x40070c
  400587:       8b 4d f4                mov    ecx,DWORD PTR [rbp-0xc]
  40058a:       89 ce                   mov    esi,ecx
  40058c:       48 89 c7                mov    rdi,rax
  40058f:       b8 00 00 00 00          mov    eax,0x0
  400594:       e8 c7 fe ff ff          call   400460 <printf@plt>
  400599:       c9                      leave
  40059a:       c3                      ret

At 0x40056c and 0x400570 we see rdi and esi being loaded into the stack. This is the calling convention on x86-64; we are loading the two arguments (char* buf and int index, respectively) into the stack. After some shuffling and integer promotion, the base of the array is added to the offset (0x400578), the byte from that memory location is read (0x40057c), and then that byte is printed (0x400594). This is pretty much what we would expect: reading/writing out of bounds is a result of naively adding offsets. There’s not much else that we can do since there’s no way of knowing if we’re within range.

In this example, we just end up leaking useless data from the stack. In a real world application however, a bug like this could end up leaking critical information from the process’ memory: passwords, encryption keys, etc.

Rust Slices

Let’s translate the C to its equivalent in Rust:

fn use_slice(slicerino: &[u8], offset: usize) -> () {
    println!("Byte at [{}] = 0x{:x}", offset, slicerino[offset]);
}

fn main() -> () {
    let test = [0x41, 0x24, 0x24, 0x3a, 0x20, 0x43, 0x30, 0x30, 0x31];
    for i in 0..30 {
        use_slice(&test, i);
    }
}

Running it, we quickly see Rust in action:

Byte at [0] = 0x41
Byte at [1] = 0x24
Byte at [2] = 0x24
Byte at [3] = 0x3a
Byte at [4] = 0x20
Byte at [5] = 0x43
Byte at [6] = 0x30
Byte at [7] = 0x30
Byte at [8] = 0x31
thread 'main' panicked at 'index out of bounds: the len is 9 but the index is 9', slicer.rs:3
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Let’s take a look at the code generated:

0000000000007900 <_ZN6slicer4main17h687ba6627330621bE>:
    7900:       48 81 ec 88 00 00 00    sub    rsp,0x88
    7907:       48 8d 7c 24 30          lea    rdi,[rsp+0x30]
    790c:       48 8d 74 24 20          lea    rsi,[rsp+0x20]
    7911:       c6 44 24 7f 41          mov    BYTE PTR [rsp+0x7f],0x41
    7916:       c6 84 24 80 00 00 00    mov    BYTE PTR [rsp+0x80],0x24
    791e:       c6 84 24 81 00 00 00    mov    BYTE PTR [rsp+0x81],0x24
    7926:       c6 84 24 82 00 00 00    mov    BYTE PTR [rsp+0x82],0x3a
    792e:       c6 84 24 83 00 00 00    mov    BYTE PTR [rsp+0x83],0x20
    7936:       c6 84 24 84 00 00 00    mov    BYTE PTR [rsp+0x84],0x43
    793e:       c6 84 24 85 00 00 00    mov    BYTE PTR [rsp+0x85],0x30
    7946:       c6 84 24 86 00 00 00    mov    BYTE PTR [rsp+0x86],0x30
    794e:       c6 84 24 87 00 00 00    mov    BYTE PTR [rsp+0x87],0x31
...
    79e2:       b8 09 00 00 00          mov    eax,0x9
    79e7:       89 c6                   mov    esi,eax
    79e9:       48 8d 4c 24 7f          lea    rcx,[rsp+0x7f]
    79ee:       48 8b 54 24 48          mov    rdx,QWORD PTR [rsp+0x48]
    79f3:       48 89 cf                mov    rdi,rcx
    79f6:       e8 95 fd ff ff          call   7790 <_ZN6slicer9use_slice17h4185f678b7a09df8E>
    79fb:       eb ac                   jmp    79a9 <_ZN6slicer4main17h687ba6627330621bE+0xa9>

0000000000007790 <_ZN6slicer9use_slice17h4185f678b7a09df8E>:
    7790:       48 81 ec e8 00 00 00    sub    rsp,0xe8
    7797:       48 89 74 24 50          mov    QWORD PTR [rsp+0x50],rsi
    779c:       48 89 7c 24 48          mov    QWORD PTR [rsp+0x48],rdi
    77a1:       48 89 54 24 40          mov    QWORD PTR [rsp+0x40],rdx
    77a6:       48 8b 44 24 40          mov    rax,QWORD PTR [rsp+0x40]
    77ab:       48 89 84 24 d8 00 00    mov    QWORD PTR [rsp+0xd8],rax
    77b2:       00
    77b3:       48 8b 35 76 40 25 00    mov    rsi,QWORD PTR [rip+0x254076]        # 25b830 <_ZN6slicer9use_slice15__STATIC
    77ba:       48 8b 15 77 40 25 00    mov    rdx,QWORD PTR [rip+0x254077]        # 25b838 <_ZN6slicer9use_slice15__STATIC
    77c1:       48 8b 8c 24 d8 00 00    mov    rcx,QWORD PTR [rsp+0xd8]
    77c8:       00
    77c9:       48 8b 7c 24 50          mov    rdi,QWORD PTR [rsp+0x50]
    77ce:       48 39 f9                cmp    rcx,rdi
    77d1:       41 0f 92 c0             setb   r8b
    77d5:       41 f6 c0 01             test   r8b,0x1
    77d9:       48 89 74 24 38          mov    QWORD PTR [rsp+0x38],rsi
    77de:       48 89 54 24 30          mov    QWORD PTR [rsp+0x30],rdx
    77e3:       48 89 4c 24 28          mov    QWORD PTR [rsp+0x28],rcx
    77e8:       75 05                   jne    77ef <_ZN6slicer9use_slice17h4185f678b7a09df8E+0x5f>
    77ea:       e9 f1 00 00 00          jmp    78e0 <_ZN6slicer9use_slice17h4185f678b7a09df8E+0x150>
...
    77ef:       48 8d 7c 24 68          lea    rdi,[rsp+0x68]
    77f4:       48 8b 15 d5 56 25 00    mov    rdx,QWORD PTR [rip+0x2556d5]        # 25ced0 <_DYNAMIC+0x338>
    77fb:       48 8d 84 24 d8 00 00    lea    rax,[rsp+0xd8]
    7802:       00
    7803:       48 8b 4c 24 48          mov    rcx,QWORD PTR [rsp+0x48]
    7808:       48 8b 74 24 28          mov    rsi,QWORD PTR [rsp+0x28]
    780d:       48 01 f1                add    rcx,rsi
    7810:       48 89 44 24 78          mov    QWORD PTR [rsp+0x78],rax
    7815:       48 89 8c 24 80 00 00    mov    QWORD PTR [rsp+0x80],rcx
    781c:       00
    781d:       48 8b 74 24 78          mov    rsi,QWORD PTR [rsp+0x78]
    7822:       48 8b 84 24 80 00 00    mov    rax,QWORD PTR [rsp+0x80]
    7829:       00
    782a:       48 89 44 24 20          mov    QWORD PTR [rsp+0x20],rax
    782f:       e8 ac fc ff ff          call   74e0 <_ZN4core3fmt10ArgumentV13new17ha06e67a360d64741E>
...
    78e0:       48 8d 3d 81 3f 25 00    lea    rdi,[rip+0x253f81]        # 25b868 <panic_bounds_check_loc.4>
    78e7:       48 8b 74 24 28          mov    rsi,QWORD PTR [rsp+0x28]
    78ec:       48 8b 54 24 50          mov    rdx,QWORD PTR [rsp+0x50]
    78f1:       e8 4a f3 03 00          call   46c40 <_ZN4core9panicking18panic_bounds_check17h155238769b791565E>

Oof, that’s a lot more assembly to read now. One thing to immediately notice is that the addresses are much shorter. The Rust compiler by default generates a Position Independent Executable which means this program can be placed anywhere at runtime. This makes the program safer by opting into ASLR. Additionally, the names are mangled. Rust, C++, and others do this to help out the linker that would otherwise have trouble with same named functions when it comes to polymorphism, function overloading, and so on.

Starting at 0x79e2, we begin setting up the call to use_slice. We follow the same x86-64 calling convention as always and start loading up the registers;

  • rdi = [rsp+0x7f]
  • rsi = 0x9
  • rdx = QWORD PTR [rsp+0x48]

But wait! That’s three arguments, and use_slice is clearly defined to take just two. Where’s this spooky third parameter coming from?!? It turns out that Rust sneaks in an extra parameter to the function and uses it to hold the size of the slice we’re passing in. The other two parameters are what we would expect; rdi holds the address of the buffer (created on the stack at 0x7911 to 0x794e) and rdx holds the current index of the slice, pulled from the parent function’s stack frame.

Following the use_slice function down to 0x77ce, we end up at a compare; we’re comparing rcx (which now holds the input index) and rdi (which now holds the slice size). With the cmp and following setb r8b instruction, we end up setting r8b to 1 if the input index is below the slice size. That value is later used to branch at 0x77e8 and 0x77ea.

Following the failure branch to 0x78e0, we can see the binary calls panic_bounds_check, which is what’s used to print that error message and exit. The success branch takes us to 0x77ef, where we load rcx with the address of the input buffer (0x7803) and rsi with the index we want to read at. We then perform the addition (0x780d) - exactlylike we did in the C program - and use its value to start crafting the format string to print (0x782f).

So from this example, we can see that while Rust and C do the same thing to read the byte, Rust sneaks in an extra parameter and some logic to make sure what you’re doing is acceptable.

Dynamically Sized Slices

Wait, what about slices whose sizes aren’t known at compile time, like with Vectors? The previous example had the length hardcoded into the logic (0x79e2) but that clearly doesn’t work for something like a Vector that can hold any number of elements at any given time. The answer is somewhat obvious: the program just grabs the size and uses that as the parameter.

0000000000007fd0 <_ZN67_$LT$collections..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..Deref$GT$5deref17h9b56f5d54a9107ccE>:
    7fd0:       48 83 ec 28             sub    rsp,0x28
    7fd4:       48 89 7c 24 20          mov    QWORD PTR [rsp+0x20],rdi
    7fd9:       eb 00                   jmp    7fdb <_ZN67_$LT$collections..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..Dere
    7fdb:       48 8b 7c 24 20          mov    rdi,QWORD PTR [rsp+0x20]
    7fe0:       e8 7b f7 ff ff          call   7760 <_ZN40_$LT$alloc..raw_vec..RawVec$LT$T$GT$$GT$3ptr17h8f31e2209b06ee3
    7fe5:       48 89 44 24 18          mov    QWORD PTR [rsp+0x18],rax
    7fea:       48 8b 7c 24 18          mov    rdi,QWORD PTR [rsp+0x18]
    7fef:       e8 7c fa ff ff          call   7a70 <_ZN4core3ptr31_$LT$impl$u20$$BP$mut$u20$T$GT$7is_null17h0e69dab9481
    7ff4:       88 44 24 17             mov    BYTE PTR [rsp+0x17],al
    7ff8:       48 8b 44 24 20          mov    rax,QWORD PTR [rsp+0x20]
    7ffd:       48 8b 70 10             mov    rsi,QWORD PTR [rax+0x10]
    8001:       48 8b 7c 24 18          mov    rdi,QWORD PTR [rsp+0x18]
    8006:       e8 85 fb ff ff          call   7b90 <_ZN4core5slice14from_raw_parts17h1aece967a1ab5fe2E>
    800b:       48 89 44 24 08          mov    QWORD PTR [rsp+0x8],rax
    8010:       48 89 14 24             mov    QWORD PTR [rsp],rdx
    8014:       48 8b 44 24 08          mov    rax,QWORD PTR [rsp+0x8]
    8019:       48 8b 14 24             mov    rdx,QWORD PTR [rsp]
    801d:       48 83 c4 28             add    rsp,0x28

There are a few things to note here. At 0x7ffd, we’re grabbing the Vector’s current size. In Rust, a Vector is a struct of 3 elements: a pointer to the data, the capacity, and the size. Since my platform is 64bit, reading [rax+0x10] ends up requesting the 3rd element (size). That size and the pointer to the data are then passed to slice::from_raw_parts (0x8006). The function then returns the data needed to pass to a function that accepts slices: rax has the data and rdx has ths size.

Conclusion

For a language that is mostly safety it’s no surprise that any of this is taking place behind the scenes. Even though it’s easy to have an idea how these things are implemented, it’s still a fun exercise to take apart the code that’s actually executing on your machine and see things for yourself.

A question I saw discussed often (not any more, cause I’m late to Rust) is whether or not Rust would have stopped heartbleed. The tl;dr on heartbleed was that you could send it a packet containing a size and arbitrary data, and the server would respond with the data at the size you specified. If you sent it a size bigger than the length of the data, you would end up getting extra data back - often containing sensitive information. Based on the experiments above, we know that a binary generated by the Rust compiler would not let you do that.