ROPV – ROP gadget finder tool for RISC-V binaries

As my first post, I would like to talk about my final degree project. The idea behind the project was to build a program like ropshell, that searches for ROP gadgets but only focused in binaries from the RISC-V architecture. In this post I’ll be explaining the motivation behind this work, some background around Return Oriented Programming, and finally I’ll be talking about some relevant parts from the implementation in addition to the future improvements i’ would like to bring.

Motivation

During the last year, I was curious about what a buffer overflow is and how you can take advantage of it. This curiosity motivated me to take a course that explores different techniques such as ROP, ret-2-libc, shelllcode injection… you can use whenever a vulnerability of buffer overflow exists. At the end of the course, I was fascinated. How did anyone think about those ways of take advantage of a buffer overflow?

When I started to think about my project, I searched through the internet to find tools that helps you in ROP attacks. I found tools like ropper, ROPgadget or a module inside pwntools, but all of them don’t support binaries from the RISC-V architecture. At this point I was enough motivated to build a tool focused in RISC-V binaries, due to the age of the architecture and because none of the tools work with that arch.

What is ROP?

ROP (Return Oriented Programming) is a computer security exploit technique where an attacker gains control of the call stack to hijack program flow by executing chosen machine instruction sequences that are already present in the machine’s memory, called “gadgets”. Each gadget typically ends in a return instruction and is located in a subroutine within the existing program and/or shared library code. Chained together, these gadgets allow an attacker to perform arbitrary operations. Now let’s do an example:

Imagine that we have a binary, where the following instructions are present:

  • At address 0x08050000, li a0, 0x0
  • At address 0x08050004, ecall
  • At address 0x08050008, lw a7, 4(sp)
  • At address 0x0805000C, lw ra, 0(sp)
  • At address 0x08050010, ret

What we will try to do is to write in the a7 register, the value 0x5D (syscall exit), trying to finish the execution of the program in a natural way. In the following figure, we could see the state of the stack before we call the vulnerable function.

Figure 1: Example of chaining ROP gadgets

I have represented each gadget in a different color (except purple parts that represent padding), to keep the example more clear. The steps we are going to follow to achieve the proposed objective are the following:

  1. The value 0x8050008 will be popped from the stack into the program counter. After that, the program will execute the instruction located in that address. This corresponds to the first gadget.
    • lw a7, 4(sp): This instruction will load the value 0x5D in the a7 register.
    • lw ra, 0(sp): This instruction will load the address of the next gadget in the program counter.
    • ret: Finally, this instruction will cause that the program continues it’s execution from the address loaded by the previous instruction in the ra register.
  2. At this point, the execution of the first gadget has ended so the program will continue it’s flow by the second gadget, located at address 0x08050000.
    • li a0, 0x0: This instruction will load the value 0x0 in the a0 register, indicating the exit code.
    • ecall: This instruction request the OS to perform the exit system call. This is done because previously, the number of syscall has been placed in the a7 register, besides the exit code is in the a0 register.

As you can see, this technique is so powerful. We can take advantage of the instructions available in the binary to perform any action. The only flaw I can see from the RISC-V perspective is that a larger amount of instructions are required to create a gadget; in contrast, in the x86 arch, as each instruction is encoded with a different numebr of bytes, we can extract instructions from almost everywhere, like jumping in the middle of a “legit” instruction.

Other aspect to take into account is that in the RISC-V arch, the ret instruction doesn’t pop the return address into the program counter, instead it returns the execution flow by jumping to the address pointed by the ra register. That means that our gadgets will need a instruction like lw ra, x(y).

From a defensive perspective, some techniques can be utilized to prevent an attacker to utilize ROP. The main countermeasure, is called Stack Canary. This works by placing a random value for each instruction (canary value) before the return address, so that if the return address is overwritten, the program will stop it’s execution by calling abort. You may have noticed that this technique is not focused on ROP attacks, it is focused in preventing the attacker to not take advantage of buffer overflow vulnerabilities.

The ASLR (Address Space Layout Randomization) technique, arranges the address space positions of key data areas of a process, including the base of the executable and the positions of the stack, heap and libraries. This makes more difficult to predict the offset where a function (like printf) would be placed. To use a ROP attack, first we will need to defeat the ASLR.

The last technique I’m discussing is CFI (Control Flow Integrity). This technique, like the stack canary is not focused in preventing an attacker to utilize a ROP attack, instead is focused in preventing the utilization of techniques that redirects the the control flow of a program. CFI has a variety of implementations, like Microsoft’s Control Flow Guard, Clang’s CFI or Intel’s Control-flow Enforcement Technology.

Future work

As you may have noticed, this tool has a strong dependency on the RISC-V toolchain. The work I’m doing right now is including the capstone engine as a disassembler, so I can delete the RISC-V toolchain from the project and make the program more portable. At this moment, I’m still playing with the engine. All the work is being done in capstone ‘s branch of the repository.

WordPress Cookie Plugin by Real Cookie Banner