ARM assembly-language programming for the Raspberry Pi

11. Using the stack to store local data

"Local data" in this context means data that is required for the duration of a specific function, but which is too large to fit into registers, or not of an appropriate type. We have already seen how to use the print_str function to print a constant string, stored in a data segment. This example demonstrates how we could build a string programmatically, by storing its characters on the stack, and then calling print_str on the address in the stack. The string could be quite long in practice -- almost certainly too long to store the data in registers alone. Even if the data could be stored in registers, many external software components, including the kernel, will sometimes expect data to be passed in memory.

I'm assuming that you've read and understood the operation of the stack pointer, which I explained a couple of articles back. If you skipped that section, I would recommend that you flick back to it, because the following will make no sense if you don't understand how the stack pointer works, and how the stack is organized in memory.

The example

This example has a function unimaginatively called test that generates a two-character message in the stack and then prints it. I'm only including the specific function here -- the rest of the code is as I've shown in previous sections.

TEST_LOCAL = 8              // How much data to reserve on the stack - 8 bytes
test:
    push   {r0, r1, fp, lr} // Store the registers we will overwrite
    sub    sp, $TEST_LOCAL  // Move the stack _down_ to allow for our data

    mov    %fp, %sp         // %fp will reference the start of our 8-byte area

    mov    %r1, %fp         // Use %r1 to count the position we are writing
    mov    %r0, $79         // Store 'O' (char 79)  
    strb   %r0,[%r1]        // Set the 'O' to memory in the stack
    add    %r1, $1          // And increment the offset by one byte
    mov    %r0, $75         // Store 'K' (char 75)
    strb   %r0,[%r1]        // set the 'K' to memory in the stack
    add    %r1, $1          // And increment the offset again
    mov    %r0, $0          // Store the terminating null character
    strb   %r0,[%r1]        // And write it out

    mov    %r0, %fp         // print_str needs an address in %r0, so copy fp
    bl     print_str        // Print the string


    add    sp, $TEST_LOCAL  // Move the stack pointer over our data area
    pop    {r0, r1, fp, lr} // and restore the registers
    bx     lr

How it works

The stack management technique this function uses is commonplace -- not just in assembly programming, but in the code generated by compilers. Because it's so common, the ARM CPU has a specific register, the frame pointer, for keeping track of local data on the stack. This particular example has no need to use the frame pointer, but I do, just to show how it's usually done.

First we define a variable indicating the amount of data to reserve on the stack.

TEST_LOCAL = 8

However much data we expect to store, this data size should be a multiple of four or eight bytes, as I explained in the earlier section on alignment.

Then we push the registers that the function will change. In this case the function will change the frame pointer fp in addition to the usual registers, so we need to preserve its value on the stack.

    push   {r0, r1, fp, lr}

Then we expand the stack to make room for the local data. Because the stack is full descending, we must expand it by reducing the value of the stack pointer. The amount we reduce it by is the size of the local data block. We use the sub (subtract) instruction for this.

    sub    sp, $TEST_LOCAL

There is now a "free" area of memory starting at the current value of sp, and extending for eight bytes. After that come the original contents of the stack, which we really, really don't want to tamper with.

Now we set the frame pointer to the same address as the stack pointer, so we can use it as a base for later operations on the local data area.

    mov    %fp, %sp

At the end of the function we reverse the set-up we did on entry, moving the stack pointer back up to the original top-of-stack, and then popping off the registers we saved.

    add    sp, $TEST_LOCAL  
    pop    {r0, r1, fp, lr}

The main body of the function just uses strb operations to write data into memory. We discussed ldrb earlier, and you won't be surprised to find that strb (store b is equally inefficient. However, working byte-by-byte is easy to follow.

Why use the frame pointer?

In my example I set the frame pointer to the start of the local data block, but the stack pointer was already there. Why not just the stack pointer to reference the location of the data block?

We can certainly do that, and sometimes it is appropriate to. However, we can't move the stack pointer, while we can freely move the frame pointer. We can't move the stack pointer because we need to use the stack for its main purpose -- preserving data across function calls. If we move the stack pointer and then call another function, the called function will overwrite our stack with its own, with catastrophic results.

Buffer overruns

It's worth thinking about what happens if we write data that extends beyond the 8-byte block we allocated. The simple answer is that it corrupts the stack and, if we're lucky, crashes the program.

A crash doesn't sound very lucky, but it's better than a potential alternative. A common method of intruder attack is to see if a program can be made to accept more data than is reserved on the stack. If it can, then the intruder can overwrite the stack with its own data. With a bit if care and experimentation, the intruder might be able to write over the return address -- the value of the link register on the stack. Assuming that the function manages to get to the end without causing a crash, the jump back to the calling program can be subverted by a jump to some address in the intruder's code. After that, anything can happen.

In the example above, we wrote a three-byte sequence of characters into an eight-byte memory area. There's little likelihood of the characters overrunning the buffer. However, when stack-based data is derived by computation, and particularly if the computation relies on external inputs, we have to be much more careful.

Assembly programs are not inherently more susceptible to buffer-overrun attacks than programs written in other languages -- most use the stack for temporary data storage, whether the programmer knows it or not. The difference is that coding all the checks that are needed to keep the data within bounds is particular tedious in assembly language, and thus more likely not to be done.

Summary

Temporary data used by a function can be stored on the stack.
Because the stack grows downward in memory, the stack pointer must be reduced to assign space.
ARM provide a frame pointer register to act as a base for stack data.
Carelessly allowing data to overrun the area allocated on the stack is a rich source of security hazards.