ARM assembly-language programming for the Raspberry Pi
11. Using the stack to store local data
"Local data" in this context means data that is required for the
duration of a specific function, but which is too large
to fit into registers, or not of an appropriate type. We have already
seen how to use the print_str
function to print a constant
string, stored in a data segment. This example demonstrates how we
could build a string programmatically, by storing its characters
on the stack, and then calling print_str
on the
address in the stack. The string could be quite long in practice --
almost certainly too long to store the data in registers alone.
Even if the data could be stored in registers, many external
software components, including the kernel, will sometimes
expect data to be passed in memory.
I'm assuming that you've read and understood the operation of the stack pointer, which I explained a couple of articles back. If you skipped that section, I would recommend that you flick back to it, because the following will make no sense if you don't understand how the stack pointer works, and how the stack is organized in memory.
The example
This example has a function unimaginatively called test
that generates a two-character message in the stack and then prints
it. I'm only including the specific function here -- the rest of the
code is as I've shown in previous sections.
TEST_LOCAL = 8 // How much data to reserve on the stack - 8 bytes test: push {r0, r1, fp, lr} // Store the registers we will overwrite sub sp, $TEST_LOCAL // Move the stack _down_ to allow for our data mov %fp, %sp // %fp will reference the start of our 8-byte area mov %r1, %fp // Use %r1 to count the position we are writing mov %r0, $79 // Store 'O' (char 79) strb %r0,[%r1] // Set the 'O' to memory in the stack add %r1, $1 // And increment the offset by one byte mov %r0, $75 // Store 'K' (char 75) strb %r0,[%r1] // set the 'K' to memory in the stack add %r1, $1 // And increment the offset again mov %r0, $0 // Store the terminating null character strb %r0,[%r1] // And write it out mov %r0, %fp // print_str needs an address in %r0, so copy fp bl print_str // Print the string add sp, $TEST_LOCAL // Move the stack pointer over our data area pop {r0, r1, fp, lr} // and restore the registers bx lr
How it works
The stack management technique this function uses is commonplace -- not just in assembly programming, but in the code generated by compilers. Because it's so common, the ARM CPU has a specific register, the frame pointer, for keeping track of local data on the stack. This particular example has no need to use the frame pointer, but I do, just to show how it's usually done.
First we define a variable indicating the amount of data to reserve on the stack.
TEST_LOCAL = 8
However much data we expect to store, this data size should be a multiple of four or eight bytes, as I explained in the earlier section on alignment.
Then we push the registers that the function will change. In this case
the function will change the frame pointer fp
in addition to the
usual registers, so we need to preserve its value on the stack.
push {r0, r1, fp, lr}
Then we expand the stack to make room for the local data. Because the
stack is full descending, we must expand it by reducing
the value of the stack pointer. The amount we reduce it by is the size of
the local data block. We use the sub
(subtract) instruction
for this.
sub sp, $TEST_LOCAL
There is now a "free" area of memory starting at the current value of
sp
, and extending for eight bytes. After that come the original
contents of the stack, which we really, really don't want to tamper with.
Now we set the frame pointer to the same address as the stack pointer, so we can use it as a base for later operations on the local data area.
mov %fp, %sp
At the end of the function we reverse the set-up we did on entry, moving the stack pointer back up to the original top-of-stack, and then popping off the registers we saved.
add sp, $TEST_LOCAL pop {r0, r1, fp, lr}
The main body of the function just uses strb
operations to
write data into memory. We discussed ldrb
earlier, and
you won't be surprised to find that strb
(store b is equally inefficient. However, working
byte-by-byte is easy to follow.
Why use the frame pointer?
In my example I set the frame pointer to the start of the local data block, but the stack pointer was already there. Why not just the stack pointer to reference the location of the data block?
We can certainly do that, and sometimes it is appropriate to. However, we can't move the stack pointer, while we can freely move the frame pointer. We can't move the stack pointer because we need to use the stack for its main purpose -- preserving data across function calls. If we move the stack pointer and then call another function, the called function will overwrite our stack with its own, with catastrophic results.
Buffer overruns
It's worth thinking about what happens if we write data that extends beyond the 8-byte block we allocated. The simple answer is that it corrupts the stack and, if we're lucky, crashes the program.
A crash doesn't sound very lucky, but it's better than a potential alternative. A common method of intruder attack is to see if a program can be made to accept more data than is reserved on the stack. If it can, then the intruder can overwrite the stack with its own data. With a bit if care and experimentation, the intruder might be able to write over the return address -- the value of the link register on the stack. Assuming that the function manages to get to the end without causing a crash, the jump back to the calling program can be subverted by a jump to some address in the intruder's code. After that, anything can happen.
In the example above, we wrote a three-byte sequence of characters into an eight-byte memory area. There's little likelihood of the characters overrunning the buffer. However, when stack-based data is derived by computation, and particularly if the computation relies on external inputs, we have to be much more careful.
Assembly programs are not inherently more susceptible to buffer-overrun attacks than programs written in other languages -- most use the stack for temporary data storage, whether the programmer knows it or not. The difference is that coding all the checks that are needed to keep the data within bounds is particular tedious in assembly language, and thus more likely not to be done.
Summary
Temporary data used by a function can be stored on the stack.
Because the stack grows downward in memory, the stack pointer must be reduced to assign space.
ARM provide a frame pointer register to act as a base for stack data.
Carelessly allowing data to overrun the area allocated on the stack is a rich source of security hazards.