ARM assembly-language programming for the Raspberry Pi
10. Nesting functions
In the earlier example introducing function calls I mentioned in passing that care had to be taken when function calls are nested. It's time to explain why, and what specifically needs to be done. Nesting of function calls, often to great depth, is a ubiquitous feature of modern programming, in any language. It's important that we have a way to implement it reliably.
The example
This example combines all the code we've used until now, and extends it to print a number of different messages. The main body of the program calls a function calledprint_str
to print a text string. print_str
needs to know
the length of the string, so it can tell the sys_write
syscall how much data to output. print_str
calls strlen
to get the length. So we have
two levels of function call: _start
calls
print_str
which calls strlen
.
// This example demonstrates nested function calls. The function // print_str calls the function strlen to work out the length of // the string it was passed. .section .rodata msg1: .ascii "String 1\n\0" .align 2 msg2: .ascii "String 2\n\0" .text .align 2 SYS_EXIT = 1 SYS_WRITE = 4 STDOUT = 1 .global _start /* =========================== exit ========================================*/ // Exit the program. // On entry, r0 should hold the exit code exit: mov %r7, $SYS_EXIT swi $0 /* =========================== strlen ======================================*/ // Calulate the length of a null-terminated string // On entry, r0 is the address of the string // On exit, r0 is the length of the string, not including // the terminating null strlen: push {r4-r5,lr} // Save the values of %r4 and %r5, and the LR mov %r4, $0 // Use %r4 as the character count; initially 0 strlen_0: ldrb %r5,[r0] // Read into %r5 the value in memory location %r0 cmp %r5, #0 // Compare to zero, the end-of-line terminator beq strlen_1 // If it's equal to zero, jump out of loop add %r0, $1 // If not zero, add one to the character count... add %r4, $1 // ...and to the address we are looking at b strlen_0 // Then do the loop again strlen_1: mov %r0, %r4 // Transfer the character count to %r0 for return pop {r4-r5,lr} // Restore the temporary registers and LR bx lr /* ======================== print_str =======================================*/ // Prints to stdout the text whose address is in the r0 register. The // text should be null-terminated print_str: push {r2-r7, lr} mov %r5, %r0 // Save string address in %r5 bl strlen // Get the length in %r0 mov %r2, %r0 // Transfer length to %r2 mov %r7, $SYS_WRITE mov %r1, %r5 // Address is in r5 mov %r0, $STDOUT swi $0 pop {r2-r7, lr} bx lr /* =========================== start ========================================*/ _start: // Print msg1 ldr %r0, =msg1 bl print_str // Print msg2 ldr %r0, =msg2 bl print_str // Now exit mov %r0, $0 b exit
What's new about this example?
Only one new idea is introduced here: pushing the link register. On entry to the function we push the registers that the function changes which includes the link register. On exit, we restore those registers.
push {r2-r7, lr}
pop {r2-r7, lr}
Why does the function change the value of lr
? It's because
this register is an implicit part of the call process. When we call
the function using bl
, this sets the return address into
the link register. However, the next, nested function call sets its
own return address, overwriting the caller's value. This will
cause an infinite loop if the value of the lr
is not
preserved.
Does the strlen
function need to preserve the value of
lr
? In fact, it doesn't, because it doesn't call any
functions itself. However, not preserving the link register
in a function of any complexity can be very risky in the long term.
Suppose we change the implementation of the function so that it
does, in fact, make a function call? This will create an error that
could potentially be very difficult to troubleshoot.
Most CPUs have a specific function call instruction that automatically pushes the return address on the stack. The ARM CPU gives the programmer the choice whether to preserve return addresses or not. It's certainly faster not too, because the return does not involve a memory read; but a choice to work this way needs careful consideration.
Summary
Nested function calls are a ubiquitous feature of modern programming.
The ARM CPU allows the programmer to control the mechanics of function calling, to improve efficiency.
Care has to be taken when using methods that will fail in nested function calls.
- Previous: 9. Using comparisons and branches to create loops
- Table of contents
- Next: 11. Using the stack to store local data