ARM assembly-language programming for the Raspberry Pi

2. A first program, demonstrating how to assemble and run code

In this first example I will outline the most fundamental concepts of assembler programming, and explain how to assemble and run the simplest possible example.

Overview -- what assembly language is and does

At the most fundamental level, a computer's CPU processes machine language or machine code instructions -- these terms mean essentially the same thing. Machine language is a set of numeric codes which the CPU follows in a well-defined sequence, and which tell the CPU what to do. The CPU operates at a level of fine detail: instructions specify how to pass data between registers, where and how to read and write memory, and very basic arithmetic operations. A program of any complexity has to be built up from these very fundamental operations -- usually a large number of them.

It's very difficult for humans to read numerical instructions, so we typically use an assembler to convert assembly language to machine code. Assembly language is symbolic -- words are used rather than numbers to specify operations. However, there's a more-or-less direct mapping from assembly language to machine code. While the assembler will provide facilities that simplify coding, it's still necessary to specify the CPU's behavior in very fine detail. Programmers used to working in high-level languages tend to find working at such a low level rather frustrating; but the key to programming successfully in assembly language is much the same as for any kind of procedural programming: build up a complex program from a number of simpler parts. Assembler programming allows for functions calls, local variables, loop constructs, data structures, and most of the paradigms that high-level languages enjoy. You just have to implement them yourself.

A note about C. You don't need to have programmed in C to be able to program in assembly language. I started programming in assembly years before I even heard of C. However, there's no escaping the fact that the Linux kernel is largely implemented in C, and follows C conventions for representing data. These conventions are embodied in the assembler as well.

In order to follow the examples in this series of articles, you will need four things:

A Raspberry Pi 3 or 4, or some other ARM-based Linux system. The examples all work on modern ARM-based Android devices. In fact, they are binary compatible, meaning that the generated machine code will be transferable between devices. Binary compatibility is quite hard to ensure between Linux devices, but all my examples are simple enough to make it possible. Installing and running an assembler on Android is no simple matter, but it's trivially easy on a Pi.
An assembler that uses the GNU syntax. There are many assemblers, some free and open-source, some commercial. Some have fancy graphical user interfaces and built-in debugging tools. The GNU assembler is a much simpler tool -- it takes a file containing assembly language instructions, and generates binary machine code. However, any assembler that uses the GNU syntax will work for these examples, however complex or rudimentary it may be.
A text editor -- nano, vi, emacs, whatever. If you have a fancy assembler with a built-in editor, then of course you could use that. However, all my instructions are based on simple, plain text source files. The GNU assembler is part of the binutils package in Debian-based Raspberry Pi Linux distributions -- sudo apt-get install binutils.
A linker. The linker has a significant role in C programming, but a much more limited one in most assembly programs. In my examples, all the linker does is convert object files into executables. Both of these kinds of file contain exactly the same machine code, but the file formats are just a little different. The GNU linker is a program called ld, and will probably be in the same package as the assembler. Again, integrated development tools automate linking and assembling, but I think it is more educational to do it manually.

To follow my examples, you need to be comfortable working with command-line tools, in a terminal. Or, alternatively, you need to be willing to adapt the examples to work with a development environment that is more to your taste. There are many such environments, but I can't comment on them or give instructions, because I don't use any myself.

The example -- a program that starts and just stops

This is the simplest example I could come up with. It's not even "Hello, world". In fact, it will take several more examples to build up to the Hello, World stage. In this example, the program will start, and stop with an exit code that you can examine on the command line. Examining the exit code will allow you to verify that the program actually did something, even if it didn't do very much.

Here is the example; I will assume it is saved in file called 01_exit.as. The use of the suffix .as is pretty typical for assembly code. I will explain what the various instructions mean, after demonstrating how to assemble and link the program.

// Set exit value to a literal number, by invoking sys_exit
.text

.global _start

_start:
    mov    %r0, $42    
    mov    %r7, $1     // sys_exit is syscall #1
    swi    $0          // invoke syscall

Assembling the example

All the examples in this series can be processed in the same way, since all consist of a single assembler source file.

$ as -o 01_exit.o 01_exit.as
$ ld -o 01_exit.bin 01_exit.o

The first line assembles the file 01_exit.as to the object file 01_exit.o. This is the file that contains the machine code, but it is not quite in a format suitable to be executed. The second line, using ld, makes the conversion to an executable file-- its main job in this case is to indicate to Linux the starting address of the program. This is the symbol _start, as I shall explain.

Both as and ld take an -o argument, to set the output filename. It's conventional to use filenames ending in .o for object files. There's no convention for executables -- they usually don't have a suffix. However, I'm using the suffix .bin ("binary") in all my examples. This is just to make it easier to manage the files, and doesn't affect operation.

To run the example at the command line (and there really is no other way at this stage):

$ ./01_exit.bin 
$ echo $?
42

$? is replaced with the exit code from the previous program. Since all this example does is set the exit code to 42, the only way to tell it has worked is to look at the exit code. Later we'll examine examples that produce more immediate output.

Examining the example code

Let's look at the code, line by line.

// Set exit value to a literal number, by invoking sys_exit

This is a comment. It has no meaning to the assembler -- it is for the guidance of human readers. Careful use of comments is generally a good idea in all forms of programming, but crucial in assembly language.

The GNU assembler accepts various different flavours of comment. Single line comments can be introduced using @ or #. Multi-line comments use the form /*...*/, as in C and Java.

.text

.text is the type name of a section. You'll also see the term segment used in some documents. The section type denotes the type of content that follows: .text is conventional for program code. Sections of different types are loaded into memory in different ways by the operating system, and are subject to different run-time restrictions. I will explain this point in more detail, and illustrate different section types, in later examples.

.global _start

.global indicates that particular symbol, in this case _start, is to be made available to the linker. Unless told otherwise, the GNU linker assumes that execution should begin at the address with the name _start. In this case, _start is the only symbol used.

An assembly language program starts from the defined starting address, and continues either until told to stop, or execution runs off the end of the code and the program crashes. The CPU won't stop executing instructions just because the program has run out of things for it to do -- you need to stop it explicitly.

_start:

This is a label, and its role is exactly what its name suggests -- it labels an address with a name. In this case, the label _start is assigned to the address zero -- it must be zero, because there has so far been no actual data or instructions in the source code.

Labels have many functions in assembly code. This one exists simply to indicate the start of the program. However, they can be used to name function entry points, or specific variables, or targets of a jump ('goto'). Later examples will illustrate all these different uses of labels.

I should point out that one of the roles of the linker is to fix up labeled addresses. That is, the linker will reallocate labels to match the way the program is to be loaded into memory. This simple program won't start at address zero, even within its own address space -- in practice some preamble will be loaded into memory ahead of the program code. Unless you're writing a compiler, these subtleties aren't usually too important.

The next three instructions -- in fact, the only three instructions in this program, comprise a syscall. A syscall is a way for the program to invoke services in the operating system kernel. All the examples in this series will use operating system services for input and output, but there are other reasons to invoke the system. In this case, we will do so to terminate the program.

All syscalls follow the same basic pattern -- we load the necessary data, including the syscall number, into one or more CPU registers, and then use a specific instruction to execute the call. So far as I know, there is no definitive reference for ARM Linux syscalls, apart from the Linux kernel source. However, many people maintain readable lists, and a web search for "arm 32 syscall table" should produce some useful hits. It's important to realize that, although all Linux systems of a particular version will provide the same syscalls, they have different interfaces. Don't be misled into trying to use information for x86/AMD64 systems.

    mov    %r0, $42

mov ("move") is one of the must fundamental ARM instructions. It moves data from one CPU register to another, or from a literal number. The format is mov [destination] [source]. In the present example, the operation is immediate. That is, the data to be transferred is actually in the instruction itself -- that's the number 42. The GNU assembler requires an immediate operand to be introduced with a specific symbol -- $, as here, or #. Unless you say otherwise, the value is treated as a decimal number. You can use hexadecimal numbers by prefixing them with 0x -- as in many other programming languages -- and there are variants for other number bases. I've tried to stick exclusively with decimal numbers in these examples.

r0 is a CPU register. The GNU assembler expects a register name to be prefixed with % in mov instructions, although it's less picky elsewhere.

It's an interesting quirk of the ARM instruction set that an immediate mov can only specify a limited range of values. The rules about what values are allowed are complicated, and I won't discuss them until a later example. All you need to know at this stage is that the assembler will warn you if you try to use an out-of-range value.

How do we know to use the register r0 here? From the kernel source, or a web search. All ARM syscalls use the same set of registers: if the call takes one argument it is in r0. For two arguments use r0 and r1, and so on. I'll have more to say about the registers and their particular functions later.

    mov    %r7, $1

The 'exit' syscall to terminate the program is syscall number 1. The syscall number goes into register r7, because lower-numbered registers are used for arguments to the call.

   swi    $0

"swi" is a software interrupt. An interrupt is any event, software or hardware, that causes the CPU to break its current flow of execution, and invoke a specific handler in the kernel. Both ARM and x86 use interrupts to invoke the kernel. This is a good choice, because the interrupt handler runs at a higher privilege level than "ordinary" code, and has direct access the system's hardware. Privilege level control is one of the fundamental ways that the kernel controls who can do what. Interrupts usually have numbers, where the number controls which kernel handler gets invoked. In this case, we need interrupt zero. Notice that the assembler requires the $ (or #) prefix, because this is an immediate operand -- the number is included in the instruction.

For the record, the AMD64 architecture uses a different method for invoking the kernel -- the instruction set has a specific instruction for this, simply written 'syscall' in assembly language.

Summary

This section introduced the functions of the assembler and linker.
Assembly-language programs can use comments to improve readability.
A program must export at least one global symbol, _start.
The mov instruction is used to move data between registers.
I explained how syscalls worked.