ARM assembly-language programming for the Raspberry Pi

6. Using the sys_write syscall to output text

And so we arrive, at last, at "Hello, World". This example demonstrates how to use sys_write to write to the console, and introduces some other new assembly-language features. I will show the example in its entirety, but some of the code is the same as in the previous example.

Example

Here is the code. It just outputs "Hello, World" to the console.

// Outputs a simple message using sys_write

.text

SYS_EXIT = 1
SYS_WRITE = 4
STDOUT = 1

.global _start

// Exit the program.
//   On entry, r0 should hold the exit code
exit:
    mov    %r7, $SYS_EXIT
    swi    $0

_start:
    // Use the sys_write syscall to output a string
    mov    %r7, $SYS_WRITE
    mov    %r0, $STDOUT
    ldr    %r1, =msg // Store the address of the message in r1
    mov    %r2, $13  // Store the length of the message in r2
    swi    $0

    // Now exit
    mov    %r0, $0
    b      exit
msg:
    .ascii "Hello, World\n"

Defining data

The text message "Hello, World" is a piece of data larger than a single number. We've already seen how an integer number can be loaded directly into a register using an immediate instruction like mov %r0, $32. However, we can't load a whole string of text into a 32-bit register. We can, and will, load the address of the string into a register, but to do that we have to define the string, and know its address.

The assembler provide a straightforward way to introduce data of various types into the object file. My example uses this method for a text string:

msg:
    .ascii "Hello, World\n"

As in most other programming languages, \n is a code that means 'new line'. Although it is written as two symbols in the source -- '\' and 'n' -- it only occupies one byte in memory.

msg is just a label. When the program is assembled, references to the label msg will be replace with its address. The assembler supports many other data types -- .byte, .word, etc.

The sys_write syscall

The sys_write syscall (number 4 in ARM Linux) is a little more complicated than sys_exit. It takes three arguments:

r0 -- the file descriptor. This is an integer that identifies the file or device to write to. "standard out" will always be file 1 on Linux terminals or consoles. Standard error is file 2.
r1 -- the address in memory of the data to write.
r2 -- the number of bytes to write.

As with all ARM Linux syscalls, the syscall number (4) goes into r7.

The ldr instruction

ldr is load register. In this example, ldr is used in a way that is conceptually exactly the same as the immediate mode of mov, -- to transfer a number into a register. This instruction:

    ldr    %r1, =msg

transfers to register r1 the numerical address labeled by msg:.

ldr also has an indirect mode, like this:

    ldr    %r1, [r4]

In this mode, the value of the register r4 is treated as an address in memory, and r0 is loaded with the data in memory at that address. It is the square brackets that indicate the indirect mode of operation.

We don't need to use the indirect form of ldr in this example, but will need it later.

ldr is not what it seems

If the use of ldr in this example is conceptually the same as mov, then why not just use mov as we did previously? Answering this question requires delving into the internal operation of the assembler, but it's necessary to do this, in order to write efficient code.

So why could we not, in the present example, instead of ldr use this?:

    mov    %r1, =msg

After all, I've already said that the immediate modes of mov and ldr are conceptually equivalent. The reason for not using mov is that the immediate operand to mov is of limited size. I already touched on this back in example 1, and hinted at it again in example 3. The immediate operand to mov can only be 11 bits long, but the register can store a 32-bit number. This limitation arises from the way that the operand is encoded, using only 11 bits in the instruction. It isn't the case that we can encode any 11-bit number -- numbers that are powers of two are encoded differently. The assembler will stop with an error if you try to use an immediate number that can't be encoded using the CPU's rules.

In practice, the address of the message labeled msg: might fit into a mov -- it's just about possible, because the program is so small. However, it's unwise to rely on this in a real program.

On the other hand, the ldr operation can encode any 32-bit number at all. If you're wondering how we can encode a 32-bit number into an instruction which is only 32 bits in total the answer is, of course: we can't. It's impossible.

The fact is that ldr's immediate mode is an illusion. ldr has no immediate mode -- only an indirect mode, where data is loaded from an address in memory. An instruction like this:

    ldr    %r1, =42

is actually a pseudo-instruction. The assembler converts this instruction into something like:

    ldr    %r1, [foo]
foo: 
    .word 42

That is, the assembler simulates an immediate operand by storing the operand's value in memory, and generating an indirect access to the stored value. That's how ldr can store a 32-bit value using a 32-bit instruction code.

The downside, and the reason we prefer to use mov if we can, is that executing a pseudo-immediate ldr will take much longer than the truly immediate mov. As well as the CPU having to read and decode the instruction code itself from memory, which is all that mov requires, using ldr requires some additional arithmetic and then a further read from memory. Using mov, where possible, is faster, as well as using less storage. To be fair, we won't notice the difference of less than a microsecond in a trivial program like this, but those microseconds add up when there are millions of them.

In short: mov is an immediate instruction -- all the data it needs is in the instruction itself. ldr is an indirect instruction that reads from memory, but the assembler simulates an immediate mode for ldr because mov has a range limitation.

Where is the data?

You may have noticed that the data that forms the "Hello, World" message is just tacked on the end of the program code. This is a reasonable thing to do, but a little unconventional -- usually the program's constant data will be placed in a separate memory segment. I'll illustrate this in the next example.

One disadvantage of using the .text (program code) section for data is that if you try to disassemble the code, or use an interactive debugger, the tools won't be able to tell the difference between genuine program code and your data. This won't do any harm, but it will make the tools confusing to use.

Summary

The sys_write syscall outputs data to a file or device.
The ldr instruction reads data from memory into a register.
ldr can be used to overcome the range limitation in the mov instruction, but mov -- where it can be used -- is faster and uses less storage.