Kevin Boone

ARM assembly-language programming for the Raspberry Pi

7. Using sections and alignment

This example is a slight modification of the previous one, to include the use of a section definition. Using such a definition raises questions about alignment, which is a subject that needs a bit of care in assembly programming.

The example

// Outputs a simple message using sys_write
  
.section .rodata
msg:
    .ascii "Hello, World\n"

.text
.align 2

SYS_EXIT = 1
SYS_WRITE = 4
STDOUT = 1

.global _start

// Exit the program.
//   On entry, r0 should hold the exit code
exit:
    mov    %r7, $SYS_EXIT
    swi    $0

_start:
    // Use the sys_write syscall to output a string
    mov    %r7, $SYS_WRITE
    mov    %r0, $STDOUT
    ldr    %r1, =msg // Store the address of the message in r1
    mov    %r2, $13  // Store the length of the message in r2
    swi    $0

    // Now exit
    mov    %r0, $0
    b      exit

sections

The first thing to note about the example is the new section definition:

.section .rodata

This indicates that the content that follows, up until the next .section statement, should be placed in a section called .rodata, or read only data. This makes for generally neater organization that just dumping named data items into the program code, and allows the operating system to treat this data in a more secure way. For example, Linux will not allow the program to write to any data in a read-only section -- trying to do so will result in an "illegal operation" signal.

A section is simply an indivisible unit of machine code or data. The linker will aggregate the sections of a particular type, and lay them out in the executable file in a way that the kernel requires. For our purposes, three section types are important.

Each of these sections is allocated a particular region of the running process's memory map.

There's no limit to the number of sections that an assembly language program can define. In fact, it's sometimes useful to define a section for each function. This is because the linker is able to work out which sections have no links to them, so they can be removed from the executable file completely, saving space and making the program slightly quicker to load.

Alignment

The next statement in the program is:

.align 2

Alignment is the process of padding out the contents of the machine code so that particular elements start on particularly favourable addresses. This may be done for efficiency, or because some specification demands it.

In the 32-bit ARM architecture, it's generally most efficient if everything starts on a 4-byte boundary. That is, each new program instruction or new item of data should have an address that is a multiple of four bytes. With a 32-bit CPU this makes sense -- memory will be read in 4-byte chunks. If a specific piece of data lies half in one chunk and half in the next, then it will require a certain amount of decoding to be useful. The CPU does this decoding, but it's better if it doesn't have to.

The ARM ABI additionally requires that any piece of data that is "public" must align to 8 bytes. "public" in this context means that the data is accessible to some other library. My examples are all self-contained, so I only align to 4-byte boundaries.

The "2" in the instruction .align 2 refers to the second bit of the address, and does not mean "2 bytes". This is a little confusing, because some assemblers and compilers do allow specific byte intervals to be specified. In practice, on ARM this instruction aligns with four (= 2 raised to the power 2) -byte boundaries.

When the assembler encounters an .align instruction, it simply inserts zeros into the object file until it reaches a boundary. If the data is already aligned in the required way, then the instruction will have no effect.

There's no need to use .align at the start of the file, because this is address zero. Nor is there any need to use it between blocks of program code, since ARM instructions are always four bytes long. .align should be used between sections, and between data elements. We will see that sometimes alignment has to be done programatically as well -- but that's for later.

Summary