Using Linux command-line tools for programming the SparkFun Pro Micro microcontroller (and similar)

Pro Micro

In this article I describe command-line operations for building programs for, and deploying them to, the SparkFun Pro Micro microcontroller board. I illustrate the process using very simple examples. Although only command-line tools arerequired, having the standard Arduino IDE on stand-by remains useful, for reasons that I'll explain. I've written this article for experienced C/C++ developers who normally use Linux workstations, and have their own preferences for editors and build tools. Some, but not all, of this article is relevant to Arduino-like boards other than the Pro Micro.

While I have no particular interest in using the Arduino IDE, I'm quite happy to use the Arduino libraries. After all, implementing USB storage or Ethernet support from scratch is a non-trivial undertaking. It is the use of the Arduino libraries that makes building code outside the official IDE so fiddly.

Consequently, in this article I'll first explain how to compile and upload a program that does not use any standard libraries. Then, with the basic principle established, I'll explain how to include Arduino library components in an application, which will always amount to building them from source.

This is a long, and perhaps rather complicated, article. If all you want to do is run "Blink", there's more information than you need. However my experience is that, if you're working outside of the Arduino IDE, developing anything more complicated than "Blink" becomes overwhelming complicated very quickly. This is where it's really necessary to understand what's going on in detail.

The (very simple) source code used in this article, and a sample Makefile to build and deploy it, are available in my GitHub repository.

A note about devices and terminology

If you're an experienced Arduino developer -- using any platform or tools -- you probably don't need to read this section.

This article is specifically about the SparkFun Pro Micro, which is a type of Arduino-compatible microcontroller unit (MCU) board. Arduino is a range of microcontrollers based primarily on the Atmel ATMega devices. The Atmel devices are examples of AVR microcontrollers. The origin of the term "AVR" is lost in the mists of time, but broadly describes a microcontroller with separate data and program memories. There are many AVR devices other than the Atmel range, and many boards based on Atmel devices that are not Arduinos. The Arduino boards were originally designed for education, and a particular design goal was that they should be programmable without specialist equipment.

Most modern Arduino, and Arduino-like, devices are programmed using a USB port, which might also be used by the installed application after programming.

Arduino devices are not "computers" in the modern sense. They don't have an operating system, and can't be programmed at runtime. They don't easily support multitasking, and there is no hardware abstraction. What gets uploaded to an Arduino is a single program, that executes until the board is powered off or a new program is uploaded. Despite their relatively low clock speeds, microcontrollers are often better-suited for real-time and time-critical applications than embedded computers, because there are no operating system overheads to interfere with the timing.

The Pro Micro is an Arduino-compatible board, but not an official Arduino product. Of the Arduino lineup, it is probably most similar to the Leonardo, particularly in its use of USB and the inclusion of an ATMega 32u4 MCU. However, it's much smaller than a Leonardo -- not much bigger than a postage stamp. It costs about the same as a pint of beer in a London pub (not that any pubs are open these days).

Its small size and low cost has made the Pro Micro popular in applications like custom USB input devices for computers and games consoles.

Basic principles -- and why there's a problem here at all

Most people, when they start developing for Arduino devices, use the official Arduino IDE tools and the official Arduino libraries. These tools and libraries are reasonably host-agnostic, and there are plenty of people doing effective Arduino development on Microsoft Windows systems. Windows provides little in the way of general-purpose compilation and code management tools, and an all-in-one IDE and library set provides a convenient development environment for the Windows platform. This, after all, is the way most development seems to be done on Windows.

Linux users, on the other hand, have a huge range of general-purpose editors, build tools, and code management utilities available, and usually want to use them. The problem here isn't simply a matter of preferring to work at the command line -- although many of us do -- as there already is a serviceable Arduino command-line build tool. Rather, it's a matter of using methods (Makefiles, library management tools, etc) that we've all become very productive with over the years. There's very little low-level documentation about how building for Arduino actually works. Using the standard libraries without the stock IDE is particularly nasty.

The Arduino standard libraries are not supplied as binaries, but as source code. The source needs to be compiled and linked into every application that uses these libraries. Compiling the libraries is not a one-time job, because different binaries would need to be made available for every Arduino-type microcontroller (MCU) that exists. Moreover -- and this is the real sticking point -- the binaries would be different for every board -- even boards that use the same MCU. That's because different boards do things like USB, or even GPIO pin assignment, in different ways. It's nice to be able to write code like Serial.println("Entering main loop") without worrying about which MCU pins are connected to the USB controller.

So when we build an application that uses Arduino libraries, we have to compile the library components as an intrinsic part of the application, using the same compiler settings and a bunch of board-specific definitions. Moreover -- and this is where we really start to grit our teeth -- the standard library sources refer to board-specific headers that are provided by the board manufacturer.

The IDE takes care of all this complexity, of course; but without the IDE we have to tackle it head-on. I'll describe in broad terms how to do that later.

Broadly, building and deploying C/C++ code on an Arduino board amounts to these steps, most of which should be broadly familiar to an experienced C/C++ developer.

A note about USB operating modes (and how to enable bootloader mode)

In my view, the use of the board's USB port is probably the most confusing, and least well-documented, aspect of using the Pro Micro (and similar boards). The confusion arises because of the way that the same USB port is used for both the bootloader, and for application purposes, and can be switched from one mode to the other.

In bootloader mode, the board's USB port looks to the host system like a USB modem. It might look like this when the board is running user software as well, but that's up to the software developer. Depending on how the host system is set up, Linux should create a /dev/ttyACM* or /dev/ttyUSB* device when the Pro Micro is plugged in, and in bootloader mode. The distinction between ttyUSB and ttyASM is not particularly important in this context -- the difference relates to matters such as which side of the link is responsible for data compression when a real modem is connected. For the record, "ACM" stands for "Abstract Control Model". The other term you'll come across when working with the Pro Micro is "Communications Device Class" (CDC). This is a general term for USB devices that take part in point-to-point communication (rather than, for example, storage).

So, in bootloader mode, you should see a /dev device which you can use to upload code using avrdude. However, as soon as application starts to run, the USB device will be disconnected, and disappear from the host operating system. In order to be able to upload code again, one of three things must happen.

First, you can perform a hard reset to bootloader. On the Pro Micro this is done by grounding the RST pin twice within half a second or so. During testing, at least, it's probably a good idea to wire a pushbutton between the GND and RST pins for this purpose. These pins are clearly marked on the board, and are next to one another.

Second, you can provide your own bootloader support in your application. You can program the USB port to accept whatever data you want, and do whatever you like with it. I'm assuming that if you have the knowledge to do this, you won't need to be reading this article, and I won't go into this approach in any more detail.

Third -- and this is the approach that most experimenters seem to take -- you can use the built in break-to-bootloader support in the Arduino USB libraries. If you use the Arduino libraries to provide a serial monitor on the USB port, for example, you'll automatically get break-to-bootloader support in your application. This support is provided by a library module CDC.cpp, which is used by all the USB libraries.

The Arduino break-to-bootloader support is invoked by setting the USB baud rate to 1200. When this happens, the application code is stopped, and the board moves immediately to bootloader mode. If the application itself was set up to perform its own USB operations then, from the Linux host's point of view, the USB is disconnected, then reconnected (perhaps as a different kind of device), which takes a little time.

I've found the following sequence of commands will reliably put the Pro Micro into bootloader mode, preparatory to uploading code:

$ stty -F $UPLOAD_DEV speed 1200
$ sleep 1
$ stty -F $UPLOAD_DEV speed 57600
$ sleep 0.25
$ avrdude ....

Where Pro Micro users seem to come unstuck, however, is in not realizing that this break-to-bootloader support only works for applications that broadly comply with the Arduino programming idiom, and are actually running properly. It definitely won't work if you upload something to the board that is completely broken (e.g., compiled for the wrong device). Even in the absence of such catastrophic failures, the break-to-bootloader support won't work if it isn't compiled into your program. When you use the Arduino IDE, this generally happens automatically; but when you're free of the IDE's constraints, you'll have to make your own provision -- which generally amounts to linking and initializing the relevant library components.

In short, you shouldn't rely on the software break-to-bootloader support always working -- you need a provision for a hardware reset as well.

A note about permissions

You won't need elevated permissions to build a program using the compiler and binary conversion tools. However, you might need some kind of elevated privileges to upload code to the board, using the USB port. In many Linux distributions, the relevant entries in /dev are writeable by the dialout group, so adding a user to that group is a reasonable way to secure the necessary access. Of course, you can also upload as root if you prefer. You might also need to use root for installing the necessary software tools, depending on how your system is set up.

Collecting tools and preparing the environment

To build and upload C/C++ programs, you'll need at least the following tools.

In mainstream Linux distributions, you should be able to get all the necessary software -- except the board-specific files -- in one go by running apt-get install arduino or dnf install arduino. The total installation size is about 150Mb.

The final set-up step is to install the board-specific files. Of these files, all we really need is the C headers, but I think it's still easiest to get them using the Arduino IDE. Instructions for doing this are on the SparkFun website. Instructions for Linux are some way down this page, which is mostly aimed at Windows users, but they're still reasonably comprehensive.

If you install the board-specific headers using the IDE, then they will end up in directory with a name like this: $HOME/.arduino15/packages/SparkFun/hardware/avr/1.1.13/variants/promicro. However, the name probably won't be exactly the same, and some searching may be required to find the "promicro" directory.

In practice, the board-specific variant directory contains only one file for the Pro Micro: pins_arduino.h. The other content that gets downloaded controls compiler settings and macros. We can use the IDE to see what those settings are, but we'll have to make our own provision to use them when compiling manually.

Deploying a trivial example

In this section, I'll explain how to deploy the simplest working example I can think of -- it just switches on the on-board transmit and receive LEDs. The example uses board-specific macros to do this, so we don't need to include any of the generic Arduino headers. We do have to include the SparkFun board-specific header pins_arduino.h, however.

Note:
Because this example uses no Arduino libraries, it won't provide any software break-to-bootloader support. So you'll need to provide a reset button to enable the bootloader, to experiment with this kind of code.

Here is the code, which I've saved in a file blink_minimal.cpp. There's nothing in this source that requires C++, rather than C, but code that uses the Arduino libraries (later) will usually need to be C++.

#include <pins_arduino.h> // For LED macros

int main (int argc, char **argv)
  {
  TX_RX_LED_INIT;
  TXLED1;
  RXLED1;
  for(;;); // halt
  }
Note:
We can use the Arduino "wiring" library to control the RX LED on the Pro Micro -- and will later. However, the Pro Micro does not have the TX LED on an Arduino-compatible GPIO pin, so that's always going to need a board-specific operation.

For ease of typing, set an environment variable to indicate where the board-specific headers are:

$ VARIANT_INCLUDE=$HOME/arduino15/packages/SparkFun/\
hardware/avr/1.1.13/variants/promicro

The directory noted above is where the Arduino IDE installed these headers. The exact location is unimportant, so long as you tell the compiler where they are.

Compile and link the program to an ELF binary, indicating the location of the board-specific headers:

$ avr-g++ -I $VARIANT_INCLUDE -mmcu=atmega32u4 -Os -Wall \
      -o blink_minimal.elf blink_minimal.cpp 

In this command, -Os means 'optimize for size'; we'll use this switch all the time, but it hardly matters in this trivial example. -mmcu identifies the specific AVR sub-type we're compiling for. The avr-g++ command compiles the C++ source, and produces an executable in ELF format -- the format generally used for Linux binaries.

Convert the ELF binary into a Intel Hex file suitable for uploading to the Pro Micro board:

$ avr-objcopy -O ihex -R .eeprom blink_minimal.elf  blink_minimal.hex

Now upload the .hex file to the board using avrdude. In order to do this, we need the board in bootloader mode. If this isn't a brand-new board, you'll need to effect this manually -- see the section "A note about USB operating modes (and how to enable bootloader mode)" above. You'll probably need to use the hardware reset button, and be aware that the Pro Micro's USB port will only stay in bootloader mode for a few seconds, so speed is of the essence.

$ avrdude -v -p atmega32u4 -c avr109 -P /dev/ttyACM0 -b57600 \
    -D -Uflash:w:blink_minimal.hex:i

If everything has gone according to plan, you should be rewarded by seeing the TX and RX LEDs turn on (along with the power LED).

Incidentally, while uploading a substantial program can be time-consuming, uploading the few hundred bytes of this trivial sample should not take more than a fraction of a second. If it seems to be taking longer than this, something is wrong.

Deploying a trivial example that uses Arduino libraries

This is where things get significantly more complicated and where, in practice, you'll need a script or a Makefile to automate the steps -- there's just too much to do, to do it manually at the prompt.

The program I'm using for demonstration purposes appears only slightly more complicated than the previous example. However, it uses the USB port to provide debugging messages, and that means we need to use the Arduino libraries (or write an awful lot of code from scratch). Here is the code, in blink.cpp.

#include <Arduino.h>
#include <HardwareSerial.h>

int RXLED = 17; // Arduino pin for the RX LED.
// Note tha the TX LED does not have an Arduino pin,
//  and uses a board-specific macro.

void setup()
  {
  pinMode (RXLED, OUTPUT);

  Serial.begin (9600);
  Serial.println ("Hello, World");
  }

void loop()
  {
  Serial.println ("Tick");
  digitalWrite (RXLED, LOW);
  TXLED0;
  delay (200);
  digitalWrite (RXLED, HIGH);
  TXLED1;
  delay (2000);
  }

This code differs from the previous, trivial example in three important ways.

Note that, by including the stock USB libraries, we automatically get software break-to-bootloader support, as I described above.

The first decision to make is how we are to manage the Arduino libraries (which, as I've said, have to be compiled into the application). I can think of two general approaches.

First, we could pick out the relevant source and header files from the basic Arduino package, and copy them into the application along with the application's own sources. For simplicity we could just copy the library files into the same directory as the application sources. The problem with this approach is in code management -- our application will contain a large number of files that aren't part of the application in any a proprietorial sense. When I push my code to a repository like GitHub, I don't want to push a heap of somebody else's code. There are ways around this problem, but not elegant ones.

Second, we could leave the library components where they are initially installed, and just compile them into object files in the program's working directory as part of the build. That's the approach I've adopted for this example. However, I should point out that, the more libraries you collect from various places -- and all will be in source form -- the easier it gets just to copy the components into the program's own source.

However the Arduino libraries are managed, compiling them is much the same as compiling the application sources -- except that the library sources expect specific compiler macros to be set.

Here's how to compile the standard USBCore module, which (in my version of the Arduino distribution) is provided as C++.

First, for ease of reading, lets define environment variables for the locations of the standard Arduino components, and for the board-specific headers:

$ VARIANT_INCLUDE=$HOME/arduino15/packages/SparkFun/\
hardware/avr/1.1.13/variants/promicro
$ INCLUDE=/usr/share/arduino/hardware/arduino/avr/cores/arduino

Then we compile like this:

$ avr-g++ -Os -Wall -ffunction-sections -fdata-sections -mmcu=atmega32u4 \
    -DF_CPU=16000000 -DUSB_VID=0x1bf4 -DUSB_PID=0x9204 \
    -fno-exceptions -fno-threadsafe-statics 
    -I $VARIANT_INCLUDE -I $INCLUDE -c -o USBCore.o $INCLUDE/USBCore.cpp

Some of these compiler switches are likely to be unfamiliar.

-ffunction-sections and -fdata-sections instruct the compiler to use a different section in the object file for each function and data block. This is important for size-optimization, because the linker can remove the sections that aren't referenced. This means that C functions that are in the library sources, but never actually used, don't get included in the final build.

-fno-exceptions and -fno-threadsafe-statics prevent the inclusion of code that is unlikely to be used in a microcontroller program.

-DF_CPU defines the microcontroller clock speed in Hz. The Pro Micro has a 16MHz clock. This definition is used in many of the library sources. For example, it's used in the delay() function to calculate timing values.

-DUSB_VID and -DUSB_PID set the vendor and product IDs of the USB device when the user application is running. These values are used by the USB modules in the library. The values I'm using here are the same as those used by the Pro Micro bootloader, so the board will appear as the same device whether it's running an application or in bootloader mode. That's the right thing to do here, because the communication of textual diagnostic data isn't different in principle from the communication used by the bootloader -- both are modem-like. However, if you're implementing a custom USB device, you'd probably want to provide you own vendor/product IDs.

None of this discussion addresses the problem of which library modules to include. You can't accidentally forget one -- the application won't link. But knowing which modules implement which functions and classes isn't straightforward. There are, I think, three ways to approach this problem: experience, trial-and-error, and cheating using the IDE.

While I won't be using the IDE for actual development, it is handy for figuring out which libraries to include. All we have to do is write the most trivial possible program that uses the relevant functions and classes (it doesn't actually have to work), and see what compilation commands the IDE emits. You can, in fact, include all the library modules, and rely on the linker to remove the parts that aren't actually used. On a modern Linux system, compiling the entire Arduino library only takes a few seconds. This isn't very elegant but, in practice, a substantial Arduino program might use a majority of code in the standard library anyway.

Note:
Annoyingly, some of the library modules are written in C and some in C++. You can't use exactly the same compiler switches for both languages, because some make sense only for C and others for C++.

Linking amounts to the following:

$ avr-gcc -w -Os -flto -fuse-linker-plugin \
      -Wl,--gc-sections -mmcu=atmega32u4 -o blink.elf \
      source1.o source2.o....

-flto and -fuse-linker-plugin both contribute to link-time optimization (LTO), and their use is somewhat disputed. It won't break the program to leave these out, or to experiment to see which gives the smallest binary.

-Wl,--gc-sections tells the linker to remove any sections that are not referenced by other sections. In conjunction with the compiler settings, this setting is crucial for minimizing the executable size.

The resulting ELF file can be converted into a hex file, and uploaded to the board, as described for the simpler example above.

The source code bundle contains a Makefile that automates all these steps. All being well, this example should provide for break-to-bootloader support, so you can deploy without needed to double-tap the reset button every time.

When the program is running, the Pro Micro will function as a USB modem (as it does when running the bootloader). However, the process of switching the USB port from bootloader mode to application mode takes a little time -- as much as ten seconds. So, while we should be able to see the diagnostic output of the Pro Micro program just by doing cat /dev/ttyACM0 (or whatever device is allocated), it might take some time for this to happen. This limitation is documented, and isn't considered a defect.

Closing remarks

It's pretty clear that the Arduino libraries were never designed to be integrated into a standard Linux toolchain. Creating a Makefile to build even a simple application is a whole lot more complicated than it ought to be. Once the Makefile is set up, building and deploying a program is a one-command operation, and doesn't require you to follow the code structure imposed by the IDE. That, alone, is a good reason for getting to grips with command-based building.

A number of developers have published general-purpose Makefiles for handling Arduino sketches. While these can work, I'm not convinced that it's possible to get the best out of them unless you understand the low-level operations involved in the build process.