The costs and benefits of software pulse-width modulation on the Raspberry Pi

Pi logo Pulse-width modulation (PWM) is a simple method of digital-to-analog conversion, that works by producing a train of pulses of variable width. These pulses are typically smoothed by a low-pass filter to produce the analog voltage. Where PWM is used for applications like controlling the brightness of a lamp or the speed of a motor, it may not even be necessary to smooth the pulses -- physics or biology will do it automatically. PWM is also used for controlling the position of the shaft of a servo motor -- a very important application in robotics and model-making.

The Raspberry Pi has no built-in analogue outputs, unless we count the audio output. This audio is, in fact, generated by PWM, using the Pi's only hardware-controlled PWM outputs. These can, in fact, be used by other applications, but only at the expense of losing audio capabilities. There are some tricks that can be used to extend the hardware PWM capabilities but, in practice, if you want multiple PWM outputs -- for analog signal generation, or some other purpose -- designers frequently fall back on software-controlled PWM.

There are libraries for doing PWM, of course. The problem with these is that it isn't easy to figure out exactly what's going on, and to find out whether there are glaring inefficiencies that can be eliminated. This is important because software PWM has the potential to be grossly inefficient, if we're not careful.

In this article I outline how to do PWM in software in C, using simple usleep() operations to do the timing. To control multiple PWM channels, it's easiest to use a separate thread for each one.

I'll be showing snippets of code, but not a full application. The full application is available in my GitHub repository.

GPIO pin control in C -- a review

If you're familiar with GPIO access in C, none of this will be new.

Other than relying on 3rd party libraries, there are a number of ways of setting voltages on GPIO pins using C. Probably the simplest -- and the most portable -- is to use the traditional sysfs interface. It's rumoured that this interface will be going away, but it's not clear what it will be replaced with, and it's present in every Pi kernel released so far. The interface has remained substantially the same since the very first Pi.

All the control points are psuedo-files in the directory /sys/class/gpio. These aren't real files -- they don't correspond to any storage -- but they can be read and written as if they were.

To use a GPIO pin it must be "exported". This amounts to writing a pin number to /sys/class/gpio/export. Doing this makes a new directory, specific to the pin, available. For example, writing "18" makes available the directory /sys/class/gpio/gpio18.

For our purposes, the two important "files" in this new directory are direction and value. To make the pin an output, we write "out" to direction. Then we can write a "0" or "1" to value to set the voltage high or low.

For present purposes, that's all there is to it. There are a number of potential inefficiencies in this approach to setting GPIO pins, some which can be avoided, and some which can't. More on this below.

Basic PWM

The basic PWM loop looks like this.

  while (!stop)
    {
    if (on_usec != 0)
      pwm_set_pin (my_pin, 1); 
    usleep (on_usec);
    if (!stop)
      {
      if (off_usec != 0)
        pwm_set_pin (my_pin, 0); 
      usleep (off_usec);
      }
    }

In this sample, on_usec is the length of the time the output should be high, in microseconds, and off_usec is the time it should be off, also in microseconds. The function pwm_set_pin() does the actual pin setting -- I'll have more to say about that later.

In the example, I have specific tests for whether the "on" time or the "off" time are zero. This is because we probably want to be able to set the PWM output to completely on, or completely off. Or maybe not -- this depends on the application. In any case, if we do want to be able to set fully on and fully off, we don't want to change the GPIO state at all. The reason is that there a minimum length of time the pin can be changed for -- this is a limitation in the kernel and I/O. So to get full-off or full-on output, we have to avoid changing the state completely.

Using threads

It's much easier to write the rest of the application is we can just set the PWM to some specific timings, and leave the PWM loop to get on with it. The easiest way to do this is to to start a new thread to do the PWM operations.

In essence we need something like this:

#include <pthread.h>

void *loop_start (void *)
  {
  // PWM loop goes here
  }

// Start the PWM loop in a thread
pthread_t pt;
pthread_create (&pt, NULL, loop_start, NULL);
pthread_detach (pt); 
// Rest of the application...

In practice, if we want to be able to use different multiple PWM signals, we'll (probably) need to create multiple threads. And then we'll need some data structures to hold the thread-specific details. My sample application shows how I do this but, of course, it's by no means the only way.

Avoiding inefficiencies (where possible)

If your PWM cycle times are of the order of seconds, or even tenths of a second, inefficiency isn't a huge concern. However, with cycle times of milliseconds, things get more critical. It is the PWM loop that requires most of the attention, because this will potentially be executed thousands of times per second.

In my example, it's only the pwm_set_pin function that allows any scope for optimization. Here's an inefficient way to implement it.

#include <pthread.h>

void pwm_set_pin (int pin, int value) 
  {
  char path[PATH_MAX];
  sprintf (path, sizeof (PATH), "/sys/class/gpio/gpio%d/value", pin);
  char num[10];
  sprintf (num, sizeof (num), "%d", value); 
  int f = open (path, O_WRONLY);
  write (f, num, strlen (num));
  close (f);
  }

Examining all the things that are wrong with this code should give some ideas about a more efficient implementation.

With these things in mind, we can write a much more efficient implementation.

void pwm_set_pin (int pin, int value) 
  {
  static char one = '1';
  static char zero = '0';
  int fd = get_stored_fd_for_pin (pin);
  write (fd, value ? &one : &zero, 1); 
  }

Of course, what this code doesn't show is all the set-up and tear-down code that makes it possible. But that's not important, so far as efficiency is concerned, because that code is executed only infrequently.

What are the remaining inefficiencies? There's the function-call overhead for pwm_set_pin(). This can perhaps be removed by making the function inline (although the compiler might do this anyway). Perhaps a few CPU cycles can be shaved off by tightening up some of the calculations although, again, the compiler might do this itself.

The inefficiencies we can't remove are those associated with the kernel syscalls for usleep() and write(). Although usleep() does not use any significant CPU during the sleep time, setting up the sleep takes work. The write() function has some processing to do but, again, because it involves a change in execution context to kernel mode, there is some unavoidable work to do. There's really no practical way to avoid the CPU cycles associated with the kernel operations. These are small inefficiencies, to be sure -- but they become significant when they're repeated thousands of times every second.

Practical performance

Unsurprisingly, the CPU load generated by the PWM process depends on the number of PWM cycles per second. The table below shows some test results from my sample program, on a Pi 3B+.

PWM cycles per second CPU usage with one pin controlled
50 Too small to measure
500 2-3%
5000 12-15%

What does this mean in practice? If you wanted to use PWM to control the colour of a multi-colour LED, you'd need three separate channels of PWM -- one for each of the individual colours. With a PWM frequency of 50Hz, most people wouldn't see any flicker. In fact, you'd probably get away with 20Hz. The total CPU usage created by software PWM in such a scenario is less than about 2%.

Most small servo motors require a PWM control signal at 50Hz or thereabouts. Again, software PWM would probably be satisfactory here, even if controlling several servos at the same time.

You could also use software PCM for, for example, generating a voltage that would be used to control the gain of an amplifier (with suitable smoothing of the PWM pulses). A 50Hz PWM frequency would probably be adequate here -- in fact, 10Hz would probably be adequate, but it would be harder to smooth nicely. You could almost certainly use software PWM -- with current amplification -- for controlling the speed of a DC motor -- in fact, this is a very common application. You don't need to smooth the PWM pulses, because the weight of the moving parts will do that for you. However, you'll probably need to fiddle with the PWM frequency to find one that is effective. If the frequency is too low, the motor will "growl". It's it's too high, it will whine. There's an optimal frequency for every motor, that probably needs some experimentation to find.

Generating any kind of analog waveform with a maximum frequency of more than about 500Hz is probably impractical. It can be done, with careful choice of low-pass filter, but the PWM operations will take a significant fraction of the CPU resources.

Alternatives to software PWM

Given the Pi's lack of built-in PWM facilities, it's worth asking what alternatives there are to doing PWM in software. To some extent the choice depends on whether you're doing PWM specifically to generate pulses of a particular width, or whether you want to smooth the pulses to create an analog voltage.

For about £5, you can buy an I2C PWM controller with 16 channels, with PWM frequency up to about 2 kHz. These devices need only a two-wire connection to the Pi's I2C bus, and usually offer 8-bit or 12-bit resolution in the control of the pulse width. They aren't usually suitable for analog signal generation -- they're usually aimed at the model-maker who wants to control banks of servos.

Alternatively, for about £3 you can get an I2C digital-to-analog converter chip. These devices usually offer 12-bit resolution. They may operate internally using PWM but, most likely, they will use a simple resistor ladder. This allows the device to generate analog voltages as fast as the I2C bus can supply data. Of course, if you don't mind tying up a bunch of GPIO pins, you can implement a resistor ladder yourself, and this will be capable of running at very high speeds.

In fact, given the low cost and high specification of some of the alternatives, its actually quite hard to justify using software PWM at all.