Getting back into C programming for CP/M

terminal prompt For reasons I've discussed elsewhere, I've recently become interested in using, and programming, CP/M again, after an interval of 40 years. I've even bought a real, Z80-based, CP/M machine to experiment with. There's a small, but growing, market for these machines among retrocomputing enthusiasts.

I've implemented a number of new utilities for CP/M in C -- see, for example, KCalc-CPM, cpmbox, and cpmlife.

cpmlife was implemented using a modern Z80 cross-compiler, but I feel that somehow this is cheating. If I'm going to develop for CP/M, I really ought to use CP/M tools. I might not do all the development or testing on CP/M -- because it's rather time-consuming -- but I like to know that it would be possible to maintain my code entirely under CP/M.

This article is about developing in C for CP/M, using a 40-year-old C compiler, and how this differs from modern C development. The compiler I'm using is the 1982 release of Manx Software Systems' "Aztec C". The compiler is freely, and legally, available from the Aztec Museum. A lot of CP/M software falls into the broad category of "abandonware" -- software that is notionally still protected by intellectual propery law, but whose owners have no interest in it, or cannot even be identified. In the case of Aztec, however, the owners of the intellectual property, whose claim is not in dispute, have stated that they are happy for it to be distributed and used.

Note:
Since I wrote this article, I have gravitated more towards the use of Hi-Tech C, because it uses Z80 features better than Aztec. I don't really feel the need to retain 8080 compatibility. See my cpm-suntimes utility for an example that uses Hi-Tech C. There's more about Hi-Tech in Part 2 of this series.

About Aztec

The Aztec C compiler would have originally been distributed on floppy disks, and is very small by moden standards. The compiler, assembler, and linker are each about 20kB in size. The C library and the math library are a little larger.

The compiler outputs assembly code, which has to be assembled separately. Modern C compilers typically can generate assembly code, but this is usually an internal operation, not visible to the user. The Aztec C compiler for CP/M actually generates 8080, not Z80, assembly instructions, so it will work on both CPUs -- the Z80's instruction set is a super-set of the 8080's. This does mean, however, that the more sophisticated features of the Z80 instruction set don't get used. There appears to be a z80-specific compiler in later Aztec releases, but I have never been able to get it to work.

After the compiler has produced an assembly language ".asm" file, the assembler converts this to a binary object file. Object files play exactly the same role here as they do in modern C development, but they are not in any recognizably modern format. The linker than combines the object files with the standard library to create an executable.

So the sequence of operations for compiling hello.c to an executable is:

A> cc hello.c
A> as hello.asm
A> ln hello.o c.lib

Note:
CP/M is broadly case-insensitive. The text 'hello.c' will be presented to the compiler as 'HELLO.C' regardless of the original case. There's no obvious way to create a lower-case filename on CP/M

Unless told otherwise, the linker will produce a binary with the same name as the first of its arguments; in this case, hello.com.

The Aztec compiler pre-dates ANSI C, and follows the archaic Kernigan & Ritchie syntax. The most obvious difference from modern practice is in function declarations:

int my_function (a, b)
int a; char *b;
  {
  ... body of function ...
  }

Modern C compilers will accept this syntax, which can be useful if you want to check part of a CP/M C program using modern tools -- more on this later.

Variables must be strictly declared at the start of a block, which means that each opening brace "{" is typically followed by a slew of declarations. Modern practice favours putting declarations closer to where the variables are used. This is particular relevant for trivial loop control variables. You can't write this:

for (int i = 0 i < 10; i++)...

You have to write this:

int i;
...
...
for (i = 0 i < 10; i++)...

This is undeniably a nuisance, but not a huge problem in most cases.

A bigger problem is the lack of any const modifier, either to indicate constant quantities or constant pointers. This makes compile-time error checking less thorough. I've found that a lot of mistakes that would be spotted at compile time by a modern compiler don't get picked up by the CP/M compiler. Of course, this annoyance has to be considered alongside the fact that the entire compiler is only 20kB in size.

Function prototypes

The Aztec compiler does not support full function prototypes, and trying to declare them will fail. You must declare the return value of a function if it is not int and not in the same file, but you can't declare the arguments. So

double my_func (double x); /* no */
double my_func(); /* yes */

Because the arguments can't be declared, you must use the correct data type in the function call. So

double x = my_func (10); /* no */
double x = my_func (10.0); /* yes */

The problem is that, lacking a prototype, the compiler does not know to treat the literal "10" as the double "10.0". Modern compilers don't usually have this kind of problem.

Becuse I like to be able to test my code with a modern compiler as well as run it on CP/M, I usually write function prototypes both ways, with a compiler switch to select which to use:

#ifdef CPM
double my_func(); 
#else
double my_func (double x); 
#endif

Data type sizes

CP/M C compilers generally offer integer data types with smaller ranges than modern compilers. For example, the Aztec compiler takes an int to be 16 bits, so its range will be 0-65535 if unsigned and -32768 to 32767 if signed. 16 bits is a good choice for the fundamental unit of calculation, as the Z80 CPU has 16-bit registers that can take part in arithmetic. Still, modern compilers, designed for contemporary CPUs, usually take an int to be 32 or 64 bits, and getting used to the smaller range can be a nuisance.

Because an 8-bit microcomputer typically has a 16-bit address bus, an int is large enough to store a pointer. Pointer types are also 16-bit quantities.

The Aztec compiler supports a long data type which is 32 bits in size. However, the Z80 CPU has no built-in arithmetic operations on data types of that size, so 32-bit operations will be comparatively slow.

The compiler has float and double types which are 32-bit and 64-bit respectively. Double-precision arithmetic gives about 12 significant figures in practice. Both types need to be used with care, because all the floating-point calculations are done by the CPU, and are not particular speedy.

Standard library limitations

The Aztec standard C library is minimal by modern standards. Most of the basic file and console I/O functions are present, and a good set of math functions. We shouldn't expect networking functions, or selectors, or thread management -- all things that make no sense in the CP/M world. However, you'll find yourself implementing your own versions of very basic functions like strdup and memcpy. These are the kinds of functions that are easy to implement in very inefficient ways, which wouldn't be noticed with a modern CPU. You'd likely get away with writing this kind of thing in a modern C:

for (int i = 0; i < strlen (str); i++) {...}

This is bad code, of course, because the strlen() function will be executed on the same string repeatedly. A modern compiler will optimize this redundancy away and, even if it can't, modern CPUs are so fast that it might not even matter. On a Z80, it matters. All coding requires paying attention to efficiency, but functions that get called many times are particularly significant.

Command line arguments

CP/M, in general, is a system in which uppercase and lowercase characters are not strongly distinguished. The Aztec C compiler presents the program with the conventional argc and argv arguments that represent the command line -- but they will all be in uppercase, whatever the user actually enters. That isn't the fault of the compiler -- it's the way the command line is delivered by the CP/M CCP. Among other things, you need to be careful about processing command-line switches -- there's no point using a mixture of upper- and lowercase switches, for example.

Unix/Linux programmers will be used to getting command-line arguments that are "globbed". That is, any filename wild-card characters on the command line will already have been expanded into a list of matching files. On CP/M, if you want to be able to handle arguments like *.txt, you'll have to expand them yourself.

MSDOS and Windows programmers will be used to doing this, because their command-line processors follow the CP/M model. Forcing the program to expand wild-cards allows CP/M to devote only a small amount of RAM to storing the command line. This was very important in the days when many desktop computers had memory sizes measured in kilobytes.

Redirection

Another way in which programming command-line applications in CP/M is different from Unix is that CP/M provides no file redirection facilities. If the user has to be able to direct program output to a file, e.g., by running

A> myprog > myfiile.out

then the programmer needs to make this happen.

This task is made easier because the Aztec C library has built-in support for redirection. When the program starts, the initialization code parses any redirection tokens on the command line, and sets up stdin, stdout, and stderr accordingly.

Of course, this library-based redirection only applies if you do input and output using the C library features. If you call BIOS or BDOS functions directly, the redirection won't apply. That's the same in Unix, though -- even if output is redirected, you can still output to the console by writing to /dev/tty, for example.

Device I/O

This will be relatively familiar to Windows programmers, I think, and certainly to those us who programmed for MSDOS. At the C level, you'd communicate with a device by opening it's pseudo-file. So to send data to the printer, you'd start with:

FILE *f = fopen ("PRN:", "w");
fprintf (f, "Something to print...");
...

You can even write to the paper punch device by opening PUN: although, since few (if any) CP/M machines ever had a paper punch, I doubt it will have much effect. Amazingly, PUN: remains a valid device identifier on Windows.

System interface

Although the C compiler's standard library has a good selection of basic file I/O functions, there's still a lot of functionality missing, compared to what we'd expect in a modern C library. For example, there are no built-in functions for enumerating files on a drive. Aztec C provides functions for calling BDOS and BIOS entry points directly, which can used in situations like this. To use them, you do need a good working knowledge of CP/M internals.

For example, here is some code to enumerate all the files on drive A:.

  #define BDOS_DFIRST 17
  #define BDOS_DNEXT 18
  #define FCB 0x005c	
  #define DMABUF 0x0080	
  #define CHAR_MASK 0x7F
  
  char *fcb = FCB; 
  fcb[0] = 1; /* Drive A */
  strcpy (fcb + 1, "???????????"); /* Match any file */

  if ((n = bdos (BDOS_DFIRST, FCB)) == 255)... /* Handle error */ 
  do
    {
    char name [12]; 
    char *fcbbuf = DMABUF + 32 * n;

    for (i = 0; i < 11; i++)
      {
      name[i] = fcbbuf[1 + i] & CHAR_MASK;
      }
    name[11] = 0;
    /* Process the file called "name" */
    } while ((n=bdos (BDOS_DNEXT, FCB)) != 255);

To make sense of this code, you need to understand the following.

BDOS file functions work on a file control block (FCB), of which the programmer typically fills in only the first two fields. FCB is a constant that represents the CP/M default FCB, which is at memory address 0x5C in low memory. The first byte in the FCB is the drive number, starting at 1; the next 11 (8 + 3) are the filename, padded with spaces. The drive enumeration functions will match a pattern; to list all files, we have to supply the pattern "????????".
BDOS function 17 reads the first directory entry for the drive, and returns either 0xFF (error) or a small integer. The small integer is an offset into the DMA buffer area of memory, starting at address 0x80. This is where the results have been written.
The result is another file control block, whose filename field (starting at byte 1) is the filename from the directory. This is also padded with spaces.
BUT... CP/M uses some of the filename characters to indicate other properties, such as whether the file is read-only. These properties are set in bit 7 of the filename bytes, so thist top bit has to be masked off.
If you want a filename in a non padded format (e.g., foo.txt rather than FOO TXT, you'll have to implement that. It's surprisingly time-consuming to do this kind of data manipulation on a list of files with a 4MHz CPU.
BDOS function 18 selects the next entry in the directory. Note that there's no infallible way to distinguish between reaching the end of the directory, and an error condition.

I mention all this not just to fill space, but to point out that using C rather than assembly language doesn't necessarily take away the need to understand CP/M internals fairly well. Happily, you can bury all this complexity in a library once you've got it working.

Calling convention

The Aztec compiler uses traditional C argument passing to functions: the caller places the arguments on the stack, and then takes them off afterwards. Any return value is returned in the A register, for an 8-bit value, or the HL register pair for a 16-bit value. Modern practice favours passing parameters in registers where possible. This is much, much faster than stack-based argument passing, but works better when there are many registers available. The 8080 CPU only has a total of 7 bytes of register capacity, so not many arguments could be passed that way.

Using the stack to pass arguments should allow for more, or larger, arguments. In practice, I've found that passing more than three long arguments is problematic. I don't know what the maximum stack size on CP/M -- I would have thought it would be limited only be available memory. However, I've noticed other indications of limited stack size. For example, "automatic" (local) variables, which are usually stored on the stack, behave badly when they are more than a few tens of bytes in size.

I do not know if this is a defect, or whether there is some specific setting that has to be used to set the stack size. If it's a defect, it's highly unlikely to be fixed at this stage. Bear in mind that a double value is 8 bytes in size, so I doubt it will be possible to pass many of these as parameters (but two is definitely OK).

Memory management

Aztec C provides the usual malloc() and free() functions, and they work as expected. It's almost certainly faster to allocate a large amount of memory and then manage it internally, than it is to make many small allocations. This is largely true with modern compilers as well. However, it's often convenient to allocate memory or an as-needed basis and, just as with a modern compiler, the developer has to work out an acceptable compromise.

Conventionally, the program checks the return value from a malloc() call to ensure the allocation succeeded. Many programmers, including myself, have gotten out of the habit of doing this on modern systems like Linux, because a malloc() call always succeeds, regardless how much memory is available. When working on a Z80, though, we need to be much more careful about this kind of thing.

Paging and overlays

CP/M systems rarely had more than 64kB of RAM, and CP/M 2.2 had no concept of paging or virtual memory. As a programmer you could implement software that required more than the available RAM by breaking it into segments, but the operating system gave little help with this.

The Aztec C compiler supports a simple paging mechanism based on a technology known in the CP/M world as "overlays". A program consists of a "base" or "core" segment that remains in memory all the time, and a number of "overlays" that are loaded from disk as required.

The tooling for compiling and using overlays is built into the compiler and C library so, for the programmer, it's pretty straightforward. Of course, there are subtle problems, like passing data from one overlay to another, so things aren't trivial. And, of course, with genuine 80s hardware, reading the overlays from disk is fairly slow, so it's a technique that has to be used with care.

Building and testing

While I think that using modern cross-compilers for CP/M development is cheating, I have no objection to use real CP/M tools on a modern CP/M emulator. This is usually much faster, and more convenient, than working on real 80s technology. But are these approaches really different?

It seems to me that, if we're interested in keeping these old technologies alive and thriving, we should actually be using them. Using a CP/M compiler on a CP/M emulator satisfies that objective -- at least to some extent -- while using modern tools that could never run on CP/M does not. At least, that's how it seems to me.

Consequently, I'm quite keen that the CP/M software I write is at least capable of being compiled and linked on 80s hardware. I might not actually do this very often, but I always check that it's possible to do so.

In any case, you'll need to test the software on real hardware, even if you build it using an emulator. A modern emulator will run CP/M applications hundreds of times faster than a real CP/M machine does natively. As a result, it's all too easy to write very inefficient code, that seems to work perfectly well on an emulator, but struggles on real hardware.

Here's an example. It's often often convenient, and expressive, to work with two-dimensional arrays. In that case, you might find yourself enumerating the complete array like this:

  int elems[5][300];
  ...
  int i, j;
  for (i = 0; i < m; i++)
    {
    for (j = 0; j < n; j++)
      {
      int elem = elems[i][j];
      ... process the value ...
      }
    }

There's nothing wrong with this code structurally and, if you only test it on an emulator, most likely it will work fine. The problem is the amount of math required to determine the value of elems[i][j]. This will require a 16-bit multiplication -- for which there is no hardware support -- and an addition, followed by an index into memory. This whole process will be repeated 1500 times.

It's hugely faster to consider the array as a single block of data, and enumerate it by maintaining a single index which gets incremented, like this:

  int elems[5][300];
  ...
  int *_elems = (int *)elems;
  int i;
  for (i = 0; i < 1500; i++) 
    {
    int elem = _elems[i];
    ... process the value ...
    }

This strategy is less readable, but it completely eliminates the need to perform 1500 16-bit multiplications. Of course, this saving can be made only because we happen to be reading the array sequentially; sometimes this isn't practicable. However, there's always a need, when programming for 40-year-old hardware, to think very carefully about efficiency. We've mostly gotten out of the habit, because modern compilers can do this sort of optimization implicitly, and our CPUs are thousands of times faster.

This is why testing as often as possible on original hardware is so important -- it's just too easy to write inefficient code if you work too much on an emulator.

At the same time, I've found that it's very helpful to be able to build and run code destined for CP/M on a modern Linux system. I suppose it would be equally straightforward -- or not -- to run it on Windows. Modern compilers can do much more extensive compile-time checking and, at runtime, we can use tools like valgrind to check for invalid memory references and careless memory management. None of this is possible under CP/M. I've found that GCC will compile K&R-style C perfectly well, and anything that Aztec C can compile can also be compiled by GCC. It might not work, of course -- nothing's ever that simple.

In practice, you'll probably only be able to unit-test certain parts of the program on a modern platform, because all the I/O will be different. Still, even that is an improvement over the testing it's practicable to do natively on a Z80 system.

Closing remarks

If you want to write really efficient code for 80s hardware, using an 80s C compiler is really only one step up from writing assembly language. The C language is minimal, as is the C library. You'll have to do all the optimisation yourself, that would be automatic with a modern compiler. Compile-time error checking is minimal, and you'll still need to be familiar with the internals of the platform.

But if it were easy, it wouldn't be fun.