The importance of enabling compiler warnings in C and C++ programming

C logo I see too many C (and C++) programs misbehave at runtime, for reasons that could easily have been detected at build time using checks built into all modern compilers. C is not -- or should not be -- a "try it and see" language. Nevertheless, many developers struggle to find faults at runtime that could easily have been eliminated before the program was even executed.

Some of these faults are old chestnuts, like putting an extra semi-colon on the end of a for(...) statement; some are more subtle. All are avoidable, with the proper compiler settings.

This article is specifically about the GNU gcc compiler, which is pretty-much ubiquitous these days. Many developers are using it in the traditional way, with Makefiles and command-line tools, while others are using it built into an integrated development tool. While I focus on gcc here, I have no reason to think that other compilers behave very differently.

A C compiler will always report actual faults in program syntax -- the kinds of error that prevent the program being compiled at all. However, it can also report "warnings", about suspicious constructions that do not prevent the program being compiled.

gcc takes a number of command-line switches that affect its warning level. If none of these switches are used, the compiler takes a pretty relaxed approach to reporting potential problems. I would suggest that the warning mode C and C++ programmers should use is at least as fussy the one enabled by the -Wall command-line switch. In principle, this switch means "all warnings" but, as we shall see, this mode does not report all the warnings that might be of interest.

The first part of this article gives examples of potential problems that gcc will warn about if the -Wall switch is used. Later, I'll describe a few problem areas that need a fussier warning level if they are to be reported.

Potential programming errors that are reported when using gcc -Wall

Misplaced semicolons

This is the archetypal trivial error that does a lot of damage. It's perfectly legal in C and C++ to write

if (at_war == TRUE);
  fire_missiles();

But the behaviour is almost certainly not what the programmer intended. The if statement is complete, and the call to fire_missiles() is made whether the if() evaluates to TRUE or FALSE.

A similar problem plagues loops. Probably the intention in this example was to run the loop body ten times:

  for (int i = 0; i < 10; i++);
    printf ("Hello, World\n");

But actually it runs just once.

There can't be many programmers, however experienced, who never make a mistake like this.

Neither of the suspicious constructs above is in any way illegal under the C language specification. A compiler can't really complain about a semicolon on the end of an if or a for by itself, because there are circumstances where it is appropriate to write code like this. What the compiler can complain about, if so configured, is the discrepancy between the terminating semicolon and the following indentation.

When running gcc with the -Wall option, the previous example gives rise to this warning:

warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
    5 |   for (int i = 0; i < 10; i++);
      |   ^~~
...this statement, but the latter is misleadingly indented 
as if it were guarded by the ‘for’
    6 |     printf ("Hello, World\n");

The "does not guard..." message indicates that the structure of the for is not consistent with the following indentation. That isn't necessarily a fault in the logic -- it might just be careless indentation. But, either way, it should be a red flag.

Writing a comparison as an assignment

This is a another old warhorse and, again, it's a mistake that even experienced programmers make.

 defcon = 5;
  if (defcon = 1)
    fire_missiles(); 

From the context, it's pretty clear that the developer meant a test, not an assignment:

  if (defcon == 1)...

But the compiler doesn't know that, and using if() with an assignment is perfectly legal -- and sometimes appropriate.

With -Wall, we receive this warning:

warning: suggest parentheses around assignment used as truth 
value [-Wparentheses]
   if (at_war = 1)
       ^~~~~~

If the assignment isn't a mistake, enclosing it in an extra pair of parentheses will remove the warning.

Array index too small

In C it's perfectly legal to index an array using a char variable, as in this example:

  int data[1000];
  char c = 0;
  data[c] = 0;

The problem with a construction like this is that a char almost certainly won't have sufficient range to address the complete length of the array. That's particularly true if the char is signed -- then the index will be constrained to lie between 0 and 127. Most of the elements in data will be inaccessible.

Because we now rarely try to achieve the tiny efficiency gains that come from using 8-bit arithmetic, indexing an array with a char nearly always indicates a logical error. The compiler will generate a warning like this:

warning: array subscript has type ‘char’ [-Wchar-subscripts]
    8 |   data[c] = 0;

This error is rarely as blatant as my example, when it does appear.

Missing return statements

With its default settings, gcc will not complain about this construction:

int foo (void)
  {
  int x = 42;
  }

even though it is obviously broken. There's simply no logical reason to define a function to return an int, and then not specify what value to return.

What makes this problem so pernicious is that, amazingly, sometimes the code works. If it does work, that's because a calculation result just happens to be in whatever CPU register the compiler uses to carry the return value. Needless to say, this code will not be robust or portable.

When properly configured, gcc will warn about this problem very clearly:

warning: control reaches end of non-void function [-Wreturn-type]
    7 |   }

Uninitialized values

Another very common class of programming error is to use variables without assigning values to them. In practice, these errors are rarely as blatant as this one:

  int a;
  if (a)
    printf ("Hello, World\n");

I particularly see this error made by programmers who are more familiar with languages where unassigned variables do take a specific value. Even in those languages, I would argue that the developer should always assign a specific value, if the value actually matters. In C and C++, though, the value of an unassigned local variable will be whatever happens to be on the stack when the function is executed.

With the -Wall option, gcc will warn about mistakes of this kind like this:

In function ‘foo’:
11:6: warning: ‘a’ is used uninitialized in this function [-Wuninitialized]
   11 |   if (a)
      |      ^

In practice, the compiler can detect problems with uninitialized variables that are much more subtle, and hard to detect, than my trivial example.

Unhandled cases

A very common construct in C programming is to use an enum to define a variable that can take one of a small number of discrete values. It's always possible to use a simple int rather than an enum, but using an enum gives additional information, both to a reader and to the compiler.

For example, if an enum is used in a switch statement, it's reasonable to assume that all the distinct values of the enum are to be handled. Consider the following example:

enum Foo {hot, warm, cold};

enum Foo foo = ...; 
switch (foo)
  {
  case hot: printf ("Hot\n"); break;
  case warm: printf ("Warm\n"); break;
  }

The enum can take one of three values: hot, warm, and cold. However, the switch has case statements for only two of these three values.

Again, it's unlikely that a programmer will make a mistake as blatant as this one; it's more of a problem when the enum has many values. With -Wall, gcc gives the following warning:

warning: enumeration value ‘cold’ not handled in switch [-Wswitch]
    9 |   switch (foo)
      |   ^~~~~~

Potential programming errors that are not reported when using gcc -Wall

The -Wall option to gcc doesn't really mean "all warnings", despite the name. Rather, it means, "all the warnings that the gcc maintainers think ought to be easy to enable". There are many potential warnings that need a highly level of fussiness. So, for example, while using -Wall will report unused functions and unused variables, it won't report unused function parameters. gcc won't complain about the following function, even with -Wall:

void foo (int bar)
  {
  //... bar not used anywhere
  }

It isn't entirely clear to me why unused parameters are considered to be less worthy of reporting than unused variables, but so it is.

"Unused parameters" is one of an additional class of warnings that is enabled using -Wextra. Are there important potential errors that are not reported even with -Wextra? Sadly, there are. Consider this example:

  int i = 12345;
  char c = i;

There are, presumably, valid reasons to assign the value of an int variable to a char, or a double to a float. However, very often such a construction represents a logical error -- or a misunderstanding of the ranges of data types, which is as bad. The example above does not raise a warning with -Wall or even with -Wextra -- assignments of a larger data type to a smaller one are only reported with -Wconversion. In that case, we see:

warning: conversion from ‘int’ to ‘char’ may change value [-Wconversion]
    9 |   char c = i;
      |            ^

There are other classes of warnings that have their own settings -- a careful reading of the gcc documentation will indicate what they are, and whether you ought to be concerned about them.

Closing remarks

I hope I've shown how much easier life is, if you configure your compiler to report warnings freely. In practice, warnings often indicate logical errors. At worst, they show constructions that are likely to confuse another person looking at your code, which is no bad thing.

A question that sometimes arises is: if -Wall doesn't really show all warnings, is there a setting that does? So far as I know, there isn't. Enabling all possible warnings in gcc (and there are hundreds of them) would be extremely pedantic. Still, I have to confess that I was taken by surprise to find that -Wall does not include warnings about assigning a value to a variable that does not have sufficient range to contain it.

Finally: a useful adjust to -Wall, etc., is -Werror. This options makes gcc treat all all warnings as errors, and stop the compiler in its tracks. This can be useful when compiling a large, complex program, because otherwise warning messages will tend to get lost in the rest of the output.