The importance of enabling compiler warnings in C and C++ programming
I see too many C (and C++) programs misbehave at runtime, for reasons that could easily have been detected at build time using checks built into all modern compilers. C is not -- or should not be -- a "try it and see" language. Nevertheless, many developers struggle to find faults at runtime that could easily have been eliminated before the program was even executed.
Some of these faults are old chestnuts, like
putting an extra semi-colon on the end of a for(...)
statement; some
are more subtle. All are avoidable, with the proper compiler
settings.
This article is specifically about the GNU gcc
compiler, which
is pretty-much ubiquitous these days. Many developers are using it in the
traditional way, with Makefiles and command-line tools, while others
are using it built into an integrated development tool. While I
focus on gcc
here, I have no reason to
think that other compilers behave very differently.
A C compiler will always report actual faults in program syntax -- the kinds of error that prevent the program being compiled at all. However, it can also report "warnings", about suspicious constructions that do not prevent the program being compiled.
gcc
takes a number of command-line switches that affect its
warning level. If none of these switches are used, the compiler takes a pretty
relaxed approach to reporting potential problems.
I would suggest that the warning mode C and C++ programmers should use
is at least as fussy the one enabled by the
-Wall
command-line switch.
In principle, this switch means "all warnings" but, as we shall see, this mode
does not report all the warnings that might be of interest.
The first part of this article gives examples of potential problems that
gcc
will warn about if the -Wall
switch is
used. Later, I'll describe a few problem areas that need a fussier
warning level if they are to be reported.
Potential programming errors that are reported when using gcc -Wall
Misplaced semicolons
This is the archetypal trivial error that does a lot of damage. It's perfectly legal in C and C++ to write
if (at_war == TRUE); fire_missiles();
But the behaviour is almost certainly not what the programmer intended.
The if
statement is complete, and the call to
fire_missiles()
is made whether the if()
evaluates to TRUE or FALSE.
A similar problem plagues loops. Probably the intention in this example was to run the loop body ten times:
for (int i = 0; i < 10; i++); printf ("Hello, World\n");
But actually it runs just once.
There can't be many programmers, however experienced, who never make a mistake like this.
Neither of the suspicious constructs above is in any way illegal under
the C language specification. A
compiler can't really complain about a semicolon on the end of an
if
or a for
by itself, because there are
circumstances where it is appropriate to write code like this. What the
compiler can complain about, if so configured, is the discrepancy
between the terminating semicolon and the following indentation.
When running gcc
with the -Wall
option, the
previous example gives rise to this warning:
warning: this ‘for’ clause does not guard... [-Wmisleading-indentation] 5 | for (int i = 0; i < 10; i++); | ^~~ ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘for’ 6 | printf ("Hello, World\n");
The "does not guard..." message indicates that the structure of the
for
is not consistent with the following indentation. That
isn't necessarily a fault in the logic -- it might just be careless indentation.
But, either way, it should be a red flag.
Writing a comparison as an assignment
This is a another old warhorse and, again, it's a mistake that even experienced programmers make.
defcon = 5; if (defcon = 1) fire_missiles();
From the context, it's pretty clear that the developer meant a test, not an assignment:
if (defcon == 1)...
But the compiler doesn't know that, and using if()
with
an assignment is perfectly legal -- and sometimes appropriate.
With -Wall
, we receive this warning:
warning: suggest parentheses around assignment used as truth value [-Wparentheses] if (at_war = 1) ^~~~~~
If the assignment isn't a mistake, enclosing it in an extra pair of parentheses will remove the warning.
Array index too small
In C it's perfectly legal to index an array using a char
variable, as in this example:
int data[1000]; char c = 0; data[c] = 0;
The problem with a construction like this is that a char
almost
certainly won't have sufficient range to address the complete length of
the array. That's particularly true if the char
is
signed -- then the index will be constrained to lie between 0 and 127.
Most of the elements in data
will be inaccessible.
Because we now rarely
try to achieve the tiny efficiency gains that come from using 8-bit
arithmetic, indexing an array with a char
nearly always
indicates a logical error. The compiler will generate a warning like this:
warning: array subscript has type ‘char’ [-Wchar-subscripts] 8 | data[c] = 0;
This error is rarely as blatant as my example, when it does appear.
Missing return statements
With its default settings, gcc
will not complain about
this construction:
int foo (void) { int x = 42; }
even though it is obviously broken. There's simply no logical reason to
define a function to return an int
, and then not specify
what value to return.
What makes this problem so pernicious is that, amazingly, sometimes the code works. If it does work, that's because a calculation result just happens to be in whatever CPU register the compiler uses to carry the return value. Needless to say, this code will not be robust or portable.
When properly configured, gcc
will warn about this
problem very clearly:
warning: control reaches end of non-void function [-Wreturn-type] 7 | }
Uninitialized values
Another very common class of programming error is to use variables without assigning values to them. In practice, these errors are rarely as blatant as this one:
int a; if (a) printf ("Hello, World\n");
I particularly see this error made by programmers who are more familiar with languages where unassigned variables do take a specific value. Even in those languages, I would argue that the developer should always assign a specific value, if the value actually matters. In C and C++, though, the value of an unassigned local variable will be whatever happens to be on the stack when the function is executed.
With the -Wall
option, gcc
will warn about
mistakes of this kind like this:
In function ‘foo’: 11:6: warning: ‘a’ is used uninitialized in this function [-Wuninitialized] 11 | if (a) | ^
In practice, the compiler can detect problems with uninitialized variables that are much more subtle, and hard to detect, than my trivial example.
Unhandled cases
A very common construct in C programming is to use an enum
to define a variable that can take one of a small number of discrete
values. It's always possible to use a simple int
rather
than an enum
, but using an enum
gives additional
information, both to a reader and to the compiler.
For example, if an enum
is used in a switch
statement, it's reasonable to assume that all the distinct values of
the enum
are to be handled. Consider the following
example:
enum Foo {hot, warm, cold}; enum Foo foo = ...; switch (foo) { case hot: printf ("Hot\n"); break; case warm: printf ("Warm\n"); break; }
The enum
can take one of three values: hot
,
warm
, and cold
. However, the switch
has case
statements for only two of these three values.
Again, it's unlikely that a programmer will make a mistake as blatant
as this one; it's more of a problem when the enum
has
many values. With -Wall
, gcc
gives the following
warning:
warning: enumeration value ‘cold’ not handled in switch [-Wswitch] 9 | switch (foo) | ^~~~~~
Potential programming errors that are not reported when using gcc -Wall
The -Wall
option to gcc
doesn't really mean
"all warnings", despite the name. Rather, it means, "all the warnings that
the gcc
maintainers think ought to be easy to enable". There
are many potential warnings that need a highly level of fussiness.
So, for example, while
using -Wall
will report unused functions and unused variables,
it won't report unused function parameters. gcc
won't
complain about the following function, even with -Wall
:
void foo (int bar) { //... bar not used anywhere }
It isn't entirely clear to me why unused parameters are considered to be less worthy of reporting than unused variables, but so it is.
"Unused parameters" is one of an additional class of warnings that is
enabled using -Wextra
. Are there important potential
errors that are not reported even with -Wextra
? Sadly,
there are. Consider this example:
int i = 12345; char c = i;
There are, presumably, valid reasons to assign the value of an
int
variable to a char
, or a double
to a float
. However, very often such a construction
represents a logical error -- or a misunderstanding of the
ranges of data types, which is as bad. The example above does not raise
a warning
with -Wall
or even with -Wextra
-- assignments of a larger data type to a smaller one are only
reported with -Wconversion
. In that case, we see:
warning: conversion from ‘int’ to ‘char’ may change value [-Wconversion] 9 | char c = i; | ^
There are other classes of warnings that have their own settings --
a careful reading of the gcc
documentation will indicate
what they are, and whether you ought to be concerned about them.
Closing remarks
I hope I've shown how much easier life is, if you configure your compiler to report warnings freely. In practice, warnings often indicate logical errors. At worst, they show constructions that are likely to confuse another person looking at your code, which is no bad thing.
A question that sometimes arises is: if -Wall
doesn't
really show all warnings, is there a setting that does?
So far as I know, there isn't. Enabling all possible warnings in
gcc
(and there are hundreds of them) would be
extremely pedantic. Still, I have to confess that I was taken by
surprise to find that -Wall
does not include
warnings about assigning a value to a variable that does not have
sufficient range to contain it.
Finally: a useful adjust to -Wall
, etc., is
-Werror
. This options makes gcc
treat
all all warnings as errors, and stop the compiler in its tracks.
This can be useful when compiling a large, complex program, because
otherwise warning messages will tend to get lost in the rest of
the output.