C, C++ Programming Tips: Vishal Patil Summer 2003
C, C++ Programming Tips: Vishal Patil Summer 2003
C, C++ Programming Tips: Vishal Patil Summer 2003
Vishal Patil
Summer 2003
Contents
1 Introduction
4
4
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
7
7
8
8
8
9
10
11
12
5 Build options
12
7.4
7.5
7.6
7.7
7.8
Use exceptions . . . . . . . . . . . . . . . . .
Virtual functions . . . . . . . . . . . . . . . .
Dont ignore API function return values . . .
Be consistent . . . . . . . . . . . . . . . . . .
Make your code const correct . . . . . . . . .
7.8.1 The many faces of const . . . . . . . .
7.8.2 Understanding the const cast operator
7.8.3 const and data hiding . . . . . . . . .
8 References
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
17
18
18
18
18
18
19
20
21
Introduction
This document gives a quick idea about the various aspects of software development using C and C++. The document is in an adhoc form and lacks
a proper structure and flow of information. However I shall be revising the
structure from time to time and as I add new information to it.
2
2.1
test suite is when you find a bug while ordinarily using the library. Then,
before you even fix the bug, write a test program that detects the bug.
Then go fix it. This way, as you add new features to your libraries you have
insurance that they wont reawaken old bugs.
Please keep documentation up to date as you go. The best time to write
documentation is right after you get a few new test programs working. You
might feel that you are too busy to write documentation, but the truth of
the matter is that you will always be too busy. After long hours debugging
these seg faults, think of it as a celebration of triumph to fire up the editor
and document your brand-spanking new cool features.
Please make sure that computational code is completely seperated from
I/O code so that someone else can reuse your computational code without
being forced to also follow your I/O model. Then write programs that
invoke your collection of libraries to solve various problems. By dividing
and conquering the problem library by library with a test suite for each
step along the way, you can write good and robust code. Also, if you are
developing numerical software, please dont expect that other users of your
code will be getting a high while entering data for your input files. Instead
write an interactive utility that will allow users to configure input files in a
user friendly way. Granted, this is too much work in Fortran. Then again,
you do know more powerful languages, dont you?
Examples of useful libraries are things like linear algebra libraries, general ODE solvers, interpolation algorithms, and so on. As a result you end
up with two packages. A package of libraries complete with a test suite, and
a package of applications that invoke the libraries. The package of libraries
is well-tested code that can be passed down to future developers. It is code
that wont have to be rewritten if its treated with respect. The package
of applications is something that each developer will probably rewrite since
different people will probably want to solve different problems. The effect of
having a package of libraries is that C++ is elevated to a Very High Level
Language thats closer to the problems you are solving. In fact a good rule
of thumb is to make the libraries sufficiently sophisticated so that each executable that you produce can be expressed in one source file. All this may
sound like common sense, but you will be surprised at how many scientific
developers maintain just one does-everything-program that they perpetually
hack until it becomes impossible to maintain. And then you will be even
more surprised when you find that some professors dont understand why a
simple mathematical modification of someone elses code is taking you so
long.
Every library must have its own directory and Makefile. So a library
5
package will have many subdirectories, each directory being one library.
And perhaps if you have too many of them, you might want to group them
even further down. Then, theres the applications. If youve done everything
right, there should be enough stuff in your libraries to enable you to have
one source file per application. Which means that all the source files can
probably go down under the same directory.
Very often you will come to a situation where theres something that
your libraries to-date cant do, so you implement it and stick it along in
your source file for the application. If you find yourself cut and pasting that
implementation to other source files, then this means that you have to put
this in a library somewhere. And if it doesnt belong to any library youve
written so far, maybe to a new library. When you are in a deadline crunch,
theres a tendency not to do this since its easier to cut and paste. The
problem is that if you dont take action right then, eventually your code will
degenerate to a hard-to-use mess. Keeping the entropy down is something
that must be done on a daily basis.
In this section I shall explain managing multi file C and C++ projects using
automake and autoconf tools. These tools enable the developer to get rid
of the tedium of writing complicated Makefiles for large projects and also
avail portability across various platforms. These tools have been specifically
designed for managing GNU projects. Software developed using automake
and autoconf needs to adhere to the GNU software engineering principles.
3.1
3.2
Enabling Portablity
#ifdef HAVE_CONFIG_H
#include <config.h>
#endif
The config.h file is generated by the tools (or the configure script not quite
sure!!) and the HAVE CONFIG H flag is passed along with the -D option of the
compiler by the generated scripts at the time of building the project. Use
the autoheader utility to generate the config.h file for the project.
3.3
3.4
A brief example
I shall explain the use of these tools with the help of a small sample project.
The project will involve creation of an executable file called main (source
main.c) which will call functions from a user generated library geom (source
circle.c) and stats (source mean.c). The code for the main executable
is present in the src directorry while that for the geom and stats libraries
is present inside subdirectories geom and stats respectively ,inside the src
directory. Thus this project will cover compiling of source files, creating
static libraries and linking the libraries to create the final executable.
3.5
This file is used by the autoconf tool and used to generate the plaform
specific configure script.
The configure.ac file for the example is shown below.
AC_INIT(reconf)
AM_CONFIG_HEADER(config.h)
8
AM_INIT_AUTOMAKE(test,0.1)
AC_PROG_CC
AC_PROG_RANLIB
AC_PROG_INSTALL
AC_CONFIG_FILES([Makefile
doc/Makefile
m4/Makefile
src/Makefile
src/geom/Makefile
src/stats/Makefile
lib/Makefile
])
AC_OUTPUT
In the above sample configure.ac the parameters passed to the AM INIT AUTOMAKE
function namely test and 0.1 represent the package name and version
number respectively.
The AC CONFIG FILES function needs to be passed the paths of
the various Makefiles which need to be generated for the various
subdirectories. Please note that Makefile.am must be present in each
sub directory under the project directory
3.6
A Makefile.am is a set of assignments. These assignments imply the Makefile, a set of targets, dependencies and rules, and the Makefile implies the
execution of building. The first set of assignments look like this
INCLUDES = -I/geom -I/stats ....
LDFLAGS = -L/geom -L/stats ....
LDADD = -lgeom -lgeom ...
The INCLUDES assignment is where you insert the -I flags that you need
to pass to your compiler. If the stuff in this directory is dependent on
a library in another directory of the same package, then the -I flag
must point to that directory.
The LDFLAGS assignment is where you insert the -L flags that are
needed by the compiler when it links all the object files to an executable.
The LDADD assignment is where you list a long set of installed libraries
that you want to link in with all of your executables. Use the -l flag
only for installed libraries. You can list libraries that have been built
but not installed yet as well, but do this only be providing the full
path to these libraries.
If your package contains subdirectories with libraries and you want to link
these libraries in another subdirectory you need to put -I and -L flags in
the two variables above. To express the path to these other subdirectories,
use the $(top srcdir) variable. For example if you want to access a library
under src/libfoo you can put something like:
INCLUDES = ... -I$(top_srcdir)/src/geom ...
LDFLAGS = ... -L$(top_srcdir)/src/geom ...
on the Makefile.am of every directory level that wants access to these libraries. Also, you must make sure that the libraries are built before the
directory level is built. To guarantee that, list the library directories in
SUBDIRS before the directory levels that depend on it. One way to do
this is to put all the library directories under a lib directory and all the
executable directories under a bin directory and on the Makefile.am for
the directory level that contains lib and bin list them as:
SUBDIRS = lib bin
3.6.1
You need to declare the set of files that are sources of the program, the set
of libraries that must be linked with the program and (optionally) a set of
dependencies that need to be built before the program is built. These are
declared in assignments that look like this:
main_SOURCES = main.c
main_LDADD = -lgeom -lstats
main_LDFLAGS = -L$(top_srcdir)/src/geom -L$(top_srcdir)/src/stats
main_DEPENDENCIES = geom stats
main SOURCES : Here you list all the *.cc and *.h files that compose the source code of the program. The presense of a header file
here doesnt cause the file to be installed at /prefix/include but it
does cause it to be added to the distribution when you do make dist.
To cause header files to be installed you must also put them in include HEADERS.
10
main LADD : Here you add primarily the -l flags for linking whatever
libraries are needed by your code. You may also list object files, which
have been compiled in an exotic way, as well as paths to uninstalled
yet libraries.
main LDFLAGS : Here you add the -L flags that are needed to resolve
the libraries you passed in main LDADD. Certain flags that need to
be passed on every program can be expressed on a global basis by
assigning them at LDFLAGS.
main DEPENDENCIES : If for any reason you want certain other
targets to be built before building this program, you can list them
here.
3.6.2
Build options
You need to run the configure script before building the project using
make. After successfully running the configure script the following options
as avaliable for make
make Builds the project and creates the executables and libraries.
make clean Cleans the project i.e removes all the executables.
make install Builds and installs the project i.e the executable is
copied in the /prefix/bin,headers in /prefix/include and libraries in
/prefix/lib where prefix is usually /usr/local.
make uninstall Uninstalls the project i.e removes the files added to
/prefix/bin, /prefix/include and /prefix/lib directories.
make dist Creates a distribution of the project (<package-name><version>.tar.gz file) of the project.
12
6
6.1
In C, variables and functions are by default public, so that any C source file
may refer to global variables and functions from another C source file. This
is true even if the file in question does not have a declaration or prototype
for the variable or function. You must, therefore, ensure that the same symbol name is not used in two different files. If you dont do this you will get
linker errors and possibly warnings during compilation.
One way of doing this is to prefix public symbols with some string which
depends on the source file they appear in. For example, all the routines in
gfx.c might begin with the prefix gfx . If you are careful with the way you
split up your program, use sensible function names, and dont go overboard
with global variables, this shouldnt be a problem anyway.
To prevent a symbol from being visible from outside the source file it is
defined in, prefix its definition with the keyword static. This is useful for
small functions which are used internally by a file, and wont be needed by
any other file.
6.2
A header file is literally substituted into your C code in place of the #include statement. Consequently, if the header file is included in more than
one source file all the definitions in the header file will occur in both source
files. This causes them to be defined more than once, which gives a linker
error (see above).
Solution: dont define variables in header files. You only want to declare
them in the header file, and define them (once only) in the appropriate C
source file, which should #include the header file of course for type checking.
The distinction between a declaration and a definition is easy to miss for
beginners; a declaration tells the compiler that the named symbol should
exist and should have the specified type, but it does not cause the compiler
to allocate storage space for it, while a definition does allocate the space. To
make a declaration rather than a definition, put the keyword extern before
the definition.
So, if we have an integer called counter which we want to be publicly
13
6.3
Consider what happens if a C source file includes both a.h and b.h, and also
a.h includes b.h (which is perfectly sensible; b.h might define some types
that a.h needs). Now, the C source file includes b.h twice. So every #define
in b.h occurs twice, every declaration occurs twice (not actually a problem),
every typedef occurs twice, etc. In theory, since they are exact duplicates it
shouldnt matter, but in practice it is not valid C and you will probably get
compiler errors or at least warnings.
The solution to this problem is to ensure that the body of each header
file is included only once per source file. This is generally achieved using
preprocessor directives. We will define a macro for each header file, as we
enter the header file, and only use the body of the file if the macro is not
already defined. In practice it is as simple as putting this at the start of
each header file:
#ifndef FILENAME_H
#define FILENAME_H
and then putting this at the end of it:
#endif
replacing FILENAME H with the (capitalised) filename of the header file,
using an underline instead of a dot. Some people like to put a comment
after the #endif to remind them what it is referring to, e.g.
#endif /* #ifndef FILENAME_H */
Personally I dont do that since its usually pretty obvious, but it is a matter
of style.
You only need to do this trick to header files that generate the compiler
errors, but it doesnt hurt to do it to all header files.
14
7
7.1
7.2
There are lots of error conditions that happen in the normal life of a program.
For instance, file not found, out of memory, or invalid user input. You should
always handle these conditions gracefully (by re-prompting for a filename,
by freeing memory or telling the user to quit other applications, or by telling
the user there is an error in his input, respectively). However, there are other
conditions which are not real error conditions, but are the result of bugs.
For example, say you have a routine which copies a string into a buffer, and
no one is supposed to pass in a NULL pointer to the routine. You do not
want to do something like this:
void CopyString(char* szBuffer, int nBufSize)
{
if (NULL == szBuffer)
return;
// quietly fail if NULL pointer
else
{
strncpy(szBuffer, "Hello", nBufSize);
}
}
15
7.3
Use asserts liberally in debug builds. You normally dont want to put assert
code in release builds, because you dont want the user to see your bug
messages. ANSI C (thats ISO for you purists) provides assertion functions
in assert.h - if the symbol NDEBUG is defined somewhere before assert.h
is included, then assert() will have no effect. Otherwise, it will print out
a diagnostic message and abort the program if its argument evaluates to
FALSE. Since 0 == FALSE, you can use assert() on pointers to test them
for non-NULL:
void myFunction(char* szFoo)
{
assert(szFoo); // same as assert(NULL != szFoo);
}
16
7.4
Use exceptions
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
7.5
Virtual functions
7.6
Most API functions will return a particular value which represents an error.
You should test for these values every time you call the API function. If you
dont want want to clutter your code with error-testing, then wrap the API
call in another function (do this when you are thinking about portability,
too) which tests the return value and either asserts, handles the problem,
or throws an exception. The above example of OpenDataFile is a primitive
way of wrapping fopen with error-checking code which throws an exception
if fopen fails.
7.7
Be consistent
Be consistent in the way you write your code. Use the same indentation and
bracketing style everywhere. If you put the constant on the left in a conditional, do it everywhere. If you assert on your pointers, do it everywhere.
Use the same kind of comment style for the same kind of comments. If you
are the type to go in for a naming convention (like Hungarian notation),
then you have to stick to it everywhere. Dont do int iCount in one place
and int nCount in another.
7.8
7.8.1
const int x;
// constant int
18
x = 2;
The const keyword is more involved when used with pointers. A pointer
is itself a variable which holds a memory address of another variable - it
can be used as a handle to the variable whose address it holds. Note that
there is a difference between a read-only handle to a changeable variable
and a changeable handle to a read-only variable.
7.8.2
const int x = 4;
const int* pX = &x;
// prints "4"
// result is undefined
// who knows what it prints?
The const cast operator is more specific than normal type-casts because
it can only be used to remove the const-ness of a variable, and trying to
change its type in other ways is a compile error. For instance, say that you
changed x in the above example to an double and changed pX to double*.
However the variable pX2 is casted as
int* pX2 = ( int* ) (pX);
The code would still compile, but pX2 would be treating it as an int. It might
not cause a problem (because ints and doubles are somewhat similar), but
19
the code would certainly be confusing. Also, if you were using user-defined
classes instead of numeric types, the code would still compile, but it would
almost certainly crash your program. If you use const cast, you can be sure
that the compiler will only let you change the const-ness of a variable, and
never its type.
7.8.3
20
References
http://www.gmonline.demon.co.uk/cscene/CS2/CS2-01.html
http://autotoolset.sourceforge.net/tutorial.html
21