\(\def\dt{\Delta t}\)
\(\newcommand{\transpose}[0]{^\mathrm{T}}\)
\(\newcommand{\half}[0]{\tfrac{1}{2}}\)
\(\newcommand{\Half}[0]{\frac{1}{2}}\)
\(\newcommand{\norm}[1]{\left\lVert#1\right\rVert}\)
\(\newcommand\given[1][]{\:#1\vert\:}\)
UP | HOME

Introduction to Compiling Code in Linux

Table of Contents

1 Introduction

The purpose of this repo is to walk through a few examples of tools one commonly encounters when compiling code on a Linux machine. We will be working with a few simple example code snippets, compiling them using different tools.

For consistency with other courses, I've borrowed examples and descriptions from the ME 333 textbook (used in the required MSR course). Specifically, I've borrowed content from Appendix A: A Crash Course in C which is available in the book's sample chapters. If you are unfamiliar with basic C code, or more importantly, the compilation procedure in general it is highly recommended that you read that Appendix before studying this document.

Throughout this document we will be working with this repo: https://github.com/NU-MSR/msr_toolchain_demos

2 Basic GCC Examples

2.1 Compile and run invest.c

To compile this example, we will directly call gcc with a minimum number of arguments. The GNU Compiler Collection, gcc, is a set of compiler frontends for a few languages (C, C++, Objective-C, Fortran, Ada, and Go), as well as implementations of common libraries for these languages. To compile invest.c we will run the following commands from the root of the Git repo we are working with today.

cd invest
mkdir bin
gcc invest.c -o bin/invest

Then we could run the code with ./bin/invest. Note that when the executable is asking for input it is expecting three space-separated fields. Note that we put the executable in the bin directory so that Git doesn't try to track the executable.

Note that you could use the file command to figure out that this executable really is an ELF 64-bit LSB shared object executable. Also note that, GCC has automatically added the correct permissions to this file.

2.2 Compiling with multiple files

For this example, we will see a simple example of compiling an executable made up of a header file declaring some functions, a source file defining those functions, and a main file using those functions.

cd helper
mkdir bin
gcc main.c helper.c -lm -o bin/geocalcs

Then we could run this with ./bin/geocalcs. Here we are using -o to provide a more descriptive name for our executable than something generic like main. We did not need to use the -I switch to tell GCC where to search for the include file because it is in the current directory. We are able to pass both *.c files in a single call to GCC and it automatically compiled and linked both to produce a single executable. The -lm switch is actually an option passed to the linker that tells GCC that it needs to link against the library libm.so (we'll talk more about this linking process later). For now, note that the primary reason we need to do this is that helper.c includes math.h, but math.h only has the declarations of the standard math functions, but not the definitions. Included with GCC is a pre-compiled library containing the compiled definitions of these functions. In order to use these functions, we need to tell the linker to link against that library.

We could easily call gcc multiple times to produce object files and then call gcc again to link the object files into a single executable.

gcc -c helper.c -o bin/helper.o
gcc -c main.c -o bin/main.o
gcc bin/*.o -lm -o bin/geocalcs

The primary reason we might want to do this is to speed up subsequent compilations. Note that the -c option tells GCC to Compile or assemble the source files, but do not link.. Imagine if we had many files that only defined functions to be used in a single file with a main() function. If we were to break up the compilation into many separate .o files, then if we edited only a single .c file we'd only have to re-compile that single .o file, and re-link all the .o files into the executable. If we did it in one step, we'd have to re-process all of the files. Note that Makefiles (discussed below) are a good way to automate the generation of all of the many object files, and to manage the dependencies of what needs to be recompiled when any given part of a large project undergoes a change. Also note that tools like ccache are easy to incorporate into a project and can greatly speed compilation.

2.3 Compiling a static library

In this example, we will compile help.c into a static library that could then be statically linked into any program. On Linux, and in GCC itself, some libraries are distributed this way. The advantage using static libraries is that they only need to be compiled once, and then they can easily be incorporated into any executable just by changing the linking options passed to GCC. The disadvantage of static libraries is that if you end up changing the static library, then all executable that use that static library would need to be re-compiled to incorporate that change. Another disadvantage is that they use more disk space and memory because you may end up effectively storing or using many copies of the library.

gcc -c helper.c -o bin/helper.o
ar rcs bin/libhelper.a bin/helper.o
gcc main.c bin/libhelper.a -lm -o bin/geocalcs

In the above we used ar to create the static library. From ar --help, the rcs options do the following:

  • r replace existing or insert new file(s) into the archive
  • c do not warn if the library had to be created
  • s act as ranlib (basically tells ar to add or modify .o files in a static library

In the gcc command used when linking, we just provided the full path to the static library we created. Likely the more common and flexible way to do this is to add the path to the library to the linker search path with -L and then link using the libraries name just like we did with -lm. Here's an example:

gcc -c helper.c -o bin/helper.o
ar rcs bin/libhelper.a bin/helper.o
gcc main.c -L./bin -lhelper -lm -o bin/geocalcs

One other note on all of these GCC options that we've been passing that is important:

Generally the order of arguments in GCC does not matter, except the placement of arguments for the linking step. Specifically the -l option for specifying libraries to link against, and the -Wl option for passing arguments to the linker. General rule of thumb is to list these arguments last. Here's a good answer explaining why.

2.4 Compiling a dynamic library

Dynamic libraries are loaded at runtime, and once they are loaded, they can stay in system memory and be used by any subsequent executable that needs access to the library. This saves memory and disk space, and allows us to automatically roll out library updates to all executables that use a given library at the same time. This style of library management is far more common on Linux. An excellent document describing how shared libraries are used, how they are loaded at runtime, and how they are named is The Linux Documentation Project's Shared Libraries page. As another excellent reference, check out The Inside Story on Shared Libraries and Dynamic Loading from a 2001 issue of Computing in Science and Engineering.

gcc -c -fpic helper.c  -o bin/helper.o
gcc -shared bin/helper.o -o bin/libhelper.so
gcc main.c -L./bin -lhelper -lm -o bin/geocalcs

Note that at this point if we run the script we will have an error:

jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory

This error is because the runtime dynamic linker, ld.so, does not know where to look for the newly created libhelper.so. From the ld.so man page, there are a few directories that are searched by ld.so. A rough description of these locations, in order, is as follows:

  1. Value of DT_RPATH section of the binary – this option is deprecated, and up-to-date versions of GCC don't set this unless forced to with -Wl,--disable-new-dtags
  2. Values in the environment variable LD_LIBRARY_PATH
  3. Value of DT_RUNPATH section of the binary – this is a value that can be set when compiling to store a search path directly in the binary
  4. From the cache file /etc/ld.so.cache
  5. In the default path /lib, and then /usr/lib. (On some 64-bit architectures, the default paths for 64-bit shared objects are /lib64, and then /usr/lib64.) If the binary was linked with the -z nodeflib linker option, this step is skipped.

If a combination of these strategies are not enough to help ld.so find the necessary libraries, then you typically see an error like above (cannot open shared object file: No such file or directory). A very helpful tool for seeing what libraries are found, and which are missing is ldd. You can simply run ldd <NAME OF EXECUTABLE OR OTHER *.SO FILE> and it will print out all required libraries and where they are found (or not found).

Let's see all of these options in action.

2.4.1 Using LD_LIBRARY_PATH

We can use the LD_LIBRARY_PATH environment variable to modify where the runtime linker searches for libraries to dynamically load. See examples below:

jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -o bin/geocalcs
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory
jarvis@test2018:~/msr_toolchain_demos/helper⟫ export LD_LIBRARY_PATH=$(pwd)/bin
jarvis@test2018:~/msr_toolchain_demos/helper⟫ echo $LD_LIBRARY_PATH
/home/jarvis/Desktop/msr_toolchain_demos/helper/bin
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.
jarvis@test2018:~/msr_toolchain_demos/helper⟫ unset LD_LIBRARY_PATH
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory
jarvis@test2018:~/msr_toolchain_demos/helper⟫ LD_LIBRARY_PATH=./bin/ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.

Often if I find myself regularly needing to update the LD_LIBRARY_PATH variable for a particular executable, I'll either create an alias/script that runs the executable while ensuring the LD_LIBRARY_PATH is set, or I'll figure out how to set the DT_RUNPATH if the missing library is a project I'm compiling.

2.4.2 Setting DT_RUNPATH

We can tell gcc to store a particular path as the DT_RUNPATH field in the executable it creates, and then this will automatically be searched when running the executable. See example below:

jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -Wl,-rpath,$(pwd)/bin -o bin/geocalcs
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.
jarvis@test2018:~/msr_toolchain_demos/helper⟫ readelf -d bin/geocalcs | grep PATH
 0x000000000000001d (RUNPATH)            Library runpath: [/home/jarvis/Desktop/msr_toolchain_demos/helper/bin]
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ldd bin/geocalcs
        linux-vdso.so.1 (0x00007ffe6c782000)
        libhelper.so => /home/jarvis/Desktop/msr_toolchain_demos/helper/bin/libhelper.so (0x00007fdac2b6e000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdac27d0000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdac23df000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fdac2f72000)

Note that this option is often turned on by default in CMake. The CMake wiki has a good description of the CMake behaviors regarding the DT_RUNPATH.

2.4.3 Using cached library locations

The tool ldconfig is responsible for keeping a cache of what shared object files are available in a set of trusted directories, a set of directories specified in /etc/ld.so.conf, and directories specified on the command line. When updating your system with apt-get you'll often see a message about running ldconfig to update the cache. This cache is used by ld.so rather than actually scanning the system (which would be slow). Let's update this cache to include our new library:

jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -o bin/geocalcs
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory
127 jarvis@test2018:~/msr_toolchain_demos/helper⟫ sudo ldconfig $(pwd)/bin
[sudo] password for jarvis:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.

Note that in the above usage, I've only temporarily updated the cache. The next time ldconfig is run without explicitly including the project's bin directory, then that directory will be cleared from the cache. So the above usage is great for transiently solving a problem, but it is not permanent. You could make a change permanent by editing the /etc/ld.so.conf file, but you don't want to do this regularly. Rather, use this option as a last resort if you really need to permanently add a library in a nonstandard location. For example, let's say you compiled some open-source project that you are going to be linking against regularly, but you are nervous to install it because you are afraid it will conflict with other libraries on your system. Then this might be a great option.

2.5 Library names

You've likely noticed that the names we are using to refer to libraries when compiling and when runtime linking don't quite match the actual name of the file. For example we use -lhelper to link against libhelper.so, and -lm to link against libm.so. The pattern demonstrated in those examples is very common. A library has a "library name" (e.g. helper), that we used to refer to the library when we tell the compiler that we want to link against it. The compiler automatically adds the "lib" and the ".so" to the library name when it searches for files (it can also add ".a" for static libraries). Note that we could also provide a full filename to the -l option by adding a colon – e.g. -l:libhelper.so. We can see that there is some mapping that may occur between library names and the actual filenames when linking during a compilation step (typically just adding the "lib" and ".so"), but it turns out there is another mapping that happens during runtime linking, and that ldconfig plays an important role in resolving these mappings names. It turns out that every shared library has an attribute called SONAME that includes the "lib", the library name, the ".so", and a final ".X" that indicates a binary compatibility version. The idea here is to be able to have multiple versions of a library that don't have compatible function definitions and for the runtime linker to be able to ensure you are runtime linking against a compatible version. So very often on Linux what you'll see is an actual filename for a shared library existing somewhere on the system; this filename is typically the SONAME with additional version information appended. This file's SONAME attribute will specify the correct SONAME, and when ldconfig is run it will create a symbolic link named using the SONAME and pointing to the actual file. Additionally, package maintainers often go the extra step of providing another symbolic link that completely strips the versioning for the specific purpose of providing an easy way to link against the library during compilation.

For example, let's look at the libraries installed for libtar – the libtar description says it is a "C library for manipulating tar archives libtar allows programs to create, extract and test tar archives. It supports both the strict POSIX tar format and many of the commonly-used GNU extensions." On my system there is an apt-get package called libtar0. The "0" at the end is indicating that the SONAME we would get if we install this package will specifically be for version "0". If I install libtar0 using apt get, I get a new file of interest on my machine: /usr/lib/libtar.so.0.0.0 The install scripts that ship with that package ask ldconfig to be run after installing the relevant files. ldconfig knows that /usr/lib is one of the directories it should be adding to its cache and managing links for so it scans the directory and notices this new file. It reads the SONAME field for this library and realizes that it needs to create a symbolic link. The link it creates is /usr/lib/libtar.so.0 and it points to the libtar.so.0.0.0 file. Additionally the maintainers of this package have also created a package called libtar-dev – the -dev is telling us that we should install this package if we want to "develop" our own code that links against libtar. This package does 2 things: (1) it provides us with a static version of libtar.a that we could statically link against; (2) it creates a symbolic link called just /usr/lib/libtar.so that points to the libtar.so.0.0.0 file that gcc will easily be able to find when linking against the tar library at compile time. See the command line printout below for demonstrating of this example:

jarvis@test2018:~⟫ cd /usr/lib
jarvis@test2018:/usr/lib⟫ ll libtar* # below we can see the symbolic links that exist on my machine:
-rw-r--r-- 1 root root 67K Oct 11  2016 libtar.a
lrwxrwxrwx 1 root root  15 Oct 11  2016 libtar.so -> libtar.so.0.0.0
lrwxrwxrwx 1 root root  15 Nov 21 08:16 libtar.so.0 -> libtar.so.0.0.0
-rw-r--r-- 1 root root 35K Oct 11  2016 libtar.so.0.0.0
jarvis@test2018:/usr/lib⟫ readelf -d libtar.so.0.0.0 |grep SONAME
 0x000000000000000e (SONAME)             Library soname: [libtar.so.0]
jarvis@test2018:/usr/lib⟫ apt-file search /usr/lib/libtar.so # we can use "apt-file" to search for which packages provide particular files
libtar-dev: /usr/lib/libtar.so
libtar0: /usr/lib/libtar.so.0
libtar0: /usr/lib/libtar.so.0.0.0
jarvis@test2018:/usr/lib⟫ sudo rm libtar.so.0 # not recommended to do this, I'm just illlustrating that "ldconfig" will update this symbolic link
jarvis@test2018:/usr/lib⟫ ll libtar* # notice the symbolic link is gone:
-rw-r--r-- 1 root root 67K Oct 11  2016 libtar.a
lrwxrwxrwx 1 root root  15 Oct 11  2016 libtar.so -> libtar.so.0.0.0
-rw-r--r-- 1 root root 35K Oct 11  2016 libtar.so.0.0.0
jarvis@test2018:/usr/lib⟫ sudo ldconfig
jarvis@test2018:/usr/lib⟫ ll libtar* # running ldconfig has returned the link:
-rw-r--r-- 1 root root 67K Oct 11  2016 libtar.a
lrwxrwxrwx 1 root root  15 Oct 11  2016 libtar.so -> libtar.so.0.0.0
lrwxrwxrwx 1 root root  15 Nov 21 08:24 libtar.so.0 -> libtar.so.0.0.0
-rw-r--r-- 1 root root 35K Oct 11  2016 libtar.so.0.0.0

This stuff gets quite complicated, and I wouldn't expect you to know and understand it all on first study. What you should know is that there are multiple ways of referring to libraries at runtime and compile time, and that if you're ever faced with a problem that feels like it's related to the names of libraries or their versions the term "SONAME" could be something to start researching.

3 GNU Make

Quoting from the Make homepage:

GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program's source files.

Make gets its knowledge of how to build your program from a file called the Makefile, which lists each of the non-source files and how to compute it from other files. When you write a program, you should write a Makefile for it, so that it is possible to use Make to build and install the program.

Make can be used with many tools (not just GCC) to specify recipes for steps that need to be taken to produce a particular output. For example, I've written Makefiles for editing photos or videos, producing PDFs, and for generating GCode for use on a laser cutter. Not only will Make allow you to follow these rules, but it will automatically check timestamps on files and only re-run steps that need to be re-run to produce a particular output.

Writing Makefiles can be cumbersome as the syntax is a bit convoluted and the debugging process can be tricky. That said, for relatively simple projects, this is an awesome way to get the project compiled and to share the project with others so that they can compile it. This document certainly won't teach you everything you need to know about Makefiles, but it will show a few simple examples to give you a sense of what a Makefile actually is.

3.1 Make without a Makefile

For simple projects, Make actually has a set of implicit rules that can sometimes be used with no particular configuration. In my experience, this isn't the most useful, but it is good to know about.

jarvis@test2018:~⟫ cd ~/msr_toolchain_demos/invest/
jarvis@test2018:~/msr_toolchain_demos/invest [master]⟫ make invest
cc     invest.c   -o invest

3.2 A simple Makefile

Let's write a very simple Makefile for invest.c:

 1: APP=invest
 2: BIN_DIR=bin
 3: CC=gcc
 4: 
 5: all: $(BIN_DIR)/$(APP)
 6: 
 7: $(BIN_DIR)/$(APP):
 8:     $(CC) $(APP).c -o $(BIN_DIR)/$(APP)
 9: 
10: clean:
11:     rm -f $(BIN_DIR)/$(APP)

The general structure of the file above is to first create a few variables on the first three lines. Then on the 5th line we create a target called all that has a dependency on bin/invest. When make parses this file with no passed args (i.e. the user just typed make at the command line), it scans line-by-line looking for the first target that has at least one dependency. Once that first target is found, it looks for rules to make the target's dependencies. In this case, the rule for the only dependency is on line 7, and that target has no dependencies. Therefore, make will create that target and then be able to create the original target as all of its dependencies are now built. In this particular case, there is nothing to be done for the original target as there are no associated commands; the purpose of the all target is just to ensure that it is the default target if no target is passed at the command line.

All targets can be invoked manually by passing them at the command line. With the above Makefile, I could use make, make all, or make bin/invest to identically produce the executable, and I could use make clean to remove the executable.

Note a slight improvement to the above Makefile would be to check that the output directory actually exists:

 1: APP=invest
 2: BIN_DIR=bin
 3: CC=gcc
 4: 
 5: all: init $(BIN_DIR)/$(APP)
 6: 
 7: $(BIN_DIR)/$(APP):
 8:     $(CC) $(APP).c -o $(BIN_DIR)/$(APP)
 9: 
10: init:
11:     mkdir -p $(BIN_DIR)
12: 
13: clean:
14:     rm -f $(BIN_DIR)/$(APP)

3.3 Slightly more complex Makefile

Now let's write a Makefile for the helper executable:

 1: APP=geocalcs
 2: BIN_DIR=bin
 3: CC=gcc
 4: LDFLAGS=-lm
 5: 
 6: SOURCES=$(wildcard *.c)
 7: OBJECTS=$(foreach obj, $(SOURCES), $(BIN_DIR)/$(obj:.c=.o))
 8: 
 9: all: init $(BIN_DIR)/$(APP)
10: 
11: $(BIN_DIR)/$(APP): $(OBJECTS)
12:     @echo creating executable $@
13:     $(CC) $(OBJECTS) $(LDFLAGS) -o $(BIN_DIR)/$(APP)
14: 
15: $(BIN_DIR)/%.o: %.c
16:     @echo creating object $@
17:     $(CC) $(CFLAGS) -o $@ -c $<
18: 
19: init:
20:     mkdir -p $(BIN_DIR)
21: 
22: clean:
23:     rm -f $(BIN_DIR)/$(APP) $(OBJECTS)

This is not all that different than the Makefile above, but we have used some more advanced features to help automate things. For example, on lines 6 and 7, we have used a wildcard and a for loop to automatically create a list of all *.c files and then convert those names into a list of targets. So the value of $(OBJECTS) is bin/main.o bin/helper.o. Thus the target on line 11 depends on building the object files from both *.c files. The target and commands on lines 15-17 are using pattern rules to provide targets, dependencies, and commands for both bin/main.o and bin/helper.o. By leveraging more complex tools like this, it is possible to write very powerful, and generic Makefiles capable of working with much more complex projects.

4 Introduction to CMake

Quoting from the CMake website:

CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native Makefiles and workspaces that can be used in the compiler environment of your choice.

So basically, CMake is a tool capable of generating common build scripts for the host OS. With one well-written CMakeLists.txt file, one could automatically generate both a Makefile for use on Linux or an MSBuild file for Visual Studio in Windows. For open-source C++ projects it seems to me that the usage of CMake has grown significantly in recent years. CMake is one of the most important build tools in many robotics-related projects including ROS, OpenCV, PCL, OMPL, trajopt, DART, and more.

In my experiences, the real power of CMake is not really the cross-platform or cross-compiler features (I don't tend to do much of that type of development), but is instead its tools for dependency management. It's trivial to develop a project that depends on many other projects and automatically generate a Makefile that has all of the right compiler flags to build your project. In the examples above, the Makefiles were quite simple, but as the complexity of your project and its dependencies goes up it starts to get quite challenging to develop reliable, portable Makefiles. This is where CMake comes in.

CMake has its share of issues as well. The syntax is cumbersome, the documentation can be difficult to understand and navigate, and the features of CMake have evolved quickly, resulting in tons of information on Stack Overflow, blogs, etc. that is incorrect or out-of-date. Perhaps the most frustrating of all is that there is a real lack of guidance for how to do things the "right" way. On top of all of that, when things aren't working it can be very difficult to parse the error messages that you are getting and to figure out how to fix them.

CMake is huge, and there is no way I can teach you everything you need to know about CMake in one short lesson. Instead what I will do is present a simple example, and then give some tips for developing your own CMake files.

4.1 First simple CMake example

For this example, we will use the same helper project as in previous examples. We will build the project as a shared library that we link against with our primary executable. Change directory into helper/, and you'll find the following CMakeLists.txt file:

 1: # likely, we could get away with an older minimum version number with this
 2: # particular CMake file
 3: cmake_minimum_required (VERSION 3.0)
 4: # let's tell CMake that we are specifically using only C... this will make
 5: # compilation faster and more portable.
 6: project(helper VERSION 1.0 DESCRIPTION "Simple CMake example" LANGUAGES C)
 7: 
 8: # ensure we are looking in current directory for header files (this is probably
 9: # not needed, but I wanted to show the syntax)
10: include_directories("${PROJECT_SOURCE_DIR}")
11: 
12: # tell CMake that we will compile "libhelper.so"
13: add_library(helper SHARED helper.c)
14: 
15: # add an executable and tell CMake to link against "libhelper.so" and "libm.so"
16: add_executable(geocalcs main.c)
17: target_link_libraries(geocalcs helper m)

Now to build this project, we will follow a procedure that is used for nearly every CMake project:

jarvis@test2018:~⟫ cd msr_toolchain_demos/helper/
jarvis@test2018:~/msr_toolchain_demos/helper⟫ mkdir build
jarvis@test2018:~/msr_toolchain_demos/helper⟫ cd build/
jarvis@test2018:~/helper/build⟫ cmake ..
-- The C compiler identification is GNU 7.3.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/jarvis/msr_toolchain_demos/helper/build
jarvis@test2018:~/helper/build⟫ make
Scanning dependencies of target helper
[ 25%] Building C object CMakeFiles/helper.dir/helper.c.o
[ 50%] Linking C shared library libhelper.so
[ 50%] Built target helper
Scanning dependencies of target geocalcs
[ 75%] Building C object CMakeFiles/geocalcs.dir/main.c.o
[100%] Linking C executable geocalcs
[100%] Built target geocalcs
jarvis@test2018:~/helper/build⟫ ./geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.
jarvis@test2018:~/helper/build⟫ ldd geocalcs
        linux-vdso.so.1 (0x00007ffeb45b8000)
        libhelper.so => /home/jarvis/msr_toolchain_demos/helper/build/libhelper.so (0x00007fb0d70c9000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb0d6d2b000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb0d693a000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb0d74cd000)
1 jarvis@test2018:~/helper/build⟫ readelf -d geocalcs |grep PATH
 0x000000000000001d (RUNPATH)            Library runpath: [/home/jarvis/msr_toolchain_demos/helper/build]

Note that it is very common to use an "out-of-source" directory with CMake to store all intermediate files and compiled targets (we used build/ in above). Also note that CMake automatically included the RUNPATH header information for our .so file in the geocalcs executable.

4.2 Depending on other projects

As stated above, one of the most powerful features of CMake is its ability to help you create projects that depend on other CMake projects. To illustrate some of this functionality, I've put together a simple demo that relies on OpenCV (for image processing) and Eigen (for matrix math). See below for an annotated CMakeLists.txt file:

cmake_minimum_required(VERSION 3.0)
project(eigen_opencv_demo VERSION 1.0 DESCRIPTION "Demo of depending on other libraries" LANGUAGES CXX)

# Let's build as a "Release"... this controls optimization level and inclusion
# of debug symbols. We wrap in an if statement to allow users to override
# setting:
if( NOT CMAKE_BUILD_TYPE )
  set(CMAKE_BUILD_TYPE Release)
endif()

# we are using Eigen, so let's see if we can find it:
find_package(Eigen3 REQUIRED)
# Eigen is header only, so we need to tell CMake that we want to look in the
# path containing Eigen's headers
include_directories(${EIGEN3_INCLUDE_DIR})

# let's see if we can find OpenCV, note that by requiring specific COMPONENTS,
# we are not linking against all OpenCV which speeds compile time and reduces
# the requirements of having a full OpenCV
find_package(OpenCV REQUIRED COMPONENTS highgui core imgproc)

# let's add an executable:
add_executable(demo demo.cpp)
# we need to link against OpenCV; in below, the ${OpenCV_LIBS} variable will
# only include the libraries explicitly listed as required components above
target_link_libraries(demo ${OpenCV_LIBS})

# I like to turn this on to produce a compile_commands.json file. Useful for
# debugging compilation and for many C++ autocomplete and code navigation tools
# (e.g. ycmd - https://github.com/Valloric/ycmd )
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

In the above, all of the magic really happens in find_package lines. This find_package function reads configuration files from other CMake projects and they define useful variables such as OpenCV_LIBS and EIGEN3_INCLUDE_DIR, and they do it in a portable way. When you call find_package(module) CMake looks in a set of directories for a file called moduleConfig.cmake or Findmodule.cmake, and when one if these files is found its purpose is to define all necessary variables for others packages to use in order to depend on module. Writing these files from scratch is challenging, but for many major packages, they are already available.

4.3 General Tips

Below are a collection of general tips that have helped me debug CMake issues in the past.

4.3.1 Setting CMake variables from the command line

Often it can be convenient to set CMake variables from the command line. A common version of this you might see is cmake .. -DCMAKE_BUILD_TYPE=Release This -D switch is used to define/set cached variables. The full documentation on this can be found on the official CMake site.

4.3.2 Seeing all compilation commands

Often it is very useful to determine exactly what commands are being sent to your compiler, especially when you suspect that your code won't compile because the wrong flags are being sent to your compiler. This stack overflow has a few tips on getting this information. Usually, I just run make VERBOSE=1 to see the whole history, and then combine with less or grep to filter out what I'm looking for.

4.3.3 Reading all CMake variables

Often it can be very useful to see all CMake variable that are currently available. As far as I know, the only way to see all variables is to include a snippet in your CMakeLists.txt, and then remove this snippet after you are done debugging (I suppose you could also define a variable and only run the snippet if the user passed an argument from the command line). The snippet I usually use is as follows:

get_cmake_property(_variableNames VARIABLES)
list (SORT _variableNames)
foreach (_variableName ${_variableNames})
    message(STATUS "${_variableName}=${${_variableName}}")
endforeach()

4.3.4 Interactive CMake configuration

There are several tools that you can use to interactively see what variables CMake currently has cached variables, and what their values are. I typically use either ccmake or cmake-gui for this purpose. Note that these packages are on apt-get as cmake-curses-gui and cmake-qt-gui respectively.

4.3.5 Depending on packages without a findXXX.cmake or xxxConfig.cmake

Likely the most frustrating complication that I've encountered when developing CMake projects is when the project that I'm developing depends on a different project that doesn't have a readily-available findXXX.cmake or xxxConfig.cmake. CMake ships with a bunch of these scripts out-of-the-box that support a wide variety of popular packages; on my system, these files are located in /usr/share/cmake-3.10/Modules/. In the case that the depending project doesn't have a script in that directory, and you are getting an error about not finding package configuration files, below are a few strategies. For reference, let's say you were depending a package called custompackage by putting find_package(custompackage) in your CMakeLists.txt. If CMake couldn't find the package configuration files, the error would be something like:

Could not find a package configuration file provided by "custompackage" with any of
the following names:

custompackageConfig.cmake
custompackage-config.cmake
  1. Using pkg-config

    Linux distributions often bundle a package called pkg-config. Then when a package is installed through a package manager (e.g. deb or apt-get), they also distribute a .pc file that can be used along with the pkg-config command line tool to retrieve information about what content was installed by the package. For example, I could query what libraries OpenCV installed with the following command:

     jarvis@test2018:~⟫ pkg-config --libs opencv
    -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab
    -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib
    -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy
    -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video
    -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo
    -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz
    -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d
    -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect
    -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs
    -lopencv_photo -lopencv_imgproc -lopencv_core
    

    It turns out that CMake provides the FindPkgConfig.cmake file that allows you to query pkg-config from CMake to define the variables needed to rely on packages without CMake configurations – of course, that is only useful if you have a .pc file for your package. Here's an example of using this capability with OpenCV and the previous demo.cpp file:

    find_package(PkgConfig REQUIRED)
    pkg_search_module(PKG_OPENCV REQUIRED opencv)
    include_directories(${PKG_OPENCV_INCLUDE_DIRS})
    
    add_executable(demo demo.cpp)
    target_link_libraries(demo ${PKG_OPENCV_LDFLAGS})
    
  2. Relying on non-standard files

    Often, even if your CMake isn't bundled with the find scripts for the project you are depending on, someone online may have written such a script. If you can find a FindPACKAGE.cmake file for your package, it is easy to bundle the find script with your project and tell CMake to use that find script. As an example, check out how many FindPACKAGE.cmake files OpenCV bundles with their source code. Many of these scripts are custom written by CMake developers, but some are borrowed from other sources. If you'd like to tell CMake where to search for these custom find scripts, you can use the CMAKE_MODULE_PATH variable.

  3. Custom package configuration files

    Writing these find scripts to be robust across different operating systems and portable from machine to machine is fairly challenging. However, writing a simple one that gets the job done often isn't all that challenging. Here's a blog that I came across recently with some good advice on CMake in general, and an analysis of what is included in these find scripts as well as a very simple example: https://pabloariasal.github.io/2018/02/19/its-time-to-do-cmake-right/

Creative Commons License
ME 495: Embedded Systems in Robotics by Jarvis Schultz is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.