Introduction to Compiling Code in Linux
Table of Contents
1 Introduction
The purpose of this repo is to walk through a few examples of tools one commonly encounters when compiling code on a Linux machine. We will be working with a few simple example code snippets, compiling them using different tools.
For consistency with other courses, I've borrowed examples and descriptions from the ME 333 textbook (used in the required MSR course). Specifically, I've borrowed content from Appendix A: A Crash Course in C which is available in the book's sample chapters. If you are unfamiliar with basic C code, or more importantly, the compilation procedure in general it is highly recommended that you read that Appendix before studying this document.
Throughout this document we will be working with this repo: https://github.com/NU-MSR/msr_toolchain_demos
2 Basic GCC Examples
2.1 Compile and run invest.c
To compile this example, we will directly call gcc
with a minimum number of
arguments. The GNU Compiler Collection, gcc, is a set of compiler frontends
for a few languages (C, C++, Objective-C, Fortran, Ada, and Go), as well as
implementations of common libraries for these languages. To compile
invest.c
we will run the following commands from the root of the Git repo
we are working with today.
cd invest
mkdir bin
gcc invest.c -o bin/invest
Then we could run the code with ./bin/invest
. Note that when the executable
is asking for input it is expecting three space-separated fields. Note that
we put the executable in the bin
directory so that Git doesn't try to track
the executable.
Note that you could use the file
command to figure out that this executable
really is an ELF 64-bit LSB shared object executable. Also note that, GCC
has automatically added the correct permissions to this file.
2.2 Compiling with multiple files
For this example, we will see a simple example of compiling an executable made up of a header file declaring some functions, a source file defining those functions, and a main file using those functions.
cd helper
mkdir bin
gcc main.c helper.c -lm -o bin/geocalcs
Then we could run this with ./bin/geocalcs
. Here we are using -o
to
provide a more descriptive name for our executable than something generic
like main
. We did not need to use the -I
switch to tell GCC where to
search for the include file because it is in the current directory. We are
able to pass both *.c
files in a single call to GCC and it automatically
compiled and linked both to produce a single executable. The -lm
switch is
actually an option passed to the linker that tells GCC that it needs to link
against the library libm.so
(we'll talk more about this linking process
later). For now, note that the primary reason we need to do this is that
helper.c
includes math.h
, but math.h
only has the declarations of the
standard math functions, but not the definitions. Included with GCC is a
pre-compiled library containing the compiled definitions of these functions.
In order to use these functions, we need to tell the linker to link against
that library.
We could easily call gcc multiple times to produce object files and then call gcc again to link the object files into a single executable.
gcc -c helper.c -o bin/helper.o gcc -c main.c -o bin/main.o gcc bin/*.o -lm -o bin/geocalcs
The primary reason we might want to do this is to speed up subsequent
compilations. Note that the -c
option tells GCC to Compile or assemble the
source files, but do not link.. Imagine if we had many files that only
defined functions to be used in a single file with a main()
function. If we
were to break up the compilation into many separate .o
files, then if we
edited only a single .c
file we'd only have to re-compile that single .o
file, and re-link all the .o
files into the executable. If we did it in one
step, we'd have to re-process all of the files. Note that Makefiles
(discussed below) are a good way to automate the generation of all of the
many object files, and to manage the dependencies of what needs to be
recompiled when any given part of a large project undergoes a change. Also
note that tools like ccache are easy to incorporate into a project and can
greatly speed compilation.
2.3 Compiling a static library
In this example, we will compile help.c
into a static library that could
then be statically linked into any program. On Linux, and in GCC itself, some
libraries are distributed this way. The advantage using static libraries is
that they only need to be compiled once, and then they can easily be
incorporated into any executable just by changing the linking options passed
to GCC. The disadvantage of static libraries is that if you end up changing
the static library, then all executable that use that static library would
need to be re-compiled to incorporate that change. Another disadvantage is
that they use more disk space and memory because you may end up effectively
storing or using many copies of the library.
gcc -c helper.c -o bin/helper.o ar rcs bin/libhelper.a bin/helper.o gcc main.c bin/libhelper.a -lm -o bin/geocalcs
In the above we used ar
to create the static library. From ar --help
, the
rcs
options do the following:
r
replace existing or insert new file(s) into the archivec
do not warn if the library had to be createds
act as ranlib (basically tellsar
to add or modify.o
files in a static library
In the gcc
command used when linking, we just provided the full path to the
static library we created. Likely the more common and flexible way to do this
is to add the path to the library to the linker search path with -L
and
then link using the libraries name just like we did with -lm
. Here's an
example:
gcc -c helper.c -o bin/helper.o ar rcs bin/libhelper.a bin/helper.o gcc main.c -L./bin -lhelper -lm -o bin/geocalcs
One other note on all of these GCC options that we've been passing that is important:
Generally the order of arguments in GCC does not matter, except the placement of arguments for the linking step. Specifically the -l option for specifying libraries to link against, and the
-Wl
option for passing arguments to the linker. General rule of thumb is to list these arguments last. Here's a good answer explaining why.
2.4 Compiling a dynamic library
Dynamic libraries are loaded at runtime, and once they are loaded, they can stay in system memory and be used by any subsequent executable that needs access to the library. This saves memory and disk space, and allows us to automatically roll out library updates to all executables that use a given library at the same time. This style of library management is far more common on Linux. An excellent document describing how shared libraries are used, how they are loaded at runtime, and how they are named is The Linux Documentation Project's Shared Libraries page. As another excellent reference, check out The Inside Story on Shared Libraries and Dynamic Loading from a 2001 issue of Computing in Science and Engineering.
gcc -c -fpic helper.c -o bin/helper.o gcc -shared bin/helper.o -o bin/libhelper.so gcc main.c -L./bin -lhelper -lm -o bin/geocalcs
Note that at this point if we run the script we will have an error:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs ./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory
This error is because the runtime dynamic linker, ld.so
, does not know
where to look for the newly created libhelper.so
. From the ld.so
man
page, there are a few directories that are searched by ld.so
. A rough
description of these locations, in order, is as follows:
- Value of
DT_RPATH
section of the binary – this option is deprecated, and up-to-date versions of GCC don't set this unless forced to with-Wl,--disable-new-dtags
- Values in the environment variable
LD_LIBRARY_PATH
- Value of
DT_RUNPATH
section of the binary – this is a value that can be set when compiling to store a search path directly in the binary - From the cache file
/etc/ld.so.cache
- In the default path /lib, and then /usr/lib. (On some 64-bit
architectures, the default paths for 64-bit shared objects are /lib64, and
then /usr/lib64.) If the binary was linked with the
-z nodeflib
linker option, this step is skipped.
If a combination of these strategies are not enough to help ld.so
find the
necessary libraries, then you typically see an error like above (cannot open
shared object file: No such file or directory). A very helpful tool for
seeing what libraries are found, and which are missing is ldd
. You can
simply run ldd <NAME OF EXECUTABLE OR OTHER *.SO FILE>
and it will print
out all required libraries and where they are found (or not found).
Let's see all of these options in action.
2.4.1 Using LD_LIBRARY_PATH
We can use the LD_LIBRARY_PATH
environment variable to modify where the
runtime linker searches for libraries to dynamically load. See examples
below:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -o bin/geocalcs jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs ./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory jarvis@test2018:~/msr_toolchain_demos/helper⟫ export LD_LIBRARY_PATH=$(pwd)/bin jarvis@test2018:~/msr_toolchain_demos/helper⟫ echo $LD_LIBRARY_PATH /home/jarvis/Desktop/msr_toolchain_demos/helper/bin jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs Pi is approximated as 3.14159260000000006840537. The surface area of the sphere is 113.0973. The volume of the sphere is 113.0973. jarvis@test2018:~/msr_toolchain_demos/helper⟫ unset LD_LIBRARY_PATH jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs ./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory jarvis@test2018:~/msr_toolchain_demos/helper⟫ LD_LIBRARY_PATH=./bin/ ./bin/geocalcs Pi is approximated as 3.14159260000000006840537. The surface area of the sphere is 113.0973. The volume of the sphere is 113.0973.
Often if I find myself regularly needing to update the LD_LIBRARY_PATH
variable for a particular executable, I'll either create an alias/script
that runs the executable while ensuring the LD_LIBRARY_PATH
is set, or
I'll figure out how to set the DT_RUNPATH
if the missing library is a
project I'm compiling.
2.4.2 Setting DT_RUNPATH
We can tell gcc
to store a particular path as the DT_RUNPATH
field in
the executable it creates, and then this will automatically be searched when
running the executable. See example below:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -Wl,-rpath,$(pwd)/bin -o bin/geocalcs
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.
jarvis@test2018:~/msr_toolchain_demos/helper⟫ readelf -d bin/geocalcs | grep PATH
0x000000000000001d (RUNPATH) Library runpath: [/home/jarvis/Desktop/msr_toolchain_demos/helper/bin]
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ldd bin/geocalcs
linux-vdso.so.1 (0x00007ffe6c782000)
libhelper.so => /home/jarvis/Desktop/msr_toolchain_demos/helper/bin/libhelper.so (0x00007fdac2b6e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fdac27d0000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fdac23df000)
/lib64/ld-linux-x86-64.so.2 (0x00007fdac2f72000)
Note that this option is often turned on by default in CMake. The CMake wiki
has a good description of the CMake behaviors regarding the DT_RUNPATH
.
2.4.3 Using cached library locations
The tool ldconfig
is responsible for keeping a cache of what shared object
files are available in a set of trusted directories, a set of directories
specified in /etc/ld.so.conf
, and directories specified on the command
line. When updating your system with apt-get
you'll often see a message
about running ldconfig
to update the cache. This cache is used by ld.so
rather than actually scanning the system (which would be slow). Let's update
this cache to include our new library:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ gcc main.c -L./bin -lhelper -lm -o bin/geocalcs
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
./bin/geocalcs: error while loading shared libraries: libhelper.so: cannot open shared object file: No such file or directory
127 jarvis@test2018:~/msr_toolchain_demos/helper⟫ sudo ldconfig $(pwd)/bin
[sudo] password for jarvis:
jarvis@test2018:~/msr_toolchain_demos/helper⟫ ./bin/geocalcs
Pi is approximated as 3.14159260000000006840537.
The surface area of the sphere is 113.0973.
The volume of the sphere is 113.0973.
Note that in the above usage, I've only temporarily updated the cache. The
next time ldconfig
is run without explicitly including the project's bin
directory, then that directory will be cleared from the cache. So the above
usage is great for transiently solving a problem, but it is not permanent.
You could make a change permanent by editing the /etc/ld.so.conf
file, but
you don't want to do this regularly. Rather, use this option as a last
resort if you really need to permanently add a library in a nonstandard
location. For example, let's say you compiled some open-source project that
you are going to be linking against regularly, but you are nervous to
install it because you are afraid it will conflict with other libraries on
your system. Then this might be a great option.
2.5 Library names
You've likely noticed that the names we are using to refer to libraries when
compiling and when runtime linking don't quite match the actual name of the
file. For example we use -lhelper
to link against libhelper.so
, and -lm
to link against libm.so
. The pattern demonstrated in those examples is very
common. A library has a "library name" (e.g. helper
), that we used to refer
to the library when we tell the compiler that we want to link against it. The
compiler automatically adds the "lib" and the ".so" to the library name when
it searches for files (it can also add ".a" for static libraries). Note that
we could also provide a full filename to the -l
option by adding a colon –
e.g. -l:libhelper.so
. We can see that there is some mapping that may occur
between library names and the actual filenames when linking during a
compilation step (typically just adding the "lib" and ".so"), but it turns
out there is another mapping that happens during runtime linking, and that
ldconfig
plays an important role in resolving these mappings names. It
turns out that every shared library has an attribute called SONAME
that
includes the "lib", the library name, the ".so", and a final ".X" that
indicates a binary compatibility version. The idea here is to be able to have
multiple versions of a library that don't have compatible function
definitions and for the runtime linker to be able to ensure you are runtime
linking against a compatible version. So very often on Linux what you'll see
is an actual filename for a shared library existing somewhere on the system;
this filename is typically the SONAME with additional version information
appended. This file's SONAME attribute will specify the correct SONAME, and
when ldconfig
is run it will create a symbolic link named using the SONAME
and pointing to the actual file. Additionally, package maintainers often go
the extra step of providing another symbolic link that completely strips the
versioning for the specific purpose of providing an easy way to link against
the library during compilation.
For example, let's look at the libraries installed for libtar – the libtar
description says it is a "C library for manipulating tar archives libtar
allows programs to create, extract and test tar archives. It supports both
the strict POSIX tar format and many of the commonly-used GNU extensions." On
my system there is an apt-get
package called libtar0
. The "0" at the end
is indicating that the SONAME we would get if we install this package will
specifically be for version "0". If I install libtar0
using apt get, I get
a new file of interest on my machine: /usr/lib/libtar.so.0.0.0
The install
scripts that ship with that package ask ldconfig
to be run after installing
the relevant files. ldconfig
knows that /usr/lib
is one of the
directories it should be adding to its cache and managing links for so it
scans the directory and notices this new file. It reads the SONAME field for
this library and realizes that it needs to create a symbolic link. The link
it creates is /usr/lib/libtar.so.0
and it points to the libtar.so.0.0.0
file. Additionally the maintainers of this package have also created a
package called libtar-dev
– the -dev
is telling us that we should
install this package if we want to "develop" our own code that links against
libtar
. This package does 2 things: (1) it provides us with a static
version of libtar.a
that we could statically link against; (2) it creates a
symbolic link called just /usr/lib/libtar.so
that points to the
libtar.so.0.0.0
file that gcc
will easily be able to find when linking
against the tar library at compile time. See the command line printout
below for demonstrating of this example:
jarvis@test2018:~⟫ cd /usr/lib jarvis@test2018:/usr/lib⟫ ll libtar* # below we can see the symbolic links that exist on my machine: -rw-r--r-- 1 root root 67K Oct 11 2016 libtar.a lrwxrwxrwx 1 root root 15 Oct 11 2016 libtar.so -> libtar.so.0.0.0 lrwxrwxrwx 1 root root 15 Nov 21 08:16 libtar.so.0 -> libtar.so.0.0.0 -rw-r--r-- 1 root root 35K Oct 11 2016 libtar.so.0.0.0 jarvis@test2018:/usr/lib⟫ readelf -d libtar.so.0.0.0 |grep SONAME 0x000000000000000e (SONAME) Library soname: [libtar.so.0] jarvis@test2018:/usr/lib⟫ apt-file search /usr/lib/libtar.so # we can use "apt-file" to search for which packages provide particular files libtar-dev: /usr/lib/libtar.so libtar0: /usr/lib/libtar.so.0 libtar0: /usr/lib/libtar.so.0.0.0 jarvis@test2018:/usr/lib⟫ sudo rm libtar.so.0 # not recommended to do this, I'm just illlustrating that "ldconfig" will update this symbolic link jarvis@test2018:/usr/lib⟫ ll libtar* # notice the symbolic link is gone: -rw-r--r-- 1 root root 67K Oct 11 2016 libtar.a lrwxrwxrwx 1 root root 15 Oct 11 2016 libtar.so -> libtar.so.0.0.0 -rw-r--r-- 1 root root 35K Oct 11 2016 libtar.so.0.0.0 jarvis@test2018:/usr/lib⟫ sudo ldconfig jarvis@test2018:/usr/lib⟫ ll libtar* # running ldconfig has returned the link: -rw-r--r-- 1 root root 67K Oct 11 2016 libtar.a lrwxrwxrwx 1 root root 15 Oct 11 2016 libtar.so -> libtar.so.0.0.0 lrwxrwxrwx 1 root root 15 Nov 21 08:24 libtar.so.0 -> libtar.so.0.0.0 -rw-r--r-- 1 root root 35K Oct 11 2016 libtar.so.0.0.0
This stuff gets quite complicated, and I wouldn't expect you to know and understand it all on first study. What you should know is that there are multiple ways of referring to libraries at runtime and compile time, and that if you're ever faced with a problem that feels like it's related to the names of libraries or their versions the term "SONAME" could be something to start researching.
3 GNU Make
Quoting from the Make homepage:
GNU Make is a tool which controls the generation of executables and other non-source files of a program from the program's source files.
Make gets its knowledge of how to build your program from a file called the Makefile, which lists each of the non-source files and how to compute it from other files. When you write a program, you should write a Makefile for it, so that it is possible to use Make to build and install the program.
Make can be used with many tools (not just GCC) to specify recipes for steps that need to be taken to produce a particular output. For example, I've written Makefiles for editing photos or videos, producing PDFs, and for generating GCode for use on a laser cutter. Not only will Make allow you to follow these rules, but it will automatically check timestamps on files and only re-run steps that need to be re-run to produce a particular output.
Writing Makefiles can be cumbersome as the syntax is a bit convoluted and the debugging process can be tricky. That said, for relatively simple projects, this is an awesome way to get the project compiled and to share the project with others so that they can compile it. This document certainly won't teach you everything you need to know about Makefiles, but it will show a few simple examples to give you a sense of what a Makefile actually is.
3.1 Make without a Makefile
For simple projects, Make actually has a set of implicit rules that can sometimes be used with no particular configuration. In my experience, this isn't the most useful, but it is good to know about.
jarvis@test2018:~⟫ cd ~/msr_toolchain_demos/invest/ jarvis@test2018:~/msr_toolchain_demos/invest [master]⟫ make invest cc invest.c -o invest
3.2 A simple Makefile
Let's write a very simple Makefile for invest.c
:
1: APP=invest 2: BIN_DIR=bin 3: CC=gcc 4: 5: all: $(BIN_DIR)/$(APP) 6: 7: $(BIN_DIR)/$(APP): 8: $(CC) $(APP).c -o $(BIN_DIR)/$(APP) 9: 10: clean: 11: rm -f $(BIN_DIR)/$(APP)
The general structure of the file above is to first create a few variables on
the first three lines. Then on the 5th line we create a target called all
that has a dependency on bin/invest
. When make parses this file with no
passed args (i.e. the user just typed make
at the command line), it scans
line-by-line looking for the first target that has at least one dependency.
Once that first target is found, it looks for rules to make the target's
dependencies. In this case, the rule for the only dependency is on line 7,
and that target has no dependencies. Therefore, make will create that target
and then be able to create the original target as all of its dependencies are
now built. In this particular case, there is nothing to be done for the
original target as there are no associated commands; the purpose of the all
target is just to ensure that it is the default target if no target is passed
at the command line.
All targets can be invoked manually by passing them at the command line. With
the above Makefile, I could use make
, make all
, or make bin/invest
to
identically produce the executable, and I could use make clean
to remove
the executable.
Note a slight improvement to the above Makefile would be to check that the output directory actually exists:
1: APP=invest 2: BIN_DIR=bin 3: CC=gcc 4: 5: all: init $(BIN_DIR)/$(APP) 6: 7: $(BIN_DIR)/$(APP): 8: $(CC) $(APP).c -o $(BIN_DIR)/$(APP) 9: 10: init: 11: mkdir -p $(BIN_DIR) 12: 13: clean: 14: rm -f $(BIN_DIR)/$(APP)
3.3 Slightly more complex Makefile
Now let's write a Makefile for the helper
executable:
1: APP=geocalcs 2: BIN_DIR=bin 3: CC=gcc 4: LDFLAGS=-lm 5: 6: SOURCES=$(wildcard *.c) 7: OBJECTS=$(foreach obj, $(SOURCES), $(BIN_DIR)/$(obj:.c=.o)) 8: 9: all: init $(BIN_DIR)/$(APP) 10: 11: $(BIN_DIR)/$(APP): $(OBJECTS) 12: @echo creating executable $@ 13: $(CC) $(OBJECTS) $(LDFLAGS) -o $(BIN_DIR)/$(APP) 14: 15: $(BIN_DIR)/%.o: %.c 16: @echo creating object $@ 17: $(CC) $(CFLAGS) -o $@ -c $< 18: 19: init: 20: mkdir -p $(BIN_DIR) 21: 22: clean: 23: rm -f $(BIN_DIR)/$(APP) $(OBJECTS)
This is not all that different than the Makefile above, but we have used some
more advanced features to help automate things. For example, on lines 6 and
7, we have used a wildcard and a for loop to automatically create a list of
all *.c
files and then convert those names into a list of targets. So the
value of $(OBJECTS) is bin/main.o bin/helper.o
. Thus the target on line 11
depends on building the object files from both *.c files. The target and
commands on lines 15-17 are using pattern rules to provide targets,
dependencies, and commands for both bin/main.o
and bin/helper.o
. By
leveraging more complex tools like this, it is possible to write very
powerful, and generic Makefiles capable of working with much more complex
projects.
4 Introduction to CMake
Quoting from the CMake website:
CMake is an open-source, cross-platform family of tools designed to build, test and package software. CMake is used to control the software compilation process using simple platform and compiler independent configuration files, and generate native Makefiles and workspaces that can be used in the compiler environment of your choice.
So basically, CMake is a tool capable of generating common build scripts for
the host OS. With one well-written CMakeLists.txt
file, one could
automatically generate both a Makefile for use on Linux or an MSBuild file for
Visual Studio in Windows. For open-source C++ projects it seems to me that the
usage of CMake has grown significantly in recent years. CMake is one of the
most important build tools in many robotics-related projects including ROS,
OpenCV, PCL, OMPL, trajopt, DART, and more.
In my experiences, the real power of CMake is not really the cross-platform or cross-compiler features (I don't tend to do much of that type of development), but is instead its tools for dependency management. It's trivial to develop a project that depends on many other projects and automatically generate a Makefile that has all of the right compiler flags to build your project. In the examples above, the Makefiles were quite simple, but as the complexity of your project and its dependencies goes up it starts to get quite challenging to develop reliable, portable Makefiles. This is where CMake comes in.
CMake has its share of issues as well. The syntax is cumbersome, the documentation can be difficult to understand and navigate, and the features of CMake have evolved quickly, resulting in tons of information on Stack Overflow, blogs, etc. that is incorrect or out-of-date. Perhaps the most frustrating of all is that there is a real lack of guidance for how to do things the "right" way. On top of all of that, when things aren't working it can be very difficult to parse the error messages that you are getting and to figure out how to fix them.
CMake is huge, and there is no way I can teach you everything you need to know about CMake in one short lesson. Instead what I will do is present a simple example, and then give some tips for developing your own CMake files.
4.1 First simple CMake example
For this example, we will use the same helper project as in previous
examples. We will build the project as a shared library that we link against
with our primary executable. Change directory into helper/
, and you'll find
the following CMakeLists.txt
file:
1: # likely, we could get away with an older minimum version number with this 2: # particular CMake file 3: cmake_minimum_required (VERSION 3.0) 4: # let's tell CMake that we are specifically using only C... this will make 5: # compilation faster and more portable. 6: project(helper VERSION 1.0 DESCRIPTION "Simple CMake example" LANGUAGES C) 7: 8: # ensure we are looking in current directory for header files (this is probably 9: # not needed, but I wanted to show the syntax) 10: include_directories("${PROJECT_SOURCE_DIR}") 11: 12: # tell CMake that we will compile "libhelper.so" 13: add_library(helper SHARED helper.c) 14: 15: # add an executable and tell CMake to link against "libhelper.so" and "libm.so" 16: add_executable(geocalcs main.c) 17: target_link_libraries(geocalcs helper m)
Now to build this project, we will follow a procedure that is used for nearly every CMake project:
jarvis@test2018:~⟫ cd msr_toolchain_demos/helper/ jarvis@test2018:~/msr_toolchain_demos/helper⟫ mkdir build jarvis@test2018:~/msr_toolchain_demos/helper⟫ cd build/ jarvis@test2018:~/helper/build⟫ cmake .. -- The C compiler identification is GNU 7.3.0 -- Check for working C compiler: /usr/bin/cc -- Check for working C compiler: /usr/bin/cc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Detecting C compile features -- Detecting C compile features - done -- Configuring done -- Generating done -- Build files have been written to: /home/jarvis/msr_toolchain_demos/helper/build jarvis@test2018:~/helper/build⟫ make Scanning dependencies of target helper [ 25%] Building C object CMakeFiles/helper.dir/helper.c.o [ 50%] Linking C shared library libhelper.so [ 50%] Built target helper Scanning dependencies of target geocalcs [ 75%] Building C object CMakeFiles/geocalcs.dir/main.c.o [100%] Linking C executable geocalcs [100%] Built target geocalcs jarvis@test2018:~/helper/build⟫ ./geocalcs Pi is approximated as 3.14159260000000006840537. The surface area of the sphere is 113.0973. The volume of the sphere is 113.0973. jarvis@test2018:~/helper/build⟫ ldd geocalcs linux-vdso.so.1 (0x00007ffeb45b8000) libhelper.so => /home/jarvis/msr_toolchain_demos/helper/build/libhelper.so (0x00007fb0d70c9000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb0d6d2b000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb0d693a000) /lib64/ld-linux-x86-64.so.2 (0x00007fb0d74cd000) 1 jarvis@test2018:~/helper/build⟫ readelf -d geocalcs |grep PATH 0x000000000000001d (RUNPATH) Library runpath: [/home/jarvis/msr_toolchain_demos/helper/build]
Note that it is very common to use an "out-of-source" directory with CMake to
store all intermediate files and compiled targets (we used build/
in
above). Also note that CMake automatically included the RUNPATH
header
information for our .so file in the geocalcs
executable.
4.2 Depending on other projects
As stated above, one of the most powerful features of CMake is its ability to help you create projects that depend on other CMake projects. To illustrate some of this functionality, I've put together a simple demo that relies on OpenCV (for image processing) and Eigen (for matrix math). See below for an annotated CMakeLists.txt file:
cmake_minimum_required(VERSION 3.0) project(eigen_opencv_demo VERSION 1.0 DESCRIPTION "Demo of depending on other libraries" LANGUAGES CXX) # Let's build as a "Release"... this controls optimization level and inclusion # of debug symbols. We wrap in an if statement to allow users to override # setting: if( NOT CMAKE_BUILD_TYPE ) set(CMAKE_BUILD_TYPE Release) endif() # we are using Eigen, so let's see if we can find it: find_package(Eigen3 REQUIRED) # Eigen is header only, so we need to tell CMake that we want to look in the # path containing Eigen's headers include_directories(${EIGEN3_INCLUDE_DIR}) # let's see if we can find OpenCV, note that by requiring specific COMPONENTS, # we are not linking against all OpenCV which speeds compile time and reduces # the requirements of having a full OpenCV find_package(OpenCV REQUIRED COMPONENTS highgui core imgproc) # let's add an executable: add_executable(demo demo.cpp) # we need to link against OpenCV; in below, the ${OpenCV_LIBS} variable will # only include the libraries explicitly listed as required components above target_link_libraries(demo ${OpenCV_LIBS}) # I like to turn this on to produce a compile_commands.json file. Useful for # debugging compilation and for many C++ autocomplete and code navigation tools # (e.g. ycmd - https://github.com/Valloric/ycmd ) set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
In the above, all of the magic really happens in find_package
lines. This
find_package function reads configuration files from other CMake projects and
they define useful variables such as OpenCV_LIBS
and EIGEN3_INCLUDE_DIR
,
and they do it in a portable way. When you call find_package(module)
CMake
looks in a set of directories for a file called moduleConfig.cmake
or
Findmodule.cmake
, and when one if these files is found its purpose is to
define all necessary variables for others packages to use in order to depend
on module
. Writing these files from scratch is challenging, but for many
major packages, they are already available.
4.3 General Tips
Below are a collection of general tips that have helped me debug CMake issues in the past.
4.3.1 Setting CMake variables from the command line
Often it can be convenient to set CMake variables from the command line. A
common version of this you might see is cmake .. -DCMAKE_BUILD_TYPE=Release
This -D
switch is used to define/set cached variables. The full documentation on this can be found on the official CMake site.
4.3.2 Seeing all compilation commands
Often it is very useful to determine exactly what commands are being sent to
your compiler, especially when you suspect that your code won't compile
because the wrong flags are being sent to your compiler. This stack overflow
has a few tips on getting this information. Usually, I just run make VERBOSE=1
to see the whole history, and then combine with less
or grep
to filter out what I'm looking for.
4.3.3 Reading all CMake variables
Often it can be very useful to see all CMake variable that are currently
available. As far as I know, the only way to see all variables is to
include a snippet in your CMakeLists.txt
, and then remove this snippet
after you are done debugging (I suppose you could also define a variable and
only run the snippet if the user passed an argument from the command line).
The snippet I usually use is as follows:
get_cmake_property(_variableNames VARIABLES) list (SORT _variableNames) foreach (_variableName ${_variableNames}) message(STATUS "${_variableName}=${${_variableName}}") endforeach()
4.3.4 Interactive CMake configuration
There are several tools that you can use to interactively see what variables
CMake currently has cached variables, and what their values are. I typically
use either ccmake
or cmake-gui
for this purpose. Note that these
packages are on apt-get
as cmake-curses-gui
and cmake-qt-gui
respectively.
4.3.5 Depending on packages without a findXXX.cmake
or xxxConfig.cmake
Likely the most frustrating complication that I've encountered when
developing CMake projects is when the project that I'm developing depends on
a different project that doesn't have a readily-available findXXX.cmake
or
xxxConfig.cmake
. CMake ships with a bunch of these scripts out-of-the-box
that support a wide variety of popular packages; on my system, these files
are located in /usr/share/cmake-3.10/Modules/
. In the case that the
depending project doesn't have a script in that directory, and you are
getting an error about not finding package configuration files, below are a
few strategies. For reference, let's say you were depending a package called
custompackage
by putting find_package(custompackage)
in your
CMakeLists.txt
. If CMake couldn't find the package configuration files,
the error would be something like:
Could not find a package configuration file provided by "custompackage" with any of the following names: custompackageConfig.cmake custompackage-config.cmake
- Using pkg-config
Linux distributions often bundle a package called
pkg-config
. Then when a package is installed through a package manager (e.g.deb
orapt-get
), they also distribute a.pc
file that can be used along with thepkg-config
command line tool to retrieve information about what content was installed by the package. For example, I could query what libraries OpenCV installed with the following command:jarvis@test2018:~⟫ pkg-config --libs opencv -lopencv_shape -lopencv_stitching -lopencv_superres -lopencv_videostab -lopencv_aruco -lopencv_bgsegm -lopencv_bioinspired -lopencv_ccalib -lopencv_datasets -lopencv_dpm -lopencv_face -lopencv_freetype -lopencv_fuzzy -lopencv_hdf -lopencv_line_descriptor -lopencv_optflow -lopencv_video -lopencv_plot -lopencv_reg -lopencv_saliency -lopencv_stereo -lopencv_structured_light -lopencv_phase_unwrapping -lopencv_rgbd -lopencv_viz -lopencv_surface_matching -lopencv_text -lopencv_ximgproc -lopencv_calib3d -lopencv_features2d -lopencv_flann -lopencv_xobjdetect -lopencv_objdetect -lopencv_ml -lopencv_xphoto -lopencv_highgui -lopencv_videoio -lopencv_imgcodecs -lopencv_photo -lopencv_imgproc -lopencv_core
It turns out that CMake provides the FindPkgConfig.cmake file that allows you to query
pkg-config
from CMake to define the variables needed to rely on packages without CMake configurations – of course, that is only useful if you have a.pc
file for your package. Here's an example of using this capability with OpenCV and the previousdemo.cpp
file:find_package(PkgConfig REQUIRED) pkg_search_module(PKG_OPENCV REQUIRED opencv) include_directories(${PKG_OPENCV_INCLUDE_DIRS}) add_executable(demo demo.cpp) target_link_libraries(demo ${PKG_OPENCV_LDFLAGS})
- Relying on non-standard files
Often, even if your CMake isn't bundled with the find scripts for the project you are depending on, someone online may have written such a script. If you can find a
FindPACKAGE.cmake
file for your package, it is easy to bundle the find script with your project and tell CMake to use that find script. As an example, check out how manyFindPACKAGE.cmake
files OpenCV bundles with their source code. Many of these scripts are custom written by CMake developers, but some are borrowed from other sources. If you'd like to tell CMake where to search for these custom find scripts, you can use the CMAKE_MODULE_PATH variable. - Custom package configuration files
Writing these find scripts to be robust across different operating systems and portable from machine to machine is fairly challenging. However, writing a simple one that gets the job done often isn't all that challenging. Here's a blog that I came across recently with some good advice on CMake in general, and an analysis of what is included in these find scripts as well as a very simple example: https://pabloariasal.github.io/2018/02/19/its-time-to-do-cmake-right/