Effective C++20 Modules

Overview

This is a prescriptive conceptual guide for using C++20 modules effectively. It does not cover all the nuances or possibilities, but rather focuses on one method of organizing C++20 modules and how that method maps on to traditional C++ library organization.

It assumes basic knowledge of C++20 modules and cmake:

Caveats

These notes are based primarily on experience was gathered on Linux using:

clang 19.1.7
gcc 15.0.1 20250222 (experimental) (i.e., I compiled gcc from git, it is unreleased).
cmake 3.31

The experience as of (April 2025) revealed stark mismatches between how clang and gcc interpreted code, with many instances of code being valid in clang but not gcc or vice versa, and occasional gcc crashes. While I did not pore over the standard, I assume that clang's implementation is more correct than gcc's as the gcc's version is explicitly incomplete and experimental. The method I outline here did not work completely with gcc 14, which is why I use a newer version (which magically fixed many problems). Overall, it seems that much progress is being made!

The Old Way

Prior to C++ modules, most C++ libraries are organized into several header (.hpp) files and implementation (.cpp) files (yes there are header-only libraries too). The creator of the library compiles the .cpp files and the .hpp files (which are #included by the .cpp files and each other) into a single binary (e.g., a .a or .so file).

Users need the .hpp files and the binary (.a) to use the library. The user #includes the header files and links against the .a file. This results in the user's compiler needing to re-compile each header file in every file that the user uses it, which is bad for compilation time. Some compilers offer pre-compiled headers, which is a non-standard way to speed-up compilation time for headers that don't change much.

The New Way

C++20 modules provide a method for the compiler to compile each "header" file only once, greatly speeding up compilation time (in theory). There is a cost, however: the compiler must determine dependencies between modules, which at times has been shown to increase compilation time [citation needed].

Here are some differences between using traditional C++ headers and modules:

Instead of a development library consisting of a binary and .hpp files, the module is distributed as a binary (.a or .so, same as before) and module interface .ixx files (somewhat analogous to .hpp files).
Instead of header files being #include=d, a module =interface is import ed.
For module writers, the module interface is compiled every time it or something it depends on changes.
- The build system automatically detects these changes and does "the right thing".
For module users, the module interface is compiled only once.
- The compilation artifact from compiling a module interface is compiler-specific, so interfaces .ixx should be distributed in source-code form (just as headers are distributed as source code).
- Modules can also have separate implementation files that get compiled into the binary library: these could be distributed in binary form.
In summary, not much has changed with modules, except you distribute module interface files and the library rather than header files and a library. When you do this, the user needs to compile the module interface files only one time, as opposed to every time for every .cpp file that includes the header.

Migration

Here is a suggested migration path for existing projects.

Each module corresponds to a library (call it MyLib).
- There are other choices, but it does not seem to make sense to divide modules up into many libraries (as evidenced by the entire C++ standard library being in a single std module).
Each module consists of a single primary interface file named MyLib.ixx that exports a module name MyLib (e.g, export module MyLib).
- According the specification (I believe), each module can have only one primary interface.
- I have adopted the .ixx convention used by MSVC. Other compiler's don't require this extension, but don't mind it either.
- Users will import MyLib and gain access to everything in the library. There should not be separate imports for separate features.
  - Of course, you can (and should!) still use namespaces within a module to avoid polluting the global namespace when importing
Each individual .hpp file in your library now becomes a .ixx file with the same name.
- Each .ixx file is setup as a partition of MyLib.
- So each header export module MyLib:PartitionName to make it's code available to the main module interface.
The main MyLib.ixx file should export import each partition (e.g., export import :PartitionName), making it available to users who import MyLib.
The .cpp files are renamed to .cxx (just for clarity, not actually necessary).
- They all declare themselves as module MyLib at the top, marking them as implementation files for the MyLib module.
- They import :PartitionName for any of the partitions that they need to use
- They do not (and cannot) declare themselves to be module MyLib:PartitionName. Partitions are for interfaces, not implementations.
The .cpp files are included in the library as normal. The .ixx files are listed as part of the FILE_SET CXX_MODULES FILES in the target_sources command in CMake.
```
add_library(mylib impl1.cxx impl2.cxx)
target_sources(mylib FILE_SET CXX_MODULES FILES MyLib.ixx other_impl.ixx)
```
export any entities that users of your library needs to use directly from the corresponding module interface partition file.

Explanation

There were some opinionated choices in the above migration guide. Here is an explanation for some of them.

The guidelines try to maintain as close a mapping as possible between files in traditional setups and modules.
- There should be little need to, for example, take all the code in a .cpp file and implement it in the header file instead.
Maintaining a separation between .ixx and .cxx files is less necessary/advantageous with modules than traditional headers, however, there are still benefits to maintaining the separation
- Interface files need not be re-compiled when the implementation changes.
- Implementation files need not be recompiled if the interface changes in a way that is unrelated to the implementation file.
  - E.g., if interface1.ixx changes but impl2.cxx doesn't import :interface1 then impl2.cxx will not be recompiled when interface1.ixx changes.
Maintaining the module as separate partitions improves compile times for the library creator (versus using a single interface file without partitions)
- When an interface file is changed, only the .ixx files that import that file need to be recompiled.
Some sources suggest using a convention of MyLib.SmallerLib to divide up the library into parts
- The . is just a character like any other with modules, there is no semantic meaning
- It seems unnecessary to introduce a naming hierarchy to modules, as a single large module can be imported quickly (like std).