Compiler frontends: ccache
This is part two of a series on compiler frontends. The first part, David Cantrell’s presentation on distcc, can be read at http://www.lugatgt.org/articles/distcc/.
Introduction
Ccache is a frontend to your favorite C/C++ compiler that caches object files, so that if a build is run multiple times, needless recompilations are not necessary. It uses the C preprocessor output and compiler flags as part of its hash function, so any object files pulled out of the cache are identical to what the compiler would have produced. Compiler messages are also stored and retrieved along with the object files. To the user or the linker, the only noticeable effect of ccache is in speed.
The uses of this are less obvious than those of distcc, but for many, recompilation of identical code is a common task. For example, if you have a project that was compiled using the -O2 flag passed to gcc, and you want to switch that to -g to debug, a recompilation is necessary. However, when using a cache, a second recompilation for unchanged code is not needed when changing back to -O2. Also, package build systems like RPM or dpkg often run “make clean” as the first step before compiling a package. Ccache prevents this make clean from throwing away useful information.
Erik Thiele’s compilercache is an earlier caching compiler frontend, and uses a collection of shell scripts and the md5sum utility to store object files. Ccache is a reimplementation of compilercache in C, with some added improvements for cache management.
Ccache was written by Andrew Tridgell, who primarily uses it to make Samba build faster.
Installation
The cache will be created the first time that ccache is run, so for most people there are no necessary steps to take before installation. Just download, compile, and install.
wget http://ccache.samba.org/ftp/ccache/ccache-2.2.tar.gz
gzip -cd ccache-2.2.tar.gz | tar -xvf -
cd ccache-2.2
./configure --prefix=/usr/local
make
make install
Configuration
The default settings for the cache are to limit its size to 1GB, and allow an unlimited number of files. These settings can be configured using ccache -F numfiles and ccache -M size. Size is in GB unless M or K postfixes the number.
The cache will be located in ${HOME}/.ccache unless otherwise specified. This can be changed by setting the CCACHE_DIR environment variable. The cache can be shared among several users, but care must be taken to keep the permissions of the cache consistent. And, of course, you have to trust all of the users with access to the cache. Everyone using the cache must be able to write to it, so a umask of 002 would probably be desirable. Also, on systems that use SysV style directory permissions (like Linux), the setgid bit needs to be set on the cache directory to ensure that all created subdirectories are owned by the cache group.
Compiling
Just as distcc performs C/C++ preprocessing and tries to speed up compiling and assembling, ccache performs C/C++ preprocessing and tries to speed up the compiling and assembling steps by skipping them, if possible. If for some reason all of your work is done in the preprocessor, ccache will not offer much benefit.
For Makefiles and configure scripts that honor the CC variable, ccache can be used by setting CC to "ccache gcc". For example:
CC="ccache gcc" ./configure
make
Similarly, CXX can be set to "ccache g++" for C++ code.
Another way to ensure that ccache is always called is to install a symlink from ccache to the name of your compiler, but put the symlink before the real compiler in PATH. For example, if gcc is installed as /usr/bin/gcc, and /usr/local/bin is before /usr/bin the path, you could create a symlink /usr/local/bin/gcc that points to ccache. When ccache is called using /usr/local/bin/gcc, it will search the path for the first executable program named gcc that is not a symlink to ccache.
Since ccache needs to know which arguments are options and which are files, it assumes that any argument that is also a valid filename is referring to that file. If you have filenames that look a lot like your compiler flags, you may need to use the --ccache-skip flag, which tells ccache that the argument following it is not really a file.
Real-world example
Kernel recompilation is a common task, so let’s see how this can be improved with ccache. The environment:
- Computer: AMD Athlon 900MHz, 768MB RAM
- Kernel Version: 2.4.21
- gcc version: 3.2.3
The same build script was used as for the distcc example, only modified slightly for Intel architectures:
#!/bin/bash
make mrproper
cp /boot/config .config
make oldconfig
make dep clean
make MAKE="make -j 5" CC="$1" bzImage
make MAKE="make -j 5" CC="$1" modules
Three runs were made: the first passing gcc as the argument to the build script, and the second two passing ccache gcc. Here are the results:
Using ccache with distcc
Since distcc and ccache do independent things, they can work together with ccache calling distcc, as in CC="ccache distcc gcc". You can also set the environment variable CCACHE_PREFIX to distcc, and then use "ccache gcc" for CC, and ccache will prefix distcc to the compiler command. Using this combination, only uncached code will be compiled, and code that is compiled will be sent over the network as needed.
Resources and Further Reading
- ccache home page
- compilercache home page
- Towers of Hanoi in C preprocessor (not good for caching)