Imagine we’re developing a C++ library. We need to deliver this library to a client, and the agreement is to deliver it as a shared library for linux. How to do this?
Let’s define the setup and assumptions:
- we develop on linux system using well-known compiler, like gcc or clang;
- we use cmake as build system (backed by make or ninja);
- we know how to build for the target platform (for example the client provides a toolchain);
- the target platform may be different from the development platform, and the produced library may require an emulator to run.
The setup for our example project includes:
- libfoois the library we need to deliver;
- appis our internal developer app that uses- libfoo;
The code of example project can be found here.
Build configuration
cmake has a concept of Build Configurations
which controls what options are passed to the compiler.
There are 4 default configurations -
Debug, Release, RelWithDebInfo and MinSizeRel -
and CMAKE_BUILD_TYPE variable
controls this.
- Debug - for development use only. No optimizations, debug info included.
- Release - produces the final deliverable binary. Optimized for speed, no debug info included.
- RelWithDebInfo - same as Release, but includes debug info. Debug info significantly increases the size of the binary, but allows to analyze crash dumps.
- MinSizeRel - similar to Release, but optimized for size of the binary rather than execution speed.
We can consider shipping the library without debug info - Release or MinSizeRel configuration - if we don’t expect to receive and analyze crash dumps from the clients. If we do need to investigate crashes, we need debug symbols, but we also need optimizations, so RelWithDebInfo is our only option. But we don’t want to ship a library polluted with debug info.
The solution is to put debug info in a separate file.
Strip the binary
file command or
readelf tool
from binutils package can show if a binary contains debug info.
Let’s build the library in RelWithDebInfo mode:
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON ..
cmake --build .  # build succeeds
and inspect it:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., with debug_info, not stripped
$ readelf --sections libfoo.so | grep debug
...
  [28] .debug_aranges    PROGBITS         0000000000000000  000030b5
  [29] .debug_info       PROGBITS         0000000000000000  00003145
...
file prints with debug_info, and readelf --sections shows debug sections, which means
the library contains debug symbols. If we would use -DCMAKE_BUILD_TYPE=Release, there would be
no debug sections in readelf and file wouldn’t show with debug_info.
objcopy tool from binutils can
copy debug info into a separate file:
objcopy --only-keep-debug libfoo.so libfoo.so.debug
objcopy --strip-debug --add-gnu-debuglink=libfoo.so.debug libfoo.so
cmake usually provides CMAKE_OBJCOPY variable
that points to objcopy executable. We can use it to add a custom command to our cmake target
and extract debug info during the build.
If we rebuild the library with LIBFOO_STRIP=ON:
# `LIBFOO_STRIP` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_STRIP=ON ..
cmake --build .  # build succeeds
and inspect the produced binary:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., not stripped
$ readelf --sections libfoo.so | grep debug
  [28] .gnu_debuglink    PROGBITS         0000000000000000  000030b8
readelf shows .gnu_debuglink section only which is a link to a file containing debug info
(caused by --add-gnu-debuglink option in the example project). file doesn’t show with debug_info,
but still shows not stripped - this means that our binary still contains additional unneeded info -
for example .symtab section. Unneeded sections can be removed if objcopy is invoked
with --strip-unneeded parameter instead of --strip-debug.
cmake --install has --strip option
which performs such aggressive stripping during installation. If we use it after the build:
# `LIBFOO_STRIP` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_STRIP=ON ..
cmake --build .  # build succeeds
ctest
cmake --install . --prefix=../out --strip
and inspect the produced and installed binaries, we’ll see that the installed binary is finally stripped:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., not stripped
$ file ../out/lib/libfoo.so
../out/lib/libfoo.so: ELF 64-bit LSB shared object, ..., stripped
We need to save the file with debug info (libfoo.so.debug) for every shipped binary, then we will be able
to analyze crash dumps that customers may send to us.
Visibility of exported symbols
Shared libraries provide functionality via exported dynamic symbols.
If we build libfoo as shared
in Release mode:
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON ..
cmake --build .  # build succeeds
ctest  # tests run successfully
we’ll get libfoo.so as our shared library.
To list dynamic symbols we can use nm tool
from binutils package:
$ nm --dynamic libfoo.so  # or "nm -D"
...
                 w _ITM_registerTMCloneTable
                 U _Unwind_Resume@GCC_3.0
00000000000012c0 T _ZN6libfoo3fooEv
00000000000013d0 T _ZN6libfoo4foo2Ev
0000000000001450 T _ZN6libfoo8internal12foo_internalB5cxx11Ev
                 U _ZNKSt5ctypeIcE13_M_widen_initEv@GLIBCXX_3.4.11
0000000000001440 W _ZNKSt5ctypeIcE8do_widenEc
                 U _ZNSo3putEc@GLIBCXX_3.4
                 U _ZNSo5flushEv@GLIBCXX_3.4
...
Looking at this list, there are 2 observations that raise questions:
- this list contains “ugly” names instead of pretty C++ names. This is called
name mangling and compilers do this to C++ symbols
to make them unique. We can add --demangleparameter tonmto get pretty symbols back:Another option is$ nm --dynamic --demangle libfoo.so # or "nm -DC" ... w _ITM_registerTMCloneTable U _Unwind_Resume@GCC_3.0 00000000000012c0 T libfoo::foo() 00000000000013d0 T libfoo::foo2() 0000000000001450 T libfoo::internal::foo_internal[abi:cxx11]() U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11 0000000000001440 W std::ctype<char>::do_widen(char) const U std::ostream::put(char)@GLIBCXX_3.4 U std::ostream::flush()@GLIBCXX_3.4 ...c++filttool frombinutils:nm --dynamic libfoo.so | c++filtwill give similar output.
- there are way too many symbols in the list including our internal symbols
(libfoo::internal::foo_internal) and symbols from library dependencies (std::ostream::flush()). This happens because by default all statically linked symbols are visible and exported from dynamic libraries. Users of the library can try to use these symbols which is undesirable. We need to limit symbols visibility to keep the public library API clean.
Let’s inspect nm output in more details.
Symbols with address in the first column are “real” symbols exported from the library.
Users that link against our library can use these symbols (call the functions) freely.
The second column is the type of the symbol. What’s important for now:
- Umeans “undefined symbol” - the symbol is required, and must be provided at runtime via dependencies (note- @GLIBCXX_3.4suffix for example).
- Tmeans global symbol “in .text section” - exported from the library.
- talso means a symbol “in .text section”, but it’s local and not exported (- nm --dynamicdoesn’t show them).
- w/- Wmeans “weak symbol”. When linking the final application, the linker will pick a non-weak symbol over weak symbols, and pick any weak symbol if no non-weak symbols exist. Typically, weak symbols are default constructors/destructors and templates instantiations. They don’t violate ODR rule, and the linker will eliminate duplicates.
Our goal is to have all symbols forming public API of our library to be exported (in dynamic section), and no other internal symbols should be exported.
Pass “version script” file to linker
Widely used linkers (like GNU ld, gold and mold or LLVM lld) support
version script files
via --version-script parameter. Version script files can be used to define visibility of symbols.
An example of such file to export symbols from libfoo:: namespace only can look like this:
{
    global:
        _ZN6libfoo*;
    local:
        *;
};
This file uses mangled symbol names by default, so we need know them upfront (by running nm for example).
It’s possible to specify the programming language of symbols
explicitly via extern "lang" directive and offload the mangling to the link time:
{
    global:
        extern "C++" {
            libfoo::*;
        };
    local:
        *;
};
Note: version script files can also be used to assign versions to symbols, so the dynamic linker can check the provided functionality of a library at runtime. But that’s out of scope for this page.
To pass a version script file to the linker we need to add -Wl,--version-script=FILENAME linker option
(or add this flag to LINK_FLAGS property
of the cmake target).
Let’s build the library and inspect exported symbols:
# `LIBFOO_USE_VERSION_SCRIPT` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_USE_VERSION_SCRIPT=ON ..
cmake --build .
ctest
nm --dynamic --demangle libfoo.so
...
                 U _Unwind_Resume@GCC_3.0
00000000000012c0 T libfoo::foo()
00000000000013d0 T libfoo::foo2()
0000000000001450 T libfoo::internal::foo_internal[abi:cxx11]()
                 U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11
...
We see that only symbols from libfoo:: namespace(s) are exported. But internal symbols
from libfoo::internal:: namespace are also exported, which we want to avoid.
And here comes the problem: it’s not possible to refine the filter by adding libfoo::internal::*
in the local section:
{
    global:
        extern "C++" {
            libfoo::*;
        };
    local:
        _ZN6libfoo8internal*;  # won't work :(
        *;
};
If a symbol matches any wild-star pattern in global section, this symbol
will not be checked
against patterns in local section.
One potential way to overcome this limitation is to list all symbols we want to export explicitly
without globbing, but that’s tedious work. A script to fetch symbols from nm output can be handy,
but requires additional effort.
Pros: no code changes required. Configuration lives in a separate file which can be dynamically created or adjusted.
Cons: limitation for visibility of nested namespaces.
Explicitly annotate exported symbols
A better way is to tell linker to hide all symbols by default and explicitly annotate symbols
we want to export. Use -fvisibility=hidden linker flag (or set CXX_VISIBILITY_PRESET hidden
cmake property) to make
all symbols hidden by default.
__attribute__((visibility("default"))) annotation
(for GCC
and clang)
marks symbols for exporting. We can define a macro to avoid typing it every time:
#define PUBLIC_API_FOO __attribute__((visibility("default")))
PUBLIC_API_FOO void foo();
It’s a common practice to annotate symbols in public header files, but these headers are also
usually shipped to customers, and customers don’t need this annotation in their code.
This macro needs to be defined to nothing when used outside of our build system.
cmake automatically provides <target>_EXPORTS
compiler definition
when a library is built as shared, so we can use it:
#ifdef foo_EXPORTS
#  define PUBLIC_API_FOO __attribute__((visibility("default")))
#else
#  define PUBLIC_API_FOO
#endif
PUBLIC_API_FOO void foo();
If we now build the library and inspect exported symbols:
# `LIBFOO_API_VISIBILITY` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_API_VISIBILITY=ON ..
cmake --build .
ctest
nm --dynamic --demangle libfoo.so
...
                 U _Unwind_Resume@GCC_3.0
00000000000012a0 T libfoo::foo()
                 U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11
...
we’ll see that only annotated symbols are exported.
Note: Don’t forget to #include public headers that define exported symbols
into compilable files (cpp/cxx/cc).
If a header file is never included in any translation unit, it’s not processed and effectively ignored.
Pros: all exported symbols are explicitly annotated. It’s a conscious decision and low risk of mistakes.
Cons:
- public headers are “polluted” with PUBLIC_API_FOOmacro, which is meaningless for clients;
- if different clients need to have access to different set of symbols, this approach requires bulky fine-tuning (for example, split API into categories and export different categories for different customers);
Exported symbols and testing
Hidden symbols are not visible for the users of the library. Tests (unit tests, components test, etc) are also users of the library, they cannot access hidden symbols.
Shared libraries need well-written interface tests to verify the produced binary. The rest of the testing can be performed on a dedicated build that doesn’t hide symbols.
Dependencies
Shared libraries as any other binaries may have dependencies on other shared libraries.
readelf tool can show what dependencies
our library has. Let’s build libfoo:
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON ..
cmake --build .  # build succeeds
and inspect the produced library:
$ readelf --dynamic libfoo.so
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: [libfoo.so]
...
NEEDED records are shared libraries that our library depends on.
Alternatively we can use ldd tool
to print all (including transitive) shared dependencies.
ldd is a runtime tool: it actually invokes the dynamic linker to find dependencies,
so it might not always work (for example if the library is cross-compiled for another architecture).
$ ldd libfoo.so
        linux-vdso.so.1 (0x00007ffc04386000)
        libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007734bbc00000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007734bbe9e000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007734bb800000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007734bbb19000)
        /lib64/ld-linux-x86-64.so.2 (0x00007734bbeca000)
If our library depends on another our library, we have to ship this dependency along with the library itself. A better way is statically link dependencies into the final shared library, that will simplify management a lot, but in some cases it’s not possible or allowed.
Note: I’m talking about first-party dependencies (dependencies that we as developers produce).
System dependencies should not be statically linked or packaged with the deliverables.
Third-party dependencies (like openssl) can follow both approaches and they should be handled on
case-by-case basis.
When cmake --build produces a library it embeds full paths to dependencies as RUNPATH records:
readelf --dynamic ...so
...
 0x0000000000000001 (NEEDED)             Shared library: [libbar.so]
 0x000000000000001d (RUNPATH)            Library runpath: [/home/andrey/projects/learning-playground/linux-shared-lib/build/libbar:]
...
which allows any executable in the project (tests or apps) to run without additional configuration,
but that’s not portable.
cmake --install strips these records and leaves just NEEDED record for each dependency:
readelf --dynamic installed/...so
...
 0x0000000000000001 (NEEDED)             Shared library: [libbar.so]
This moves the responsibility to provide runtime search paths to the final application. The clients application may be shipped along with its dependencies, or can be an application that expects dependencies to be in specific locations within the application bundle (like Android APK for example). In such cases better let the client deal with search paths.
Note: there’s a very good talk
“C++ Shared Libraries and Where To Find Them”
that explains RPATH/RUNPATH handling at compile time and runtime.
ABI versioning via SONAME
readelf --dynamic shows SONAME record, which contains a value similar to the filename of the shared library.
This value will be embedded into the client application as dynamic dependency when the app is linked
against our library. Even though the app links against libfoo.so during the build,
at runtime the app will look for a file with the name taken from SONAME record of libfoo.so.
This mechanism allows updates of libraries without rebuilding client applications. Libraries that use ABI version management are usually shipped with symlinks, for example:
libfoo.so -> libfoo.so.1  # symlink
libfoo.so.1 -> libfoo.so.1.0.0  # symlink
libfoo.so.1.0.0  # actual library file
and SONAME record of the library contains libfoo.so.1.
When an app is linked against libfoo.so, at runtime this app will look for libfoo.so.1 file
(value of SONAME record). This allows users to update libfoo to version 1.0.1 or 1.1.0
and the app will continue to work (as long as the update process updates symlinks:
libfoo.so.1 -> libfoo.so.1.1.0).
Users can even install multiple major versions of the same library (1.1.0 and 2.0.0) and
apps will be able to find the correct dependency at runtime
(one app that depends on libfoo.so.1 will pick libfoo.so.1.1.0 while another app
that depends on libfoo.so.2 will pick libfoo.so.2.0.0).
Note: it’s the responsibility of the library authors to actually maintain ABI compatibility.
In cmake this can be configured via VERSION and SOVERSION properties.
Let’s build libfoo:
# `LIBFOO_VERSIONING` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_VERSIONING=ON ..
cmake --build .
and inspect the produced files:
$ ls -la *.so*
lrwxrwxrwx 1 user user     11 Sep  2 19:56 libfoo.so -> libfoo.so.1*
lrwxrwxrwx 1 user user     15 Sep  2 19:56 libfoo.so.1 -> libfoo.so.1.2.3*
-rwxr-xr-x 1 user user 106840 Sep  2 19:56 libfoo.so.1.2.3*
$ readelf --dynamic libfoo.so | grep SONAME
 0x000000000000000e (SONAME)             Library soname: [libfoo.so.1]
If an app is linked against libfoo.so, it will depend at runtime on libfoo.so.1:
$ readelf --dynamic ./app
...
 0x0000000000000001 (NEEDED)             Shared library: [libfoo.so.1]
...
This might be useful if the shared library that we deliver may be updated on-the-fly, and client apps must continue to work. If the client app is used as a single package and library updates can’t happen (for example if our library is packaged in an Android APK file), this versioning can be safely ignored.
Usage
When clients want to use our library, they need to link against libfoo.so and
add the path to public headers of our library to their include path.
cmake has a concept of imported targets for this purpose:
add_library(foo SHARED IMPORTED)
set_target_properties(foo PROPERTIES
    IMPORTED_LOCATION path/to/libfoo.so
    IMPORTED_SONAME libfoo.so
)
target_include_directories(foo INTERFACE path/to/libfoo/headers)
IMPORTED_SONAME property
must match SONAME record in libfoo.so.
After that foo target can be used as any other target:
target_link_libraries(app PRIVATE foo)
Let’s build and install libfoo:
cd libfoo/build
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_API_VISIBILITY=ON -DLIBFOO_STRIP=ON ..
cmake --build .  # build succeeds
ctest  # tests run and pass - the library is usable
cmake --install . --prefix=../out --strip  # install libfoo into 'libfoo/out'
build the app:
cd app/build
# `LIBFOO_BASE_DIR` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DLIBFOO_BASE_DIR=../libfoo/out ..
cmake --build .
./app  # runs and prints output
and inspect the executable:
$ readelf --dynamic app
...
 0x0000000000000001 (NEEDED)             Shared library: [libfoo.so]
 0x000000000000001d (RUNPATH)            Library runpath: [/home/andrey/projects/learning-playground/linux-shared-lib/libfoo/out/lib:]
...
$ ldd app
        linux-vdso.so.1 (0x00007ffe6416f000)
        libfoo.so => /home/andrey/projects/learning-playground/linux-shared-lib/libfoo/out/lib/libfoo.so (0x0000771d1f915000)
...
This is the executable in the cmake build tree, it contains RUNPATH to locate the exact library
it was linked with.
To make this application portable, it needs to be installed via cmake --install:
cmake --install . --prefix=../out --strip
readelf --dynamic ../out/bin/app
...
 0x0000000000000001 (NEEDED)             Shared library: [libfoo.so]
 0x0000000000000001 (NEEDED)             Shared library: [libstdc++.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
...
It has no RPATH/RUNPATH records by default. If we try to run the app, it will fail:
$ ../out/bin/app
../out/bin/app: error while loading shared libraries: libfoo.so: cannot open shared object file: No such file or directory
And that’s expected, because libfoo.so is not in any standard search path of the system.
We need to either explicitly set relative RPATH for the application during the build and
put our libraries there,
or use LD_LIBRARY_PATH environment variable:
linux-shared-lib$ LD_LIBRARY_PATH=libfoo/out/lib/ ./app/out/bin/app
Hello world!
cmake project for the client application can be configured to also copy shared libraries from dependencies, copy debug info files, and more, but that’s out of scope for this page.
References
- StackOverflow: What are CMAKE_BUILD_TYPE: Debug, Release, RelWithDebInfo and MinSizeRel?
- Controlling the Exported Symbols of Shared Libraries
- Stripped binaries (wiki)
- C++ Shared Libraries and Where To Find Them
- GNU Wiki: Visibility attribute
- Slides for “An introduction to building and using shared libraries” talk on Linux Conf 2006 (see this slide for symbol versioning via version script)
- How to write shared libraries - a paper by Ulrich Drepper