Imagine we’re developing a C++ library. We need to deliver this library to a client, and the agreement is to deliver it as a shared library for linux. How to do this?
Let’s define the setup and assumptions:
- we develop on linux system using well-known compiler, like gcc or clang;
- we use cmake as build system (backed by make or ninja);
- we know how to build for the target platform (for example the client provides a toolchain);
- the target platform may be different from the development platform, and the produced library may require an emulator to run.
The setup for our example project includes:
libfoo
is the library we need to deliver;app
is our internal developer app that useslibfoo
;
The code of example project can be found here.
Build configuration
cmake
has a concept of Build Configurations
which controls what options are passed to the compiler.
There are 4 default configurations -
Debug
, Release
, RelWithDebInfo
and MinSizeRel
-
and CMAKE_BUILD_TYPE
variable
controls this.
- Debug - for development use only. No optimizations, debug info included.
- Release - produces the final deliverable binary. Optimized for speed, no debug info included.
- RelWithDebInfo - same as Release, but includes debug info. Debug info significantly increases the size of the binary, but allows to analyze crash dumps.
- MinSizeRel - similar to Release, but optimized for size of the binary rather than execution speed.
We can consider shipping the library without debug info - Release or MinSizeRel configuration - if we don’t expect to receive and analyze crash dumps from the clients. If we do need to investigate crashes, we need debug symbols, but we also need optimizations, so RelWithDebInfo is our only option. But we don’t want to ship a library polluted with debug info.
The solution is to put debug info in a separate file.
Strip the binary
file
command or
readelf
tool
from binutils
package can show if a binary contains debug info.
Let’s build the library in RelWithDebInfo mode:
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON ..
cmake --build . # build succeeds
and inspect it:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., with debug_info, not stripped
$ readelf --sections libfoo.so | grep debug
...
[28] .debug_aranges PROGBITS 0000000000000000 000030b5
[29] .debug_info PROGBITS 0000000000000000 00003145
...
file
prints with debug_info
, and readelf --sections
shows debug sections, which means
the library contains debug symbols. If we would use -DCMAKE_BUILD_TYPE=Release
, there would be
no debug sections in readelf
and file
wouldn’t show with debug_info
.
objcopy
tool from binutils
can
copy debug info into a separate file:
objcopy --only-keep-debug libfoo.so libfoo.so.debug
objcopy --strip-debug --add-gnu-debuglink=libfoo.so.debug libfoo.so
cmake
usually provides CMAKE_OBJCOPY
variable
that points to objcopy
executable. We can use it to add a custom command to our cmake
target
and extract debug info during the build.
If we rebuild the library with LIBFOO_STRIP=ON
:
# `LIBFOO_STRIP` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_STRIP=ON ..
cmake --build . # build succeeds
and inspect the produced binary:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., not stripped
$ readelf --sections libfoo.so | grep debug
[28] .gnu_debuglink PROGBITS 0000000000000000 000030b8
readelf
shows .gnu_debuglink
section only which is a link to a file containing debug info
(caused by --add-gnu-debuglink
option in the example project). file
doesn’t show with debug_info
,
but still shows not stripped
- this means that our binary still contains additional unneeded info -
for example .symtab
section. Unneeded sections can be removed if objcopy
is invoked
with --strip-unneeded
parameter instead of --strip-debug
.
cmake --install
has --strip
option
which performs such aggressive stripping during installation. If we use it after the build:
# `LIBFOO_STRIP` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_STRIP=ON ..
cmake --build . # build succeeds
ctest
cmake --install . --prefix=../out --strip
and inspect the produced and installed binaries, we’ll see that the installed binary is finally stripped:
$ file libfoo.so
libfoo.so: ELF 64-bit LSB shared object, ..., not stripped
$ file ../out/lib/libfoo.so
../out/lib/libfoo.so: ELF 64-bit LSB shared object, ..., stripped
We need to save the file with debug info (libfoo.so.debug
) for every shipped binary, then we will be able
to analyze crash dumps that customers may send to us.
Visibility of exported symbols
Shared libraries provide functionality via exported dynamic symbols.
If we build libfoo
as shared
in Release mode:
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON ..
cmake --build . # build succeeds
ctest # tests run successfully
we’ll get libfoo.so
as our shared library.
To list dynamic symbols we can use nm
tool
from binutils
package:
$ nm --dynamic libfoo.so # or "nm -D"
...
w _ITM_registerTMCloneTable
U _Unwind_Resume@GCC_3.0
00000000000012c0 T _ZN6libfoo3fooEv
00000000000013d0 T _ZN6libfoo4foo2Ev
0000000000001450 T _ZN6libfoo8internal12foo_internalB5cxx11Ev
U _ZNKSt5ctypeIcE13_M_widen_initEv@GLIBCXX_3.4.11
0000000000001440 W _ZNKSt5ctypeIcE8do_widenEc
U _ZNSo3putEc@GLIBCXX_3.4
U _ZNSo5flushEv@GLIBCXX_3.4
...
Looking at this list, there are 2 observations that raise questions:
- this list contains “ugly” names instead of pretty C++ names. This is called
name mangling and compilers do this to C++ symbols
to make them unique. We can add
--demangle
parameter tonm
to get pretty symbols back:Another option is$ nm --dynamic --demangle libfoo.so # or "nm -DC" ... w _ITM_registerTMCloneTable U _Unwind_Resume@GCC_3.0 00000000000012c0 T libfoo::foo() 00000000000013d0 T libfoo::foo2() 0000000000001450 T libfoo::internal::foo_internal[abi:cxx11]() U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11 0000000000001440 W std::ctype<char>::do_widen(char) const U std::ostream::put(char)@GLIBCXX_3.4 U std::ostream::flush()@GLIBCXX_3.4 ...
c++filt
tool frombinutils
:nm --dynamic libfoo.so | c++filt
will give similar output. - there are way too many symbols in the list including our internal symbols
(
libfoo::internal::foo_internal
) and symbols from library dependencies (std::ostream::flush()
). This happens because by default all statically linked symbols are visible and exported from dynamic libraries. Users of the library can try to use these symbols which is undesirable. We need to limit symbols visibility to keep the public library API clean.
Let’s inspect nm
output in more details.
Symbols with address in the first column are “real” symbols exported from the library.
Users that link against our library can use these symbols (call the functions) freely.
The second column is the type of the symbol. What’s important for now:
U
means “undefined symbol” - the symbol is required, and must be provided at runtime via dependencies (note@GLIBCXX_3.4
suffix for example).T
means global symbol “in .text section” - exported from the library.t
also means a symbol “in .text section”, but it’s local and not exported (nm --dynamic
doesn’t show them).w
/W
means “weak symbol”. When linking the final application, the linker will pick a non-weak symbol over weak symbols, and pick any weak symbol if no non-weak symbols exist. Typically, weak symbols are default constructors/destructors and templates instantiations. They don’t violate ODR rule, and the linker will eliminate duplicates.
Our goal is to have all symbols forming public API of our library to be exported (in dynamic section), and no other internal symbols should be exported.
Pass “version script” file to linker
Widely used linkers (like GNU ld
and gold
or LLVM lld
) support
version script files
via --version-script
parameter. Version script files can be used to define visibility of symbols.
An example of such file to export symbols from libfoo::
namespace only can look like this:
{
global:
_ZN6libfoo*;
local:
*;
};
This file uses mangled symbol names, so we need know them upfront (by running nm
for example).
To pass a version script file to the linker we need to add -Wl,--version-script=FILENAME
linker option
(or add this flag to LINK_FLAGS
property
of the cmake target).
Let’s build the library and inspect exported symbols:
# `LIBFOO_USE_VERSION_SCRIPT` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_USE_VERSION_SCRIPT=ON ..
cmake --build .
ctest
nm --dynamic --demangle libfoo.so
...
U _Unwind_Resume@GCC_3.0
00000000000012c0 T libfoo::foo()
00000000000013d0 T libfoo::foo2()
0000000000001450 T libfoo::internal::foo_internal[abi:cxx11]()
U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11
...
We see that only symbols from libfoo::
namespace(s) are exported. But internal symbols
from libfoo::internal::
namespace are also exported, which we want to avoid.
And here comes the problem: it’s not possible to refine the filter by adding libfoo::internal::*
in the local
section:
{
global:
_ZN6libfoo*;
local:
_ZN6libfoo8internal*; # won't work :(
*;
};
If a symbol matches any wild-star pattern in global
section, this symbol
will not be checked
against patterns in local
section.
One potential way to overcome this limitation is to list all symbols we want to export explicitly,
but that’s a tedious work. A script to fetch symbols from nm
output can be handy,
but requires additional effort.
Pros: no code changes required. Configuration lives in a separate file which can be dynamically created or adjusted.
Cons: limitation for visibility of nested namespaces.
Explicitly annotate exported symbols
A better way is to tell linker to hide all symbols by default and explicitly annotate symbols
we want to export. Use -fvisibility=hidden
linker flag (or set CXX_VISIBILITY_PRESET hidden
cmake property) to make
all symbols hidden by default.
__attribute__((visibility("default")))
annotation
(for GCC
and clang)
marks symbols for exporting. We can define a macro to avoid typing it every time:
#define PUBLIC_API_FOO __attribute__((visibility("default")))
PUBLIC_API_FOO void foo();
It’s a common practice to annotate symbols in public header files, but these headers are also
usually shipped to customers, and customers don’t need this annotation in their code.
This macro needs to be defined to nothing when used outside of our build system.
cmake
automatically provides <target>_EXPORTS
compiler definition
when a library is built as shared, so we can use it:
#ifdef foo_EXPORTS
# define PUBLIC_API_FOO __attribute__((visibility("default")))
#else
# define PUBLIC_API_FOO
#endif
PUBLIC_API_FOO void foo();
If we now build the library and inspect exported symbols:
# `LIBFOO_API_VISIBILITY` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_API_VISIBILITY=ON ..
cmake --build .
ctest
nm --dynamic --demangle libfoo.so
...
U _Unwind_Resume@GCC_3.0
00000000000012a0 T libfoo::foo()
U std::ctype<char>::_M_widen_init() const@GLIBCXX_3.4.11
...
we’ll see that only annotated symbols are exported.
Note: Don’t forget to #include
public headers that define exported symbols
into compilable files (cpp/cxx/cc).
If a header file is never included in any translation unit, it’s not processed and effectively ignored.
Pros: all exported symbols are explicitly annotated. It’s a conscious decision and low risk of mistakes.
Cons:
- public headers are “polluted” with
PUBLIC_API_FOO
macro, which is meaningless for clients; - if different clients need to have access to different set of symbols, this approach requires bulky fine-tuning (for example, split API into categories and export different categories for different customers);
Exported symbols and testing
Hidden symbols are not visible for the users of the library. Tests (unit tests, components test, etc) are also users of the library, they cannot access hidden symbols.
Shared libraries need well-written interface tests to verify the produced binary. The rest of the testing can be performed on a dedicated build that doesn’t hide symbols.
Dependencies
Shared libraries as any other binaries may have dependencies on other shared libraries.
readelf
tool can show what dependencies
our library has. Let’s build libfoo
:
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON ..
cmake --build . # build succeeds
and inspect the produced library:
$ readelf --dynamic libfoo.so
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libgcc_s.so.1]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000e (SONAME) Library soname: [libfoo.so]
...
NEEDED
records are shared libraries that our library depends on.
Alternatively we can use ldd
tool
to print all (including transitive) shared dependencies.
ldd
is a runtime tool: it actually invokes the dynamic linker to find dependencies,
so it might not always work (for example if the library is cross-compiled for another architecture).
$ ldd libfoo.so
linux-vdso.so.1 (0x00007ffc04386000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007734bbc00000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007734bbe9e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007734bb800000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007734bbb19000)
/lib64/ld-linux-x86-64.so.2 (0x00007734bbeca000)
If our library depends on another our library, we have to ship this dependency along with the library itself. A better way is statically link dependencies into the final shared library, that will simplify management a lot, but in some cases it’s not possible or allowed.
Note: I’m talking about first-party dependencies (dependencies that we as developers produce).
System dependencies should not be statically linked or packaged with the deliverables.
Third-party dependencies (like openssl
) can follow both approaches and they should be handled on
case-by-case basis.
When cmake --build
produces a library it embeds full paths to dependencies as RUNPATH
records:
readelf --dynamic ...so
...
0x0000000000000001 (NEEDED) Shared library: [libbar.so]
0x000000000000001d (RUNPATH) Library runpath: [/home/andrey/projects/learning-playground/linux-shared-lib/build/libbar:]
...
which allows any executable in the project (tests or apps) to run without additional configuration,
but that’s not portable.
cmake --install
strips these records and leaves just NEEDED
record for each dependency:
readelf --dynamic installed/...so
...
0x0000000000000001 (NEEDED) Shared library: [libbar.so]
This moves the responsibility to provide runtime search paths to the final application. The clients application may be shipped along with its dependencies, or can be an application that expects dependencies to be in specific locations within the application bundle (like Android APK for example). In such cases better let the client deal with search paths.
Note: there’s a very good talk
“C++ Shared Libraries and Where To Find Them”
that explains RPATH
/RUNPATH
handling at compile time and runtime.
ABI versioning via SONAME
readelf --dynamic
shows SONAME
record, which contains a value similar to the filename of the shared library.
This value will be embedded into the client application as dynamic dependency when the app is linked
against our library. Even though the app links against libfoo.so
during the build,
at runtime the app will look for a file with the name taken from SONAME
record of libfoo.so
.
This mechanism allows updates of libraries without rebuilding client applications. Libraries that use ABI version management are usually shipped with symlinks, for example:
libfoo.so -> libfoo.so.1 # symlink
libfoo.so.1 -> libfoo.so.1.0.0 # symlink
libfoo.so.1.0.0 # actual library file
and SONAME
record of the library contains libfoo.so.1
.
When an app is linked against libfoo.so
, at runtime this app will look for libfoo.so.1
file
(value of SONAME
record). This allows users to update libfoo
to version 1.0.1
or 1.1.0
and the app will continue to work (as long as the update process updates symlinks:
libfoo.so.1 -> libfoo.so.1.1.0
).
Users can even install multiple major versions of the same library (1.1.0
and 2.0.0
) and
apps will be able to find the correct dependency at runtime
(one app that depends on libfoo.so.1
will pick libfoo.so.1.1.0
while another app
that depends on libfoo.so.2
will pick libfoo.so.2.0.0
).
Note: it’s the responsibility of the library authors to actually maintain ABI compatibility.
In cmake
this can be configured via VERSION
and SOVERSION
properties.
Let’s build libfoo
:
# `LIBFOO_VERSIONING` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=ON -DLIBFOO_VERSIONING=ON ..
cmake --build .
and inspect the produced files:
$ ls -la *.so*
lrwxrwxrwx 1 user user 11 Sep 2 19:56 libfoo.so -> libfoo.so.1*
lrwxrwxrwx 1 user user 15 Sep 2 19:56 libfoo.so.1 -> libfoo.so.1.2.3*
-rwxr-xr-x 1 user user 106840 Sep 2 19:56 libfoo.so.1.2.3*
$ readelf --dynamic libfoo.so | grep SONAME
0x000000000000000e (SONAME) Library soname: [libfoo.so.1]
If an app is linked against libfoo.so
, it will depend at runtime on libfoo.so.1
:
$ readelf --dynamic ./app
...
0x0000000000000001 (NEEDED) Shared library: [libfoo.so.1]
...
This might be useful if the shared library that we deliver may be updated on-the-fly, and client apps must continue to work. If the client app is used as a single package and library updates can’t happen (for example if our library is packaged in an Android APK file), this versioning can be safely ignored.
Usage
When clients want to use our library, they need to link against libfoo.so
and
add the path to public headers of our library to their include path.
cmake
has a concept of imported targets for this purpose:
add_library(foo SHARED IMPORTED)
set_target_properties(foo PROPERTIES
IMPORTED_LOCATION path/to/libfoo.so
IMPORTED_SONAME libfoo.so
)
target_include_directories(foo INTERFACE path/to/libfoo/headers)
IMPORTED_SONAME
property
must match SONAME
record in libfoo.so
.
After that foo
target can be used as any other target:
target_link_libraries(app PRIVATE foo)
Let’s build and install libfoo
:
cd libfoo/build
cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_SHARED_LIBS=ON -DLIBFOO_API_VISIBILITY=ON -DLIBFOO_STRIP=ON ..
cmake --build . # build succeeds
ctest # tests run and pass - the library is usable
cmake --install . --prefix=../out --strip # install libfoo into 'libfoo/out'
build the app:
cd app/build
# `LIBFOO_BASE_DIR` is a custom option in the example project
cmake -DCMAKE_BUILD_TYPE=Release -DLIBFOO_BASE_DIR=../libfoo/out ..
cmake --build .
./app # runs and prints output
and inspect the executable:
$ readelf --dynamic app
...
0x0000000000000001 (NEEDED) Shared library: [libfoo.so]
0x000000000000001d (RUNPATH) Library runpath: [/home/andrey/projects/learning-playground/linux-shared-lib/libfoo/out/lib:]
...
$ ldd app
linux-vdso.so.1 (0x00007ffe6416f000)
libfoo.so => /home/andrey/projects/learning-playground/linux-shared-lib/libfoo/out/lib/libfoo.so (0x0000771d1f915000)
...
This is the executable in the cmake
build tree, it contains RUNPATH
to locate the exact library
it was linked with.
To make this application portable, it needs to be installed via cmake --install
:
cmake --install . --prefix=../out --strip
readelf --dynamic ../out/bin/app
...
0x0000000000000001 (NEEDED) Shared library: [libfoo.so]
0x0000000000000001 (NEEDED) Shared library: [libstdc++.so.6]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
...
It has no RPATH
/RUNPATH
records by default. If we try to run the app, it will fail:
$ ../out/bin/app
../out/bin/app: error while loading shared libraries: libfoo.so: cannot open shared object file: No such file or directory
And that’s expected, because libfoo.so
is not in any standard search path of the system.
We need to either explicitly set relative RPATH
for the application during the build and
put our libraries there,
or use LD_LIBRARY_PATH
environment variable:
linux-shared-lib$ LD_LIBRARY_PATH=libfoo/out/lib/ ./app/out/bin/app
Hello world!
cmake project for the client application can be configured to also copy shared libraries from dependencies, copy debug info files, and more, but that’s out of scope for this page.