Heads up - New ELF check in nightly check_rtime tests
Date: Tue, 22 May 2007 12:10:12 -0600
From: Ali Bahrami <Ali.Bahrami at Sun dot COM>
To: onnv-gate at onnv dot eng dot sun dot com
Subject: Heads up - New ELF check in nightly check_rtime tests
My putback of
6518331 Eliminate duplicate addresses from ON ELF symbol sort sections
adds a new check to the check_rtime script which is run when nightly
builds are run with the -r flag (usr/src/tools/scripts/check_rtime.pl).
A typical message from this check in nightly build output looks like:
./lib/amd64/libc.so.1: .SUNW_dynsymsort: duplicate 0x00000000000de9c0:
membar_enter, membar_exit
This message says that the two functions (membar_enter(), and membar_exit())
are really a single function that is known by two names. This represents
an ambiguity for observability tools (DTrace, debuggers) that attempt to
map addresses to names, since either name is valid. As a result, our
tools may end up presenting confusing or suboptimal results to the user.
The change to check_rtime is intended to prevent this situation from
occurring in ON.
You may safely ignore the rest of this heads up message if you don't do
any of the following things:
- Use C++
- Use Assembler
- Use the C "#pragma weak" directive
- Work on libc
Please read on if you do any of the above, are a gatekeeper,
or are generally interested in low level ELF details.
----------------------------------------------------------------------------
ELF symbol sort sections were added to Solaris with
PSARC 2007/026 ELF symbol sort sections
6475344 DTrace needs ELF function and data symbols sorted by address
The underlying idea is that our ELF files need to have a section that gives
access to the symbols in the dynamic symbol tables, sorted in order of
increasing address, and omitting those symbols that do not reference a
variable or function. DTrace will (in time) use these sections to map
addresses to function or variable names, in O(log n) time, from within
the kernel.
ELF allows more than one symbol to reference a given address, and as
such, it is possible for a given address to be represented by more than
one entry in a symbol sort section. In principle, duplicates are harmless,
because every symbol represents a valid name for the address it
references. However:
- The names displayed by DTrace and other utilities should be the
ones that users would expect to see (i.e. foo()), rather than
an internal name it happens to be aliased to for implementation
reasons (i.e. _private_foox()).
- The same name should be used consistently, even when the libraries
are rebuilt and the number of symbols change, and if possible across
releases.
- The duplication is a waste of space.
The overall goal is for our observability tools to show users a
consistent set of public names, when possible.
I know of three ways in which duplicate, or alias, symbols can
occur:
- In assembly language, it is easy to give a given block
of code more than one name. The main use for this feature
seems to be to eliminate duplicate code without introducing
additional calling overhead.
- In C, the "#pragma weak" directive can be used to create
symbols that alias a given symbol. (Solaris uses this
feature heavily via synonyms.h. Note that we hope to see
much of this go away in favor of the use of direct bindings).
- C++ compilers use it to implement inheritance under some
circumstances.
Most of the work for 6518331 was to eliminate duplicates in our
existing symbol sort sections. The new test for check_rtime
is intended to prevent new duplicates from being introduced.
If your code should trigger this check_rtime message, here is a list
of possible solutions. I've tried to list the better options first.
[1] If the code in question is C++, then you should modify
usr/src/tools/scripts/check_rtime.pl and put it on the
$SkipSymSort exclusion list. We don't currently worry
about sort section duplication in C++.
[2] In most contexts, the calling overhead for a utility
function is not significant. Rather than alias symbols,
you can move the code into a utility routine and write
the duplicate symbols as wrappers. For example:
#pragma weak foo = bar
int
bar(arg)
{
...
}
might be re-written as (please forgive the violation
of cstyle, done here to keep things short):
static int foo_bar_body(arg) { ... }
int foo(arg) { return foo_bar_body(arg); }
int bar(arg) { return foo_bar_body(arg); }
or even as:
static int foo(arg) { ... }
int bar(arg) { return foo(arg); }
[3] If you have an aliased function for which calling overhead
is significant (i.e. it is small, and used frequently), then
it might be reasonable to make a duplicate copy of the code
body. Note that this is only for the shortest of very important
routines. This is a rare measure, not a common solution.
[4] You can tag the aliased names you wish to drop from the sort
sections with the NODYNSORT mapfile keyword. This approach
can be convenient, since it avoids code changes. NODYNSORT
is described in the Linkers and Libraries Manual. Examples of
its use can be found in the libc mapfiles.
For example, the 32-bit sparc libc contains the following
globally visible functions, implemented as aliased names
on a single function body:
atomic_add_ptr_nv atomic_add_long_nv atomic_add_int
_atomic_add_ptr_nv atomic_add_int_nv atomic_add_ptr
_atomic_add_32 _atomic_add_int_nv _atomic_add_ptr
atomic_add_long atomic_add_32 _atomic_add_32_nv
_atomic_add_long_nv atomic_add_32_nv _atomic_add_int
_atomic_add_long atomic_add_ptr_nv atomic_add_long_nv
atomic_add_int _atomic_add_ptr_nv atomic_add_int_nv
atomic_add_ptr _atomic_add_32 _atomic_add_int_nv
_atomic_add_ptr atomic_add_long atomic_add_32
_atomic_add_32_nv _atomic_add_long_nv atomic_add_32_nv
_atomic_add_int _atomic_add_long
If we allow these duplicates to appear in the symbol sort
section, then any one of these names might be found when we
map the address of the underlying function to a name. The
essential operation being represented by all of these names
is 32-bit addition. So, I used NODYNSORT mapfile directives
to exclude all of these names, except for atomic_add_32, from
the symbol sort sections.
[5] In the rare case where your code has to do symbol aliasing,
and the solutions detailed above are not acceptable, you can
modify usr/src/tools/scripts/check_rtime.pl and put your
ELF file on the $SkipSymSort exclusion list. This should almost
never be necessary. Currently, there is one such item (a DTrace
test that uses a weak symbol).
- Ali
|