OpenSolaris

You are not signed in. Sign in or register.

Heads up - New ELF check in nightly check_rtime tests

Date: Tue, 22 May 2007 12:10:12 -0600
From: Ali Bahrami <Ali.Bahrami at Sun dot COM>
To: onnv-gate at onnv dot eng dot sun dot com
Subject: Heads up - New ELF check in nightly check_rtime tests

My putback of

     6518331 Eliminate duplicate addresses from ON ELF symbol sort sections

adds a new check to the check_rtime script which is run when nightly
builds are run with the -r flag (usr/src/tools/scripts/check_rtime.pl).

A typical message from this check in nightly build output looks like:

     ./lib/amd64/libc.so.1: .SUNW_dynsymsort: duplicate 0x00000000000de9c0:
         membar_enter, membar_exit

This message says that the two functions (membar_enter(), and membar_exit())
are really a single function that is known by two names. This represents
an ambiguity for observability tools (DTrace, debuggers) that attempt to
map addresses to names, since either name is valid. As a result, our
tools may end up presenting confusing or suboptimal results to the user.
The change to check_rtime is intended to prevent this situation from
occurring in ON.

You may safely ignore the rest of this heads up message if you don't do
any of the following things:

     - Use C++
     - Use Assembler
     - Use the C "#pragma weak" directive
     - Work on libc

Please read on if you do any of the above, are a gatekeeper,
or are generally interested in low level ELF details.
----------------------------------------------------------------------------

ELF symbol sort sections were added to Solaris with

     PSARC 2007/026 ELF symbol sort sections
     6475344 DTrace needs ELF function and data symbols sorted by address

The underlying idea is that our ELF files need to have a section that gives
access to the symbols in the dynamic symbol tables, sorted in order of
increasing address, and omitting those symbols that do not reference a
variable or function. DTrace will (in time)  use these sections to map
addresses to function or variable names, in O(log n) time, from within
the kernel.

ELF allows more than one symbol to reference a given address, and as
such, it is possible for a given address to be represented by more than
one entry in a symbol sort section. In principle, duplicates are harmless,
because every symbol represents a valid name for the address it
references. However:

     - The names displayed by DTrace and other utilities should be the
       ones that users would expect to see (i.e. foo()), rather than
       an internal name it happens to be aliased to for implementation
       reasons (i.e. _private_foox()).
     - The same name should be used consistently, even when the libraries
       are rebuilt and the number of symbols change, and if possible across
       releases.
     - The duplication is a  waste of space.

The overall goal is for our observability tools to show users a
consistent set of public names, when possible.

I know of three ways in which duplicate, or alias, symbols can
occur:

         - In assembly language, it is easy to give a given block
           of code more than one name. The main use for this feature
           seems to be to eliminate duplicate code without introducing
           additional calling overhead.
         - In C, the "#pragma weak" directive can be used to create
           symbols that alias a given symbol. (Solaris uses this
           feature heavily via synonyms.h. Note that we hope to see
           much of this go away in favor of the use of direct bindings).
         - C++ compilers use it to implement inheritance under some
           circumstances.

Most of the work for 6518331 was to eliminate duplicates in our
existing symbol sort sections. The new test for check_rtime
is intended to prevent new duplicates from being introduced.

If your code should trigger this check_rtime message, here is a list
of possible solutions. I've tried to list the better options first.

         [1] If the code in question is C++, then you should modify
             usr/src/tools/scripts/check_rtime.pl and put it on the
             $SkipSymSort exclusion list. We don't currently worry
             about sort section duplication in C++.

         [2] In most contexts, the calling overhead for a utility
             function is not significant. Rather than alias symbols,
             you can move the code into a utility routine and write
             the duplicate symbols as wrappers. For example:

                 #pragma weak foo = bar
                 int
                 bar(arg)
                 {
                         ...
                 }

             might be re-written as (please forgive the violation
             of cstyle, done here to keep things short):

                 static int foo_bar_body(arg) { ... }
                 int foo(arg) { return foo_bar_body(arg); }
                 int bar(arg) { return foo_bar_body(arg); }

             or even as:

                 static int foo(arg) { ... }
                 int bar(arg) { return foo(arg); }

         [3] If you have an aliased function for which calling overhead
             is significant (i.e. it is small, and used frequently), then
             it might be reasonable to make a duplicate copy of the code
             body. Note that this is only for the shortest of very important
             routines. This is a rare measure, not a common solution.

         [4] You can tag the aliased names you wish to drop from the sort
             sections with the NODYNSORT mapfile keyword. This approach
             can be convenient, since it avoids code changes. NODYNSORT
             is described in the Linkers and Libraries Manual. Examples of
             its use can be found in the libc mapfiles.

             For example, the 32-bit sparc libc contains the following
             globally visible functions, implemented as aliased names
             on a single function body:

                 atomic_add_ptr_nv    atomic_add_long_nv   atomic_add_int
                 _atomic_add_ptr_nv   atomic_add_int_nv    atomic_add_ptr
                 _atomic_add_32       _atomic_add_int_nv   _atomic_add_ptr
                 atomic_add_long      atomic_add_32        _atomic_add_32_nv
                 _atomic_add_long_nv  atomic_add_32_nv     _atomic_add_int
                 _atomic_add_long     atomic_add_ptr_nv    atomic_add_long_nv
                 atomic_add_int       _atomic_add_ptr_nv   atomic_add_int_nv
                 atomic_add_ptr       _atomic_add_32       _atomic_add_int_nv
                 _atomic_add_ptr      atomic_add_long      atomic_add_32
                 _atomic_add_32_nv    _atomic_add_long_nv  atomic_add_32_nv
                 _atomic_add_int      _atomic_add_long

             If we allow these duplicates to appear in the symbol sort
             section, then any one of these names might be found when we
             map the address of the underlying function to a name. The
             essential operation being represented by all of these names
             is 32-bit addition. So, I used NODYNSORT mapfile directives
             to exclude all of these names, except for atomic_add_32, from
             the symbol sort sections.

         [5] In the rare case where your code has to do symbol aliasing,
             and the solutions detailed above are not acceptable, you can
             modify usr/src/tools/scripts/check_rtime.pl and put your
             ELF file on the $SkipSymSort exclusion list. This should almost
             never be necessary. Currently, there is one such item (a DTrace
             test that uses a weak symbol).

- Ali