|
Replies:
4
-
Last Post:
Nov 20, 2006 10:58 AM
by: rie
|
|
|
Posts:
229
From:
CZ
Registered:
3/21/06
|
|
|
|
shared library symbols at address 0x00000000
Posted:
Nov 17, 2006 8:18 AM
|
|
Hi all,
[ please CC me on replies ]
I'm part time fixing some bugs in Nexenta, and I have for a second time hit the bug, where library libA.so has been linked against some other shared library libB.so and some symbols were incorrectly resolved to be at absolute address 0x0. Note that I'm talking about symbols representing regular functions like pthread_create, dlopen, ...
Some recent examples of this bug can be found in the bug reports:
http://www.gnusolaris.org/cgi-bin/trac.cgi/ticket/409 http://www.gnusolaris.org/cgi-bin/trac.cgi/ticket/347
This most probably happens due to a bug in GNU ld, that, given a certain wrong set of commandline switches resolves the symbols incorrectly, and leads to application crash when the resolved symbol is first used (plain old segfault while jumping to 0x0 address).
While investigating this, I could see that on Solaris, some symbols in some libraries are deliberately put at the address 0x0, and since this happens in libraries like libc.so, libpthread.so, I don't believe it is a bug.
I'm just curious why this happens, what these symbols mean, and what are they used for. Seems that GNU ld is picking them up in situations where it shouldn't be, and I would like to reproduce a test case where ld can deliberately exhibit this bug.
thanx, Martin
P.S. an excerpt of $ nm -D libc.so | grep '00000000 A' ... 00000000 A dladdr 00000000 A dladdr1 00000000 A dlclose 00000000 A dldump 00000000 A dlerror 00000000 A dlinfo 00000000 A dlmopen 00000000 A dlopen 00000000 A dlsym 00000000 A frexp 00000000 A isnan 00000000 A isnand 00000000 A isnanf 00000000 A ldexp 00000000 A logb 00000000 A modf 00000000 A modff ...
-- http://martinman.net _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss at opensolaris dot org
|
|
|
Posts:
3,607
From:
NL
Registered:
3/9/05
|
|
|
|
Re: [osol-code] shared library symbols at address
0x00000000
Posted:
Nov 17, 2006 9:13 AM
in response to: mmman
|
|
>I'm part time fixing some bugs in Nexenta, and I have for a second time >hit the bug, where library libA.so has been linked against some other >shared library libB.so and some symbols were incorrectly resolved to be >at absolute address 0x0. Note that I'm talking about symbols >representing regular functions like pthread_create, dlopen, ...
>While investigating this, I could see that on Solaris, some symbols in >some libraries are deliberately put at the address 0x0, and since this >happens in libraries like libc.so, libpthread.so, I don't believe it is >a bug.
Correct.
>I'm just curious why this happens, what these symbols mean, and what are >they used for. Seems that GNU ld is picking them up in situations where >it shouldn't be, and I would like to reproduce a test case where ld can >deliberately exhibit this bug.
These symbols are "filter" symbols; they live in different libraries.
libc is a "filter" of libdl.so; this means it also exports a view on the symbols found in libdl.so.
Casper _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss at opensolaris dot org
|
|
|
|
Posts:
307
From:
US
Registered:
3/9/05
|
|
|
|
Re: shared library symbols at address 0x00000000
Posted:
Nov 17, 2006 9:22 AM
in response to: mmman
|
|
Martin Man wrote:
> I'm just curious why this happens, what these symbols mean, and what are > they used for. Seems that GNU ld is picking them up in situations where > it shouldn't be, and I would like to reproduce a test case where ld can > deliberately exhibit this bug. > ... > P.S. an excerpt of > $ nm -D libc.so | grep '00000000 A' > ... > 00000000 A dladdr > 00000000 A dladdr1
These are filters. For a long time we have produced shared objects that act as filters on other shared objects. The whole object has acted as a filter:
oxpoly 490. elfdump -d /usr/lib/libdl.so.1 | fgrep FILTER [1] FILTER 0xe6 /usr/lib/ld.so.1 oxpoly 491. elfdump -d /usr/lib/libsys.so.1 | fgrep FILTER [1] FILTER 0x758 /usr/lib/libc.so.1
In Solaris 10, we added per-symbol filtering, a mechanism where individual symbols could be identified as filters. In fact this got back ported to Solaris 9 9/04. See:
http://docs.sun.com/app/docs/doc/817-1984/6mhm7pl1q?a=view
The filtering is triggered because we maintain an auxiliary array of information for the symbol table - the SHT_SUNW_syminfo, .SUNW_syminfo section. You can dump this with:
oxpoly 493. elfdump -y /lib/libc.so.1 | grep dladdr [1176] F [1] /usr/lib/ld.so.1 dladdr1 [1291] F [1] /usr/lib/ld.so.1 dladdr [1641] F [1] /usr/lib/ld.so.1 _dladdr1 [1913] F [1] /usr/lib/ld.so.1 _dladdr ^ SYMINFO_FLG_FILTER
When ld(1) resolves an object to a filter symbol, it simply creates the appropriate reference. For function references, this would be the creation of a procedure linkage table entry, .plt:
oxpoly 498. elfdump -sN.dynsym /lib/libc.so.1 | grep dlopen [2328] 0x00000000 0x00000000 FUNC GLOB D 5 ABS dlopen oxpoly 499. cc -o main main.c oxpoly 500. elfdump -r main | fgrep dlopen R_SPARC_JMP_SLOT 0x20ca4 0 .rela.plt dlopen
When the runtime linker binds the process, it redirects the binding to the filtee. In this case, the call is resolved to ld.so.1 itself. Because of this redirection, there is no need for any code to back the filter symbol definition - hence it is defined as ABS.
Another form of filtering is auxiliary filtering, this redirects the binding at runtime if a "better" implementation exists, but if not falls back to the original library:
oxpoly 503. elfdump -y /lib/libc.so.1 | grep memcpy [93] A [2] /platform/$PLATFORM/lib/libc_psr.so.1 memcpy
As this function has backing code, the symbol defines the associated code:
oxpoly 504. elfdump -sN.dynsym /lib/libc.so.1 | fgrep memcpy [93] 0x0003fed0 0x000001b0 FUNC WEAK D 38 .text memcpy [583] 0x0003fed0 0x000001b0 FUNC GLOB D 36 .text _memcpy
With the introduction of per-symbol filters, we were able to simplify and refine many object filtering mechanisms. For example, the dl* family could be defined in libc (you don't need to link with -ldl anymore).
Send me mail if you need more information.
--
Rod. _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss at opensolaris dot org
|
|
|
|
Posts:
229
From:
CZ
Registered:
3/21/06
|
|
|
|
Re: shared library symbols at address 0x00000000
Posted:
Nov 20, 2006 2:11 AM
in response to: rie
|
|
Hi Rod,
Thank you for a very detailed explanation.
As I understand it now, on Linux systems, the same effect can be achieved by linking a library libA.so against another library libB.so, in which case, the resulting binary can be linked only against libA.so and will automatically resolve symbols from libB.so.
On Solaris, the symbols linked to libA.so from libB.so are moreover marked as filter symbols.
I have to investigate now what is the support of filter symbols in GNU ld(1) and GNU nm(1) and eventually fix it. It seems that neither of them is aware of filter symbols and concepts behind them.
Have someone of you hit the same bug or seen similar problems?
thanx, Martin
Rod Evans wrote: > Martin Man wrote: > >> I'm just curious why this happens, what these symbols mean, and what >> are they used for. Seems that GNU ld is picking them up in situations >> where it shouldn't be, and I would like to reproduce a test case where >> ld can deliberately exhibit this bug. >> ... >> P.S. an excerpt of >> $ nm -D libc.so | grep '00000000 A' >> ... >> 00000000 A dladdr >> 00000000 A dladdr1 > > These are filters. For a long time we have produced shared objects that > act as filters on other shared objects. The whole object has acted as > a filter: > > oxpoly 490. elfdump -d /usr/lib/libdl.so.1 | fgrep FILTER > [1] FILTER 0xe6 /usr/lib/ld.so.1 > oxpoly 491. elfdump -d /usr/lib/libsys.so.1 | fgrep FILTER > [1] FILTER 0x758 /usr/lib/libc.so.1 > > In Solaris 10, we added per-symbol filtering, a mechanism where > individual symbols could be identified as filters. In fact this > got back ported to Solaris 9 9/04. See: > > http://docs.sun.com/app/docs/doc/817-1984/6mhm7pl1q?a=view > > The filtering is triggered because we maintain an auxiliary array > of information for the symbol table - the SHT_SUNW_syminfo, > .SUNW_syminfo section. You can dump this with: > > oxpoly 493. elfdump -y /lib/libc.so.1 | grep dladdr > [1176] F [1] /usr/lib/ld.so.1 dladdr1 > [1291] F [1] /usr/lib/ld.so.1 dladdr > [1641] F [1] /usr/lib/ld.so.1 _dladdr1 > [1913] F [1] /usr/lib/ld.so.1 _dladdr > ^ > SYMINFO_FLG_FILTER > > When ld(1) resolves an object to a filter symbol, it simply creates > the appropriate reference. For function references, this would be the > creation of a procedure linkage table entry, .plt: > > oxpoly 498. elfdump -sN.dynsym /lib/libc.so.1 | grep dlopen > [2328] 0x00000000 0x00000000 FUNC GLOB D 5 ABS dlopen > oxpoly 499. cc -o main main.c > oxpoly 500. elfdump -r main | fgrep dlopen > R_SPARC_JMP_SLOT 0x20ca4 0 .rela.plt dlopen > > When the runtime linker binds the process, it redirects the binding > to the filtee. In this case, the call is resolved to ld.so.1 itself. > Because of this redirection, there is no need for any code to back the > filter symbol definition - hence it is defined as ABS. > > Another form of filtering is auxiliary filtering, this redirects the > binding at runtime if a "better" implementation exists, but if not > falls back to the original library: > > oxpoly 503. elfdump -y /lib/libc.so.1 | grep memcpy > [93] A [2] /platform/$PLATFORM/lib/libc_psr.so.1 memcpy > > As this function has backing code, the symbol defines the associated > code: > > oxpoly 504. elfdump -sN.dynsym /lib/libc.so.1 | fgrep memcpy > [93] 0x0003fed0 0x000001b0 FUNC WEAK D 38 .text memcpy > [583] 0x0003fed0 0x000001b0 FUNC GLOB D 36 .text _memcpy > > > With the introduction of per-symbol filters, we were able to simplify > and refine many object filtering mechanisms. For example, the dl* > family could be defined in libc (you don't need to link with -ldl > anymore). > > > Send me mail if you need more information. > >
-- http://martinman.net _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss at opensolaris dot org
|
|
|
|
Posts:
307
From:
US
Registered:
3/9/05
|
|
|
|
Re: shared library symbols at address 0x00000000
Posted:
Nov 20, 2006 10:58 AM
in response to: mmman
|
|
Martin Man wrote:
> As I understand it now, on Linux systems, the same effect can be > achieved by linking a library libA.so against another library libB.so, > in which case, the resulting binary can be linked only against libA.so > and will automatically resolve symbols from libB.so.
This is different. Filters are an abstraction, where a dependency established at link-edit time, is redirected to an alternative implementation at runtime.
The scenario you have outlined is a trick played with dependencies, which the Solaris ld(1) will frown upon :-). If a binary requires interfaces within libB.so, it should have its own dependency on libB.so. Assuming some other object will make libB.so appear in the address space is risky.
> On Solaris, the symbols linked to libA.so from libB.so are moreover > marked as filter symbols.
The focal point seems to be our interpretation of ABS. On Solaris, an ABS symbol index defines how the symbol should be interpreted *within* the object that contains the symbol. A binary that references this symbol should establish its own reference model based on the symbols type - FUNC (referring object creates a .plt) or DATA (referring object creates a GOT reference), etc.
It looks like the gnu linker is propagating the destination symbol index (ABS) to the referring object.
At runtime, the referring object should bind to the definition as normal. What the defining implementation does (act as a filter, dlopen() something, or define an absolute offset) is up to the implementation - and can change from one runtime environment to another.
--
Rod. _______________________________________________ opensolaris-discuss mailing list opensolaris-discuss at opensolaris dot org
|
|
|
|
|