|
Replies:
12
-
Last Post:
Jun 23, 2007 9:22 PM
by: rie
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Why the __STATIC_CONSTRUCTOR() calls are not in proper order?
Posted:
Jun 19, 2007 11:41 PM
To: Communities » tools » linking
Cc: Communities » i18n » discuss
Cc: Communities » tools » discuss
|
|
I met a strange problem recently. The __STATIC_CONSTRUCTOR() calls are not in proper order.
I have one shared object file (im-scim.so), and it depends on another shared library file (libscim-1.0.so). In im-scim.so, there is a static object instance, and it calls a global function in libscim-1.0.so, in that global function, it relies no another static object instance. I wrote a small test program, it just uses dlopen() to load im-scim.so.
From the debugger, I could see, the static constructor in im-scim.so is called before the static constructor in libscim-1.0.so. There, the test program cores.
I want to know how the static constructor chain are handled? And it could be a bug of libCrun or ld.so?
P.S., when I am using snv_62, the shared objects works fine, but failed on snv_66.
Here is the backtrace,
[4] scim::__initialize_config(), line 131 in "scim_global_config.cpp" [5] scim::scim_global_config_read(key = CLASS, defVal = 5000), line 183 in "scim_global_config.cpp" [6] scim::scim_get_default_socket_timeout(), line 1214 in "scim_socket.cpp" [7] scim::PanelClient::PanelClientImpl::PanelClientImpl(this = 0x8085318), line 86 in "scim_panel_client.cpp" [8] scim::PanelClient::PanelClient(this = 0xfbfe9510), line 566 in "scim_panel_client.cpp" [9] __SLIP.INIT_I(0xfbfe11e8, 0xfbfc33f9, 0x8046e8c, 0xfbfcd807, 0xfeffa7d0, 0xfe660438), at 0xfbfb035f [10] __STATIC_CONSTRUCTOR(), line 301 in "gtkimcontextscim.cpp" [11] __cplus_fini_at_exit(0xfbf93760, 0xfe660438, 0xfeffa7d0, 0xfbc10df8, 0xfefd05dc, 0xfbc10df8), at 0xfbfcd807 [12] call_init(0xfbc10df8, 0x3), at 0xfefd380f [13] is_dep_init(0xfe660438, 0xfbf80490), at 0xfefd357c [14] elf_bndr(0xfbf80490, 0x2d8, 0xfbe6ccbf), at 0xfefde53c [15] elf_rtbndr(0x2d8, 0xfbe6ccbf, 0xfbfe9594, 0x0, 0xfbf2f200, 0xfbe6cc79), at 0xfefc8c24 [16] 0xfbf80490(0xfbf2f200, 0xfbe6cda9, 0x8046fcc, 0xfbf0890b, 0xfeffa7d0, 0xfbf80490), at 0xfbf80490 [17] __STATIC_CONSTRUCTOR(), line 42 in "string.cc" [18] __cplus_fini_at_exit(0xfeffa2d8, 0xfe660438, 0xfeffa7d0, 0xc, 0x8047024, 0xfefd7d91), at 0xfbf0890b [19] call_init(0xfbc10db0, 0x1), at 0xfefd380f [20] load_completion(0xfe660438, 0xfec425b0), at 0xfefd3dfa [21] dlmopen_intn(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x0, 0x0, 0x80470d0), at 0xfefd7ff0 [22] dlmopen_check(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x80470d0), at 0xfefd80e0 [23] _dlopen(0x8066c40, 0x101), at 0xfefd81a1 [24] _g_module_open(0x8066c40, 0x1, 0x0), at 0xfc59118a [25] g_module_open(0x8065008, 0x0), at 0xfc59175c [26] query_module(0x8064308, 0xfee84042), at 0x805137a [27] main(0x1, 0x80471f0, 0x80471f8), at 0x8051651
|
|
|
Posts:
295
From:
US
Registered:
3/9/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not
in proper order?
Posted:
Jun 20, 2007 8:58 AM
in response to: yongsun
|
|
Yong Sun wrote:
> [4] scim::__initialize_config(), line 131 in "scim_global_config.cpp" > [5] scim::scim_global_config_read(key = CLASS, defVal = 5000), line 183 in "scim_global_config.cpp" > [6] scim::scim_get_default_socket_timeout(), line 1214 in "scim_socket.cpp" > [7] scim::PanelClient::PanelClientImpl::PanelClientImpl(this = 0x8085318), line 86 in "scim_panel_client.cpp" > [8] scim::PanelClient::PanelClient(this = 0xfbfe9510), line 566 in "scim_panel_client.cpp" > [9] __SLIP.INIT_I(0xfbfe11e8, 0xfbfc33f9, 0x8046e8c, 0xfbfcd807, 0xfeffa7d0, 0xfe660438), at 0xfbfb035f > [10] __STATIC_CONSTRUCTOR(), line 301 in "gtkimcontextscim.cpp" > [11] __cplus_fini_at_exit(0xfbf93760, 0xfe660438, 0xfeffa7d0, 0xfbc10df8, 0xfefd05dc, 0xfbc10df8), at 0xfbfcd807 > [12] call_init(0xfbc10df8, 0x3), at 0xfefd380f > [13] is_dep_init(0xfe660438, 0xfbf80490), at 0xfefd357c > [14] elf_bndr(0xfbf80490, 0x2d8, 0xfbe6ccbf), at 0xfefde53c > [15] elf_rtbndr(0x2d8, 0xfbe6ccbf, 0xfbfe9594, 0x0, 0xfbf2f200, 0xfbe6cc79), at 0xfefc8c24 > [16] 0xfbf80490(0xfbf2f200, 0xfbe6cda9, 0x8046fcc, 0xfbf0890b, 0xfeffa7d0, 0xfbf80490), at 0xfbf80490 > [17] __STATIC_CONSTRUCTOR(), line 42 in "string.cc" > [18] __cplus_fini_at_exit(0xfeffa2d8, 0xfe660438, 0xfeffa7d0, 0xc, 0x8047024, 0xfefd7d91), at 0xfbf0890b > [19] call_init(0xfbc10db0, 0x1), at 0xfefd380f > [20] load_completion(0xfe660438, 0xfec425b0), at 0xfefd3dfa > [21] dlmopen_intn(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x0, 0x0, 0x80470d0), at 0xfefd7ff0 > [22] dlmopen_check(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x80470d0), at 0xfefd80e0 > [23] _dlopen(0x8066c40, 0x101), at 0xfefd81a1 > [24] _g_module_open(0x8066c40, 0x1, 0x0), at 0xfc59118a > [25] g_module_open(0x8065008, 0x0), at 0xfc59175c > [26] query_module(0x8064308, 0xfee84042), at 0x805137a > [27] main(0x1, 0x80471f0, 0x80471f8), at 0x8051651
Perhaps you have a cyclic dependency.
This trace shows that ld.so.1 has kicked off a .init [19] as part of the dlopen [23]. From exercising the __STATIC_CONSTRUCTOR [17], ld.so.1 has been asked to bind to another function [15], which is presumably to an object whose init hasn't fired yet. Before the binding is established, ld.so.1 calls the defining objects .init [12].
If you turned on the runtime linkers debugging, I 'd expect you'd see something like:
09086: calling .init (dynamically triggered): ./libXXXX.so.1
For a description of the complexities of init/fini processing, see:
http://blogs.sun.com/rie/entry/init_and_fini_processing_who
--
Rod. _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 20, 2007 7:03 PM
in response to: rie
|
|
Hi, Rod,
Thank you so much! Yes, you are right, there is a "dynamically triggered" when calling ".init"s:
$ LD_DEBUG=init,detail,demangle ./test 02473: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 02473: 1: calling .init (dynamically triggered): /usr/lib/gtk-2.0/immodules/im-scim.so
And seems that the "dynamically triggered" is caused by resolving the __null_string_ref_rep symbol:
$ LD_DEBUG=init,symbols,bindings,demangle ./test 25357: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 25357: 1: 25357: 1: symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigne d>,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator <unsigned> > >::__null_string_ref_rep(); lookup in file=test [ ELF ] 25357: 1: symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigne d>,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator <unsigned> > >::__null_string_ref_rep(); lookup in file=/lib/libc.so.1 [ ELF ] 25357: 1: symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigne d>,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator <unsigned> > >::__null_string_ref_rep(); lookup in file=/usr/lib/gtk-2.0/immodules/im-scim.so [ ELF ] 25357: 1: binding file=/usr/lib/libscim-1.0.so.8 to file=/usr/lib/gtk-2.0/immodules/im-scim.so: symbol `__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned>, std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator<un signed> > >::__null_string_ref_rep()' 25357: 1: 25357: 1: calling .init (dynamically triggered): /usr/lib/gtk-2.0/immodules/im-scim.so
25320: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 25320: 1: calling .init (dynamically triggered): /usr/lib/gtk-2.0/immodules/im-scim.so
And from the nm result of libscim-1.0.so, we can tell, the symbol is also available in libscim-1.0.so,
$ nm /usr/lib/libscim-1.0.so | c++filt | grep __null_string_ref_rep [2593] | 820544| 84|FUNC |GLOB |0 |14 |__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned>, std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator<un signed> > >::__null_string_ref_rep()
So, is the looking up and binding order is incorrect?
Regards,
Rod Evans wrote: > Yong Sun wrote: > >> [4] scim::__initialize_config(), line 131 in "scim_global_config.cpp" >> [5] scim::scim_global_config_read(key = CLASS, defVal = 5000), line >> 183 in "scim_global_config.cpp" >> [6] scim::scim_get_default_socket_timeout(), line 1214 in >> "scim_socket.cpp" >> [7] scim::PanelClient::PanelClientImpl::PanelClientImpl(this = >> 0x8085318), line 86 in "scim_panel_client.cpp" >> [8] scim::PanelClient::PanelClient(this = 0xfbfe9510), line 566 in >> "scim_panel_client.cpp" >> [9] __SLIP.INIT_I(0xfbfe11e8, 0xfbfc33f9, 0x8046e8c, 0xfbfcd807, >> 0xfeffa7d0, 0xfe660438), at 0xfbfb035f >> [10] __STATIC_CONSTRUCTOR(), line 301 in "gtkimcontextscim.cpp" >> [11] __cplus_fini_at_exit(0xfbf93760, 0xfe660438, 0xfeffa7d0, >> 0xfbc10df8, 0xfefd05dc, 0xfbc10df8), at 0xfbfcd807 >> [12] call_init(0xfbc10df8, 0x3), at 0xfefd380f >> [13] is_dep_init(0xfe660438, 0xfbf80490), at 0xfefd357c >> [14] elf_bndr(0xfbf80490, 0x2d8, 0xfbe6ccbf), at 0xfefde53c >> [15] elf_rtbndr(0x2d8, 0xfbe6ccbf, 0xfbfe9594, 0x0, 0xfbf2f200, >> 0xfbe6cc79), at 0xfefc8c24 >> [16] 0xfbf80490(0xfbf2f200, 0xfbe6cda9, 0x8046fcc, 0xfbf0890b, >> 0xfeffa7d0, 0xfbf80490), at 0xfbf80490 >> [17] __STATIC_CONSTRUCTOR(), line 42 in "string.cc" >> [18] __cplus_fini_at_exit(0xfeffa2d8, 0xfe660438, 0xfeffa7d0, 0xc, >> 0x8047024, 0xfefd7d91), at 0xfbf0890b >> [19] call_init(0xfbc10db0, 0x1), at 0xfefd380f >> [20] load_completion(0xfe660438, 0xfec425b0), at 0xfefd3dfa >> [21] dlmopen_intn(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x0, 0x0, >> 0x80470d0), at 0xfefd7ff0 >> [22] dlmopen_check(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, >> 0x80470d0), at 0xfefd80e0 >> [23] _dlopen(0x8066c40, 0x101), at 0xfefd81a1 >> [24] _g_module_open(0x8066c40, 0x1, 0x0), at 0xfc59118a >> [25] g_module_open(0x8065008, 0x0), at 0xfc59175c >> [26] query_module(0x8064308, 0xfee84042), at 0x805137a >> [27] main(0x1, 0x80471f0, 0x80471f8), at 0x8051651 > > Perhaps you have a cyclic dependency. > > This trace shows that ld.so.1 has kicked off a .init [19] as part of the > dlopen [23]. From exercising the __STATIC_CONSTRUCTOR [17], ld.so.1 has > been asked to bind to another function [15], which is presumably to an > object whose init hasn't fired yet. Before the binding is established, > ld.so.1 calls the defining objects .init [12]. > > If you turned on the runtime linkers debugging, I 'd expect you'd see > something like: > > 09086: calling .init (dynamically triggered): ./libXXXX.so.1 > > For a description of the complexities of init/fini processing, see: > > http://blogs.sun.com/rie/entry/init_and_fini_processing_who >
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 21, 2007 1:12 AM
in response to: yongsun
|
|
|
|
Hi, Rod,
I finally isolated the problem to very simple example C/C++ source files, as attached. If you have interests, you could extract the tar file, and run gmake. Then you could see, test would core, while test2 would succeed.
Regards,
Yong Sun wrote: > Hi, Rod, > > Thank you so much! Yes, you are right, there is a "dynamically > triggered" when calling ".init"s: > > $ LD_DEBUG=init,detail,demangle ./test > 02473: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 > 02473: 1: calling .init (dynamically triggered): > /usr/lib/gtk-2.0/immodules/im-scim.so > > And seems that the "dynamically triggered" is caused by resolving the > __null_string_ref_rep symbol: > > $ LD_DEBUG=init,symbols,bindings,demangle ./test > 25357: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 > 25357: 1: > 25357: 1: > symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned& gt;,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator&l t;unsigned> > > >::__null_string_ref_rep(); lookup in file=test [ ELF ] > 25357: 1: > symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned& gt;,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator&l t;unsigned> > > >::__null_string_ref_rep(); lookup in file=/lib/libc.so.1 [ ELF ] > 25357: 1: > symbol=__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned& gt;,std::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator&l t;unsigned> > > >::__null_string_ref_rep(); lookup in > file=/usr/lib/gtk-2.0/immodules/im-scim.so [ ELF ] > 25357: 1: binding file=/usr/lib/libscim-1.0.so.8 to > file=/usr/lib/gtk-2.0/immodules/im-scim.so: symbol > `__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned>,st d::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator<unsi gned> > > >::__null_string_ref_rep()' > 25357: 1: > 25357: 1: calling .init (dynamically triggered): > /usr/lib/gtk-2.0/immodules/im-scim.so > > 25320: 1: calling .init (from sorted order): /usr/lib/libscim-1.0.so.8 > 25320: 1: calling .init (dynamically triggered): > /usr/lib/gtk-2.0/immodules/im-scim.so > > And from the nm result of libscim-1.0.so, we can tell, the symbol is > also available in libscim-1.0.so, > > $ nm /usr/lib/libscim-1.0.so | c++filt | grep __null_string_ref_rep > [2593] | 820544| 84|FUNC |GLOB |0 |14 > |__rwstd::__null_string_ref_rep<unsigned,std::char_traits<unsigned>,st d::allocator<unsigned>,__rwstd::__string_ref_rep<std::allocator<unsi gned> > > >::__null_string_ref_rep() > > So, is the looking up and binding order is incorrect? > > Regards, > > Rod Evans wrote: >> Yong Sun wrote: >> >>> [4] scim::__initialize_config(), line 131 in "scim_global_config.cpp" >>> [5] scim::scim_global_config_read(key = CLASS, defVal = 5000), line >>> 183 in "scim_global_config.cpp" >>> [6] scim::scim_get_default_socket_timeout(), line 1214 in >>> "scim_socket.cpp" >>> [7] scim::PanelClient::PanelClientImpl::PanelClientImpl(this = >>> 0x8085318), line 86 in "scim_panel_client.cpp" >>> [8] scim::PanelClient::PanelClient(this = 0xfbfe9510), line 566 in >>> "scim_panel_client.cpp" >>> [9] __SLIP.INIT_I(0xfbfe11e8, 0xfbfc33f9, 0x8046e8c, 0xfbfcd807, >>> 0xfeffa7d0, 0xfe660438), at 0xfbfb035f >>> [10] __STATIC_CONSTRUCTOR(), line 301 in "gtkimcontextscim.cpp" >>> [11] __cplus_fini_at_exit(0xfbf93760, 0xfe660438, 0xfeffa7d0, >>> 0xfbc10df8, 0xfefd05dc, 0xfbc10df8), at 0xfbfcd807 >>> [12] call_init(0xfbc10df8, 0x3), at 0xfefd380f >>> [13] is_dep_init(0xfe660438, 0xfbf80490), at 0xfefd357c >>> [14] elf_bndr(0xfbf80490, 0x2d8, 0xfbe6ccbf), at 0xfefde53c >>> [15] elf_rtbndr(0x2d8, 0xfbe6ccbf, 0xfbfe9594, 0x0, 0xfbf2f200, >>> 0xfbe6cc79), at 0xfefc8c24 >>> [16] 0xfbf80490(0xfbf2f200, 0xfbe6cda9, 0x8046fcc, 0xfbf0890b, >>> 0xfeffa7d0, 0xfbf80490), at 0xfbf80490 >>> [17] __STATIC_CONSTRUCTOR(), line 42 in "string.cc" >>> [18] __cplus_fini_at_exit(0xfeffa2d8, 0xfe660438, 0xfeffa7d0, 0xc, >>> 0x8047024, 0xfefd7d91), at 0xfbf0890b >>> [19] call_init(0xfbc10db0, 0x1), at 0xfefd380f >>> [20] load_completion(0xfe660438, 0xfec425b0), at 0xfefd3dfa >>> [21] dlmopen_intn(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, 0x0, >>> 0x0, 0x80470d0), at 0xfefd7ff0 >>> [22] dlmopen_check(0xfeffa2d8, 0x8066c40, 0xd01, 0xfec425b0, >>> 0x80470d0), at 0xfefd80e0 >>> [23] _dlopen(0x8066c40, 0x101), at 0xfefd81a1 >>> [24] _g_module_open(0x8066c40, 0x1, 0x0), at 0xfc59118a >>> [25] g_module_open(0x8065008, 0x0), at 0xfc59175c >>> [26] query_module(0x8064308, 0xfee84042), at 0x805137a >>> [27] main(0x1, 0x80471f0, 0x80471f8), at 0x8051651 >> >> Perhaps you have a cyclic dependency. >> >> This trace shows that ld.so.1 has kicked off a .init [19] as part of the >> dlopen [23]. From exercising the __STATIC_CONSTRUCTOR [17], ld.so.1 has >> been asked to bind to another function [15], which is presumably to an >> object whose init hasn't fired yet. Before the binding is established, >> ld.so.1 calls the defining objects .init [12]. >> >> If you turned on the runtime linkers debugging, I 'd expect you'd see >> something like: >> >> 09086: calling .init (dynamically triggered): ./libXXXX.so.1 >> >> For a description of the complexities of init/fini processing, see: >> >> http://blogs.sun.com/rie/entry/init_and_fini_processing_who >> > > >
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
295
From:
US
Registered:
3/9/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not
in proper order?
Posted:
Jun 21, 2007 5:45 PM
in response to: yongsun
|
|
Yong Sun wrote: > Hi, Rod, > > I finally isolated the problem to very simple example C/C++ source > files, as attached. If you have interests, you could extract the tar > file, and run gmake. Then you could see, test would core, while test2 > would succeed.
Someone from C++ land is going to have to unravel this.
There are multiple instances of the same symbol in different libraries, cyclic dependencies, and .init code that jumps all over the place.
Set LD_DEBUG=.init and we start seeing:
04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 04973: 1: 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 04973: 1: 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not completed 04973: 1: 04973: 1: calling .init (dynamically triggered): ./libtest.so 04973: 1: 04973: 1: calling .init (dynamically triggered): /home/rie/dltest/libbase.so 04973: 1: 04973: 1: warning: calling ./libtest.so whose init has not completed 04973: 1: 04973: 1: warning: calling ./libtest.so whose init has not completed 04973: 1: 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not completed 04973: 1: 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not completed
Now I can't tell you that these indicate problems or not, as there is no way to determine whether a reference to another object requires that object to have completed its .init for the reference to be valid. Meaning, if data is updated by a .init, and that data is referenced before the .init has completed, are you in trouble?
If you expand a little with init,bindings:
04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 ...... 04948: 1: binding file=/usr/lib/libCrun.so.1 to file=/usr/lib/libCstd.so.1: \ symbol `__SUNW_force_load_of_inits' 04948: 1: 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1
Hmmm, __SUNW_force_load_of_inits - that looks scary.
04948: 1: binding file=/usr/lib/libCstd.so.1 to file=/usr/lib/libCrun.so.1: \ symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ 04948: 1: 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not completed
So, you have libCruns .init calling libCstd.so.1, and libCstds .init calling libCrun.so.1
The scenario continues through your other objects.
The runtime linker is simply jumping from object to object as directed, and trying frantically to fire .init's before an object is called. When cyclic dependencies exist, you can't programaticaly determine a "correct" order, so the dynamic firing attempts to compensate - and from experience we know that without this "compensation" a whole mess of applications would already be falling over.
I'll stick by my concluding remarks from
http://blogs.sun.com/rie/entry/init_and_fini_processing_who
and let's see if someone from C++ can enlighten us some more.
--
Rod. _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 21, 2007 6:29 PM
in response to: rie
|
|
Hi, Rod,
Thanks, if I dlopen with RTLD_NOW, or build the libbase.so with -Bdirect, this problem would not happen. I also logged this problem in my blog, http://blogs.sun.com/yongsun/entry/static_constructors_and_symbol_looking
Regards,
Rod Evans wrote: > Yong Sun wrote: >> Hi, Rod, >> >> I finally isolated the problem to very simple example C/C++ source >> files, as attached. If you have interests, you could extract the tar >> file, and run gmake. Then you could see, test would core, while test2 >> would succeed. > > Someone from C++ land is going to have to unravel this. > > There are multiple instances of the same symbol in different libraries, > cyclic dependencies, and .init code that jumps all over the place. > > Set LD_DEBUG=.init and we start seeing: > > 04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 > 04973: 1: > 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > 04973: 1: > 04973: 1: calling .init (dynamically triggered): ./libtest.so > 04973: 1: > 04973: 1: calling .init (dynamically triggered): > /home/rie/dltest/libbase.so > 04973: 1: > 04973: 1: warning: calling ./libtest.so whose init has not completed > 04973: 1: > 04973: 1: warning: calling ./libtest.so whose init has not completed > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > > Now I can't tell you that these indicate problems or not, as there is no > way to determine whether a reference to another object requires that > object > to have completed its .init for the reference to be valid. Meaning, if > data is updated by a .init, and that data is referenced before the .init > has completed, are you in trouble? > > If you expand a little with init,bindings: > > 04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 > ...... > 04948: 1: binding file=/usr/lib/libCrun.so.1 to > file=/usr/lib/libCstd.so.1: \ > symbol `__SUNW_force_load_of_inits' > 04948: 1: > 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 > > Hmmm, __SUNW_force_load_of_inits - that looks scary. > > 04948: 1: binding file=/usr/lib/libCstd.so.1 to > file=/usr/lib/libCrun.so.1: \ > symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ > 04948: 1: > 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > > So, you have libCruns .init calling libCstd.so.1, and libCstds .init > calling > libCrun.so.1 > > The scenario continues through your other objects. > > The runtime linker is simply jumping from object to object as directed, > and trying frantically to fire .init's before an object is called. When > cyclic dependencies exist, you can't programaticaly determine a "correct" > order, so the dynamic firing attempts to compensate - and from experience > we know that without this "compensation" a whole mess of applications > would already be falling over. > > I'll stick by my concluding remarks from > > http://blogs.sun.com/rie/entry/init_and_fini_processing_who > > and let's see if someone from C++ can enlighten us some more. >
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
295
From:
US
Registered:
3/9/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not
in proper order?
Posted:
Jun 22, 2007 9:07 AM
in response to: yongsun
|
|
Yong Sun wrote: > Hi, Rod, > > Thanks, if I dlopen with RTLD_NOW, or build the libbase.so with > -Bdirect, this problem would not happen.
I'd be careful of -Bdirect and C++. -Bdirect can result in different callers binding to different definitions of the same named symbol. This is verify often useful, and what you want to achieve. But C++ is littered with implementation details that expect interposition to occur with some of their multiply defined symbols. -Bdirect can break this expectation.
RTLD_NOW results in ld.so.1 performing all relocations on the loaded objects - thus, the interdependencies of the function calls that have been bound are added to the mix to determine the topological sort of the .init sections. I suspect this sort has resulted in a different firing order than the sort that is determined when only the data symbol relocation bindings have been processed.
You might still be skating on thin ice ... cyclic dependencies between .init code should be avoided.
--
Rod. _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 23, 2007 6:55 PM
in response to: rie
|
|
Hi, Rod,
> I'd be careful of -Bdirect and C++. -Bdirect can result in different > callers binding to different definitions of the same named symbol. > This is verify often useful, and what you want to achieve. But C++ > is littered with implementation details that expect interposition > to occur with some of their multiply defined symbols. -Bdirect can > break this expectation. Would you like to give me an example of that? I'm very interested in that :)
Regards, > > RTLD_NOW results in ld.so.1 performing all relocations on the loaded > objects - thus, the interdependencies of the function calls that have > been bound are added to the mix to determine the topological sort > of the .init sections. I suspect this sort has resulted in a > different firing order than the sort that is determined when only > the data symbol relocation bindings have been processed. > > You might still be skating on thin ice ... cyclic dependencies between > .init code should be avoided. >
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
295
From:
US
Registered:
3/9/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not
in proper order?
Posted:
Jun 23, 2007 9:22 PM
in response to: yongsun
|
|
Yong Sun wrote: > Hi, Rod, > >> I'd be careful of -Bdirect and C++. -Bdirect can result in different >> callers binding to different definitions of the same named symbol. >> This is verify often useful, and what you want to achieve. But C++ >> is littered with implementation details that expect interposition >> to occur with some of their multiply defined symbols. -Bdirect can >> break this expectation. > Would you like to give me an example of that? I'm very interested in > that :)
Well, Steve's last posting stated:
The std::string<T> class template depends on having only one __null_string_ref_rep<T> in the entire program.
So, I guess if two objects contained this symbol, and each object was built with -Bdirect, they would both bind to their own instance of the symbol. I guess this would be the same as using -Bsymbolic. C++ doesn't like either option.
-- Rod _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
33
From:
US
Registered:
9/28/06
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 22, 2007 9:56 AM
in response to: rie
|
|
We have a deliberate, although possibly mis-guided, interaction between libCrun and libCstd initialization. To ensure that C++ iostreams are initialized soon enough, CCrti (program startup code for C++) calls a routine in libCun that has a table of initialization routines that are done first. The only entry in the table currently is the one to initialize iostreams in libCstd. (It uses a weak symbol definition for the target, so if libCstd is not linked, nothing happens.)
If a user program has a proper dependency on libCstd, this bouncing between libCrun and libCstd, and their respective initializations, should finish before any user library starts its initialization. At least, that's how it seems to me.
Do we have a case where a user program has a dependency on libCstd, but sill mixes its initialization with that of libCstd? If so, I'd like to know how that happened, and what we could do to ensure that doesn't happen.
We are now looking again at the iostream issue. I think we can arrange for libCrun to initialize itself without any reference to libCstd, and for libCstd to ensure that iostreams are initialized first.
If we make that change, programs that now use libCstd but have no dependency on it, might stop working. I consider such a situation to be user error (correct me if I am wrong), in which case the change doesn't bother me.
--- Steve Clamage, stephen dot clamage at sun dot com
On 06/21/07 17:45, Rod Evans wrote: > Yong Sun wrote: >> Hi, Rod, >> >> I finally isolated the problem to very simple example C/C++ source >> files, as attached. If you have interests, you could extract the tar >> file, and run gmake. Then you could see, test would core, while test2 >> would succeed. > > Someone from C++ land is going to have to unravel this. > > There are multiple instances of the same symbol in different libraries, > cyclic dependencies, and .init code that jumps all over the place. > > Set LD_DEBUG=.init and we start seeing: > > 04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 > 04973: 1: > 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > 04973: 1: > 04973: 1: calling .init (dynamically triggered): ./libtest.so > 04973: 1: > 04973: 1: calling .init (dynamically triggered): > /home/rie/dltest/libbase.so > 04973: 1: > 04973: 1: warning: calling ./libtest.so whose init has not completed > 04973: 1: > 04973: 1: warning: calling ./libtest.so whose init has not completed > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > 04973: 1: > 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > > Now I can't tell you that these indicate problems or not, as there is no > way to determine whether a reference to another object requires that object > to have completed its .init for the reference to be valid. Meaning, if > data is updated by a .init, and that data is referenced before the .init > has completed, are you in trouble? > > If you expand a little with init,bindings: > > 04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 > ...... > 04948: 1: binding file=/usr/lib/libCrun.so.1 to > file=/usr/lib/libCstd.so.1: \ > symbol `__SUNW_force_load_of_inits' > 04948: 1: > 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 > > Hmmm, __SUNW_force_load_of_inits - that looks scary. > > 04948: 1: binding file=/usr/lib/libCstd.so.1 to > file=/usr/lib/libCrun.so.1: \ > symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ > 04948: 1: > 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not > completed > > So, you have libCruns .init calling libCstd.so.1, and libCstds .init > calling > libCrun.so.1 > > The scenario continues through your other objects. > > The runtime linker is simply jumping from object to object as directed, > and trying frantically to fire .init's before an object is called. When > cyclic dependencies exist, you can't programaticaly determine a "correct" > order, so the dynamic firing attempts to compensate - and from experience > we know that without this "compensation" a whole mess of applications > would already be falling over. > > I'll stick by my concluding remarks from > > http://blogs.sun.com/rie/entry/init_and_fini_processing_who > > and let's see if someone from C++ can enlighten us some more. > _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 22, 2007 8:32 PM
in response to: clamage
|
|
Hi, Steve,
Thank you very much!
In my case (the dltest.tar), when the main program call dlopen ("libtest.so"), it resolves the dependencies, then adds libbase.so in the initialization sequence ahead of libtest.so (the order is reversed). And we all knew that the static constructors in a shared library are in the .init routine.
While it's trying to initialize the static object "bar" in libbase.so, it finds there is a symbol "xxx::__null_string_ref_rep<xxx>" (introduced by std::basic_string<unsigned int>), then looks up this symbol in loaded libraries. Firstly, it looks up the symbol in main program, then in libc.so, then finds matched one in libtest.so, and stops to move on (actually, libbase.so also has this symbol). Then it tries to initialize libtest.so, and initializes the static object "foo". Unfortunately, the constructor of Foo calls a external function in libbase, and this function accesses the static instance "bar", which is not initialized yet (the "buf" is not allocated).
So the main program cores.
I am not sure, maybe the ld.so should first lookup the symbol in the current initializing shared library, i.e., libbase.so?
Regards,
Steve Clamage wrote: > We have a deliberate, although possibly mis-guided, interaction > between libCrun and libCstd initialization. To ensure that C++ > iostreams are initialized soon enough, CCrti (program startup code for > C++) calls a routine in libCun that has a table of initialization > routines that are done first. The only entry in the table currently is > the one to initialize iostreams in libCstd. (It uses a weak symbol > definition for the target, so if libCstd is not linked, nothing happens.) > > If a user program has a proper dependency on libCstd, this bouncing > between libCrun and libCstd, and their respective initializations, > should finish before any user library starts its initialization. At > least, that's how it seems to me. > > Do we have a case where a user program has a dependency on libCstd, > but sill mixes its initialization with that of libCstd? If so, I'd > like to know how that happened, and what we could do to ensure that > doesn't happen. > > We are now looking again at the iostream issue. I think we can arrange > for libCrun to initialize itself without any reference to libCstd, and > for libCstd to ensure that iostreams are initialized first. > > If we make that change, programs that now use libCstd but have no > dependency on it, might stop working. I consider such a situation to > be user error (correct me if I am wrong), in which case the change > doesn't bother me. > > --- > Steve Clamage, stephen dot clamage at sun dot com > > > On 06/21/07 17:45, Rod Evans wrote: >> Yong Sun wrote: >>> Hi, Rod, >>> >>> I finally isolated the problem to very simple example C/C++ source >>> files, as attached. If you have interests, you could extract the tar >>> file, and run gmake. Then you could see, test would core, while >>> test2 would succeed. >> >> Someone from C++ land is going to have to unravel this. >> >> There are multiple instances of the same symbol in different libraries, >> cyclic dependencies, and .init code that jumps all over the place. >> >> Set LD_DEBUG=.init and we start seeing: >> >> 04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >> 04973: 1: >> 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >> 04973: 1: >> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >> completed >> 04973: 1: >> 04973: 1: calling .init (dynamically triggered): ./libtest.so >> 04973: 1: >> 04973: 1: calling .init (dynamically triggered): >> /home/rie/dltest/libbase.so >> 04973: 1: >> 04973: 1: warning: calling ./libtest.so whose init has not completed >> 04973: 1: >> 04973: 1: warning: calling ./libtest.so whose init has not completed >> 04973: 1: >> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >> completed >> 04973: 1: >> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >> completed >> >> Now I can't tell you that these indicate problems or not, as there is no >> way to determine whether a reference to another object requires that >> object >> to have completed its .init for the reference to be valid. Meaning, if >> data is updated by a .init, and that data is referenced before the .init >> has completed, are you in trouble? >> >> If you expand a little with init,bindings: >> >> 04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >> ...... >> 04948: 1: binding file=/usr/lib/libCrun.so.1 to >> file=/usr/lib/libCstd.so.1: \ >> symbol `__SUNW_force_load_of_inits' >> 04948: 1: >> 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >> >> Hmmm, __SUNW_force_load_of_inits - that looks scary. >> >> 04948: 1: binding file=/usr/lib/libCstd.so.1 to >> file=/usr/lib/libCrun.so.1: \ >> symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ >> 04948: 1: >> 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >> completed >> >> So, you have libCruns .init calling libCstd.so.1, and libCstds .init >> calling >> libCrun.so.1 >> >> The scenario continues through your other objects. >> >> The runtime linker is simply jumping from object to object as directed, >> and trying frantically to fire .init's before an object is called. When >> cyclic dependencies exist, you can't programaticaly determine a >> "correct" >> order, so the dynamic firing attempts to compensate - and from >> experience >> we know that without this "compensation" a whole mess of applications >> would already be falling over. >> >> I'll stick by my concluding remarks from >> >> http://blogs.sun.com/rie/entry/init_and_fini_processing_who >> >> and let's see if someone from C++ can enlighten us some more. >>
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
33
From:
US
Registered:
9/28/06
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 22, 2007 9:57 PM
in response to: yongsun
|
|
The std::string<T> class template depends on having only one __null_string_ref_rep<T> in the entire program.
The std::string template is pre-instantiated on char and wchar_t, in which case the only __null_string object is already in libCstd, and does not get replicated.
I think the problem comes from creating a std::string on another type, unsigned int in your case. You need to take steps to ensure only one copy of __null_string_ref_rep<T> exists, for any T other than char or wchar_t.
You can do that by building libraries in hierarchical order, bottom up, and using the -instlib option to prevent instances of the same template from being generated in more than one library. Here is how:
1. Determine the (partial) ordering of library dependencies. There must be no cycle in the dependency graph.
2. Build the lowest-level library first, the one which depends on no other project library. Call it libA.so.
3. Build the next-higher library, call it libB, which depends on libA. For each CC command that creates a .o file that will go into libB, add the option -instlib=/path/to/libA.so
4. Build the next-higher library, libC.so. For each CC command that creates a .o file that will go into libC, add the options -instlib=/path/to/libA.so -instlib=/path/to/libB.so (assuming libC depends on both libA and libB).
You need an -instlib option only for a library on which the current library depends (directly or transitively).
You can read more about the -instlib option in the C++ Users Guide Appendix on command-line options. The option causes the the compiler to scan the named library for template instances (and inline functions generated out of line), and omit generating them in the current .o file. In my example, libB will contain no template instance that occurs in libA, and libC will contain no template instance that occurs in libA or libB.
We are planning to add the capability to the compiler and Solaris linker to specify that an object must be a singleton, which will eliminate the need for this elaborate build process just to fix the _null_string problem. We do not have a projected date for the new option. You might still need to use -instlib to eliminate other circular references, however.
--- Steve Clamage, stephen dot clamage at sun dot com
On 6/22/2007 8:32 PM, Yong Sun wrote: > Hi, Steve, > > Thank you very much! > > In my case (the dltest.tar), when the main program call dlopen > ("libtest.so"), it resolves the dependencies, then adds libbase.so in > the initialization sequence ahead of libtest.so (the order is reversed). > And we all knew that the static constructors in a shared library are in > the .init routine. > > While it's trying to initialize the static object "bar" in libbase.so, > it finds there is a symbol "xxx::__null_string_ref_rep<xxx>" (introduced > by std::basic_string<unsigned int>), then looks up this symbol in loaded > libraries. Firstly, it looks up the symbol in main program, then in > libc.so, then finds matched one in libtest.so, and stops to move on > (actually, libbase.so also has this symbol). Then it tries to initialize > libtest.so, and initializes the static object "foo". Unfortunately, the > constructor of Foo calls a external function in libbase, and this > function accesses the static instance "bar", which is not initialized > yet (the "buf" is not allocated). > > So the main program cores. > > I am not sure, maybe the ld.so should first lookup the symbol in the > current initializing shared library, i.e., libbase.so? > > Regards, > > Steve Clamage wrote: >> We have a deliberate, although possibly mis-guided, interaction >> between libCrun and libCstd initialization. To ensure that C++ >> iostreams are initialized soon enough, CCrti (program startup code for >> C++) calls a routine in libCun that has a table of initialization >> routines that are done first. The only entry in the table currently is >> the one to initialize iostreams in libCstd. (It uses a weak symbol >> definition for the target, so if libCstd is not linked, nothing happens.) >> >> If a user program has a proper dependency on libCstd, this bouncing >> between libCrun and libCstd, and their respective initializations, >> should finish before any user library starts its initialization. At >> least, that's how it seems to me. >> >> Do we have a case where a user program has a dependency on libCstd, >> but sill mixes its initialization with that of libCstd? If so, I'd >> like to know how that happened, and what we could do to ensure that >> doesn't happen. >> >> We are now looking again at the iostream issue. I think we can arrange >> for libCrun to initialize itself without any reference to libCstd, and >> for libCstd to ensure that iostreams are initialized first. >> >> If we make that change, programs that now use libCstd but have no >> dependency on it, might stop working. I consider such a situation to >> be user error (correct me if I am wrong), in which case the change >> doesn't bother me. >> >> --- >> Steve Clamage, stephen dot clamage at sun dot com >> >> >> On 06/21/07 17:45, Rod Evans wrote: >>> Yong Sun wrote: >>>> Hi, Rod, >>>> >>>> I finally isolated the problem to very simple example C/C++ source >>>> files, as attached. If you have interests, you could extract the tar >>>> file, and run gmake. Then you could see, test would core, while >>>> test2 would succeed. >>> >>> Someone from C++ land is going to have to unravel this. >>> >>> There are multiple instances of the same symbol in different libraries, >>> cyclic dependencies, and .init code that jumps all over the place. >>> >>> Set LD_DEBUG=.init and we start seeing: >>> >>> 04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >>> 04973: 1: >>> 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >>> 04973: 1: >>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>> completed >>> 04973: 1: >>> 04973: 1: calling .init (dynamically triggered): ./libtest.so >>> 04973: 1: >>> 04973: 1: calling .init (dynamically triggered): >>> /home/rie/dltest/libbase.so >>> 04973: 1: >>> 04973: 1: warning: calling ./libtest.so whose init has not completed >>> 04973: 1: >>> 04973: 1: warning: calling ./libtest.so whose init has not completed >>> 04973: 1: >>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>> completed >>> 04973: 1: >>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>> completed >>> >>> Now I can't tell you that these indicate problems or not, as there is no >>> way to determine whether a reference to another object requires that >>> object >>> to have completed its .init for the reference to be valid. Meaning, if >>> data is updated by a .init, and that data is referenced before the .init >>> has completed, are you in trouble? >>> >>> If you expand a little with init,bindings: >>> >>> 04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >>> ...... >>> 04948: 1: binding file=/usr/lib/libCrun.so.1 to >>> file=/usr/lib/libCstd.so.1: \ >>> symbol `__SUNW_force_load_of_inits' >>> 04948: 1: >>> 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >>> >>> Hmmm, __SUNW_force_load_of_inits - that looks scary. >>> >>> 04948: 1: binding file=/usr/lib/libCstd.so.1 to >>> file=/usr/lib/libCrun.so.1: \ >>> symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ >>> 04948: 1: >>> 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>> completed >>> >>> So, you have libCruns .init calling libCstd.so.1, and libCstds .init >>> calling >>> libCrun.so.1 >>> >>> The scenario continues through your other objects. >>> >>> The runtime linker is simply jumping from object to object as directed, >>> and trying frantically to fire .init's before an object is called. When >>> cyclic dependencies exist, you can't programaticaly determine a >>> "correct" >>> order, so the dynamic firing attempts to compensate - and from >>> experience >>> we know that without this "compensation" a whole mess of applications >>> would already be falling over. >>> >>> I'll stick by my concluding remarks from >>> >>> http://blogs.sun.com/rie/entry/init_and_fini_processing_who >>> >>> and let's see if someone from C++ can enlighten us some more. >>> > _______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
Posts:
207
From:
Registered:
6/15/05
|
|
|
|
Re: Why the __STATIC_CONSTRUCTOR() calls are not in
proper order?
Posted:
Jun 23, 2007 5:05 AM
in response to: clamage
|
|
Hi, Steve,
Thank you so much!!! I will try -instlib option :)
Regards,
Steve Clamage wrote: > The std::string<T> class template depends on having only one > __null_string_ref_rep<T> in the entire program. > > The std::string template is pre-instantiated on char and wchar_t, in > which case the only __null_string object is already in libCstd, and > does not get replicated. > > I think the problem comes from creating a std::string on another type, > unsigned int in your case. You need to take steps to ensure only one > copy of __null_string_ref_rep<T> exists, for any T other than char or > wchar_t. > > You can do that by building libraries in hierarchical order, bottom > up, and using the -instlib option to prevent instances of the same > template from being generated in more than one library. Here is how: > > 1. Determine the (partial) ordering of library dependencies. There > must be no cycle in the dependency graph. > > 2. Build the lowest-level library first, the one which depends on no > other project library. Call it libA.so. > > 3. Build the next-higher library, call it libB, which depends on libA. > For each CC command that creates a .o file that will go into libB, add > the option > -instlib=/path/to/libA.so > > 4. Build the next-higher library, libC.so. For each CC command that > creates a .o file that will go into libC, add the options > -instlib=/path/to/libA.so -instlib=/path/to/libB.so > (assuming libC depends on both libA and libB). > > You need an -instlib option only for a library on which the current > library depends (directly or transitively). > > You can read more about the -instlib option in the C++ Users Guide > Appendix on command-line options. The option causes the the compiler > to scan the named library for template instances (and inline functions > generated out of line), and omit generating them in the current .o > file. In my example, libB will contain no template instance that > occurs in libA, and libC will contain no template instance that occurs > in libA or libB. > > We are planning to add the capability to the compiler and Solaris > linker to specify that an object must be a singleton, which will > eliminate the need for this elaborate build process just to fix the > _null_string problem. We do not have a projected date for the new > option. You might still need to use -instlib to eliminate other > circular references, however. > > --- > Steve Clamage, stephen dot clamage at sun dot com > > On 6/22/2007 8:32 PM, Yong Sun wrote: >> Hi, Steve, >> >> Thank you very much! >> >> In my case (the dltest.tar), when the main program call dlopen >> ("libtest.so"), it resolves the dependencies, then adds libbase.so in >> the initialization sequence ahead of libtest.so (the order is >> reversed). And we all knew that the static constructors in a shared >> library are in the .init routine. >> >> While it's trying to initialize the static object "bar" in >> libbase.so, it finds there is a symbol >> "xxx::__null_string_ref_rep<xxx>" (introduced by >> std::basic_string<unsigned int>), then looks up this symbol in loaded >> libraries. Firstly, it looks up the symbol in main program, then in >> libc.so, then finds matched one in libtest.so, and stops to move on >> (actually, libbase.so also has this symbol). Then it tries to >> initialize libtest.so, and initializes the static object "foo". >> Unfortunately, the constructor of Foo calls a external function in >> libbase, and this function accesses the static instance "bar", which >> is not initialized yet (the "buf" is not allocated). >> >> So the main program cores. >> >> I am not sure, maybe the ld.so should first lookup the symbol in the >> current initializing shared library, i.e., libbase.so? >> >> Regards, >> >> Steve Clamage wrote: >>> We have a deliberate, although possibly mis-guided, interaction >>> between libCrun and libCstd initialization. To ensure that C++ >>> iostreams are initialized soon enough, CCrti (program startup code >>> for C++) calls a routine in libCun that has a table of >>> initialization routines that are done first. The only entry in the >>> table currently is the one to initialize iostreams in libCstd. (It >>> uses a weak symbol definition for the target, so if libCstd is not >>> linked, nothing happens.) >>> >>> If a user program has a proper dependency on libCstd, this bouncing >>> between libCrun and libCstd, and their respective initializations, >>> should finish before any user library starts its initialization. At >>> least, that's how it seems to me. >>> >>> Do we have a case where a user program has a dependency on libCstd, >>> but sill mixes its initialization with that of libCstd? If so, I'd >>> like to know how that happened, and what we could do to ensure that >>> doesn't happen. >>> >>> We are now looking again at the iostream issue. I think we can >>> arrange for libCrun to initialize itself without any reference to >>> libCstd, and for libCstd to ensure that iostreams are initialized >>> first. >>> >>> If we make that change, programs that now use libCstd but have no >>> dependency on it, might stop working. I consider such a situation to >>> be user error (correct me if I am wrong), in which case the change >>> doesn't bother me. >>> >>> --- >>> Steve Clamage, stephen dot clamage at sun dot com >>> >>> >>> On 06/21/07 17:45, Rod Evans wrote: >>>> Yong Sun wrote: >>>>> Hi, Rod, >>>>> >>>>> I finally isolated the problem to very simple example C/C++ source >>>>> files, as attached. If you have interests, you could extract the >>>>> tar file, and run gmake. Then you could see, test would core, >>>>> while test2 would succeed. >>>> >>>> Someone from C++ land is going to have to unravel this. >>>> >>>> There are multiple instances of the same symbol in different >>>> libraries, >>>> cyclic dependencies, and .init code that jumps all over the place. >>>> >>>> Set LD_DEBUG=.init and we start seeing: >>>> >>>> 04973: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >>>> 04973: 1: >>>> 04973: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >>>> 04973: 1: >>>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>>> completed >>>> 04973: 1: >>>> 04973: 1: calling .init (dynamically triggered): ./libtest.so >>>> 04973: 1: >>>> 04973: 1: calling .init (dynamically triggered): >>>> /home/rie/dltest/libbase.so >>>> 04973: 1: >>>> 04973: 1: warning: calling ./libtest.so whose init has not completed >>>> 04973: 1: >>>> 04973: 1: warning: calling ./libtest.so whose init has not completed >>>> 04973: 1: >>>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>>> completed >>>> 04973: 1: >>>> 04973: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>>> completed >>>> >>>> Now I can't tell you that these indicate problems or not, as there >>>> is no >>>> way to determine whether a reference to another object requires >>>> that object >>>> to have completed its .init for the reference to be valid. >>>> Meaning, if >>>> data is updated by a .init, and that data is referenced before the >>>> .init >>>> has completed, are you in trouble? >>>> >>>> If you expand a little with init,bindings: >>>> >>>> 04948: 1: calling .init (from sorted order): /usr/lib/libCrun.so.1 >>>> ...... >>>> 04948: 1: binding file=/usr/lib/libCrun.so.1 to >>>> file=/usr/lib/libCstd.so.1: \ >>>> symbol `__SUNW_force_load_of_inits' >>>> 04948: 1: >>>> 04948: 1: calling .init (dynamically triggered): /usr/lib/libCstd.so.1 >>>> >>>> Hmmm, __SUNW_force_load_of_inits - that looks scary. >>>> >>>> 04948: 1: binding file=/usr/lib/libCstd.so.1 to >>>> file=/usr/lib/libCrun.so.1: \ >>>> symbol `__1cG__CrunSregister_exit_code6FpG_v_v_ >>>> 04948: 1: >>>> 04948: 1: warning: calling /usr/lib/libCrun.so.1 whose init has not >>>> completed >>>> >>>> So, you have libCruns .init calling libCstd.so.1, and libCstds >>>> .init calling >>>> libCrun.so.1 >>>> >>>> The scenario continues through your other objects. >>>> >>>> The runtime linker is simply jumping from object to object as >>>> directed, >>>> and trying frantically to fire .init's before an object is called. >>>> When >>>> cyclic dependencies exist, you can't programaticaly determine a >>>> "correct" >>>> order, so the dynamic firing attempts to compensate - and from >>>> experience >>>> we know that without this "compensation" a whole mess of applications >>>> would already be falling over. >>>> >>>> I'll stick by my concluding remarks from >>>> >>>> http://blogs.sun.com/rie/entry/init_and_fini_processing_who >>>> >>>> and let's see if someone from C++ can enlighten us some more. >>>> >>
_______________________________________________ tools-linking mailing list tools-linking at opensolaris dot org
|
|
|
|
|