阅读笔记:库绑定 - 我们应该让它更精确一些
原作者: Michael Walker
原文来自:
http://blogs.sun.com/roller/page/msw
译注者: Badcoffee
Email: blog.oliver@gmail.com
Blog: http://blog.csdn.net/yayong
2005年6月
Library Bindings - let's be a little bit more precise shall we
But first a little history on what we currently do. Solaris (and
*nix's in general) does the following when a process is executed. The
kernel will load the required program (a ELF object) into memory and
also load the runtime linker ( ld.so.1(1)
) into memory. The kernel then transfers(转让) control initially to the
runtime linker. It's the runtime linkers job to examine the. program
loaded and find any dependencies(依赖) it has (in the form of a shared
object), load those shared objects into memory, and then bind all of
the symbol bindings(绑定) (function calls, data references, etc...) from
the program to each of those dependencies. Of coarse, as it loads each
shared object it must in turn do the same examination on each of them
and load any dependencies they require. Once all of the dependencies
are loaded and their symbols have been bound - the runtime linker will
fire(调用) the .init sections for each shared object loaded and finally
transfer control to the executable, which calls main(). Most
people think a process starts with main() but amazing things
happen before we even get there.
注:
1. 在进程在执行期间,ELF文件加载被载入内存的同时,运行时链接器(ld.so.1)也被映射入内存。
2. Kernel最初将控制转给运行时链接器,运行时链接器的工作就是检查程序所依赖的共享库,并完成这些共享库的内存映射并且完成符号绑定。
3. 一旦所有的依赖被载入内存,并且它们的符号被绑定,运行时链接器将调用每一个共享库的.init
section并把控制转给可执行文件,调用main函数。
以上所有过程,Linux和Solaris是相似的。
Here we will specifically look at how the runtime linker binds the various symbol reference between all of the objects loaded into memory. Let's take a simple example first - how about a application which links against a couple of shared objects and then libc.
% more *.cWe've now got a program, prog, which is bound against three shared objects, foo.so, bar.so and libc.so. The program makes two function calls, one to foo() and one to bar() located in it's dependent shared objects, by ldd'ing the executable we can see it's dependencies and a run of it shows the execution flow:
::::::::::::::
bar.c
::::::::::::::
#include
void bar()
{
printf("inside of bar\n");
}
::::::::::::::
foo.c
::::::::::::::
#include
void foo() {
printf("inside of foo\n");
}
::::::::::::::
prog.c
::::::::::::::
#include
int
main(int argc, char *argv[]){
extern void foo();
extern void bar();
foo();
bar();
return (0);
}
% cc -G -o foo.so -Kpic foo.c -lc
% cc -G -o bar.so -Kpic bar.c -lc
% cc -o prog prog.c ./foo.so ./bar.so
% ldd prog注:
./foo.so => ./foo.so
./bar.so => ./bar.so
libc.so.1 => /lib/libc.so.1
libm.so.2 => /lib/libm.so.2
/platform/SUNW,Sun-Blade-1000/lib/libc_psr.so.1
% ./prog
inside of foo
inside of bar
%
prog -> foo.so -> bar.so ->libc.so.1 -> libm.so.2 -> libc_psr.so.1When the runtime linker needs to find a definition for a symbol it starts at the head of the list and will search each object for that symbol. If it's found, it binds to that symbol - if it's not found it proceeds to the next object on the list. The following should help demonstrate(示范) what's happening. I will run the prog program, but with some runtime linker diagnostics(诊断) turned on to trace what it is doing. I'm concentrating(集中注意力) specifically on foo and bar for this example - of course there are thousands of other bindings going on:
注:
6. 运行时链接器载入依赖的共享库,并且创建了一个叫做linkmap的数据结构。这里面是把linkmap简化后的表示,就是一个线性表。Linux也有类似的linkmap。
% LD_DEBUG=symbols,bindings ./prog注:
...
20579: 1: symbol=foo; lookup in file=./prog [ ELF ]
20579: 1: symbol=foo; lookup in file=./foo.so [ ELF ]
20579: 1: binding file=./prog to file=./foo.so: symbol `foo'
...
20579: 1: symbol=bar; lookup in file=./prog [ ELF ]
20579: 1: symbol=bar; lookup in file=./foo.so [ ELF ]
20579: 1: symbol=bar; lookup in file=./bar.so [ ELF ]
20579: 1: binding file=./prog to file=./bar.so: symbol `bar'
...
% pldd `pgrep firefox-bin`And on average - each of those objects have symbol tables with over 2,500 symbols. Doing a linear(线性的) search at the beginning of each link-map list until you find the symbol just doesn't seem that practical anymore. Firefox is average for modern applications these days - if you were to take a look at Star Office you would find a single program which depends upon over 90 different shared objects.
28294: /disk3/local/firefox/firefox-bin /lib/libpthread.so.1
/lib/libthread.so.1
/lib/libc.so.1
/disk3/local/firefox/libmozjs.so
/disk3/local/firefox/libxpcom.so
/usr/sfw/lib/libgtk-1.2.so.0.9.1
/usr/sfw/lib/libgmodule-1.2.so.0.0.10
/usr/sfw/lib/libglib-1.2.so.0.0.10
/usr/openwin/lib/libXext.so.0
/usr/openwin/lib/libX11.so.4
/lib/libsocket.so.1
/lib/libnsl.so.1
/lib/libm.so.2
/usr/sfw/lib/libgdk-1.2.so.0.9.1
/disk3/local/firefox/libssl3.so
/disk3/local/firefox/libnss3.so
/disk3/local/firefox/libplc4.so
/disk3/local/firefox/libplds4.so
/disk3/local/firefox/libnspr4.so
/disk3/local/firefox/libsoftokn3.so
/lib/librt.so.1
/lib/libdl.so.1
/lib/libaio.so.1
/lib/libmd5.so.1
/usr/openwin/lib/libXt.so.4
/platform/sun4u-us3/lib/libc_psr.so.1
/usr/lib/libCrun.so.1
/usr/lib/libdemangle.so.1
/disk3/local/firefox/cpu/sparcv8plus/libnspr_flt4.so
/lib/libm.so.1
/disk3/local/firefox/libsmime3.so
/usr/openwin/lib/libXp.so.1
/disk3/local/firefox/libxpcom_compat.so
/usr/lib/libCstd.so.1
/usr/lib/cpu/sparcv8plus/libCstd_isa.so.1
/lib/libw.so.1
/lib/libmp.so.2
/lib/libscf.so.1
/lib/libuutil.so.1
/usr/openwin/lib/libSM.so.6
/usr/openwin/lib/libICE.so.6
/usr/lib/iconv/646%UTF-16BE.so
/usr/lib/iconv/UTF-16BE%646.so
/usr/jdk/instances/jdk1.5.0/jre/plugin/sparc/ns7/libjavaplugin_oji.so
/platform/sun4u/lib/libmd5_psr.so.1
/usr/jdk/instances/jdk1.5.0/jre/lib/sparc/libjavaplugin_nscp.so
/disk3/local/firefox/components/libjar50.so
/usr/dt/lib/libXm.so.4
/disk3/local/firefox/libfreebl_hybrid_3.so
/usr/sfw/lib/mozilla/libnssckbi.so
%
There's got to be a better way, right? There is - we call it direct bindings(直接绑定). Instead of doing the linear search at runtime you can simply ask the link-editor to record not only what shared objects you bound against - but what symbols you obtained from each shared object. So, if you are bound with Direct Bindings, the runtime linker changes how it looks up symbol bindings and instead will bind directly to the object that offered the symbol at runtime. A much more efficient model - here's the same prog, but this time built with direct bindings, this is done by passing the -Bdirect link-editor option on the link-line:
% cc -Bdirect -o prog prog.c ./foo.so ./bar.so注:
% elfdump -y prog注:
Syminfo Section: .SUNW_syminfo
index flgs bound to symbol
...
[15] DBL [1] ./foo.so foo
[19] DBL [3] ./bar.so bar
...
%
% LD_DEBUG=symbols,bindings ./progNotice we now find each symbol in the first object we look in, much better.
...
20728: 1: symbol=foo; lookup in file=./foo.so [ ELF ]
20728: 1: binding file=./prog to file=./foo.so: symbol `foo'
...
20728: 1: symbol=bar; lookup in file=./bar.so [ ELF ]
20728: 1: binding file=./prog to file=./bar.so: symbol `bar'
...
%
This Direct Bindings has been in Solaris for a few releases now, although because it's not the default not everyone is familiar with it. It has matured quite a bit over the last few years and we are now starting to use it for some of our core shared objects. If you look at the X11 shared objects delivered with Solaris - you'll find that they are bound with direct bindings:
% elfdump -y /usr/lib/libX11.so | head注:
Syminfo Section: .SUNW_syminfo
index flgs bound to symbol
[1] D_XimXTransDisconnect
[2] D [8] libc.so.1 snprintf
[3] D_XcmsFreeIntensityMaps
[4] D_XcmsTableSearch
[5] D_XDeq
[6] DXGetWMSizeHints
[7] DXUnmapWindow
%
Along these lines - it's worth giving a cautionary(警告) note for
those re-linking their existing Applications with Direct Bindings
enabled. As we apply Direct Bindings to more and more applications we
have found a few cases where there are multiple definitions of a single
symbol, by changing the binding model you can change the behavior of
the application. In most, if not all cases, this was a bug in the
design of the application - but a program can become dependent upon
this and result in a failure of the application when run with Direct
Bindings.
注:
15. 直接绑定在一些出现同一个符号多处定义的程序上会导致运行错误,多数情况下是应用程序设计的错误。
Further details on Direct Bindings specifically and the runtime linker (ld.so.1(1)) and link-editor (ld(1)) in general can be found in the Linker and Libraries Guide which is part of the standard Solaris Documentation.
Examples of tracing what the runtime linker is doing can found in a
Blog entry by Rod here titled Tracing a
link-edit.
注:
16. Linker and
Libraries
Guide这本书里面讲述了运行时链接器(ld.so.1(1))和
链接器(ld(1))的
基本概念。
17. 本篇文章使用LD_DEBUG来跟踪运行时链接器,这种方式在Rod的
文章Tracing a
link-edit可以了解到。