Why (and how) do I have (seemingly!) duplicate symbols in my shared libraries?

I happened to need to work out which symbols were exported by which library from a flat list of exported shared library symbols. There were just enough symbols (20 or so) in the list that I wasn’t going to manually cross-reference each one.

I found that nm -A -D -f sysv <library-name> seemed to produce useful output – I could search for lines containing FUNC that also listed an address in the second column. So I ran this command on everything in /usr/lib, redirecting it to a file.

To my surprise, the sanity-checking in the script I made to parse the file reported that there were duplicate symbols! Upon investigation and found that libraries appeared to be exporting duplicate symbols?!?

I verified this using some shell scripting, and I was able to turn the commands I used into this (technically) one-liner:

readlink -f /lib/* /usr/lib/* \
  | grep -F .so. | sort | uniq \
  | while read x; do nm -A -D -f sysv $x; done \
  | grep FUNC | cut -d'|' -f1 | sort -g | uniq -c | sort -g \
  | sed -n '/^ \+1/!{s/^ \+[0-9]\+ \+//p}' | sed 's/ //g' \
  | tr '\n' '\v' \
  | sed ':1;s/\([^ ]\+\):\([^\v$]\+\)\v\1:/\1:\2|/g;t1' \
  | tr '\v' '\n' \
  | while IFS=: read -a x; do \
     nm -A -D -f sysv "${x[0]}" | grep ":\\(${x[1]//|/\\|}\\).*FUNC"; \
  done

The command above will make your disk seek for a short period of time. You can break it into chunks that redirect to temporary files if you want. Also the output will be very wide (~150 cols).

I initially ran the original script on a Debian Squeeze chroot I was working within, but out of curiosity I ran the above on my host system to find out if the chroot was somehow not sane.

Well… the chroot reported just over 90 duplicates, but my host (Arch) system apparently has about 267.

The command works by grepping nm‘s output so the results are a bit noisy, but it looks like this:

/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_access|00043700|   T  |              FUNC|00000037|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_access|000436c0|   T  |              FUNC|0000003c|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_access_mask|000458f0|   T  |              FUNC|00000058|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size|000432a0|   T  |              FUNC|00000037|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size|00043260|   T  |              FUNC|0000003c|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size_max|00044ff0|   T  |              FUNC|0000003c|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size_max|00045030|   T  |              FUNC|00000037|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size_min|000453c0|   T  |              FUNC|00000037|     |.text
/usr/lib/libasound.so.2.0.0:snd_pcm_hw_params_get_buffer_size_min|00045380|   T  |              FUNC|0000003c|     |.text

Notice how there are two of each symbol. The address is different, yes, but… I thought dynamic linking worked by symbol name, and that was it. My confusion is compounded by the fact that (if you scroll rightwards in the listing above) the symbols are all of type FUNC and from the .text section.

I’m posting this to learn what interesting magic is going on under the hood here. (Since my system is working…)

If anyone has any good ideas where I could dump about 600 lines of text – Pastebin doesn’t seem in vogue anymore, and I don’t use GitHub – I’d be happy to share the full output.

Answer

The symbols appear duplicated because the information provided by nm is incomplete: the symbols in question are versioned. You can see this with objdump -T:

0000000000059d00 g    DF .text  0000000000000044 (ALSA_0.9)   snd_pcm_hw_params_get_access
0000000000056040 g    DF .text  000000000000004b  ALSA_0.9.0rc4 snd_pcm_hw_params_get_access

or nm’s --with-symbol-versions option:

/usr/lib/x86_64-linux-gnu/libasound.so.2.0.0:snd_pcm_hw_params_get_access|0000000000059d00|   T  |              FUNC|0000000000000044|     |.text@ALSA_0.9
/usr/lib/x86_64-linux-gnu/libasound.so.2.0.0:snd_pcm_hw_params_get_access|0000000000056040|   T  |              FUNC|000000000000004b|     |.text@@ALSA_0.9.0rc4

Binaries are linked against a specific version of the symbol, and they’ll get the right one at link time. This allows APIs to be changed while preserving backwards-compatibility.

Attribution
Source : Link , Question Author : i336_ , Answer Author : Stephen Kitt

Leave a Comment