This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: About the dll search algorithm of dlopen


Hi Corinna,

On 08/20/2016 09:32 PM, Corinna Vinschen wrote:
>>>>
>>>> One way around YA code duplication could be some kind of path iterator
>>>> class which could be used from find_exec as well as from
>>>> get_full_path_of_dll.

>>> 0001.patch is a draft for some new cygwin::pathfinder class, with
>>> 0002.patch adding the executable's directory as searchpath, and
>>> 0003.patch to search the PATH environment as well.
>>>
>>> Thoughts?
> 
> Ok, that might be disappointing now because you already put so much work
> into it, but I actually expected some more discussion first.  I have two
> problem with this.
> 
> I'm not a big fan of templates.

Never mind, it's been some template exercise to me anyway.

> What I had in mind was a *simple* class which gets told if it searches
> for libs or executables and then checks the different paths accordingly,
> kind of a copy of find_exec as a class, just additionally handling the
> prefix issue for DLLs.

What I'm more interested in for such a class is the actual API for use
by dlopen() and exec(), and the final list of files searched for - with
these use cases coming to my mind:

Libraries/dlls with final search path "/lib:/morelibs":
  L1) dlopen("libN.so")
  L2) dlopen("libN.dll")
  L3) dlopen("cygN.dll")
  L4) dlopen("N.so")
  L5) dlopen("N.dll")
Executables with final search path "/bin:/moreexes"
  X1) exec("X")
  X2) exec("X.exe")
  X3) exec("X.com")

Instead of API calls similar to:
  L1) find(dll, "N", ["/lib", "/morelibs"])
  L2) find(dll, "N", ["/lib", "/morelibs"])
  L3) find(dll, "N", ["/lib", "/morelibs"])
  L4) find(dll, "N", ["/lib", "/morelibs"])
  L5) find(dll, "N", ["/lib", "/morelibs"])
  X1) find(exe, "X", ["/bin", "/moreexes"])
  X2) find(exe, "X", ["/bin", "/moreexes"])
  X3) find(exe, "X", ["/bin", "/moreexes"])
it feels necessary to support more explicit naming, as in:
  L1) find(["libN.so", "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L2) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L3) find([           "cygN.dll", "libN.dll"], ["/lib/../bin","/lib","/morelibs"])
  L4) find(["N.so", "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
  L5) find([        "N.dll"                  ], ["/lib/../bin","/lib","/morelibs"])
  X1) find(["X", "X.exe","X.com"], ["/bin","/moreexes"])
  X2) find(["X", "X.exe"        ], ["/bin","/moreexes"])
  X3) find(["X", "X.com"        ], ["/bin","/moreexes"])

Where the find method does not need to actually know whether it searches
for a dll or an exe, but dlopen() and exec() instead define the file
names to search for. This is what the patch draft does in dlopen.

>>>>>>> *) The directory of the current main executable should be searched
>>>>>>>    after LD_LIBRARY_PATH and before /usr/bin:/usr/lib.
>>>>>>>    And PATH should be searched before /usr/bin:/usr/lib as well.
>>>>>>
>>>>>> Checking the executable path and $PATH are Windows concepts.  dlopen
>>>>>> doesn't do that on POSIX systems and we're not doing that either.
>>>>>
>>>>> Agreed, but POSIX also does have the concept of embedded RUNPATH,
>>>>> which is completely missing in Cygwin as far as I can see.
>>>>
>>>> RPATH and RUNPATH are ELF dynamic loader features, not supported by
>>>> PE/COFF.
>>>
>>> In any case, to me it does feel quite important to have the (almost) same
>>> dll search algorithm with dlopen() as with CreateProcess().
> 
> Last but not least I'm not yet convinced if it's *really* a good idea to
> prepend the executable path to the DLL search path unconditionally.  Be
> it as it is in terms of DT_RUNPATH, why is the application dir a good
> choice at all, unless we're talking Windows semantics?  Which we don't.
> Also, if loading from the applications dir from dlopen is important for
> you, you can emulate it by adding the application dir to LD_LIBRAYR_PATH.

As long as there is lack of a Cygwin specific dll loader to find the
dlls to load during process startup, we're bound to Windows semantics.

For dlopen, it is more important to find the same dll file as would be
found when the exe was linked against that dll file, rather than using
the Linux-known algorithm and environment variables - and differ from
process startup: Both really should result in the same algorithm here,
even if that means some difference compared to Linux.

As far as I understand, lack of DT_RUNPATH (besides /etc/ld.so.conf)
support during process start was the main reason for the dlls to install
into /lib/../bin instead of /lib at all, to be found at process start
because of residing in the application's bin dir:
Why should that be different for dlopen?

> I checked for the usage of DT_RUNPATH/DT_RPATH on Fedora 23 and only a
> limited number of packages use it (texlive, samba, python, man-db,
> swipl, and a few more).  Some of them, like texlive, even use it wrongly,
> with RPATH pointing to a non-existing build dir.  There are also a few
> stray "/usr/lib64" settings, but all in all it's not used to point to
> the dir the application is installed to, but rather to some package specific
> subdir, e.g.  /usr/lib64/samba, /usr/lib64/swipl-7.2.3/lib/x86_64-linux,
> etc.

On Linux, the binaries installed in /usr usually rely on the Linux
loader to be configured via /etc/ld.so.conf to find their runtime
libs in /usr/lib.

Please remember: This whole thing is not a problem with packages
installed to /usr, but with packages installed to /somewhere/else
that provide runtime libraries that are also available in /usr.

Using LD_LIBRARY_PATH pointing to /somewhere/else/lib may break the
binaries found in /usr/bin - and agreed, searching PATH doesn't make
it better, as PATH is the "LD_LIBRARY_PATH" for Windows.

> IMHO this means just adding the applications bin dir is most of the time
> an unused or even wrong workaround.

Although GetModuleHandle may reduce that pressure for dlopen - as long as
the applications bin dir is searched at process start, it really should
be searched by dlopen too, even if for /usr/bin/* this might indeed become
redundant, as we always add /usr/bin in dlopen - which really mimics
the /etc/ld.so.conf content actually, although that one is unavailable
to process startup.

Thanks!
/haubi/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]