Summary: | Xorg start fails with missleading log entries: Module [...] does not have a [...] data object. | ||||||
---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Knut Petersen <Knut_Petersen> | ||||
Component: | Server/DDX/Xorg/dlloader | Assignee: | Adam Jackson <ajax> | ||||
Status: | RESOLVED INVALID | QA Contact: | Xorg Project Team <xorg-team> | ||||
Severity: | normal | ||||||
Priority: | medium | CC: | daniel | ||||
Version: | git | Keywords: | patch | ||||
Hardware: | x86 (IA32) | ||||||
OS: | Linux (All) | ||||||
Whiteboard: | 2011BRB_Reviewed | ||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
Description
Knut Petersen
2011-09-25 23:31:28 UTC
Hmm, it seems like you're starting with LD_BIND_NOW or RTLD_NOW enabled, which isn't supported. Is that the case? No. set | grep LD shows none of the LD/RTLD variables. cu, knut There's two places where you can ask for LD_BIND_NOW-style behaviour, in the ld.so environment (quite difficult with suid executables actually) and at ld time itself. I suspect if you run 'readelf -a foo.so | grep NOW' against one of your compiled modules you'll see something like: 0x00000018 (BIND_NOW) 0x6ffffffb (FLAGS_1) Flags: NOW Which means you've put '-z now' into your ldflags. Don't have done that. No. There is no -z in LDFLAGS. No. "readelf -a foo.so | grep NOW' does not succeed to find "NOW". I did not install a new glibc, binutils or something like that. Xorg is built using the following script. I believe it is ok. export PREFIX=/usr export PKG_CONFIG_PATH=$PREFIX/lib/pkgconfig export PATH=$PREFIX/bin:$PATH export ACLOCAL="aclocal -I $PREFIX/share/aclocal" export LD_LIBRARY_PATH=$PREFIX/lib export PYTHONPATH=$PREFIX/lib/python2.7/site-packages export CFLAGS="-v -O3 " util/modular/build.sh $PREFIX --modfile modules_to_build --autoresume built-modules.txt \ --confflags "--enable-kdrive --with-dri-drivers=i915 --disable-gallium --localstatedir=/var" A full new build after make clean, make realclean, git reset --hard does not help. ltrace shows that dlopen is called with flags 257. That is ok. vsnprintf("(II) Loading /usr/lib/xorg/modules/extensions/libglx.so\n", 1024, "(II) Loading %s\n", 0xbf8166f8) = 56 fwrite("(II) Loading /usr/lib/xorg/modules/extensions/libglx.so\n", 56, 1, 0x8220d28) = 1 dlopen("/usr/lib/xorg/modules/extensions/libglx.so", 257) = NULL dlerror() = "/usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo" snprintf("(EE) Failed to load %s: %s\n", 1024, "%s%s%s", "(EE)", " ", "Failed to load %s: %s\n") = 27 clock_gettime(1, 0xbf816240, 0x79732064, 0x6c6f626d, 0x5244203a) = 0 sprintf("[ 71701.622] ", "[%10.3f] ", ...) = 13 fwrite("[ 71701.622] ", 13, 1, 0x8220d28) = 1 vsnprintf("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 1024, "(EE) Failed to load %s: %s\n", 0xbf8166f8) = 145 fwrite("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 145, 1, 0xb7537560(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo ) = 1 fwrite("(EE) Failed to load /usr/lib/xorg/modules/extensions/libglx.so: /usr/lib/xorg/modules/extensions/libglx.so: undefined symbol: DRIGetDrawableInfo\n", 145, 1, 0x8220d28) = 1 __strdup(0x822b9b0, 0xbf8167ec, 0xbf8167e8, 0xbf81677c, 0) = 0x822b480 strchr("glx", '.') = NULL asprintf(0xbf81678c, 0x81e8905, 0x8229c58, 0xbf81677c, 0) = 13 dlsym(NULL, "glxModuleData") = NULL dlopen(NULL, 257) = 0xb7837900 dlsym(0xb7837900, "glxModuleData") = NULL snprintf("(EE) LoadModule: Module %s does not have a %s data object.\n", 1024, "%s%s%s", "(EE)", " ", "LoadModule: Module %s does not have a %s data object.\n") = 59 clock_gettime(1, 0xbf816270, 1, 0x8220d28, 1) = 0 sprintf("[ 71701.647] ", "[%10.3f] ", ...) = 13 fwrite("[ 71701.647] ", 13, 1, 0x8220d28) = 1 vsnprintf("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 1024, "(EE) LoadModule: Module %s does not have a %s data object.\n", 0xbf816728) = 71 fwrite("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 71, 1, 0xb7537560(EE) LoadModule: Module glx does not have a glxModuleData data object. ) = 1 fwrite("(EE) LoadModule: Module glx does not have a glxModuleData data object.\n", 71, 1, 0x8220d28) = 1 snprintf("(II) UnloadModule: "%s"\n", 1024, "%s%s%s", "(II)", " ", "UnloadModule: "%s"\n") = 24 clock_gettime(1, 0xbf816230, 71, -1, 0xb7536ff4) = 0 sprintf("[ 71701.660] ", "[%10.3f] ", ...) = 13 fwrite("[ 71701.660] ", 13, 1, 0x8220d28) = 1 vsnprintf("(II) UnloadModule: "glx"\n", 1024, "(II) UnloadModule: "%s"\n", 0xbf8166ec) = 25 fwrite("(II) UnloadModule: "glx"\n", 25, 1, 0x8220d28) = 1 snprintf("(II) Unloading %s\n", 1024, "%s%s%s", "(II)", " ", "Unloading %s\n") = 18 clock_gettime(1, 0xbf816210, -1, 0xbf816230, 3) = 0 sprintf("[ 71701.669] ", "[%10.3f] ", ...) = 13 fwrite("[ 71701.669] ", 13, 1, 0x8220d28) = 1 vsnprintf("(II) Unloading glx\n", 1024, "(II) Unloading %s\n", 0xbf8166c8) = 19 fwrite("(II) Unloading glx\n", 19, 1, 0x8220d28) = 1 libdri etc do provide the symbols required for libglx. Xorg finds them and loads them after libglx failed. ldconfig -p | grep /usr/lib/xorg does show all the required libraries. According to the man page dlopen should load those libraries, shouldn´t it?! I´d suspect a problem with ld*, but why is only Xorg module loading broken? What is so special about Xorg? I am perplexed. cu, Knut > libdri etc do provide the symbols required for libglx. Xorg finds them
> and loads them after libglx failed. ldconfig -p | grep /usr/lib/xorg does show
> all the required libraries. According to the man page dlopen should load those
> libraries, shouldn´t it?!
Yes, it should.
To use libglx as an example, the only reference it makes to DRIGetDrawableInfo is as a function call:
glx/glxdri.c: retval = DRIGetDrawableInfo(pScreen, drawable->base.pDraw, index, stamp,
These are _normally_ resolved lazily (ie, when called) by the dynamic loader. However if you force symbols to be resolved before they're all available, dlopen will fail. That's why I keep asking about -z now: that's the thing that changes functional call resolution from lazy to up-front.
You _must_ be getting that behaviour from somewhere. Check the Xorg binary itself. Check for wrapper scripts. Check your OS for security policy changes (-z now lets you do some additional security hardening).
A full extra verbose build log does not show -z now. After exporting LD_DEBUG and LD_DEBUG_OUTPUT I got "log". grep "relocation" log shows 14821: relocation processing: /lib/libgpg-error.so.0 (lazy) 14821: relocation processing: /lib/libc.so.6 (lazy) 14821: relocation processing: /lib/librt.so.1 (lazy) 14821: relocation processing: /lib/libm.so.6 (lazy) 14821: relocation processing: /usr/local/lib/libXdmcp.so.6 (lazy) 14821: relocation processing: /usr/local/lib/libXau.so.6 (lazy) 14821: relocation processing: /lib/libz.so.1 (lazy) 14821: relocation processing: /usr/lib/libfontenc.so.1 (lazy) 14821: relocation processing: /usr/lib/libfreetype.so.6 (lazy) 14821: relocation processing: /usr/lib/libXfont.so.1 (lazy) 14821: relocation processing: /usr/lib/libpixman-1.so.0 (lazy) 14821: relocation processing: /lib/libpthread.so.0 (lazy) 14821: relocation processing: /usr/lib/libpciaccess.so.0 (lazy) 14821: relocation processing: /lib/libdl.so.2 (lazy) 14821: relocation processing: /lib/libgcrypt.so.11 (lazy) 14821: relocation processing: /lib/libdbus-1.so.3 (lazy) 14821: relocation processing: /usr/lib/libhal.so.1 (lazy) 14821: relocation processing: Xorg (lazy) 14821: relocation processing: /lib/ld-linux.so.2 14821: relocation processing: /lib/libgcc_s.so.1 (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/libextmod.so (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/libdbe.so (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/libglx.so (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/librecord.so (lazy) 14821: relocation processing: /usr/lib/libdrm.so.2 (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/libdri.so (lazy) 14821: relocation processing: /usr/lib/xorg/modules/extensions/libdri2.so (lazy) 14821: relocation processing: /usr/lib/libdrm_intel.so.1 (lazy) 14821: relocation processing: /usr/lib/xorg/modules/drivers/intel_drv.so (lazy) Everything is processed lazy as it should. Here is the section related to searching of DRIGetDrawableInfo during processing of libglx: 14821: symbol=DRIGetDrawableInfo; lookup in file=Xorg [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libhal.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdbus-1.so.3 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libgcrypt.so.11 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdl.so.2 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libpciaccess.so.0 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libpthread.so.0 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libpixman-1.so.0 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libXfont.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libfreetype.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/libfontenc.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libz.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/local/lib/libXau.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/local/lib/libXdmcp.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libm.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/librt.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libc.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libgpg-error.so.0 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/ld-linux.so.2 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libextmod.so [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libdbe.so [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/usr/lib/xorg/modules/extensions/libglx.so [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libdl.so.2 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libm.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/librt.so.1 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libc.so.6 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/ld-linux.so.2 [0] 14821: symbol=DRIGetDrawableInfo; lookup in file=/lib/libpthread.so.0 [0] 14821: /usr/lib/xorg/modules/extensions/libglx.so: error: symbol lookup error: undefined symbol: DRIGetDrawableInfo (fatal) dlopen does not have a look at libdri.so readelf shows four needed shared libs for libglx.so. I don´t know how the linker exactly finds symbols in the various libraries, but: Shouldn´t there be a "NEEDED" entry for libdri.so in libglx.so? Dynamic section at offset 0x5dee8 contains 27 entries: Tag Type Name/Value 0x00000001 (NEEDED) Shared library: [libdl.so.2] 0x00000001 (NEEDED) Shared library: [libm.so.6] 0x00000001 (NEEDED) Shared library: [librt.so.1] 0x00000001 (NEEDED) Shared library: [libc.so.6] 0x0000000e (SONAME) Library soname: [libglx.so] 0x0000000c (INIT) 0xe818 0x0000000d (FINI) 0x4ee28 0x00000004 (HASH) 0x138 0x6ffffef5 (GNU_HASH) 0x57c 0x00000005 (STRTAB) 0xefc 0x00000006 (SYMTAB) 0x63c 0x0000000a (STRSZ) 2090 (bytes) 0x0000000b (SYMENT) 16 (bytes) 0x00000003 (PLTGOT) 0x5dff4 0x00000002 (PLTRELSZ) 16 (bytes) 0x00000014 (PLTREL) REL 0x00000017 (JMPREL) 0xe808 0x00000011 (REL) 0x18b0 0x00000012 (RELSZ) 53080 (bytes) 0x00000013 (RELENT) 8 (bytes) 0x00000016 (TEXTREL) 0x0 0x0000001e (FLAGS) TEXTREL STATIC_TLS 0x6ffffffe (VERNEED) 0x1840 0x6fffffff (VERNEEDNUM) 2 0x6ffffff0 (VERSYM) 0x1726 0x6ffffffa (RELCOUNT) 3970 0x00000000 (NULL) 0x0 cu, Knut (In reply to comment #6) > I don´t know how the linker exactly finds symbols in the various libraries, > but: Shouldn´t there be a "NEEDED" entry for libdri.so in libglx.so? If these were actual shared libraries, then yes, but these are loadable modules which rely on the symbols being found at runtime in either the loading program or the other objects it's already dlopen'ed. That said, the Solaris packages do include a patch to add that dependency, as part of our checking all symbols are resolvable at build time (-z defs): http://src.opensolaris.org/source/xref/x-cons/xnv-clone/open-src/xserver/xorg/dixmods-deps.patch If that was useful to other platforms, I'd be happy to contribute upstream, similar to the recently submitted http://patchwork.freedesktop.org/patch/7209/ Created attachment 51740 [details] [review] fix of longstanding error handling bug in the module loader We could argue about the error message... I think we should give a hint to Fred Foobar how he could fix the problem on his system. But as dlopen() never should fail, we also should ask for a bug report. cu, Knut Please send your patch to xorg-devel for review. (In reply to comment #8) > Created attachment 51740 [details] [review] [review] > fix of longstanding error handling bug in the module loader > > We could argue about the error message... > > I think we should give a hint to Fred Foobar how he could fix the problem on > his system. But as dlopen() never should fail, we also should ask for a bug > report. > > cu, > Knut Someone else fixed the 2nd Bug. The main problem still exists in current git master - Xorg fails to start without a manually created "Section Modules", even after I installed a fresh version of openSuSE. Knut This came up again on intel-gfx@, and it does indeed appear to be a toolchain issue: http://lists.freedesktop.org/archives/intel-gfx/2012-July/019079.html The solution is pretty simple: ============================================= Never ever include -v or --verbose in CFLAGS! ============================================= Why? Because otherwise there will be some output to stdout during the -fPIC test compile executed from configure, and that output causes the build system to erroneously assume that -fPIC does not work. Hence xorg parts that normally would be build with -fPIC will be built without that flag. The resulting Xorg server will fail to start with the normal configuration setup as lazy resolution is assumed but impossible. It will work perfectly if you add a suitable Section "Module" that loads all necessary modules in the right order. I think the test for "-fPIC" support is fundamentally broken and should be fixed. Or would it be better to check for -v and --verbose in CFLAGS? cu, Knut |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.