Summary: | ZaphodHeads doesn't work after upgrading to xorg 1.14 | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Damian Nowak <nowaker> | ||||||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||||||
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||||
Severity: | critical | ||||||||||||
Priority: | medium | CC: | jasonbstubbs, max, nowaker | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | x86-64 (AMD64) | ||||||||||||
OS: | Linux (All) | ||||||||||||
Whiteboard: | |||||||||||||
i915 platform: | i915 features: | ||||||||||||
Attachments: |
|
Description
Damian Nowak
2013-03-29 19:01:13 UTC
To be clear, the message "failed to set drm interface version" appears in CASE 1 as well. That's why these two cases are considered one bug. I've got the same problem. X.Org X Server 1.13.1 Release Date: 2012-12-13 [ 6286.574] X Protocol Version 11, Revision 0 [ 6286.574] Build Operating System: Linux 3.6.11-gentoo x86_64 Gentoo [ 6286.575] Current Operating System: Linux vaio 3.7.10-gentoo #1 SMP Sun Mar 17 20:02:26 MSK 2013 x86_64 [ 6286.575] Kernel command line: root=/dev/sda2 [ 6286.576] Build Date: 16 February 2013 12:46:40AM [ 6286.576] [ 6286.577] Current version of pixman: 0.28.0 ..... [ 6286.581] (==) Using system config directory "/usr/share/X11/xorg.conf.d" [ 6286.581] (==) ServerLayout "MyLayout" [ 6286.581] (**) |-->Screen "LCDScreen" (0) [ 6286.581] (**) | |-->Monitor "MyVaioLCD" [ 6286.581] (**) | |-->Device "NvidiaLCD" [ 6286.581] (**) |-->Screen "FlatronScreen" (1) [ 6286.581] (**) | |-->Monitor "MyFlatron" [ 6286.581] (**) | |-->Device "NvidiaCRT" [ 6286.581] (**) Option "Xinerama" "1" ..... [ 6286.582] (II) Module ABI versions: [ 6286.582] X.Org ANSI C Emulation: 0.4 [ 6286.582] X.Org Video Driver: 13.1 [ 6286.582] X.Org XInput driver : 18.0 [ 6286.582] X.Org Server Extension : 7.0 [ 6286.583] (II) config/udev: Adding drm device (/dev/dri/card0) [ 6286.585] (--) PCI:*(0:1:0:0) 10de:0a75:104d:9067 rev 162, Mem @ 0xe2000000/16777216, 0xd0000000/268435456, 0xe0000000/33554432, I/O @ 0x0000d000/128, BIOS @ 0x????????/524288 [ 6286.585] (II) Open ACPI successful (/var/run/acpid.socket) ..... [ 6286.597] (II) Module glx: vendor="X.Org Foundation" [ 6286.597] compiled for 1.13.1, module version = 1.0.0 [ 6286.597] ABI class: X.Org Server Extension, version 7.0 [ 6286.597] (==) AIGLX enabled [ 6286.598] Loading extension GLX [ 6286.598] (II) LoadModule: "nouveau" [ 6286.598] (II) Loading /usr/lib64/xorg/modules/drivers/nouveau_drv.so [ 6286.598] (II) Module nouveau: vendor="X.Org Foundation" [ 6286.598] compiled for 1.13.1, module version = 1.0.6 [ 6286.598] Module class: X.Org Video Driver [ 6286.598] ABI class: X.Org Video Driver, version 13.1 [ 6286.598] (II) NOUVEAU driver ..... [ 6286.598] (--) using VT number 7 [ 6286.600] (II) [drm] nouveau interface version: 1.1.0 [ 6286.600] (II) [drm] nouveau interface version: 1.1.0 [ 6286.600] (II) Loading sub module "dri" [ 6286.600] (II) LoadModule: "dri" [ 6286.600] (II) Module "dri" already built-in [ 6286.600] (II) NOUVEAU(0): Loaded DRI module [ 6286.600] (--) NOUVEAU(0): Chipset: "NVIDIA NVa8" [ 6286.601] (**) NOUVEAU(0): Depth 24, (--) framebuffer bpp 32 [ 6286.601] (==) NOUVEAU(0): RGB weight 888 [ 6286.601] (==) NOUVEAU(0): Default visual is TrueColor [ 6286.601] (**) NOUVEAU(0): Option "ZaphodHeads" "LVDS-1,VGA-1" [ 6286.601] (==) NOUVEAU(0): Using HW cursor [ 6286.601] (==) NOUVEAU(0): GLX sync to VBlank disabled. [ 6286.601] (==) NOUVEAU(0): Page flipping enabled [ 6286.601] (==) NOUVEAU(0): Swap limit set to 2 [Max allowed 2] [ 6286.646] (II) NOUVEAU(0): Output LVDS-1 using monitor section MyVaioLCD [ 6286.646] (**) NOUVEAU(0): Option "Enable" "true" [ 6286.685] (II) NOUVEAU(0): Output VGA-1 using monitor section MyFlatron [ 6286.685] (**) NOUVEAU(0): Option "Enable" "true" [ 6286.689] (II) NOUVEAU(0): EDID for output LVDS-1 [ 6286.689] (II) NOUVEAU(0): Manufacturer: SNY Model: 6fa Serial#: 0 ..... [ 6286.729] (II) NOUVEAU(0): Output LVDS-1 enabled by config file [ 6286.729] (II) NOUVEAU(0): Output VGA-1 enabled by config file [ 6286.729] (II) NOUVEAU(0): Using exact sizes for initial modes [ 6286.729] (II) NOUVEAU(0): Output LVDS-1 using initial mode 1600x900 [ 6286.729] (II) NOUVEAU(0): Output VGA-1 using initial mode 1600x900 ..... [ 6286.730] (II) Module shadowfb: vendor="X.Org Foundation" [ 6286.730] compiled for 1.13.1, module version = 1.0.0 [ 6286.730] ABI class: X.Org ANSI C Emulation, version 0.4 [ 6286.730] (II) Loading sub module "dri" [ 6286.730] (II) LoadModule: "dri" [ 6286.730] (II) Module "dri" already built-in [ 6286.730] (II) NOUVEAU(1): Loaded DRI module [ 6286.730] (EE) NOUVEAU(1): [drm] failed to set drm interface version. [ 6286.730] (EE) NOUVEAU(1): [drm] error opening the drm [ 6286.730] (EE) NOUVEAU(1): 819: [ 6286.730] (II) UnloadModule: "nouveau" [ 6286.730] (--) Depth 24 pixmap format is 32 bpp [ 6286.733] (II) NOUVEAU(0): Opened GPU channel 0 [ 6286.747] (II) NOUVEAU(0): [DRI2] Setup complete [ 6286.747] (II) NOUVEAU(0): [DRI2] DRI driver: nouveau [ 6286.747] (II) NOUVEAU(0): [DRI2] VDPAU driver: nouveau [ 6286.750] (II) EXA(0): Driver allocated offscreen pixmaps OK, so it looks like this time it's DRM to blame, not xorg 1.14. I've upgraded x11-drivers/xf86-video-nouveau from 1.0.6 to 1.0.7, the result is still the same. How can I find (or refine) the reason? About the kernel: /usr/src/linux-3.7.10-gentoo/drivers/gpu/drm/nouveau/nouveau_drm.h:#define DRIVER_DATE "20120801" Unfortunately, I cannot use stand-alone drm module with my 3.7.10 kernel: /var/tmp/portage/x11-base/nouveau-drm-20121015/work/master/drivers/gpu/drm/drm_gem.c: In function ‘drm_gem_mmap’: /var/tmp/portage/x11-base/nouveau-drm-20121015/work/master/drivers/gpu/drm/drm_gem.c:709:19: error: ‘VM_RESERVED’ undeclared (first use in this function) /var/tmp/portage/x11-base/nouveau-drm-20121015/work/master/drivers/gpu/drm/drm_gem.c:709:19: note: each undeclared identifier is reported only once for each function it appears in New 3.8.7 kernel contains the same version of drm: /usr/src/linux-3.7.10-gentoo/drivers/gpu/drm/nouveau/nouveau_drm.h:#define DRIVER_DATE "20120801" /usr/src/linux-3.8.7-gentoo/drivers/gpu/drm/nouveau/nouveau_drm.h:#define DRIVER_DATE "20120801" A _very_ wild guess - this is an existing drm race bug, exposed with recent X work Keep xf86-video-nouveau (ddx) at 1.0.7, and bisect X Note you may need to rebuild the ddx as well Thanks Any instructions on how to "bisect X"? I have no idea what you meant. :) > Any instructions on how to "bisect X"? http://git-scm.com/docs/git-bisect http://git-scm.com/book/en/Git-Tools-Debugging-with-Git#Binary-Search A bit farther (helped me for my first bisect, maybe it's just redundant with above project-agnostic docs): http://wiki.winehq.org/RegressionTesting Can reproduce the issue but there is no "good" combination on my system, i.e. any combination of xf86-video-nouveau -> 1.0.6-1 1.0.7-1 xorg-server -> 1.13.2.901-1 1.14.0-2 produces "Failed to set drm interface version" In terms of "how to bisect" the links provided are quite nice Although I would suggest to confirm which package exactly caused the issue, before jumping into blind bisection 1. Revert xorg-server to previous version (1.13.2.901-1) 2. Keep xf86-video-nouveau 1.0.7-1, but rebuild it on top of the old/good xserver and vice versa Note: to avoid rebuilding the input drivers have SSH handy and observe Xorg.log for the offending lines Note: if unsure about the specific build options, patches and others take a look at the distro packaging system Dave pushed a fix for the issue commit d3b52efe959f255784f5ead16d7276ca0fb4cdb1 Author: Dave Airlie <airlied@redhat.com> Date: Mon May 13 13:35:12 2013 +1000 nouveau: attempt to fix zaphod since dri1 code removal j_v on #nouveau bisected b1a630b48210d6a3c44994fce1b73273000ace5c has breaking zaphod, on review it was trying to open the drm fd a second time which was unnecessary. Avoid the problem by storing the nv fd in an entity and have share it between the two scrn info recs. I will validate it today. Dear guys, I had the same issue on an NV80 GPU card. I've just tried your patch, it's a little bit better (I can see this kind [ 4957.508] (II) NOUVEAU(1): Output DVI-I-2 connected) Which is great and better than before, but now I've a segfault ... Created attachment 79294 [details]
My Xorg.log
Here is my Xorg.log for the segfault ...
At quick look at your log indicates another issue (EE) AIGLX error: Calling driver entry point failed Please open another bug with more information [1] Cheers Emil [1] http://nouveau.freedesktop.org/wiki/Bugs/ The fix didn't work for me. I get the very same exception as in my first comment (titled CASE 1). Using bf72ae1f6574c540f0afc2d7845d41df43507a8f. 2013-05-19 log: http://upload.nowaker.net/nwkr/1368964535_Xorg.0.log 2013-03-29 log: http://upload.nowaker.net/nwkr/1364581970_Xorg.0.log-HEAD Indeed MasterPremium's error is a different issue. I have never had such a message. Two separate issues in a single bug report :P AFAICS the patch did resolve the second issue, although the first one seems quite specific to your system/setup There are two fronts you can take to tackle this 1. Carry on with a bisection - first with xf86-video-nouveau and after that with any other package that was updated when the breakage occured 2. Bisect your xorg.conf, see which line(s) are causing the issue I would recommend you try them in the order in which they are listed I am too weak to do this bisect thing. Maybe when I invite my hacker-friend for a beer I will manage to do that. ;-) But I took a look at `git log` and checked out 27a1a0616304e9b9f0ae842899b7d614f1026578 (which is actually your fix for my #56347), compiled and... it works! Will this info help you to figure out what actually happened? Arch Linux 20130519 xorg-server 1.14.1-1 libdrm 2.4.45-1 A complete list of all X/drm/dri-related packages: http://wklej.org/id/1042829/ An alternative to bisecting: you know where the crash is, it's at [ 360.877] (EE) 3: /usr/local/lib/xorg/modules/drivers/nouveau_drv.so (0x7f24ae751000+0x255c0) [0x7f24ae7765c0] [ 360.877] (EE) 4: /usr/bin/X (xf86CrtcSetModeTransform+0x12a) [0x4aad4a] So load up nouveau_drv.so in gdb (gdb /usr/local/lib/xorg/modules/drivers/nouveau_drv.so) And inside gdb, run disassemble 0x255c0 That should tell you what function it's dying in, and the exact instructions it's dying on. This is what I did for bug 63263 (see my initial comment there), and that was able to pinpoint the crash exactly. (Of course figuring out the circumstances that lead to that condition can be trickier to work out.) You can also compile the whole thing with -g, e.g. ./configure CFLAGS=-g or something like that. And make sure that the final thing isn't stripped. That might make gdb more cooperative if you're having trouble. @Ilia, thanks for suggestions. Leaving debug symbols sounds good. Will check it soon. Hi, I was getting the "[drm] error opening the drm" error, which is what led me to this bug, and d3b52efe solves it for me too. I think the crash is a different error as the "error opening the drm" error existed even on 1.0.4 whereas that version didn't have the crash. Having said that, I'm getting the crash too and have got some of the information you've asked for so. seeing it's already been discussed on this bug, I'll go ahead and post it here. git bisect led me to this commit: commit 1fdd7db94b55c65ea62cc9eaefff620b20e9e4ea Author: Dave Airlie <airlied@redhat.com> Date: Mon Jan 7 15:28:53 2013 +1000 nouveau: add reverse prime support This allows the nvidia card to scanout Intel cards rendering. Signed-off-by: Dave Airlie <airlied@redhat.com> It didn't revert cleanly against master and how to fix the conflict wasn't clear to me so I haven't been able to test that. The disassemble points to the function drmmode_set_mode_major. I'll attach that output, along with my Xorg.0.log after this. Created attachment 80077 [details]
My Xorg.0.log
Created attachment 80078 [details]
disassemble of segfault location
Backtrace:
disassemble 0x263f7 taken from frame 3 below
0: /usr/bin/X (xorg_backtrace+0x36) [0x53ccca]
1: /usr/bin/X (0x400000+0x13fe6b) [0x53fe6b]
2: /lib64/libpthread.so.0 (0x7f89dcbe8000+0x11070) [0x7f89dcbf9070]
3: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f89da6e0000+0x263f7) [0x7f89da7063f7]
4: /usr/bin/X (xf86CrtcSetModeTransform+0x14f) [0x493916]
5: /usr/bin/X (xf86SetDesiredModes+0x251) [0x493fb4]
6: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f89da6e0000+0xee51) [0x7f89da6eee51]
7: /usr/lib64/xorg/modules/drivers/nouveau_drv.so (0x7f89da6e0000+0xf78e) [0x7f89da6ef78e]
8: /usr/bin/X (0x400000+0x91c1e) [0x491c1e]
9: /usr/bin/X (0x400000+0x29362) [0x429362]
10: /lib64/libc.so.6 (__libc_start_main+0xed) [0x7f89db89464d]
11: /usr/bin/X (0x400000+0x29779) [0x429779]
(In reply to comment #20) > commit 1fdd7db94b55c65ea62cc9eaefff620b20e9e4ea > Author: Dave Airlie <airlied@redhat.com> > Date: Mon Jan 7 15:28:53 2013 +1000 > > It didn't revert cleanly against master and how to fix the conflict wasn't > clear to me so I haven't been able to test that. I spoke too soon. Reverting was as easy as deleting all the additions, whereas I was confused in thinking that the additions needed to be kept. So, to cut a long story short, HEAD with the above patch reverted works for me. Could you post the disassembly of the function? Would like to see exactly where in that method it's dying. (I wonder if it's crtc->randr_crtc... something with a 0x2d0 struct offset.) Also, the latest version of that code has added a #ifdef NOUVEAU_PIXMAP_SHARING around the added code in drmmode_set_mode_major... (which I assume you've tried running as well). What happens if you just remove that whole ifdef? (But leave the rest of the commit in.) Created attachment 80079 [details]
Disassembly of drmmode_set_mode_major
This is taken running rev 1fdd7db94b55c65ea62cc9eaefff620b20e9e4ea
The segfault happens in:
349 if (crtc->randr_crtc->scanout_pixmap)
0x00000000000263e2 <+498>: mov 0x1b8(%rbp),%rax
0x00000000000263e9 <+505>: xor %ecx,%ecx
0x00000000000263eb <+507>: xor %r8d,%r8d
0x00000000000263f7 <+519>: cmpq $0x0,0x2d0(%rax)
0x00000000000263ff <+527>: je 0x264f0 <drmmode_set_mode_major+768>
350 x = y = 0;
HEAD with the patch reverted and #ifdef'd chunks removed succeeds, but patch reverted and #ifdef'd chunks kept (and a func declaration added) failed. Testing a pristine HEAD with the #ifdef'd chunks removed also succeeds.
Just so you don't think I've fallen off the face of the earth, I'm about to go home for the weekend and I can only reproduce the issue on my work PC (the only place I'm lucky enough to have three monitors!) so I won't be able to test again until Monday. So the faulting address is 0x00000000000263f7 <+519>: cmpq $0x0,0x2d0(%rax) Which means that crtc->randr_crtc is NULL (and ->scanout_pixmap is 0x2d0 bytes into the structure). Hopefully this should provide enough info to someone more knowledgeable than myself to figure out what's going on. Guys plese try xf86-video-nouveau 1.0.9 It contains the following patch which should handle the crtc->randr_crtc == NULL case as spotted by Ilia commit be44e7804862b4c276ed4d4717b1212920f428e6 Author: Dave Airlie <airlied@gmail.com> Date: Tue Jul 30 15:26:46 2013 +1000 nouveau: fix crash when xinerama is enabled. Signed-off-by: Dave Airlie <airlied@redhat.com> Closing for now |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.