Bug 8537 - dual head/non xinerama crashes server if aiglx is enabled
Summary: dual head/non xinerama crashes server if aiglx is enabled
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: 7.1 (2006.05)
Hardware: x86 (IA32) Linux (All)
: high normal
Assignee: Alan Hourihane
QA Contact:
URL:
Whiteboard:
Keywords:
: 8554 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-10-06 16:45 UTC by Carl Michal
Modified: 2007-05-15 13:19 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log on dual head startup with crash (103.57 KB, text/plain)
2006-10-10 16:03 UTC, Carl Michal
no flags Details
log from subsequent start-up attempt (19.89 KB, text/plain)
2006-10-10 16:03 UTC, Carl Michal
no flags Details
Set driverprivate to NULL at startup (723 bytes, patch)
2006-10-12 01:36 UTC, Alan Hourihane
no flags Details | Splinter Review
Set driverprivate to NULL (721 bytes, patch)
2006-10-12 01:38 UTC, Alan Hourihane
no flags Details | Splinter Review
Set driverprivate to NULL (653 bytes, patch)
2006-10-12 01:40 UTC, Alan Hourihane
no flags Details | Splinter Review
log with patched dri.c (152.81 KB, text/plain)
2006-10-12 08:52 UTC, Carl Michal
no flags Details
check dri private (1.16 KB, patch)
2006-10-12 14:56 UTC, Alan Hourihane
no flags Details | Splinter Review

Description Carl Michal 2006-10-06 16:45:27 UTC
Unless aiglx is explicitly turned off, I find a dual head session that is not in
xinerama crashes the X server every time, a few seconds after both screens are up.

There is no sign of anything wrong - windows can be opened and moved around, or
nothing done at all, then after 5-10s, the server crashes.  Happens every time.

This is an i915 in a Dell inspiron laptop.
Comment 1 Alan Hourihane 2006-10-08 13:56:29 UTC
logs would be useful.
Comment 2 Carl Michal 2006-10-10 16:03:10 UTC
Created attachment 7340 [details]
Xorg log on dual head startup with crash

Here's the log from an X server startup where all looks good until a few
seconds after its up, and then it crashes.
Comment 3 Carl Michal 2006-10-10 16:03:57 UTC
Created attachment 7341 [details]
log from subsequent start-up attempt

This is the log from the next attempt to start the X server.
Comment 4 Alan Hourihane 2006-10-11 00:40:44 UTC
The second log is similar to your other bug report, and it's only caused because
of the first server crash.

Now, as for the first server crash you are going to have to run the Xserver
under gdb, and make sure you've compiled the Xserver yourself with debugging
flags enabled. Then get a real backtrace of where the crash is happening.
Comment 5 Carl Michal 2006-10-11 09:41:00 UTC
I will try to recompile the server with -g and get a real backtrace.

I filed these separately because they seemed distinct - with AIGLX enabled,
it crashes every time (even immediately after booting), but not till a few
seconds after its started up.

Without AIGLX, it seems to either run ok, or not start up at all.

Comment 6 Carl Michal 2006-10-11 14:46:34 UTC
Is there a trick to starting the X server under gdb?

with: 
gdb /usr/bin/X
gdb> run -ignoreABI


The server says:
(==) Using config file: "/etc/X11/xorg.conf"
[tcsetpgrp failed in terminal_inferior: Operation not permitted]
(WW) module ABI major version (0) doesn't match the server's version (1)
(WW) I810: No matching Device section for instance (BusID PCI:0:2:1) found
PIPECONF (1) BEFORE 0x80000000
DSPCNTR (1) BEFORE 0x49000000
PIPECONF (1) AFTER 0x80000000
DSPCNTR (1) AFTER 0xc9000000
I830InitVideo
I830SetupImageVideoOverlay
I830ResetVideo: base: 0xa78f6000, offset: 0xfffa000, obase: 0xb78f0000
Original gamma: 0x80808 0x101010 0x202020 0x404040 0x808080 0xc0c0c0
Bounded  gamma: 0x80808 0x101010 0x202020 0x404040 0x808080 0xc0c0c0
I830SetupImageVideoOverlay

and then locks up.  The tcsetpgrp line seems to be new.


I was able to attach the debugger after starting up X. When started like this
(directly starting /usr/bin/X), the server starts up fine and stays up.  But
when started with startx (configured in gentoo to start a gnome session), it
crashes after a few seconds.

(Compiled with -O2  -g -march=pentium-m)

This brings to mind another bug I didn't file here because it didn't seem to be 
a server problem - but it may be a piece of the puzzle:

If I enable xinerama, the screen resolutions seem to get reported all wrong
somewhere - ie, I can't move windows to some parts of the combined screen (lower
and farthest right).  But that also only happens if the server is started from
startx to start a gnome session.  If its started from gdm or with /usr/bin/X it
seems to behave properly.

A difference though is that if AIGLX is enabled with two heads, non xinerama -
it crashes from startx or from gdm, but not if started with /usr/bin/X.

Back to the issue at hand. With AIGLX enabled without xinerama, the server now
spits out:

Backtrace:
0: X(xf86SigHandler+0x84) [0x80b85e8]
1: [0xffffe420]
2: /usr/lib/xorg/modules/extension/libglx.so [0xb7c65beb]
3: /usr/lib/xorg/modules/extensions/libglx.so(__glXleavServer+0x22) [0xb7c41d52]
4: /usr/lib/xorg/modules/extensions/libglx.so [0xb7c4236e]
5: X(Dispatch+0x19b) [0x8086c0b]
6: X(main+0x488) [0x806e608]
7: /lib/libc.so.6(__libc_start_main+0xd8) [0xb7ce7878]
8: X(FontFileCompleteXLFD+0xad) [0x806d931]

and gdb says:

Program received signal SIGSEGV, Segmentation fault.
0xb7efa4bd in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1399
1399    dri.c: No such file or directory.
        in dri.c
(gdb) bt
#0  0xb7efa4bd in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1399
#1  0xb7c0bbeb in __glXDRIleaveServer () at glxdri.c:146
#2  0xb7be7d52 in __glXleaveServer () at glxext.c:447
#3  0xb7be836e in __glXDispatch (client=0x8507210) at glxext.c:520
#4  0x08086c0b in Dispatch () at dispatch.c:459
#5  0x0806e608 in main (argc=7, argv=0xbfe2d004, envp=0x0) at main.c:447



Comment 7 Alan Hourihane 2006-10-11 15:26:59 UTC
O.k. so in your sources in xserver/hw/xfree86/dri/dri.c at line 1399 what does
it say ??
Comment 8 Carl Michal 2006-10-11 18:07:19 UTC
line 1399 is the third line of DRIDoBlockHandler:

if (pDRIPriv->pDriverInfo->driverSwapMethod == DRI_HIDE_X_CONTEXT) {

Comment 9 Alan Hourihane 2006-10-12 01:36:14 UTC
Created attachment 7372 [details] [review]
Set driverprivate to NULL at startup
Comment 10 Alan Hourihane 2006-10-12 01:37:02 UTC
Can you try the patch I just posted in the previous comment.

It applies to the dri.c file. Make sure you install the new libdri.so that gets
built.
Comment 11 Alan Hourihane 2006-10-12 01:38:54 UTC
Created attachment 7373 [details] [review]
Set driverprivate to NULL

Oops. Use this one instead as that last patch is bogus.
Comment 12 Alan Hourihane 2006-10-12 01:40:08 UTC
Created attachment 7374 [details] [review]
Set driverprivate to NULL

Ugh. Now I go grab a coffee. Use this one as the test.
Comment 13 Carl Michal 2006-10-12 08:52:02 UTC
Created attachment 7385 [details]
log with patched dri.c

I'm afraid it still crashes...	log attached.
Comment 14 Alan Hourihane 2006-10-12 09:08:43 UTC
O.k. when in gdb you'll need to print the values of the following when the crash
occurs.

So do...

print pDRIPriv
print pDRIPriv->pDriverInfo
print pDRIPriv->pDriverInfo->driverSwapMethod

and let me know the results are. You'll need to have started X with gdb, rather
than attaching to it.
Comment 15 Carl Michal 2006-10-12 10:26:12 UTC
I can't seem to run the X server from gdb directly - there are actually two
issues.  One is if that when I start X from within gdb, it never starts - it
locks up.  The second issue is that starting X directly from the command line
doesn't reproduce the crash - it only happens when started by a gnome-session.

Attaching to the process, I find:

Program received signal SIGSEGV, Segmentation fault.
0xb7f0149d in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1404
1404        if (pDRIPriv->pDriverInfo->driverSwapMethod == DRI_HIDE_X_CONTEXT)
{(gdb) print pDRIPriv
$1 = (DRIScreenPrivPtr) 0x0
(gdb) print pDRIPriv->pDriverInfo
$2 = (DRIInfoPtr) 0xf000eec2
(gdb) print pDRIPriv->pDriverInfo->driverSwapMethod
Cannot access memory at address 0xf000ef3e
(gdb)
Comment 16 Alan Hourihane 2006-10-12 14:56:36 UTC
Created attachment 7393 [details] [review]
check dri private

Try this one.
Comment 17 Carl Michal 2006-10-12 18:10:15 UTC
ok, so now when it crashes:

Program received signal SIGSEGV, Segmentation fault.
0xb7f594b5 in DRIDoBlockHandler (screenNum=1, blockData=0x0, pTimeout=0x0,
    pReadmask=0x0) at dri.c:1417
1417            DRM_SPINUNLOCK(&pDRIPriv->pSAREA->drawable_lock, 1);
(gdb) list
1412                                                  DRI_2D_CONTEXT,
1413                                                 
pDRIPriv->partial3DContextStore);
1414        }
1415
1416        if (pDRIPriv->windowsTouched)
1417            DRM_SPINUNLOCK(&pDRIPriv->pSAREA->drawable_lock, 1);
1418        pDRIPriv->windowsTouched = FALSE;
1419
1420        DRIUnlock(pScreen);
1421    }
(gdb)

So I made the obvious change of putting lines 1416-1420 inside:
if (pDRIPriv){

}
 
and tried again and it no longer crashes.

Thanks for all your help.

I left in the earlier patch setting driverprivate to NULL.
Comment 18 Alan Hourihane 2006-10-15 10:50:11 UTC
*** Bug 8554 has been marked as a duplicate of this bug. ***
Comment 19 Alan Hourihane 2006-10-17 04:08:15 UTC
O.k. This is already fixed in the git repository for the upcoming Xorg 7.2 release.
Comment 20 Timo Aaltonen 2007-05-15 03:18:15 UTC
It's not applied to 1.3 branch, master or elsewhere.
Comment 21 Alan Hourihane 2007-05-15 03:43:51 UTC
Yes it is.

If you've got a problem I suggest you open a new bug report.
Comment 22 Timo Aaltonen 2007-05-15 09:03:27 UTC
Ok, show me the commit? The attached patch has been in ubuntu against 1.2 and applied just fine, ditto for 1.3 so if it has been fixed by other means then I'd like to know about it.
Comment 23 Alan Hourihane 2007-05-15 09:17:25 UTC
The patch here isn't needed as it's fixed elsewhere in the GLX code.

If you are experiencing a crash with X.Org 7.2 or later then I suggest you log a new bug with details.
Comment 24 Timo Aaltonen 2007-05-15 13:19:54 UTC
No, that's all I needed to know, thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.