Bug 28193 - [regression] glxgears_fbconfig crashed X caused by xserver
Summary: [regression] glxgears_fbconfig crashed X caused by xserver
Status: VERIFIED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high major
Assignee: Kristian Høgsberg
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-20 23:11 UTC by Yi Sun
Modified: 2010-05-30 20:27 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description Yi Sun 2010-05-20 23:11:30 UTC
System Environment:
----------------------
Platform:               G45
Arch:           x86_64
Libdrm:		(master)a3305b076c005e0d3bd55da0214e91413cf65b48
Mesa:		(master)7234dc19afeac1e5cf39ebb10d07362dc6572d33
Xserver:		(master)345eb171264325d73ea2c50ba8c692cf589c2a9b
Xf86_video_intel:		(master)2c69709d8afa6e9c0990efc463df0061536585e1
Cairo:		(master)7ef1bd22ded512f4fad3959796d7f40c4ddc5824
Kernel:		(for-linus)722154e4cacf015161efe60009ae9be23d492296

Bug Description:
---------------------
Glxgears_fbconfig crashs X as soon as it runs.We found that the regression is caused by xserver.
The infomation when the X crashed is as following:

XIO:  fatal IO error 11 (Resource temporarily unavailable) on X server ":0.0"
      after 201 requests (200 known processed) with 0 events remaining.
Comment 1 Chris Wilson 2010-05-21 01:25:08 UTC
The Xorg.log from the crashing X server will be useful, and attaching gdb and performing a bt, even more so.
Comment 2 Yi Sun 2010-05-27 02:03:41 UTC
With bisect,we found the commit 345eb171264325d73ea2c50ba8c692cf589c2a9b is  bad and the 315041762313598aad90df84226e2d2def4a0fc9 is a good one.
There are still 13 commits between the two commits. But once we use any one of the 13 commits, the X won't be started up. So that, we can't continue bisect it.
Comment 3 Yi Sun 2010-05-28 01:45:10 UTC
(gdb) c
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007f5a59b8a454 in intelDRI2Invalidate (drawable=0x435bdb0)
    at intel_screen.c:122
warning: Source file is more recent than executable.
122         dri2InvalidateDrawable,

(gdb) c
Continuing.

Program received signal SIGABRT, Aborted.
0x0000003b3ae32f05 in raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
64        return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);


back trace:
#0  0x0000003b3ae32f05 in raise (sig=<value optimized out>)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x0000003b3ae34a73 in abort () at abort.c:88
#2  0x000000000046388e in OsAbort () at utils.c:1263
#3  0x000000000046cfbd in ddxGiveUp () at xf86Init.c:1225
#4  0x000000000045b0dd in AbortServer () at log.c:417
#5  0x000000000045b8c0 in FatalError (
    f=0x568a18 "Caught signal %d (%s). Server aborting\n") at log.c:545
#6  0x000000000045c584 in OsSigHandler (signo=11, sip=0x23ff00002209,
    unused=<value optimized out>) at osinit.c:156
#7  <signal handler called>
#8  0x00007f5a59b8a454 in intelDRI2Invalidate (drawable=0x435bdb0)
    at intel_screen.c:122
#9  0x00007f5a5acb2bf2 in DRI2InvalidateDrawable (pDraw=0x435b4a0)
    at dri2.c:510
#10 0x00007f5a5acb3c69 in DRI2SwapBuffers (client=0x436be10, pDraw=0x435b4a0,
    target_msc=0, divisor=0, remainder=0, swap_target=0x7ffffd7172a8,
    func=0x7f5a5acb4440 <DRI2SwapEvent>, data=0x435b4a0) at dri2.c:823
#11 0x00007f5a5acb47c4 in ProcDRI2SwapBuffers () at dri2ext.c:408
#12 ProcDRI2Dispatch (client=0x436be10) at dri2ext.c:584
#13 0x0000000000429249 in Dispatch () at dispatch.c:432
#14 0x0000000000421685 in main (argc=2, argv=0x7ffffd717468,
    envp=<value optimized out>) at main.c:283
Comment 4 Chris Wilson 2010-05-28 02:23:34 UTC
The function that caused the crash was removed from mesa with:

commit e67c338b415c983bee570e6644b9684d8d1fc99b
Author: Kristian Høgsberg <krh@bitplanet.net>
Date:   Tue May 18 21:50:44 2010 -0400

    intel: Throttle after doing copyregion/swapbuffers round trip
    
    Before we would throttle in the flush callback prior to round-tripping
    to the server to do copyregion or swapbuffer.  Now, instead just note
    that we need to throttle and do it in intel_prepare_render(), which
    will be called after receiving the response from the server but before
    we start rendering the next frame.  Even if the server also throttles
    us in swapbuffer, this just makes the throttling a no-op when we hit
    intel_prepare_render().  With that we can drop the
    using_dri2_swapbuffers hack and just always throttle.

I presume the generic function that is called instead is safe...
Comment 5 Kristian Høgsberg 2010-05-28 05:02:57 UTC
(In reply to comment #4)
> The function that caused the crash was removed from mesa with:
> 
> commit e67c338b415c983bee570e6644b9684d8d1fc99b
> Author: Kristian Høgsberg <krh@bitplanet.net>
> Date:   Tue May 18 21:50:44 2010 -0400
> 
>     intel: Throttle after doing copyregion/swapbuffers round trip
> 
>     Before we would throttle in the flush callback prior to round-tripping
>     to the server to do copyregion or swapbuffer.  Now, instead just note
>     that we need to throttle and do it in intel_prepare_render(), which
>     will be called after receiving the response from the server but before
>     we start rendering the next frame.  Even if the server also throttles
>     us in swapbuffer, this just makes the throttling a no-op when we hit
>     intel_prepare_render().  With that we can drop the
>     using_dri2_swapbuffers hack and just always throttle.
> 
> I presume the generic function that is called instead is safe...

It is, the problem before was that when libGL invalidates a drawable, that drawable isn't necesarily bound to a context.  That means when the intel dri driver goes to set a flag in the context to disable throttling, it sometimes follows a NULL pointer.

The commit above changes throttling to have much less overhead so we can always do it, which removes the need for setting that flag and thus the crash.
Comment 6 Yi Sun 2010-05-30 20:27:24 UTC
The issue has been fixed with the latest code.
So change the status to verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.