Bug 27285 - modeset under Composite WM may crash X
modeset under Composite WM may crash X
Status: VERIFIED FIXED
Product: xorg
Classification: Unclassified
Component: Driver/intel
unspecified
All Linux (All)
: high critical
Assigned To: Chris Wilson
Xorg Project Team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-03-24 02:20 UTC by fangxun
Modified: 2010-05-17 03:40 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Xorg log (168.89 KB, text/plain)
2010-03-24 02:20 UTC, fangxun
no flags Details
dmesg file (37.17 KB, text/plain)
2010-03-24 02:21 UTC, fangxun
no flags Details
Handle reference counting across page flipping. (6.03 KB, patch)
2010-05-11 08:06 UTC, Chris Wilson
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description fangxun 2010-03-24 02:20:28 UTC
Created attachment 34399 [details]
Xorg log

System Environment:
--------------------------
Platform:        G45
Mesa:           (7.8)2ded27b2f0a7b01f5db77ea9a2a25df5baa876b3
Xserver:        (master)3083c5d0c4386cdd7083b7a83ac72fdad2f1e61e
Xf86_video_intel: (master)9c037f61a490c96f9095f7ff3fecbf41f5efe9f7
Libdrm:         (master)c1c8bbf80b1f734e23996bf805dc78f32ebaf56f
Kernel:  (master)60b341b778cc2929df16c0a504c91621b3c6a4ad

Bug detailed description:
-------------------------
On gnome desktop with compiz enabled, X crash after set modes about 6 times with error message: 

intel_bufmgr_gem.c:1247: Error setting memory domains 966 (00000040 00000000): Input/output error .
Tue May 25 00:21:44 CST 2010
intel_bufmgr_gem.c:1070: Error setting domain 910: Input/output error
(EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error.
intel_bufmgr_gem.c:1070: Error setting domain 910: Input/output error
intel_bufmgr_gem.c:1070: Error setting domain 706: Input/output error
intel_bufmgr_gem.c:1070: Error setting domain 795: Input/output error
intel_bufmgr_gem.c:1070: Error setting domain 792: Input/output error
intel_bufmgr_gem.c:986: Error setting to CPU domain 957: Input/output error
Fatal server error:
Failed to map batchbuffer: Input/output error

If compiz disabled, it works fine. 

Reproduce steps:
----------------
1.start X and gnome-session
2.xrandr --output VGA1 --mode 1024x768
3.xrandr --output VGA1 --mdoe 800x600
4.xrandr --output VGA1 --mode 1280x1024
5.do 2-4 repeatedly
Comment 1 fangxun 2010-03-24 02:21:33 UTC
Created attachment 34400 [details]
dmesg file
Comment 2 Chris Wilson 2010-03-30 11:38:38 UTC
As this is a hang, and not thrown by error interrupt, there is a good chance that it is caused by an incorrect batchbuffer (though it could well be a serialisation problem with the changing of the framebuffer as well). So I think your kernel should fill /sys/kernel/debug/dri/0/i915_error_state with the failing batchbuffer, could you please upload that?
Comment 3 fangxun 2010-03-30 20:37:08 UTC
Maybe I didn't made it clear about X crash status. This is not a hang. After setting modes, X crashs but GPU doesn't hang. We can restart X after this crash.

Tested with kernel 2.6.34-rc1, setting modes on gnome-session with compiz still cause X crash. Check "/sys/kernel/debug/dri/0/i915_error_state" and return "no error state collected".

 
Comment 4 Gordon Jin 2010-04-11 20:07:41 UTC
promote priority back to high, after Q1 release cycle.
Comment 5 Gordon Jin 2010-05-09 23:39:31 UTC
Carl/Chris, can you reproduce this?

We can also reproduce it in MeeGo (with mutter) (on Pineview), so it's not compiz specific.
Comment 6 Chris Wilson 2010-05-11 08:06:06 UTC
Created attachment 35568 [details] [review]
Handle reference counting across page flipping.

The likelihood is that this is the same bug as 27922, in which case the attached patch should work.
Comment 7 Chris Wilson 2010-05-12 13:51:36 UTC
I've pushed:

commit 9f54107f866a25cf670f81f7c52b8c108728c6a5
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue May 11 14:55:16 2010 +0100

    dri2: Handle reference counting across page flipping
    
    1. Instead of swapping bos, swap the entire private structure.
    
    2. If we update the pixmap bo for the Screen, make sure we update the
    reference inside intel->front_buffer so that xrandr still functions.
    
    Fixes:
    
      Bug 27922 - i965: Rapidly resizing OpenGL window causes GPU to hang.
      https://bugs.freedesktop.org/show_bug.cgi?id=27922

which I think should fix this bug as well.
Comment 8 fangxun 2010-05-13 03:31:08 UTC
X still crash with this patch. It also make a regression. While drag the window, there are severe flickering in the redrawn area.
Comment 9 Chris Wilson 2010-05-15 11:32:20 UTC
This should be the missing piece:

commit 030d56279bf14d9ddd42d8fdbeaa66ef3f557b4d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri May 14 16:53:40 2010 +0100

    drm: don't overwrite the old intel->front_buffer
    
    It's now handled in the common ExchangeBuffers() path.
Comment 10 fangxun 2010-05-17 03:40:25 UTC
Works fine. Marking bug as verified.