Bug 39524

Summary: [Sandybridge] GPU Hang During x11perf
Product: DRI Reporter: Matthew Ross <matthew.s.ross>
Component: DRM/IntelAssignee: Chris Wilson <chris>
Status: CLOSED FIXED QA Contact:
Severity: normal    
Priority: medium CC: ben, chris, jbarnes, mavoga
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
See Also: https://bugzilla.kernel.org/show_bug.cgi?id=27892
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
i915 Error State
none
Kernel Log
none
Xorg Log none

Description Matthew Ross 2011-07-25 06:51:09 UTC
Created attachment 49521 [details]
i915 Error State

Seeing consistent GPU hangs when running x11perf tests; specifically copywinwin and copypixpix tests.

-- chipset: Sandbridge - Xeon E31245, Intel HD Graphics P3000
-- system architecture: i686
-- xf86-video-intel: Reproduced with 2.14 and 2.15
-- xserver: 1.10.1
-- mesa: 
-- libdrm: 2.4.23
-- kernel: Tested on 2.6.38-10 (Ubuntu) 2.6.39-3 (Kernel.org).
-- Linux distribution: Ubuntu 11.04
-- Machine or mobo model: HP z210 Workstation
-- Display connector: Display Port

Have also reproduced using the Xorg edgers stack.
Comment 1 Matthew Ross 2011-07-25 06:52:28 UTC
Created attachment 49522 [details]
Kernel Log
Comment 2 Matthew Ross 2011-07-25 06:53:09 UTC
Created attachment 49523 [details]
Xorg Log
Comment 3 Chris Wilson 2011-07-25 07:47:57 UTC
Known issue, but I haven't seen it triggered from x11perf before. The only workaround so far is DebugFlushCaches. The only explanation I have so far is a hw bug...
Comment 4 Matthew Ross 2011-07-25 10:33:15 UTC
The patch for intel_uxa.c you listed under bug 27892 seemed to resolve this for the time being - making this a very good day for me thank you.

Until now I have yet to successfully run x11perf -all as it would hang always on copywinwin10 or copypixpix10 tests. For some reason running under gnome environment vs single user would allow the copywinwin tests to pass only to fail on copypixpix10.
Comment 5 Chris Wilson 2011-10-17 05:22:33 UTC
Hi Matthew, it seems although we still haven't got a clue as to why it dies, we have a workaround that doesn't penalise too much. Can you please retry with the current master of xf86-video-intel?

commit 46f97127c22ea42bc8fdae59d2a133e4b8b6c997
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Sun Oct 16 21:40:15 2011 +0100

    snb,ivb: Workaround unknown blitter death
    
    The first workaround was a performance killing MI_FLUSH_DW after every
    op. This workaround appears to be a stable compromise instead, only
    requiring a redundant command after every BLT command with little
    impact on throughput.
    
    Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=27892
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39524
    Tested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 6 Chris Wilson 2011-10-18 06:29:09 UTC
*** Bug 41266 has been marked as a duplicate of this bug. ***
Comment 7 Chris Wilson 2011-10-19 06:27:28 UTC
I think we have a winner!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.