Bug 50668

Summary: [PNV uxa regression] Piglit texturing/lodbias causes GPU hung
Product: xorg Reporter: lu hua <huax.lu>
Component: Driver/intelAssignee: Chris Wilson <chris>
Status: VERIFIED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: high CC: ben, chris, daniel, jbarnes, xunx.fang, yi.sun
Version: git   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg
none
error_state none

Description lu hua 2012-06-04 01:37:04 UTC
Created attachment 62490 [details]
dmesg

System Environment:
--------------------------
Arch:             i386
Platform:         Pineview
Libdrm:         (master)2.4.34-2-g481234f2909c0506962a2f42da862da6a9b13fd8
Mesa:           (8.0)3d657b14b4cab98a2945904823e78cd8950944f4
Xserver:         (server-1.12-branch)xorg-server-1.12.1
Xf86_video_intel:(master)2.19.0-176-g1f78a934a423911e18d340f0585e31941f6e8663
Libva:          (master)90460bb796d99eaaec71a0a25be936544645446e
Libva_intel_driver:(master)af76df40082e321ebad114bc45eee76e4a455bf9
Kernel:         (drm-intel-fixes) 9e612a008fa7fe493a473454def56aa321479495

Bug detailed description:
-----------------------------
It happens on Pineview with drm-intel-fixes kernel.It doesn't happen on drm-intel-next-queued kernel.
The last known good commit:c3b20037926e607b6cdaecbf9d3103e2ca63bc31
The last known bad commit:9e612a008fa7fe493a473454def56aa321479495

Case 'general_triangle-rasterization-overdraw' and 'shaders_glsl-clamp-vertex-color' also have this issue.
Reproduce steps:
----------------
1. start X
2. ./bin/lodbias -auto
3. dmesg
Comment 1 Daniel Vetter 2012-06-04 01:46:22 UTC
You say that c3b20037926e60 is the last good and that 9e612a008fa7fe4 is bad, which means this regression has been introduced in 9e612a008fa7fe4

Can you confirm that this is correct by reverting that commit?

If that's indeed the case, then this is really strange - this kernel commit changes the hotplug behaviour. So presuming nothing else than the lodbias piglit test runs, this code should even get executed.

Hence can you please double-check whether this is not a regression in mesa?
Comment 2 lu hua 2012-06-05 01:52:11 UTC
Created attachment 62563 [details]
error_state
Comment 3 lu hua 2012-06-05 02:04:48 UTC
This issue happens on nightly testing.It collects error in /debug/dri/0/i915_error_state.
I attached the i915_error_state.I can't reproduce the hung via manually testing.
Comment 4 Chris Wilson 2012-06-05 03:04:12 UTC
So this was meant to be fixed by:

commit 14667a4bde4361b7ac420d68a2e9e9b9b2df5231
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 3 17:58:35 2012 +0100

    drm/i915: Finish any pending operations on the framebuffer before disabling

as both pipes are off. The other condition where this goes wrong is when the DDX doesn't notice the kernel switching the pipes off before submitting the wait, and that should have been fixed by

commit 3f3bde4f0c72f6f31aae322bcdc20b95eade6631
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu May 24 11:58:46 2012 +0100

    uxa: Only consider an output valid if the kernel reports it attached
Comment 5 Chris Wilson 2012-06-05 03:04:41 UTC
Considering this is a ddx/kernel interaction bug, throw SNA into the mix as well :-p
Comment 6 Chris Wilson 2012-06-05 08:10:56 UTC
Considering the problem again, I realised the effect would be DPMS:

commit c4eb5528a456b65c673f7c984d14a622ac67cdca
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jun 5 16:04:16 2012 +0100

    uxa: Check for DPMS off before scheduling a WAIT_ON_EVENT
    
    Regression from commit 3f3bde4f0c72f6f31aae322bcdc20b95eade6631
    Author: Chris Wilson <chris@chris-wilson.co.uk>
    Date:   Thu May 24 11:58:46 2012 +0100
    
        uxa: Only consider an output valid if the kernel reports it attached
    
    When backporting from SNA, a key difference that UXA does not track DPMS
    state in its enabled flag and that a DPMS off CRTC is still bound to the
    fb. So we do need to rescan the outputs and check that we have a
    connector enabled *and* the pipe is running prior to emitting a scanline
    wait.
    
    References: https://bugs.freedesktop.org/show_bug.cgi?id=50668
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 7 Chris Wilson 2012-06-06 00:28:48 UTC
*** Bug 50753 has been marked as a duplicate of this bug. ***
Comment 8 lu hua 2012-06-10 22:42:11 UTC
Verified. It has been fixed on commit 3d657b14b4cab98a2945.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.