Bug 44610

Summary: [IVB GT2] urbanterror makes machine hang
Product: DRI Reporter: libo <bo.c.li>
Component: DRM/IntelAssignee: Eugeni Dodonov <eugeni>
Status: VERIFIED FIXED QA Contact:
Severity: critical    
Priority: high CC: ben, chris, daniel, jbarnes, kenneth
Version: XOrg git   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 42991, 44622    

Description libo 2012-01-09 20:20:55 UTC
System Environment:
--------------------------
 Libdrm:	(master)2.4.30
 Mesa:		(master)37240d2132d25588ad05ae5394c237f45d8ad881
 Xserver:	(master)xorg-server-1.11.99.901
 Xf86_video_intel:	(master)2.17.0-352-g6c70558ae7298db94724c931d88a730ef0151608
 Cairo:		(master)fefc273c53c39c750b27d35964ec250547b948af
 Libva:		(vaapi-ext)4aeaa296febf2f71200ff30380902e2c80cbf679
 Libva_intel_driver:	(vaapi-ext)f0358f252e13a619550d37e5a720f9894c7aa18c
 Kernel:	(drm-intel-next) d8e70a254d8f2da141006e496a51502b79115e80

Bug detailed description:
-------------------------
The machine will be hanged when running Urbanterror.
It exists only on IVB.


Reproduce steps:
-------------------------
1.xinit&
2.gnome-session
3.vblank_mode=0 ./urbanterror
Comment 1 Daniel Vetter 2012-01-10 00:01:40 UTC
As usual, please attach full dmesg and the i915_error_state plus any other relevant details.
Comment 2 Chris Wilson 2012-01-10 01:30:47 UTC
Also try with i915.reset=0 and without gnome-session to simplify the test environment and hopefully reduce the system hang into a GPU hang.
Comment 3 libo 2012-01-10 19:32:08 UTC
The bug also exists even if running it with simplify test enviroment.
(In reply to comment #2)
> Also try with i915.reset=0 and without gnome-session to simplify the test
> environment and hopefully reduce the system hang into a GPU hang.
Comment 4 libo 2012-01-10 19:41:21 UTC
I try to get dmesg of kernel with netconsole, but I can't get anything. Do I need to add some parameter with kernel ?Or do you have other advice to get more message? 
(In reply to comment #1)
> As usual, please attach full dmesg and the i915_error_state plus any other
> relevant details.
Comment 5 Gordon Jin 2012-01-12 00:30:58 UTC
Could any developer confirm this is reproducible or not?
Comment 6 Daniel Vetter 2012-01-13 04:22:02 UTC
Is this a regression? If so, please bisect it.
Comment 7 Kenneth Graunke 2012-01-13 10:33:03 UTC
I can reproduce this on my IVB.  I doubt it's a regression.  At least, not a recent one.

I'll see if I can create an apitrace that reproduces the issue.
Comment 8 libo 2012-01-15 17:08:10 UTC
I don't think it's a regression because the bug has always exists but it can't be reproduced steadily before.
Comment 9 libo 2012-01-15 21:41:07 UTC
This bug only exists on IVB GT2. 
I have (In reply to comment #8)
> I don't think it's a regression because the bug has always exists but it can't
> be reproduced steadily before.
Comment 10 Chris Wilson 2012-01-22 09:20:40 UTC
Ken, is it still reproducible if you pretend it's a GT1 and use the more conservative defaults in mesa?
Comment 11 libo 2012-01-22 09:20:50 UTC
I'm on vacation from Jan.19 to Feb.5. My email response may be slow.
Comment 12 Chris Wilson 2012-02-08 06:51:34 UTC
Ken, Eugeni, can you test whether the recent batch of IVB w/a help here?
Comment 13 Eugeni Dodonov 2012-02-08 09:28:45 UTC
The magic IVB hang-fixing patches seem to have solved this one as well.
Comment 14 Eugeni Dodonov 2012-02-23 06:26:53 UTC
The Ivy Bridge hang workarounds have made it to drm-intel-fixes and have landed in Linus tree. They should be available in 3.3-rc5.
Comment 15 Chris Wilson 2012-02-23 06:35:24 UTC
commit d71de14ddf423ccc9a2e3f7e37553c99ead20d7c
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Wed Feb 8 12:53:52 2012 -0800

    drm/i915: gen7: Disable the RHWO optimization as it can cause GPU hangs.
    
    The BSpec Workarounds page states that bits 10 and 26 must be set to
    avoid 3D ring hangs.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41353
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44610
    Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

commit db099c8f963fe656108e0a068274c5580a17f69b
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:51 2012 -0800

    drm/i915: gen7: work around a system hang on IVB
    
    This adds the workaround for WaCatErrorRejectionIssue which could result
    in a system hang.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41353
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44610
    Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
    Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>

commit e4e0c058a19c41150d12ad2d3023b3cf09c5de67
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:50 2012 -0800

    drm/i915: gen7: Implement an L3 caching workaround.
    
    This adds two cache-related workarounds for Ivy Bridge which can lead to
    3D ring hangs and corruptions.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41353
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44610
    Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Comment 16 Florian Mickler 2012-02-27 14:30:58 UTC
A patch referencing this bug report has been merged in Linux v3.3-rc5:

commit eae66b50c760233fad526edf4a0d327be17a055d
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:49 2012 -0800

    drm/i915: gen7: implement rczunit workaround
Comment 17 libo 2012-03-30 19:11:00 UTC
I have verified it and issue has been fixed.
commit eae66b50c760233fad526edf4a0d327be17a055d
Comment 18 libo 2012-03-30 19:56:24 UTC
Hi Daniel, has your branch integrated this patch?

commit eae66b50c760233fad526edf4a0d327be17a055d
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Wed Feb 8 12:53:49 2012 -0800

    drm/i915: gen7: implement rczunit workaround

(In reply to comment #1)
> As usual, please attach full dmesg and the i915_error_state plus any other
> relevant details.
Comment 19 libo 2012-03-30 20:36:54 UTC
I can't find it in -next (or -queued) branch .

(In reply to comment #18)
> Hi Daniel, has your branch integrated this patch?
> 
> commit eae66b50c760233fad526edf4a0d327be17a055d
> Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
> Date:   Wed Feb 8 12:53:49 2012 -0800
> 
>     drm/i915: gen7: implement rczunit workaround
> 
> (In reply to comment #1)
> > As usual, please attach full dmesg and the i915_error_state plus any other
> > relevant details.
Comment 20 Daniel Vetter 2012-03-31 02:30:16 UTC
> --- Comment #19 from libo <bo.c.li@intel.com> 2012-03-30 20:36:54 PDT ---
> I can't find it in -next (or -queued) branch .

It's not there yet, it's only in -testing. I plan to merge -fixes into
-next before the next testing cycle (assuming that all the things I'd like
to end up in next are indeed there).
Comment 21 Ouping Zhang 2012-04-05 01:31:29 UTC
When continually running openarena above 10 times, it would cause GPU hang on IVB.
Comment 22 Daniel Vetter 2012-04-09 08:51:09 UTC
Ouping Zhang, please open a new bug report for the openarena hangs, it sounds like they are a new (and maybe unrelated) sighting.
Comment 23 Ouping Zhang 2012-04-10 01:37:51 UTC
This patch also can fix the below issue.
(In reply to comment #22)
> Ouping Zhang, please open a new bug report for the openarena hangs, it sounds
> like they are a new (and maybe unrelated) sighting.