Bug 102286 - [BDW] Unigine Valley & GfxBench T-Rex fail to GPU hang
Summary: [BDW] Unigine Valley & GfxBench T-Rex fail to GPU hang
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-08-18 08:13 UTC by Eero Tamminen
Modified: 2018-01-15 16:16 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Example i915 error state (354.61 KB, text/plain)
2017-08-18 08:13 UTC, Eero Tamminen
Details
Latest BDW GT3 error state (69.10 KB, text/plain)
2017-09-28 16:10 UTC, Eero Tamminen
Details
BDW GT3 error state (Mesa: 24fe4e6143) (78.71 KB, text/plain)
2017-10-24 09:45 UTC, Eero Tamminen
Details

Description Eero Tamminen 2017-08-18 08:13:50 UTC
Created attachment 133592 [details]
Example i915 error state

Unigine Valley (v1.0) has started to randomly fail on BDW around mid-July Mesa git:
---------------
ATTENTION: default value of option vblank_mode overridden by environment.
intel_do_flush_locked failed: Input/output error
---------------
[  561.009349] [drm] GPU HANG: ecode 8:0:0x84dffec4, in valley_x64 [2125], reason: Hang on rcs0, action: reset
---------------

It happens approximately once every 5th run, both on BDW GT2 & GT3, both with drm-tip kernel, and month older kernel build.

(This is too random to be bisected.)
Comment 1 Mark Janes 2017-08-18 18:24:32 UTC
I saw one gpu hang on bdw/valley, in the past 20 runs.
Comment 2 Eero Tamminen 2017-08-21 10:36:36 UTC
I've also seen couple of system hangs on BYT during Valley run. Last one with Mesa 6f8a577ed2 (and drm-tip kernel from around same time).
Comment 3 Eero Tamminen 2017-09-07 10:00:39 UTC
Last Valley GPU hang I've seen on BDW GT2 was with 
f24cf82d6db290a88abfff0669d2c5e2aa463901 2017-08-19

Somewhat later, there have been GPU hangs with GfxBench T-Rex onscreen & offscreen with
4d807d7fe272db97fb9e20800872d5970fa2696d 2017-08-22
f0602dc92044ea6d738d0e539e52f938a41f6093 2017-08-23
dc9e08b0c3b04ba77ed59b8700e9f43edccb3168 2017-09-02


On BDW GT3 last Valley GPU hang was bit later with 
5d2205fafb5d244af658de5e3c38c6cc805ae345 2017-08-24

However, on same commit and later there have been also GfxBench T-Rex onscreen & offscreen GPU hangs with:
1eb58960bfd30d575cca4fa3c600512751aab467 2017-08-25
43145bbf097dd0c973bb19afd9227cf3ce75f52a 2017-09-01
49b428470e28ae6ab22083e43fa41abf622f3b0d 2017-09-03
ad160c2273ce4c76fd4713badc126e05dfe9cb81 2017-09-06
Comment 4 Eero Tamminen 2017-09-28 16:10:11 UTC
Created attachment 134547 [details]
Latest BDW GT3 error state

Still getting T-Rex (onscreen) hangs on BDW now and then, latest on BDW GT3, with Mesa 52ed3bca91ff13217378196d6800ca7113641a63, see attached error state.
Comment 5 Eero Tamminen 2017-10-24 09:45:34 UTC
Created attachment 135019 [details]
BDW GT3 error state (Mesa: 24fe4e6143)

GfxBench T-Rex offscreen tests is still GPU hanging.  Last error is from few days ago:
-------------------------
i965: Failed to submit batchbuffer: Input/output error
...
[ 4052.788996] [drm] GPU HANG: ecode 8:0:0x84dfffc4, in testfw_app [2869], reason: Hang on rcs0, action: reset
-------------------------

It happened both on BDW GT2 & GT3 with following Mesa version:
2017-10-21 24fe4e6143

See the attached error log.


Valley hasn't GPU hanged within last 2 months.  It segfaulted with Mesa:
2017-09-03 49b428470e

But that's the last problem I've seen with Valley, i.e. Valley part of this bug isn't anymore reproducible.
Comment 6 Eero Tamminen 2017-11-27 09:21:39 UTC
Last T-Rex hang with latest 3D stack and/or Mesa was end of October, so this issue seems gone, but I'll keep this ticket open until next year to be sure.

(With latest kernel, last hang was early this month, so I'm not sure where the fix actually is, Mesa or kernel, if there is one, or have the timings just changed so that it doesn't get anymore triggered.)
Comment 7 Eero Tamminen 2018-01-15 16:16:08 UTC
(In reply to Eero Tamminen from comment #6)
> Last T-Rex hang with latest 3D stack and/or Mesa was end of October, so this
> issue seems gone, but I'll keep this ticket open until next year to be sure.

Haven't seen these anymore, so setting this as (assumed) fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.