Bug 94380 - [gen3] GPU hanging and killing application.
Summary: [gen3] GPU hanging and killing application.
Status: RESOLVED MOVED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i915 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-03 05:30 UTC by Lucas
Modified: 2019-09-18 19:39 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
The error file generated at /sys/class/drm/card0/error when reproducing the game crash. (1.49 MB, text/plain)
2016-03-03 05:30 UTC, Lucas
Details
Error file generated when running Geeks3D GpuTest Piano benchmark. (1.49 MB, text/plain)
2016-03-03 05:31 UTC, Lucas
Details

Description Lucas 2016-03-03 05:30:34 UTC
Created attachment 122091 [details]
The error file generated at /sys/class/drm/card0/error when reproducing the game crash.

While playing the game The Sims 3 with Pets expansion on a 945G GPU with oibaf's Ubuntu's PPA, the game crashed when creating a pet dog.

On looking at the dmesg, I got the following error:

> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.804071] [drm] stuck on render ring
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805899] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1815], reason: Ring hung, action: reset
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805905] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805907] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805908] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805910] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805911] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805914] i915: render error detected, EIR: 0x00000010
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805917] i915:   IPEIR: 0x00000000
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805919] i915:   IPEHR: 0x780a0101
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805921] i915:   INSTDONE_0: 0xffffffff
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805922] i915:   INSTDONE_1: 0xbfffeff0
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805924] i915:   INSTDONE_2: 0x00000000
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805925] i915:   INSTDONE_3: 0x00000000
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805927] i915:   INSTPS: 0x8001e023
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805929] i915:   ACTHD: 0x06e84f60
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805931] i915: page table error
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.805932] i915:   PGTBL_ER: 0x00000001
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.806001] [drm:i915_handle_error [i915]] *ERROR* EIR stuck: 0x00000010, masking
> Feb 27 22:30:46 tassiana-laptop kernel: [ 3921.807019] drm/i915: Resetting chip after gpu hang
> Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.804065] [drm] stuck on render ring
> Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805739] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1815], reason: Ring hung, action: reset
> Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805903] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
> Feb 27 22:30:52 tassiana-laptop kernel: [ 3927.805996] drm/i915: Resetting chip after gpu hang

I thought the problem might be related with the oibaf's unstable version of the graphics driver, so I revert to the original version in Ubuntu 15.10, and restart. Unfortunately, I forgot to save the error file for that crash.

After restarting, I was able to reproduce the error with the same game. The error file is the attached "TS3W.exe.error". The message in dmesg was the following:

> [  685.816087] [drm] stuck on render ring
> [  685.817915] [drm] GPU HANG: ecode 4:0:0x87e5fefe, in TS3W.exe [1557], reason: Ring hung, action: reset
> [  685.817919] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
> [  685.817921] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> [  685.817923] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
> [  685.817925] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
> [  685.817926] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [  685.818213] drm/i915: Resetting chip after gpu hang
> [  691.816039] [drm] stuck on render ring
> [  691.817958] [drm] GPU HANG: ecode 4:0:0x87f5fefe, in TS3W.exe [1557], reason: Ring hung, action: reset
> [  691.818146] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
> [  691.818243] drm/i915: Resetting chip after gpu hang

Reproducing an error with a proprietary game is a big hassle, but fortunately, I was able to trigger the same problem (or at least, a very similar problem) with the freely available tool at: http://www.geeks3d.com/gputest/ (it seems it won't download if you have an ad-blocker).

To reproduce, download and unpack Geeks3D GpuTest, run the python script called gputest_gui.py:

> $ python gputest_gui.py

then select PixMark Piano (OpenGL 2.1/3.0) and click on "Run benchmark".

The above steps crashed the benchmark, and left the following message on dmesg:

> [  430.816064] [drm] stuck on render ring
> [  430.817671] [drm] GPU HANG: ecode 4:0:0xf98df17c, in GpuTest [1647], reason: Ring hung, action: reset
> [  430.817673] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
> [  430.817674] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
> [  430.817676] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
> [  430.817677] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
> [  430.817679] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [  430.817844] drm/i915: Resetting chip after gpu hang
> [  436.804065] [drm] stuck on render ring
> [  436.805797] [drm] GPU HANG: ecode 4:0:0xf989e17c, in GpuTest [1647], reason: Ring hung, action: reset
> [  436.805944] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
> [  436.806005] drm/i915: Resetting chip after gpu hang

The corresponding error file will also be attached.
Comment 1 Lucas 2016-03-03 05:31:44 UTC
Created attachment 122092 [details]
Error file generated when running Geeks3D GpuTest Piano benchmark.
Comment 2 Matt Turner 2016-03-03 16:24:33 UTC
The driver for i945 (called i915 in both the kernel and Mesa) is basically unmaintained. You're going to have to be very persistent in order to get this fixed. I'd recommend that you start by building Mesa yourself, and then bisecting to the commit that caused the failure.
Comment 3 Kenneth Graunke 2016-03-03 16:47:34 UTC
I think there's some kind of mistake here - both of the error states are clearly from a GM45 (Gen4.5) not 945G (Gen3) machine.
Comment 4 Lucas 2016-03-03 19:06:48 UTC
(In reply to Kenneth Graunke from comment #3)
> I think there's some kind of mistake here - both of the error states are
> clearly from a GM45 (Gen4.5) not 945G (Gen3) machine.

I guess you are right. I mixed up the names. The machine is a Pentium Dual-Core from 2009, that fits the date where GM45 was released. Unfortunately I don't have it with me now to confirm.
Comment 5 GitLab Migration User 2019-09-18 19:39:25 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/762.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.