Bug 101402 - [bdw] GPU hang in starbound
Summary: [bdw] GPU hang in starbound
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel 3D Bugs Mailing List
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
: 101411 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-06-13 05:11 UTC by Richard Wilson
Modified: 2017-06-16 17:11 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
/sys/class/drm/card0/error (35.98 KB, application/gzip)
2017-06-13 05:11 UTC, Richard Wilson
Details
uncompressed (24.21 KB, text/plain)
2017-06-16 02:38 UTC, leozinho29_eu
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Wilson 2017-06-13 05:11:41 UTC
Created attachment 131913 [details]
/sys/class/drm/card0/error

Ubuntu 16.04 LTS

Hung recently on booting Starbound, other games have had the issue in the past. Not all though.

Originally happened on Kernel 4.4, upgraded to 4.8.0-54-generic.

dmesg told me to report it here:


```
[   79.635062] [drm] GPU HANG: ecode 8:0:0x85dffffb, in starbound [3449], reason: Hang on render ring, action: reset
[   79.635063] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   79.635063] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   79.635064] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   79.635064] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   79.635065] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   79.635158] drm/i915: Resetting chip after gpu hang
[   89.624974] drm/i915: Resetting chip after gpu hang
```

Would be willing to help debug the issue.
Comment 1 Elizabeth 2017-06-14 17:49:57 UTC
Good evening Richard, there is a similar case to the bug that you're reporting, could you please take a look at it, and try the steps listed there to check if this helps the problem: https://bugs.freedesktop.org/show_bug.cgi?id=101411
Thank you.
Comment 2 Chris Wilson 2017-06-15 12:53:00 UTC
*** Bug 101411 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2017-06-15 12:54:39 UTC
From bug101411:
> Created attachment 131931 [details]
> error
> 
> Hello
> 
> When opening Starbound, after the initial loading screen, when the main menu
> should appear, the computer freezes and, after it returns to function, the
> game crashes. 
> 
> When the game is opened from the terminal, a message can be seen in the
> moment the game crashes:
> intel_do_flush_locked failed: Input/output error
> 
> After this freeze and crash, some messages appear on dmesg:
> 
> [ 1868.754963] [drm] GPU HANG: ecode 9:0:0x85dffffb, in starbound [5869],
> reason: Hang on render ring, action: reset
> [ 1868.754964] [drm] GPU hangs can indicate a bug anywhere in the entire gfx
> stack, including userspace.
> [ 1868.754965] [drm] Please file a _new_ bug report on bugs.freedesktop.org
> against DRI -> DRM/Intel
> [ 1868.754966] [drm] drm/i915 developers can then reassign to the right
> component if it's not a kernel issue.
> [ 1868.754966] [drm] The gpu crash dump is required to analyze gpu hangs, so
> please always attach it.
> [ 1868.754967] [drm] GPU crash dump saved to /sys/class/drm/card0/error
> [ 1868.755030] drm/i915: Resetting chip after gpu hang
> [ 1868.755197] [drm] RC6 on
> [ 1868.770399] [drm] GuC firmware load skipped
> [ 1876.750371] drm/i915: Resetting chip after gpu hang
> [ 1876.750473] [drm] RC6 on
> [ 1876.764402] [drm] GuC firmware load skipped
> [ 1890.798070] drm/i915: Resetting chip after gpu hang
> [ 1890.798216] [drm] RC6 on
> [ 1890.813528] [drm] GuC firmware load skipped
> [ 1898.797893] drm/i915: Resetting chip after gpu hang
> [ 1898.798138] [drm] RC6 on
> [ 1898.815332] [drm] GuC firmware load skipped
> [ 1906.797768] drm/i915: Resetting chip after gpu hang
> [ 1906.797894] [drm] RC6 on
> [ 1906.811883] [drm] GuC firmware load skipped
> 
> Every time the game is opened this problem happens. The only way to make
> this not happen is to open it from terminal with the option
> LIBGL_ALWAYS_SOFTWARE=1, but then the performance is terrible, because then
> it is used software rendering.
> 
> Information from glxinfo:
> 
> Extended renderer info (GLX_MESA_query_renderer):
>     Vendor: Intel Open Source Technology Center (0x8086)
>     Device: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2) (0x1916)
>     Version: 12.0.6
>     Accelerated: yes
>     Video memory: 3072MB
>     Unified memory: yes
>     Preferred profile: core (0x1)
>     Max core profile version: 4.3
>     Max compat profile version: 3.0
>     Max GLES1 profile version: 1.1
>     Max GLES[23] profile version: 3.1
> OpenGL vendor string: Intel Open Source Technology Center
> OpenGL renderer string: Mesa DRI Intel(R) HD Graphics 520 (Skylake GT2)
> OpenGL core profile version string: 4.3 (Core Profile) Mesa 12.0.6
> OpenGL core profile shading language version string: 4.30
> OpenGL core profile context flags: (none)
> OpenGL core profile profile mask: core profile 
> 
> Specifications:
> 
> Processor: Intel Core i3-6100U;
> Video: Intel HD Graphics 520;
> Architecture: amd64
> Mesa: 12.0.6;
> libdrm: 2.4.70-1
> Kernel version: 4.11.4-041104-generic from
> http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11.4/ This happened with
> generic (4.4), generic-hwe (4.8) and generic hwe-edge (4.10) too.
> Distribution: Xubuntu 16.04.2 amd64
> Machine: Lenovo Ideapad 310-14ISK (LENOVO 80UG/Toronto 4A2, BIOS 0XCN42WW
> 04/21/2017)
> 
> Steps to reproduce:
> 1) Open the game Starbound;
> 2) Wait the loading;
> 3) Note how the entire system freezes when the main menu should appear;
> 4) Crash.
> 
> Additional info:
> 
> I am not sure this can be helpful, but from the game logs, it hangs when the
> following message appears:
> [Info] detected supported OpenGL texture size 8192, using atlasNumCells 256
>
Comment 4 leozinho29_eu 2017-06-16 02:38:21 UTC
Created attachment 131991 [details]
uncompressed
Comment 5 leozinho29_eu 2017-06-16 02:46:26 UTC
Hello

The bug I have opened was marked as duplicate and closed.

I have done what was suggested there before it was closed, so I uploaded the uncompressed error file and the test result here.

I have installed mesa version 17.0.3. I was able to open and play Starbound using mesa 17.0.3 on it (confirmed it was 17.0.3 with glxinfo). It had no bugs and worked fine.

I can say mesa 17.0.3 don't have this bug.

Is there a way to make the applications detect the i965_dri.so and the other .dri files on the folder I have created? So I could use the more updated version normally, without having to add their patches in terminal before using them.

Thank you.
Comment 6 Mark Janes 2017-06-16 17:11:20 UTC
leozinho29_eu:

You can update the version of mesa for your ubuntu LTS system with the oibaf ppa:

https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers

Your comments indicate that this bug has already been fixed in mesa.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.