Bug 99924 - [drm] stuck on render ring
Summary: [drm] stuck on render ring
Status: CLOSED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-23 15:19 UTC by Chris Healy
Modified: 2017-09-07 20:27 UTC (History)
2 users (show)

See Also:
i915 platform: IVB
i915 features: GPU hang


Attachments
The contents of /sys/class/drm/card0/error (3.54 MB, text/plain)
2017-02-23 15:19 UTC, Chris Healy
no flags Details

Description Chris Healy 2017-02-23 15:19:30 UTC
Created attachment 129875 [details]
The contents of /sys/class/drm/card0/error

3.14.5 kernel
2.4.52 libdrm
1.6.2 libX11
10.0.2 Mesa
2.21.15 xf86-video-intel
1.12.4 xserver-xorg

Running a custom OpenGL application and this issue occurred resulting in the OpenGL application no longer rendering

Kernel messages as follows:

[51630.220152] [drm] stuck on render ring
[51630.220163] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[51630.220167] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[51630.220170] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[51630.220173] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[51630.220176] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[51636.201578] [drm] stuck on render ring
[51642.202965] [drm] stuck on render ring
..... keeps going indefinitely ....

After I copied /sys/class/drm/card0/error to /tmp, the system automatically recovered and continued rendering as if there was no problem.
Comment 1 Chris Healy 2017-02-23 15:20:57 UTC
Additionally, this was on an Ivy Bride i7-3517U.
Comment 2 Chris Wilson 2017-02-23 15:25:12 UTC
Looks like the HW context state was corrupted. You definitely need a new mesa and kernel.
Comment 3 Chris Healy 2017-02-23 15:47:34 UTC
Is there a better way to figure out what needs to change to address this issue without moving to a trunk kernel and Mesa or is that pretty difficult due to the age of the code and number of differences?

Ideally, we would take an individual bug fix and apply it to our current kernel/Mesa/libdrm to address this issue as this box exists on a commercial aircraft and making large changes can be challenging for non-technical reasons.  Is this reasonably feasible?
Comment 4 Elizabeth 2017-07-05 21:10:30 UTC
(In reply to Chris Healy from comment #3)
> ... 
> Ideally, we would take an individual bug fix and apply it to our current
> kernel/Mesa/libdrm to address this issue as this box exists on a commercial
> aircraft and making large changes can be challenging for non-technical
> reasons.  Is this reasonably feasible?

Good evening Chirs,
Your problem is quite understandable, but it is specially risky to try and keep the same old kernel using patches or any customized fixes, because this could lead to the malfunction of other features that may not work well or don't work at all. That's why it is highly recommended to move from the actual version to the latest kernel. I hope this information helps. 
BR. Elizabeth
Comment 5 Elizabeth 2017-08-25 18:47:06 UTC
Hello again, I'm closing this bug since last update was on February. If problem persist please file a new bug with HW, SW information and logs attached. Thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.