Bug 92441

Summary: GPU HANG: ecode 6:0:0x000097ff // kernel 4.1.10 x86_64
Product: DRI Reporter: lexi81 <bugzilla>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: medium CC: bugzilla, intel-gfx-bugs
Version: XOrg git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: SNB i915 features: GPU hang
Attachments:
Description Flags
dmesg
none
gpu dump none

Description lexi81 2015-10-12 19:51:27 UTC
Created attachment 118845 [details]
dmesg

At start of playback the gpu hangs after displaying the first frame. This happens on OpenElec kernel 4.1.10 (new VAAPI EGL branch). File being played is the AVS forums white clipping file.

61.340143] [drm] stuck on render ring
[   61.342340] [drm] GPU HANG: ecode 6:0:0x000097ff, in kodi.bin [462], reason: Ring hung, action: reset
[   61.342344] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[   61.342346] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[   61.342349] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[   61.342351] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[   61.342354] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[   61.344936] drm/i915: Resetting chip after gpu hang
[   67.339418] [drm] stuck on render ring
[   67.340439] [drm] GPU HANG: ecode 6:0:0x000097ff, in kodi.bin [462], reason: Ring hung, action: reset
[   67.342597] drm/i915: Resetting chip after gpu hang
[   68.062008] kodi.bin[462]: segfault at 30 ip 0000000000000030 sp 00007ffcbee89778 error 14 in kodi.bin[400000+1418000]

full dmesg and gpu dump attached.
Comment 1 lexi81 2015-10-12 19:52:45 UTC
Created attachment 118846 [details]
gpu dump
Comment 2 yann 2016-09-21 16:17:32 UTC
From gpu crash dump, we can see that actual head to batch buffers for render and bsd ring are corrupted, not pointing to current ones and we can see
ERROR: 0x00000001
    TLB page fault error (GTT entry not valid)


Since there were improvements pushed in kernel and Mesa that will benefit to your system, please re-test with latest kernel & Mesa to see if this issue is still occurring.
Comment 3 yann 2016-11-15 11:31:22 UTC
(In reply to yann from comment #2)
> From gpu crash dump, we can see that actual head to batch buffers for render
> and bsd ring are corrupted, not pointing to current ones and we can see
> ERROR: 0x00000001
>     TLB page fault error (GTT entry not valid)
> 
> 
> Since there were improvements pushed in kernel and Mesa that will benefit to
> your system, please re-test with latest kernel & Mesa to see if this issue
> is still occurring.

Timeout. Assuming that it is fixed by now. If this is not the case, please re-test with latest kernel & Mesa (12-13) to see if this issue is still occurring since there were improvements pushed in kernel and Mesa that will benefit to your system.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.