Bug 101422 - [SKL] GPU HANG: ecode 9:1:0xcab6fff5, in queue_encoding: [1120], reason: Ring hung, action: reset
Summary: [SKL] GPU HANG: ecode 9:1:0xcab6fff5, in queue_encoding: [1120], reason: Ring...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-14 12:24 UTC by Veeranna Tadala
Modified: 2018-04-20 14:14 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
gpu crash dump (2.97 MB, text/plain)
2017-06-14 12:24 UTC, Veeranna Tadala
no flags Details
Full dmesg (41.10 KB, text/plain)
2017-06-26 04:43 UTC, Veeranna Tadala
no flags Details

Description Veeranna Tadala 2017-06-14 12:24:52 UTC
Created attachment 131953 [details]
gpu crash dump

Hi,

Observed this on Intel sky-lake platform while running multiple gstreamer pipelines which includes decode, encode and display.

Linux custom image : Build using yocto. 
yocto version : krogoth
kernel version : 4.4
Hardware details :  Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz

More information on observed hang: 

Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] stuck on bsd ring
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] GPU HANG: ecode 9:1:0xcab6fff5, in queue_encoding: [1120], reason: Ring hung, action: reset
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jun 13 06:32:04 64006A23AEE9 user.info kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jun 13 06:32:04 64006A23AEE9 user.notice kernel: drm/i915: Resetting chip after gpu hang
Jun 13 06:32:04 64006A23AEE9 user.err kernel: [drm:i915_gem_init_hw] *ERROR* Failed to initialize GuC, error -5 (ignored)

GPU crash dump also attached here.

Please let me know if you need more information. Thanks in advance.
Comment 1 Elizabeth 2017-06-14 16:14:28 UTC
Hello Veeranna, is this problem 100% reproducible? I mean, can you replicate it always? Is there any especial step or sequence to make it happen? Also, Could you please boot with drm.debug=0xe and quiet splash parameters on the grub and provide the full dmesg? Thank you.
Comment 2 Elizabeth 2017-06-14 19:06:38 UTC
I forget to mention once you provide more information, please change the tag from "NEEDINFO" o "REOPEN". Thank you.
Comment 3 Elizabeth 2017-06-14 19:16:32 UTC
(In reply to elizabethx.de.la.torre.mena from comment #2)
> I forget to mention once you provide more information, please change the tag
> from "NEEDINFO" o "REOPEN". Thank you.

My error, sorry, from "NEEDINFO"  to  "REOPEN".
Comment 4 Veeranna Tadala 2017-06-26 04:43:01 UTC
Created attachment 132243 [details]
Full dmesg
Comment 5 Veeranna Tadala 2017-06-26 04:45:07 UTC
Yes its 100% reproducible with below command line pipeline and no need to follow any specific steps.

gst-launch-1.0 rtspsrc location=rtsp://192.168.7.108:8556/cv ! rtph264depay ! tee name=t1 ! h264parse ! 'video/x-h264, stream-format=byte-stream' ! tee name=t4 t4. ! queue ! appsink t4. ! queue ! appsink t1. ! queue ! h264parse ! 'video/x-h264, stream-format=byte-stream' ! vaapidecodebin ! tee name=t2 t2. ! queue ! vaapipostproc scale-method=2 width=640 height=360 deinterlace-mode=2 ! appsink t2. ! queue ! videorate ! 'video/x-raw, format=NV12, framerate=10/1' ! vaapipostproc scale-method=2 deinterlace-mode=2 ! vaapih264enc init-qp=20 keyframe-period=10 ! 'video/x-h264, stream-format=byte-stream' ! tee name=t3 t3. ! queue ! appsink t3. ! queue ! appsink t2. ! queue ! vaapipostproc scale-method=2 deinterlace-mode=2 ! vaapisink

Attached full dmesg dump with drm.debug=0xe and quiet splash parameters.
Comment 6 Veeranna Tadala 2017-06-26 04:48:17 UTC
(In reply to Veeranna Tadala from comment #5)
> Yes its 100% reproducible with below command line pipeline and no need to
> follow any specific steps.
> 
> gst-launch-1.0 rtspsrc location=rtsp://192.168.7.108:8556/cv ! rtph264depay
> ! tee name=t1 ! h264parse ! 'video/x-h264, stream-format=byte-stream' ! tee
> name=t4 t4. ! queue ! appsink t4. ! queue ! appsink t1. ! queue ! h264parse
> ! 'video/x-h264, stream-format=byte-stream' ! vaapidecodebin ! tee name=t2
> t2. ! queue ! vaapipostproc scale-method=2 width=640 height=360
> deinterlace-mode=2 ! appsink t2. ! queue ! videorate ! 'video/x-raw,
> format=NV12, framerate=10/1' ! vaapipostproc scale-method=2
> deinterlace-mode=2 ! vaapih264enc init-qp=20 keyframe-period=10 !
> 'video/x-h264, stream-format=byte-stream' ! tee name=t3 t3. ! queue !
> appsink t3. ! queue ! appsink t2. ! queue ! vaapipostproc scale-method=2
> deinterlace-mode=2 ! vaapisink
> 
> Attached full dmesg dump with drm.debug=0xe and quiet splash parameters.

I am using gstreamer-vaapi-1.9.90 and libva-1.7.2.
Comment 7 Elizabeth 2017-06-26 21:06:52 UTC
Adding tag into "Whiteboard" field - ReadyForDev
*Status is correct
*Platform is included
*Feature is included
*Priority and Severity correctly set
*Logs included
Comment 8 Elizabeth 2017-10-20 17:42:49 UTC
IOMMU enabled?: -1
Could you try intel_iommu=igfx_off on grub please.
It hung on bsd ring. Have you tried drm-tip branch: https://cgit.freedesktop.org/drm-tip or latest mainline: https://www.kernel.org?
Comment 9 Elizabeth 2018-01-25 22:38:46 UTC
Any luck with intel_iommu=igfx_off??
Comment 10 Jani Saarinen 2018-03-29 07:10:27 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 11 Jani Saarinen 2018-04-20 14:14:47 UTC
Closing, please re-open if still occurs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.