Bug 104377 - GPU HANG: ecode 9:0:0x8fd8ffff, in ffmpeg [3458], reason: Hang on rcs0, action: reset
Summary: GPU HANG: ecode 9:0:0x8fd8ffff, in ffmpeg [3458], reason: Hang on rcs0, actio...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-23 14:56 UTC by Erik Oomen
Modified: 2018-04-20 15:51 UTC (History)
3 users (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
/sys/class/drm/card0/error (24.50 KB, text/plain)
2017-12-23 14:56 UTC, Erik Oomen
no flags Details
dmesg detailing hang (58.80 KB, text/x-log)
2017-12-23 17:34 UTC, Axel Fischer
no flags Details
GPU crash dump (58.29 KB, text/plain)
2017-12-23 17:37 UTC, Axel Fischer
no flags Details
/sys/class/drm/card0/error (24.37 KB, text/plain)
2017-12-26 20:05 UTC, Erik Oomen
no flags Details
Headless server dmesg (156.03 KB, text/plain)
2017-12-27 16:18 UTC, Erik Oomen
no flags Details
Headless server dmesg after ffmpeg run. (232.93 KB, text/plain)
2017-12-27 16:21 UTC, Erik Oomen
no flags Details
/sys/class/drm/card0/error (24.39 KB, text/plain)
2017-12-27 16:23 UTC, Erik Oomen
no flags Details
linux 4.15-rc6, /sys/class/drm/card0/error (18.95 KB, text/plain)
2018-01-05 09:58 UTC, Erik Oomen
no flags Details
linux 4.15-rc6 dmesg (207.41 KB, text/plain)
2018-01-05 10:01 UTC, Erik Oomen
no flags Details

Description Erik Oomen 2017-12-23 14:56:25 UTC
Created attachment 136376 [details]
/sys/class/drm/card0/error

Trying to transcode a movie using ffmpeg (hevc and h264 -> h264) using ffmpeg 3.4.1 on a debian9 machine.

Happens with all 4.14 kernels.
Comment 1 Axel Fischer 2017-12-23 17:34:15 UTC
Same problem here.
GPU HANG: ecode 9:0:0x85dffffb, in X [3938], reason: Hang on rcs0, action: reset

In my case it is usually triggered by visiting Google Maps in Firefox.

dmesg and dump attached.
Comment 2 Axel Fischer 2017-12-23 17:34:59 UTC
Created attachment 136379 [details]
dmesg detailing hang
Comment 3 Axel Fischer 2017-12-23 17:37:23 UTC
Created attachment 136380 [details]
GPU crash dump
Comment 4 Axel Fischer 2017-12-23 18:38:45 UTC
In my case it is reproducible with kernels 4.12.14 and 4.14.8. I will try to find out, which package update caused this problem.
Comment 5 Elizabeth 2017-12-26 16:09:22 UTC
Hello everyone, just in case, could you try mesa 17.3?
Comment 6 Erik Oomen 2017-12-26 20:04:42 UTC
Original bug reporter here.

Not running xorg, this is an headless server.  However, upgraded the mesa libs: 17.3.1-1 (debian sid).  Also upgraded the kernel to 4.14.9.  
No effects, ffmpeg is still hanging.
Comment 7 Erik Oomen 2017-12-26 20:05:59 UTC
Created attachment 136391 [details]
/sys/class/drm/card0/error
Comment 8 Elizabeth 2017-12-27 15:53:27 UTC
(In reply to Axel Fischer from comment #4)
> In my case it is reproducible with kernels 4.12.14 and 4.14.8. I will try to
> find out, which package update caused this problem.
That will be really helpful.

(In reply to Erik Oomen from comment #6)
> Original bug reporter here.
> 
> Not running xorg, this is an headless server.  However, upgraded the mesa
> libs: 17.3.1-1 (debian sid).  Also upgraded the kernel to 4.14.9.  
> No effects, ffmpeg is still hanging.
Could you get a dmesg or kern.log from boot to hang with debug information, drm.debug=0xe parameter on grub?

Thanks.
Comment 9 Erik Oomen 2017-12-27 16:18:19 UTC
Created attachment 136405 [details]
Headless server dmesg

dmesg of headless server.
Comment 10 Erik Oomen 2017-12-27 16:21:37 UTC
Created attachment 136406 [details]
Headless server dmesg after ffmpeg run.

dmesg after ffmpeg was run.
Comment 11 Erik Oomen 2017-12-27 16:23:58 UTC
Created attachment 136407 [details]
/sys/class/drm/card0/error
Comment 12 Erik Oomen 2017-12-27 21:01:18 UTC
Update: Error is reproducible on a NUC 6i3SYB .
Comment 13 Axel Fischer 2017-12-28 11:32:39 UTC
(In reply to Elizabeth from comment #8)
> (In reply to Axel Fischer from comment #4)
> > In my case it is reproducible with kernels 4.12.14 and 4.14.8. I will try to
> > find out, which package update caused this problem.
> That will be really helpful.
 
I tested it now and this problem is reproducible with mesa versions 17.3.0 and 17.3.1. Downgrading mesa to version 17.2.7 or any other prior version fixes the problem. Please let me know if I can provide additional information that would be helpful.
Comment 14 Elizabeth 2017-12-28 15:33:05 UTC
(In reply to Axel Fischer from comment #13)
> ... I tested it now and this problem is reproducible with mesa versions 17.3.0
> and 17.3.1. Downgrading mesa to version 17.2.7 or any other prior version
> fixes the problem. Please let me know if I can provide additional
> information that would be helpful.
Seems to be different problems. Erik, could you confirm that issue fixes downgrading mesa? If not, it will be needed to file a new bug with Axel's report to be worked on by mesa team.
Comment 15 Erik Oomen 2017-12-28 21:02:43 UTC
Downgrading did not help.  Tried kernels 4.12, 4.11 and 4.10, downgraded motherboard BIOS.  No luck.

I've had vaapi working perfectly in this setup, only thing that changed was the addition of 2nd PCIe ethernet card and regular debian upgrades.
Comment 16 Axel Fischer 2017-12-29 06:46:44 UTC
(In reply to Elizabeth from comment #14)
> (In reply to Axel Fischer from comment #13)
> > ... I tested it now and this problem is reproducible with mesa versions 17.3.0
> > and 17.3.1. Downgrading mesa to version 17.2.7 or any other prior version
> > fixes the problem. Please let me know if I can provide additional
> > information that would be helpful.
> Seems to be different problems. Erik, could you confirm that issue fixes
> downgrading mesa? If not, it will be needed to file a new bug with Axel's
> report to be worked on by mesa team.

I created a new bug report for mesa (#104411).
Comment 17 Erik Oomen 2018-01-05 09:58:18 UTC
Created attachment 136565 [details]
linux 4.15-rc6, /sys/class/drm/card0/error
Comment 18 Erik Oomen 2018-01-05 10:01:13 UTC
Created attachment 136566 [details]
linux 4.15-rc6 dmesg
Comment 19 Erik Oomen 2018-01-05 11:15:18 UTC
Ok, got it working...

On debian:
downgraded libva-dev and i965-va-driver to 1.7.3-2 (stable), build ffmpeg from git.
Comment 20 Jani Saarinen 2018-03-29 07:11:36 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 21 Jani Saarinen 2018-04-20 15:51:48 UTC
Closing, please re-open if still occurs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.