Bug 104446 - GPU HANG: ecode 9:0:0x00000000, reason: No progress on rcs0, bcs0, vcs0, vecs0, action: reset
Summary: GPU HANG: ecode 9:0:0x00000000, reason: No progress on rcs0, bcs0, vcs0, vecs...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-02 08:10 UTC by rvprasad
Modified: 2018-03-30 06:52 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features: GPU hang


Attachments
Contents of /sys/class/drm/card0/error file associated with GPU HANG (11.27 KB, text/plain)
2018-01-04 04:02 UTC, rvprasad
no flags Details

Description rvprasad 2018-01-02 08:10:28 UTC
I have PopOS (Ubuntu artful) installed with kernel 4.14.0-041400-generic #201711122031 on a System76 laptop with Intel i915 integrated GPU (519b) and NVidia 1070 (driver v387.26).  When I enable "MSHYBRID" option in BIOS, the screen goes blank upon booting and stays so as the system completes booting to become operational without the display (I can ssh into the system).  When I enable "DISCRETE" option in BIOS, system boots up with display using the NVIDIA card but Intel GPU is not visible in tools such as lspci.

Observable behavior is the same 
 - with stock 4.13.0 kernel in PopOS
 - upon installing intel-microcode_3.20171117.1_amd64.deb

dmesg snippet:

[drm] GPU HANG: ecode 9:0:0x00000000, reason: No progress on rcs0, bcs0, vcs0, vecs0, action: reset
[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.

Since the display was off, I had to reboot the laptop and I could not find /sys/class/drm/card0/error file.
Comment 1 Chris Wilson 2018-01-02 08:50:10 UTC
As you've worked out the error state is only kept in memory until the next reboot. Any chance you can log in remotely to obtain it and the dmesg? It sounds like very early bring up fails. Can you install a drm-tip kernel (https://cgit.freedesktop.org/drm-tip) as in the worst case we may have to do some exploratory patching to see why it fails.
Comment 2 Elizabeth 2018-01-02 21:05:20 UTC
Adjusting priority while more information for triaging is collected.
Comment 3 rvprasad 2018-01-02 22:42:39 UTC
Based on feedback from folks at System76, I added

GRUB_GFXPAYLOAD_LINUX=1920*1080

to grub and I did not encounter the blank screen.  Would you still want the error log?
Comment 4 Elizabeth 2018-01-03 20:53:23 UTC
If possible, yes, could you share it. Thank you.
Comment 5 rvprasad 2018-01-04 04:02:46 UTC
Created attachment 136542 [details]
Contents of /sys/class/drm/card0/error file associated with GPU HANG

1) Unlike when I encountered the bug, this time I did not see the following messages when I reproduced the bug (and captured error log) on 4.13.0-21-generic.

[drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.

2) When I collected the above error, all of the following were true.
  a) GRUB_GFXPAYLOAD_LINUX was not specified
  b) No external monitor cable were plugged into the laptop
  c) "Intel" GPU was selected in Nvidia X Server Settings

3) There was no GPU HANG message at boot time in configurations !a && b, !a && b, and !a && !b.  IIRC, content was displayed on the integrated display when GRUB_GFXPAYLOAD_LINUX=1920*1080.

4) System76 mentioned that the HDMI port is not connected to iGPU on my Oryx Pro laptop.  [xrandr does not list HDMI port if "Intel" GPU was selected.]  (I am interested in a software based solution to verify HDMI-GPU connectivity.]
Comment 6 Jani Saarinen 2018-03-29 07:11:48 UTC
First of all. Sorry about spam.
This is mass update for our bugs. 

Sorry if you feel this annoying but with this trying to understand if bug still valid or not.
If bug investigation still in progress, please ignore this and I apologize!

If you think this is not anymore valid, please comment to the bug that can be closed.
If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Comment 7 rvprasad 2018-03-29 16:12:20 UTC
(In reply to Jani Saarinen from comment #6)
> First of all. Sorry about spam.
> This is mass update for our bugs. 
> 
> Sorry if you feel this annoying but with this trying to understand if bug
> still valid or not.
> If bug investigation still in progress, please ignore this and I apologize!
> 
> If you think this is not anymore valid, please comment to the bug that can
> be closed.
> If you haven't tested with our latest pre-upstream tree(drm-tip), can you do
> that also to see if issue is valid there still and if you cannot see issue
> there, please comment to the bug.

I have not encountered the same issue again.  I guess it is due to the Intel Graphics not being connected to the display port.  So, I am not sure if this is a software issue.  Does this help?
Comment 8 Jani Saarinen 2018-03-30 06:52:31 UTC
OK, thanks for the feedback, resolving.
If you see issue again, please re-open


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.