Bug 105778 - [CI] igt@gem_ctx_switch@basic-all-heavy - Failed assertion: !"GPU hung"
Summary: [CI] igt@gem_ctx_switch@basic-all-heavy - Failed assertion: !"GPU hung"
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-28 06:18 UTC by Marta Löfstedt
Modified: 2018-06-19 14:25 UTC (History)
1 user (show)

See Also:
i915 platform: BXT
i915 features: GPU hang


Attachments

Description Marta Löfstedt 2018-03-28 06:18:34 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3988/shard-apl3/igt@gem_ctx_switch@basic-all-heavy.html

(gem_ctx_switch:1516) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:481:
(gem_ctx_switch:1516) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Subtest basic-all-heavy failed.
Comment 1 Chris Wilson 2018-03-28 07:55:00 UTC
Looks like a false proclamation from hangcheck.

<7>[  258.560579] [IGT] gem_ctx_switch: starting subtest basic-all-heavy
<6>[  260.835639] [drm] GPU HANG: ecode 9:1:0xfefffffe, reason: no progress on bcs0, action: reset
<6>[  260.835921] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6>[  260.835925] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6>[  260.835929] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6>[  260.835932] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6>[  260.835937] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<7>[  260.837739] [drm:i915_reset_device [i915]] resetting chip
<5>[  260.840419] i915 0000:00:02.0: Resetting chip for no progress on bcs0
<7>[  260.843636] [drm:i915_gem_reset_engine [i915]] bcs0 pardoned
Comment 2 Marta Löfstedt 2018-04-18 06:37:22 UTC
Since this has only been seen once I lower priority to medium
Comment 3 kgizdov 2018-05-15 21:35:54 UTC
May 15 23:01:13 kernel: [drm] GPU HANG: ecode 9:0:0xfefffffe, in gnome-shell [907], reason: Hang on rcs0, action: reset
May 15 23:01:13 kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
May 15 23:01:13 kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
May 15 23:01:13 kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
May 15 23:01:13 kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
May 15 23:01:13 kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error

I see this error on 75% percent of my boots like clockwork and it completely stalls the machine. Only solution is reboot again until it works. I am on a Dell XPS 13 9350 with latest BIOS 1.7.0. This is probably very similar to this - https://bugs.freedesktop.org/show_bug.cgi?id=104522
Comment 4 Martin Peres 2018-06-15 08:22:53 UTC
(In reply to kgizdov from comment #3)
> May 15 23:01:13 kernel: [drm] GPU HANG: ecode 9:0:0xfefffffe, in gnome-shell
> [907], reason: Hang on rcs0, action: reset
> May 15 23:01:13 kernel: [drm] GPU hangs can indicate a bug anywhere in the
> entire gfx stack, including userspace.
> May 15 23:01:13 kernel: [drm] Please file a _new_ bug report on
> bugs.freedesktop.org against DRI -> DRM/Intel
> May 15 23:01:13 kernel: [drm] drm/i915 developers can then reassign to the
> right component if it's not a kernel issue.
> May 15 23:01:13 kernel: [drm] The gpu crash dump is required to analyze gpu
> hangs, so please always attach it.
> May 15 23:01:13 kernel: [drm] GPU crash dump saved to
> /sys/class/drm/card0/error
> 
> I see this error on 75% percent of my boots like clockwork and it completely
> stalls the machine. Only solution is reboot again until it works. I am on a
> Dell XPS 13 9350 with latest BIOS 1.7.0. This is probably very similar to
> this - https://bugs.freedesktop.org/show_bug.cgi?id=104522

Hello, so sorry as I just saw your comment out now!

In any case, I am not aware of issues like this, especially with such a high reproduction rate! Please file a new bug, as this is a bug to keep track of CI failures :s
Comment 5 Martin Peres 2018-06-15 08:23:29 UTC
Last seen: CI_DRM_3988 (2 months, 2 weeks / 1123 runs ago)
Comment 6 Jani Saarinen 2018-06-19 14:25:50 UTC
Closing, thanks.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.