Bug 104793 - GPU CRASH - GPU HANG: ecode 9:1:0xfffffffe, reason: Hang on bcs0
Summary: GPU CRASH - GPU HANG: ecode 9:1:0xfffffffe, reason: Hang on bcs0
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-25 20:22 UTC by Serge Pouliquen
Modified: 2018-03-02 16:06 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
/sys/class/drm/card0/error (16.00 KB, text/plain)
2018-01-25 20:22 UTC, Serge Pouliquen
no flags Details

Description Serge Pouliquen 2018-01-25 20:22:38 UTC
Created attachment 136964 [details]
/sys/class/drm/card0/error

Hi,

I'm using Debian stable (with kernel from backport).
I face a GPU crash and syslog message was inviting me to file a bug.

kernel 4.14.13 (from debian)

> uname -a
Linux lemon 4.14.0-0.bpo.3-amd64 #1 SMP Debian 4.14.13-1~bpo9+1 (2018-01-14) x86_64 GNU/Linux

hardware : gigabyte ga-z270x-ud5 + intel i5-7600 (kaby lake + 4k monitor bdm3275

debian is up to date

I have no idea of the cause, because crash happens during screensaver.
I cannot reproduce : don't know how to crash (looks random and rare - first time in the week)
I already filed bug about crash during screensaver.

Regards,
Serge

syslog extract:
Jan 25 19:53:03 lemon kernel: [25129.965380] [drm] GPU HANG: ecode 9:1:0xfffffffe, reason: Hang on bcs0, action: reset
Jan 25 19:53:03 lemon kernel: [25129.965380] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 25 19:53:03 lemon kernel: [25129.965381] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 25 19:53:03 lemon kernel: [25129.965381] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issu
e.
Jan 25 19:53:03 lemon kernel: [25129.965381] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jan 25 19:53:03 lemon kernel: [25129.965382] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jan 25 19:53:03 lemon kernel: [25129.965479] i915 0000:00:02.0: Resetting bcs0 after gpu hang
Jan 25 19:53:07 lemon kernel: [25133.960270] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 25 19:53:19 lemon kernel: [25145.960279] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 25 19:53:27 lemon kernel: [25153.960375] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Jan 25 19:53:37 lemon kernel: [25163.944460] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Comment 1 Chris Wilson 2018-01-25 20:37:57 UTC
It's an issue in the DMC firmware.

commit 4f0aa1fa3e3849caee450ee5d14fcc289cf16703
Author: Anusha Srivatsa <anusha.srivatsa@intel.com>
Date:   Thu Nov 9 10:51:43 2017 -0800

    drm/i915/dmc: DMC 1.04 for Kabylake
    
    There is a new version of DMC available for KBL.
    
    The release notes mentions:
    1. Fix for the issue where DC_STATE was getting enabled even
    when disabled by driver causing data corruption.
    
    v2: Remove pull request from commit message (Rodrigo).
    
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
    Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Signed-off-by: Jani Nikula <jani.nikula@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/1510253503-12634-1-git-send-email-anusha.srivatsa@intel.com


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.