Bug 104802

Summary: GPU error lock system
Product: DRI Reporter: Jean-Paul <jp.pozzi>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED DUPLICATE QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: critical    
Priority: medium CC: intel-gfx-bugs
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Gpu crash dump none

Description Jean-Paul 2018-01-26 11:15:58 UTC
Created attachment 136973 [details]
Gpu crash dump

Hello,

One of my systems was running with a recent kernel (4.14), yesterday I install the latest 4.14.15.
During the night the machine has a gpu hang causing a freeze of the system.
I enclose a part of the "kern.log" file and the gpu crash dump.
My system is :
Debian Stretch "up to date"
CPU : Core I5
Mem : 16 Go
Some disks (SSD and classical).

Regards

JP P
Comment 1 Jean-Paul 2018-01-26 11:17:54 UTC
Part of syslog before crash :
Jan 26 03:33:32 portail dhcpd[3712]: DHCPREQUEST for 192.168.2.50 from ac:3c:0b:a7:e8:72 via br1
Jan 26 03:33:32 portail dhcpd[3712]: DHCPACK on 192.168.2.50 to ac:3c:0b:a7:e8:72 via br1
Jan 26 03:35:00 portail kernel: [94002.775916] DMAR: DRHD: handling fault status reg 3
Jan 26 03:35:00 portail kernel: [94002.775920] DMAR: [DMA Read] Request device [00:02.0] fault addr 108000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775923] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775924] DMAR: [DMA Read] Request device [00:02.0] fault addr 10c000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775926] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775927] DMAR: [DMA Read] Request device [00:02.0] fault addr 10e000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775929] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775931] DMAR: [DMA Read] Request device [00:02.0] fault addr 11f000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775933] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775934] DMAR: [DMA Read] Request device [00:02.0] fault addr 114000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775936] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775938] DMAR: [DMA Read] Request device [00:02.0] fault addr 113000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775941] DMAR: DRHD: handling fault status reg 3
Jan 26 03:35:00 portail kernel: [94002.775943] DMAR: [DMA Read] Request device [00:02.0] fault addr 119000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775944] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775945] DMAR: [DMA Read] Request device [00:02.0] fault addr 121000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775947] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775948] DMAR: [DMA Read] Request device [00:02.0] fault addr 122000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:00 portail kernel: [94002.775949] DMAR: DRHD: handling fault status reg 2
Jan 26 03:35:00 portail kernel: [94002.775951] DMAR: [DMA Read] Request device [00:02.0] fault addr 12c000 [fault reason 05] PTE Write access is not set
Jan 26 03:35:01 portail CRON[13333]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Jan 26 03:35:11 portail kernel: [94013.635132] [drm] GPU HANG: ecode 8:0:0x85dffffb, in Xorg [2791], reason: Hang on rcs0, action: reset
Jan 26 03:35:11 portail kernel: [94013.635134] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
Jan 26 03:35:11 portail kernel: [94013.635134] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
Jan 26 03:35:11 portail kernel: [94013.635134] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
Jan 26 03:35:11 portail kernel: [94013.635135] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
Jan 26 03:35:11 portail kernel: [94013.635135] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Jan 26 03:35:11 portail kernel: [94013.635138] i915 0000:00:02.0: Resetting rcs0 after gpu hang
Comment 2 Chris Wilson 2018-01-26 12:22:51 UTC

*** This bug has been marked as a duplicate of bug 89360 ***

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.