Bug 90623 - Intel GPU hangs on resume from suspend - screen freezes, must hard reboot
Summary: Intel GPU hangs on resume from suspend - screen freezes, must hard reboot
Status: CLOSED WONTFIX
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other other
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-05-24 21:22 UTC by Rahul
Modified: 2017-03-03 18:07 UTC (History)
2 users (show)

See Also:
i915 platform: BDW
i915 features: power/suspend-resume


Attachments

Description Rahul 2015-05-24 21:22:14 UTC
Possible duplicate bug: https://bugs.launchpad.net/elementaryos/+bug/1404826, although my issue has nothing to do with firefox. The problem occurs more frequently when I manually suspend the session and try to resume. I've got nothing special on resume from suspend as far as I can tell. Filing bug here because the /var/log/syslog entry asked me to.

> ll /etc/pm/sleep.d/
total 16
drwxr-xr-x 2 root root 4096 Apr 22 08:07 ./
drwxr-xr-x 5 root root 4096 Apr 22 08:05 ../
-rwxr-xr-x 1 root root  210 Apr  6 16:44 10_grub-common*
-rwxr-xr-x 1 root root  660 Mar  5 11:36 10_unattended-upgrades-hibernate*


> egrep "drm|i915|fbcon|fb0" /var/log/syslog

May 24 09:29:13 taprah kernel: [    0.641647] fb0: VESA VGA frame buffer device
May 24 09:29:13 taprah kernel: [   10.052343] [drm] Initialized drm 1.1.0 20060810
May 24 09:29:13 taprah kernel: [   10.528663] [drm] Memory usable by graphics device = 2048M
May 24 09:29:13 taprah kernel: [   10.528672] fb: switching to inteldrmfb from VESA VGA
May 24 09:29:13 taprah kernel: [   10.528785] [drm] Replacing VGA console driver
May 24 09:29:13 taprah kernel: [   10.551935] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
May 24 09:29:13 taprah kernel: [   10.551938] [drm] Driver supports precise vblank timestamp query.
May 24 09:29:13 taprah kernel: [   10.584551] [drm] Initialized i915 1.6.0 20141121 for 0000:00:02.0 on minor 0
May 24 09:29:13 taprah kernel: [   10.607909] fbcon: inteldrmfb (fb0) is primary device
May 24 09:29:13 taprah kernel: [   10.607962] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
May 24 09:29:13 taprah kernel: [   10.607964] i915 0000:00:02.0: registered panic notifier
May 24 09:29:13 taprah kernel: [   11.239990] Could not create debugfs 'i915_context_free' directory
May 24 09:29:13 taprah kernel: [   11.239993] Could not create debugfs 'i915_context_create' directory
May 24 09:29:13 taprah kernel: [   11.239995] Could not create debugfs 'i915_ppgtt_release' directory
May 24 09:29:13 taprah kernel: [   11.239997] Could not create debugfs 'i915_ppgtt_create' directory
May 24 09:29:13 taprah kernel: [   11.240001] Could not create debugfs 'i915_reg_rw' directory
May 24 09:29:13 taprah kernel: [   11.240004] Could not create debugfs 'i915_flip_complete' directory
May 24 09:29:13 taprah kernel: [   11.240006] Could not create debugfs 'i915_flip_request' directory
May 24 09:29:13 taprah kernel: [   11.240008] Could not create debugfs 'i915_ring_wait_end' directory
May 24 09:29:13 taprah kernel: [   11.240011] Could not create debugfs 'i915_ring_wait_begin' directory
May 24 09:29:13 taprah kernel: [   11.240013] Could not create debugfs 'i915_gem_request_wait_end' directory
May 24 09:29:13 taprah kernel: [   11.240015] Could not create debugfs 'i915_gem_request_wait_begin' directory
May 24 09:29:13 taprah kernel: [   11.240017] Could not create debugfs 'i915_gem_request_complete' directory
May 24 09:29:13 taprah kernel: [   11.240019] Could not create debugfs 'i915_gem_request_retire' directory
May 24 09:29:13 taprah kernel: [   11.240032] Could not create debugfs 'i915_gem_request_add' directory
May 24 09:29:13 taprah kernel: [   11.240034] Could not create debugfs 'i915_gem_ring_flush' directory
May 24 09:29:13 taprah kernel: [   11.240037] Could not create debugfs 'i915_gem_ring_dispatch' directory
May 24 09:29:13 taprah kernel: [   11.240039] Could not create debugfs 'i915_gem_ring_sync_to' directory
May 24 09:29:13 taprah kernel: [   11.240041] Could not create debugfs 'i915_gem_evict_vm' directory
May 24 09:29:13 taprah kernel: [   11.240044] Could not create debugfs 'i915_gem_evict_everything' directory
May 24 09:29:13 taprah kernel: [   11.240046] Could not create debugfs 'i915_gem_evict' directory
May 24 09:29:13 taprah kernel: [   11.240048] Could not create debugfs 'i915_gem_object_destroy' directory
May 24 09:29:13 taprah kernel: [   11.240050] Could not create debugfs 'i915_gem_object_clflush' directory
May 24 09:29:13 taprah kernel: [   11.240053] Could not create debugfs 'i915_gem_object_fault' directory
May 24 09:29:13 taprah kernel: [   11.240055] Could not create debugfs 'i915_gem_object_pread' directory
May 24 09:29:13 taprah kernel: [   11.240057] Could not create debugfs 'i915_gem_object_pwrite' directory
May 24 09:29:13 taprah kernel: [   11.240059] Could not create debugfs 'i915_gem_object_change_domain' directory
May 24 09:29:13 taprah kernel: [   11.240090] Could not create debugfs 'i915_vma_unbind' directory
May 24 09:29:13 taprah kernel: [   11.240092] Could not create debugfs 'i915_vma_bind' directory
May 24 09:29:13 taprah kernel: [   11.240095] Could not create debugfs 'i915_gem_object_create' directory
May 24 09:29:13 taprah kernel: [   11.240097] Could not create debugfs 'i915_pipe_update_end' directory
May 24 09:29:13 taprah kernel: [   11.240099] Could not create debugfs 'i915_pipe_update_vblank_evaded' directory
May 24 09:29:13 taprah kernel: [   11.240101] Could not create debugfs 'i915_pipe_update_start' directory
May 24 09:29:13 taprah kernel: [   11.240477] snd_hda_intel 0000:00:03.0: Cannot turn on display power on i915
May 24 16:13:16 taprah kernel: [11827.646199] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
May 24 16:13:16 taprah kernel: [11827.646778] i915 0000:00:02.0: BAR 6: [??? 0x00000000 flags 0x2] has bogus alignment
May 24 16:14:30 taprah kernel: [11901.549648] [drm] stuck on render ring
May 24 16:14:30 taprah kernel: [11901.550629] [drm] GPU HANG: ecode 8:0:0x84dffefc, in Xorg [760], reason: Ring hung, action: reset
May 24 16:14:30 taprah kernel: [11901.550631] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
May 24 16:14:30 taprah kernel: [11901.550632] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
May 24 16:14:30 taprah kernel: [11901.550633] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
May 24 16:14:30 taprah kernel: [11901.550634] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
May 24 16:14:30 taprah kernel: [11901.550635] [drm] GPU crash dump saved to /sys/class/drm/card0/error
May 24 16:14:30 taprah kernel: [11901.557647] drm/i915: Resetting chip after gpu hang
May 24 16:14:36 taprah kernel: [11907.525897] [drm] stuck on render ring
May 24 16:14:36 taprah kernel: [11907.526888] [drm] GPU HANG: ecode 8:0:0x84dffefc, in Xorg [760], reason: Ring hung, action: reset
May 24 16:14:36 taprah kernel: [11907.527021] [drm:i915_set_reset_status [i915]] *ERROR* gpu hanging too fast, banning!
May 24 16:14:36 taprah kernel: [11907.533812] drm/i915: Resetting chip after gpu hang
May 24 16:14:49 taprah kernel: [11920.562243] [drm] stuck on render ring
May 24 16:14:49 taprah kernel: [11920.563205] [drm] GPU HANG: ecode 8:0:0x86dffffd, in Xorg [3061], reason: Ring hung, action: reset
May 24 16:14:49 taprah kernel: [11920.570194] drm/i915: Resetting chip after gpu hang
May 24 16:17:45 taprah kernel: [    0.645193] fb0: VESA VGA frame buffer device
May 24 16:17:45 taprah kernel: [   11.130407] [drm] Initialized drm 1.1.0 20060810
May 24 16:17:45 taprah kernel: [   11.881528] [drm] Memory usable by graphics device = 2048M
May 24 16:17:45 taprah kernel: [   11.881533] fb: switching to inteldrmfb from VESA VGA
May 24 16:17:45 taprah kernel: [   11.881622] [drm] Replacing VGA console driver
May 24 16:17:45 taprah kernel: [   11.904413] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
May 24 16:17:45 taprah kernel: [   11.904416] [drm] Driver supports precise vblank timestamp query.
May 24 16:17:45 taprah kernel: [   11.927408] [drm] Initialized i915 1.6.0 20141121 for 0000:00:02.0 on minor 0
May 24 16:17:45 taprah kernel: [   11.951135] fbcon: inteldrmfb (fb0) is primary device
May 24 16:17:45 taprah kernel: [   11.951219] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
May 24 16:17:45 taprah kernel: [   11.951220] i915 0000:00:02.0: registered panic notifier
May 24 16:17:45 taprah kernel: [   12.592929] Could not create debugfs 'i915_context_free' directory
May 24 16:17:45 taprah kernel: [   12.592933] Could not create debugfs 'i915_context_create' directory
May 24 16:17:45 taprah kernel: [   12.592937] Could not create debugfs 'i915_ppgtt_release' directory
May 24 16:17:45 taprah kernel: [   12.592940] Could not create debugfs 'i915_ppgtt_create' directory
May 24 16:17:45 taprah kernel: [   12.592946] Could not create debugfs 'i915_reg_rw' directory
May 24 16:17:45 taprah kernel: [   12.592949] Could not create debugfs 'i915_flip_complete' directory
May 24 16:17:45 taprah kernel: [   12.592953] Could not create debugfs 'i915_flip_request' directory
May 24 16:17:45 taprah kernel: [   12.592956] Could not create debugfs 'i915_ring_wait_end' directory
May 24 16:17:45 taprah kernel: [   12.592959] Could not create debugfs 'i915_ring_wait_begin' directory
May 24 16:17:45 taprah kernel: [   12.592963] Could not create debugfs 'i915_gem_request_wait_end' directory
May 24 16:17:45 taprah kernel: [   12.592966] Could not create debugfs 'i915_gem_request_wait_begin' directory
May 24 16:17:45 taprah kernel: [   12.592969] Could not create debugfs 'i915_gem_request_complete' directory
May 24 16:17:45 taprah kernel: [   12.592973] Could not create debugfs 'i915_gem_request_retire' directory
May 24 16:17:45 taprah kernel: [   12.592990] Could not create debugfs 'i915_gem_request_add' directory
May 24 16:17:45 taprah kernel: [   12.592993] Could not create debugfs 'i915_gem_ring_flush' directory
May 24 16:17:45 taprah kernel: [   12.592997] Could not create debugfs 'i915_gem_ring_dispatch' directory
May 24 16:17:45 taprah kernel: [   12.593000] Could not create debugfs 'i915_gem_ring_sync_to' directory
May 24 16:17:45 taprah kernel: [   12.593003] Could not create debugfs 'i915_gem_evict_vm' directory
May 24 16:17:45 taprah kernel: [   12.593006] Could not create debugfs 'i915_gem_evict_everything' directory
May 24 16:17:45 taprah kernel: [   12.593010] Could not create debugfs 'i915_gem_evict' directory
May 24 16:17:45 taprah kernel: [   12.593013] Could not create debugfs 'i915_gem_object_destroy' directory
May 24 16:17:45 taprah kernel: [   12.593016] Could not create debugfs 'i915_gem_object_clflush' directory
May 24 16:17:45 taprah kernel: [   12.593019] Could not create debugfs 'i915_gem_object_fault' directory
May 24 16:17:45 taprah kernel: [   12.593022] Could not create debugfs 'i915_gem_object_pread' directory
May 24 16:17:45 taprah kernel: [   12.593026] Could not create debugfs 'i915_gem_object_pwrite' directory
May 24 16:17:45 taprah kernel: [   12.593029] Could not create debugfs 'i915_gem_object_change_domain' directory
May 24 16:17:45 taprah kernel: [   12.593069] Could not create debugfs 'i915_vma_unbind' directory
May 24 16:17:45 taprah kernel: [   12.593073] Could not create debugfs 'i915_vma_bind' directory
May 24 16:17:45 taprah kernel: [   12.593076] Could not create debugfs 'i915_gem_object_create' directory
May 24 16:17:45 taprah kernel: [   12.593079] Could not create debugfs 'i915_pipe_update_end' directory
May 24 16:17:45 taprah kernel: [   12.593083] Could not create debugfs 'i915_pipe_update_vblank_evaded' directory
May 24 16:17:45 taprah kernel: [   12.593086] Could not create debugfs 'i915_pipe_update_start' directory
May 24 16:17:45 taprah kernel: [   12.593585] snd_hda_intel 0000:00:03.0: Cannot turn on display power on i915
May 24 16:42:49 taprah kernel: [ 1120.181916] [drm:hsw_unclaimed_reg_detect.isra.12 [i915]] *ERROR* Unclaimed register detected. Please use the i915.mmio_debug=1 to debug this problem.
May 24 16:43:16 taprah kernel: [ 1526.102062] [drm:hsw_unclaimed_reg_detect.isra.12 [i915]] *ERROR* Unclaimed register detected. Please use the i915.mmio_debug=1 to debug this problem.[drm:hsw_unclaimed_reg_detect.isra.12 [i915]] *ERROR* Unclaimed register detected. Please use the i915.mmio_debug=1 to debug this problem.
May 24 16:55:00 taprah kernel: [ 1552.990256] [drm:hsw_unclaimed_reg_detect.isra.12 [i915]] *ERROR* Unclaimed register detected. Please use the i915.mmio_debug=1 to debug this problem.[drm:hsw_unclaimed_reg_detect.isra.12 [i915]] *ERROR* Unclaimed register detected. Please use the i915.mmio_debug=1 to debug this problem.


> cat /proc/version_signature 

Ubuntu 3.19.0-18.18-generic 3.19.6

> sudo lspci -vnvn

00:00.0 Host bridge [0600]: Intel Corporation Broadwell-U Host Bridge -OPI [8086:1604] (rev 09)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>

00:02.0 VGA compatible controller [0300]: Intel Corporation Broadwell-U Integrated Graphics [8086:1616] (rev 09) (prog-if 00 [VGA controller])
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 45
	Region 0: Memory at f6000000 (64-bit, non-prefetchable) [size=16M]
	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 4: I/O ports at f000 [size=64]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0f00c  Data: 4142
	Capabilities: [d0] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a4] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: i915

00:03.0 Audio device [0403]: Intel Corporation Broadwell-U Audio Controller [8086:160c] (rev 09)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	Region 0: Memory at f7134000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [50] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [60] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [70] Express (v1) Root Complex Integrated Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0
			ExtTag- RBE-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
	Kernel driver in use: snd_hda_intel

00:14.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB xHCI Controller [8086:9cb1] (rev 03) (prog-if 30 [XHCI])
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 40
	Region 0: Memory at f7120000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [70] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0-,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
		Address: 00000000fee0500c  Data: 4191
	Kernel driver in use: xhci_hcd

00:16.0 Communication controller [0780]: Intel Corporation Wildcat Point-LP MEI Controller #1 [8086:9cba] (rev 03)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 43
	Region 0: Memory at f713d000 (64-bit, non-prefetchable) [size=32]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0f00c  Data: 41e1
	Kernel driver in use: mei_me

00:19.0 Ethernet controller [0200]: Intel Corporation Ethernet Connection (3) I218-V [8086:15a3] (rev 03)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 42
	Region 0: Memory at f7100000 (32-bit, non-prefetchable) [size=128K]
	Region 1: Memory at f713b000 (32-bit, non-prefetchable) [size=4K]
	Region 2: I/O ports at f080 [size=32]
	Capabilities: [c8] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0400c  Data: 4182
	Capabilities: [e0] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: e1000e

00:1b.0 Audio device [0403]: Intel Corporation Wildcat Point-LP High Definition Audio Controller [8086:9ca0] (rev 03)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 32, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 46
	Region 0: Memory at f7130000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0f00c  Data: 4162
	Kernel driver in use: snd_hda_intel

00:1c.0 PCI bridge [0604]: Intel Corporation Wildcat Point-LP PCI Express Root Port #1 [8086:9c90] (rev e3) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort+ <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot-), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
			ClockPM- Surprise- LLActRep+ BwNot+
		LnkCtl:	ASPM L0s L1 Enabled; RCB 64 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x0, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Via WAKE# ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: Intel Corporation Device [8086:2057]
	Capabilities: [a0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Kernel driver in use: pcieport

00:1c.3 PCI bridge [0604]: Intel Corporation Wildcat Point-LP PCI Express Root Port #4 [8086:9c96] (rev e3) (prog-if 00 [Normal decode])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
	Memory behind bridge: f7000000-f70fffff
	Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
	BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
		PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
	Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0
			ExtTag- RBE+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #4, Speed 5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <512ns, L1 <16us
			ClockPM- Surprise- LLActRep+ BwNot+
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
		SltCap:	AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
			Slot #3, PowerLimit 10.000W; Interlock- NoCompl+
		SltCtl:	Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
			Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
		SltSta:	Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
			Changed: MRL- PresDet- LinkState+
		RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna- CRSVisible-
		RootCap: CRSVisible-
		RootSta: PME ReqID 0000, PMEStatus- PMEPending-
		DevCap2: Completion Timeout: Range ABC, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
		LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
		Address: 00000000  Data: 0000
	Capabilities: [90] Subsystem: Intel Corporation Device [8086:2057]
	Capabilities: [a0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v0] #00
	Capabilities: [200 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=40us PortTPowerOnTime=10us
	Kernel driver in use: pcieport

00:1d.0 USB controller [0c03]: Intel Corporation Wildcat Point-LP USB EHCI Controller [8086:9ca6] (rev 03) (prog-if 20 [EHCI])
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 23
	Region 0: Memory at f713a000 (32-bit, non-prefetchable) [size=1K]
	Capabilities: [50] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [58] Debug port: BAR=1 offset=00a0
	Capabilities: [98] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: ehci-pci

00:1f.0 ISA bridge [0601]: Intel Corporation Wildcat Point-LP LPC Controller [8086:9cc3] (rev 03)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Capabilities: [e0] Vendor Specific Information: Len=0c <?>
	Kernel driver in use: lpc_ich

00:1f.2 SATA controller [0106]: Intel Corporation Wildcat Point-LP SATA Controller [AHCI Mode] [8086:9c83] (rev 03) (prog-if 01 [AHCI 1.0])
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin B routed to IRQ 41
	Region 0: I/O ports at f0d0 [size=8]
	Region 1: I/O ports at f0c0 [size=4]
	Region 2: I/O ports at f0b0 [size=8]
	Region 3: I/O ports at f0a0 [size=4]
	Region 4: I/O ports at f060 [size=32]
	Region 5: Memory at f7139000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee0a00c  Data: 41b1
	Capabilities: [70] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004
	Kernel driver in use: ahci

00:1f.3 SMBus [0c05]: Intel Corporation Wildcat Point-LP SMBus Controller [8086:9ca2] (rev 03)
	Subsystem: Intel Corporation Device [8086:2057]
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin C routed to IRQ 5
	Region 0: Memory at f7138000 (64-bit, non-prefetchable) [size=256]
	Region 4: I/O ports at f040 [size=32]

02:00.0 Network controller [0280]: Intel Corporation Wireless 7265 [8086:095a] (rev 59)
	Subsystem: Intel Corporation Dual Band Wireless-AC 7265 [8086:9010]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 44
	Region 0: Memory at f7000000 (64-bit, non-prefetchable) [size=8K]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
		Address: 00000000fee0100c  Data: 4122
	Capabilities: [40] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 unlimited
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L1, Exit Latency L0s <4us, L1 <32us
			ClockPM+ Surprise- LLActRep- BwNot-
		LnkCtl:	ASPM L1 Enabled; RCB 64 bytes Disabled- CommClk+
			ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR+, OBFF Via WAKE#
		DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis-, LTR+, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [100 v1] Advanced Error Reporting
		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
		UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
		AERCap:	First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
	Capabilities: [140 v1] Device Serial Number 34-13-e8-ff-ff-1e-0e-47
	Capabilities: [14c v1] Latency Tolerance Reporting
		Max snoop latency: 3145728ns
		Max no snoop latency: 3145728ns
	Capabilities: [154 v1] L1 PM Substates
		L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
			  PortCommonModeRestoreTime=30us PortTPowerOnTime=60us
	Kernel driver in use: iwlwifi

> lsb_release -rd
Description:    Ubuntu 15.04
Release:        15.04
Comment 1 Rahul 2015-05-24 21:23:39 UTC
> sudo cat /sys/class/drm/card0/error
[sudo] password for taprah: 
no error state collected
Comment 2 Ander Conselvan de Oliveira 2015-05-25 05:25:32 UTC
(In reply to Rahul from comment #1)
> > sudo cat /sys/class/drm/card0/error
> [sudo] password for taprah: 
> no error state collected

Did you try this after the message appeared in the log and before a reboot?
Comment 3 Rahul 2015-05-25 17:28:07 UTC
Here's what I do, I suspend the session, I resume it by double-clicking my mouse, and I get a blank white screen where the mouse is still moving but nothing else can be done other than reboot. I can only look at the log *after* reboot.
Comment 4 Ander Conselvan de Oliveira 2015-05-26 07:48:46 UTC
(In reply to Rahul from comment #3)
> Here's what I do, I suspend the session, I resume it by double-clicking my
> mouse, and I get a blank white screen where the mouse is still moving but
> nothing else can be done other than reboot. I can only look at the log
> *after* reboot.

So SSH doesn't work before a reboot?
Comment 5 Rahul 2015-05-26 11:48:34 UTC
Hi Ander - the screen is frozen on wakeup. Maybe ssh works. I have no other computer on the network to ssh into it from if that's what you are suggesting. I can't do anything on the screen - it's just plain white and blank. Maybe applications are still running - but I can't tell if they are. I can't bring up an xterm or anything to get more information. So if I understand you correctly, upon reboot /sys/class/drm/card0/error gets overwritten? Can I set up a daemon to watch the file and copy it to another location? Maybe that way I can get you a stacktrace?
Comment 6 Ander Conselvan de Oliveira 2015-05-26 12:13:51 UTC
(In reply to Rahul from comment #5)
> Hi Ander - the screen is frozen on wakeup. Maybe ssh works. I have no other
> computer on the network to ssh into it from if that's what you are
> suggesting. I can't do anything on the screen - it's just plain white and
> blank. Maybe applications are still running - but I can't tell if they are.
> I can't bring up an xterm or anything to get more information. So if I
> understand you correctly, upon reboot /sys/class/drm/card0/error gets
> overwritten?

In a way yes. That file is actually an interface to i915 code that outputs the error state recorded in the driver. So when the driver is loaded after the reboot, that error state is clear.

 Can I set up a daemon to watch the file and copy it to another
> location? Maybe that way I can get you a stacktrace?

That could work, though I'm not sure inotify would work for that particular file. You might have to constantly poll the contents of the file.

It is pretty much impossible to debug the GPU hang without that error state.
Comment 7 Rahul 2015-05-26 14:37:03 UTC
tail -f /sys/class/drm/card0/error | tee ~/somewhere_else.log should work? I'll try it tonight and update here, thanks.
Comment 8 Jani Nikula 2015-08-18 15:21:03 UTC
Please reproduce with a recent upstream kernel, and attach dmesg.
Comment 9 David Weinehall 2017-02-16 16:23:12 UTC
Can you still reproduce this on a recent kernel? This bug has been in NEEDINFO state for 1.5 years now; if there's no further update I'll close it.
Comment 10 Ricardo 2017-03-03 18:07:37 UTC
closing bug due to inactivity by the submitter, thanks David for also looking into this :)


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.