Bug 107067 - vblank wait timed out on Skylake
Summary: vblank wait timed out on Skylake
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged, ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-06-29 06:48 UTC by MichaelLong
Modified: 2018-11-05 14:34 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: GPU hang


Attachments
Backtrace of the bug (5.14 KB, text/plain)
2018-06-29 06:48 UTC, MichaelLong
no flags Details
suspend-resume cycle with debugging-mode enabled (44.41 KB, text/plain)
2018-06-29 06:51 UTC, MichaelLong
no flags Details
lspci (1.66 KB, text/plain)
2018-06-29 06:52 UTC, MichaelLong
no flags Details
dmesg output - fresh boot (345.44 KB, text/plain)
2018-06-29 06:53 UTC, MichaelLong
no flags Details
dmesg output with several suspend/resume-cycles showing the problem at the end. (3.75 MB, text/x-log)
2018-07-02 18:10 UTC, MichaelLong
no flags Details
suspend-resume cycle trying with drm-tip showing issues. (299.58 KB, text/x-log)
2018-07-02 18:10 UTC, MichaelLong
no flags Details

Description MichaelLong 2018-06-29 06:48:28 UTC
Created attachment 140385 [details]
Backtrace of the bug

Hi,

on my hardware, a Lenovo Thinkpad T460p with Skylake Intel graphics, I'm increasingly experience a "vblank wait timed out on crtc 0"-bug when resuming from s2ram.

Whenever this bug occurs and the docking station is not used the backlit of the internal display goes on but nothing is displayed. When resuming on a docking-station attached with two 24"-displays it can happen that one of the displays stays in standby.

Only a reboot can recover this issue.

From my judgement the likelihood of getting this bug increased starting with 4.15 onwards (on every 4th or 5th resume) however I was always getting this bug quite rarely, so I only started investigating this lately.

Currently I'm on kernel 4.17.3 with the most recent firmware files.

Hardware details.

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 530 (rev 06) (prog-if 00 [VGA controller])
	Subsystem: Lenovo HD Graphics 530
	Flags: bus master, fast devsel, latency 0, IRQ 127
	Memory at f2000000 (64-bit, non-prefetchable) [size=16M]
	Memory at c0000000 (64-bit, prefetchable) [size=512M]
	I/O ports at e000 [size=64]
	[virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [100] Process Address Space ID (PASID)
	Capabilities: [200] Address Translation Service (ATS)
	Capabilities: [300] Page Request Interface (PRI)
	Kernel driver in use: i915
	Kernel modules: i915

Please let me know if you need additional information or if I can test out anything.

Thanks.
Comment 1 MichaelLong 2018-06-29 06:51:47 UTC
Created attachment 140386 [details]
suspend-resume cycle with debugging-mode enabled

This suspend-resume cycle was recorded while connected to a docking-station with only one connected 24"-display. This display was not usable after resuming.
Comment 2 MichaelLong 2018-06-29 06:52:21 UTC
Created attachment 140387 [details]
lspci
Comment 3 MichaelLong 2018-06-29 06:53:28 UTC
Created attachment 140388 [details]
dmesg output - fresh boot
Comment 4 Jani Saarinen 2018-06-29 08:09:26 UTC
Could you also try using https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e log_buf_len=4M?
Comment 5 Imre Deak 2018-06-29 09:16:08 UTC
Please also send the full log starting from boot-up up to the error.
Comment 6 MichaelLong 2018-07-02 18:09:10 UTC
I was trying to use drm-tip for creating a failure log but it seems that the current state does not look that good. The desktop display manager sddm is just showing a cursor on a black background, nothing more. I tried a suspend/resume-cycle anyway but when waking up other issues surfaced:

Jul 02 18:40:02 t460p kernel: [drm:i915_reset_device [i915]] resetting chip
Jul 02 18:40:02 t460p kernel: i915 0000:00:02.0: Resetting chip for hang on rcs0, bcs0, vcs0, vecs0
Jul 02 18:40:02 t460p kernel: i915 0000:00:02.0: GPU recovery failed

I've attached a log of the drm-tip output.

Proceeding with 4.17.3 I was finally able to recreate the original problem after several suspend/resume-cycles, please see the log attached.
Comment 7 MichaelLong 2018-07-02 18:10:05 UTC
Created attachment 140438 [details]
dmesg output with several suspend/resume-cycles showing the problem at the end.
Comment 8 MichaelLong 2018-07-02 18:10:40 UTC
Created attachment 140439 [details]
suspend-resume cycle trying with drm-tip showing issues.
Comment 9 Lakshmi 2018-10-23 10:02:47 UTC
(In reply to MichaelLong from comment #8)
> Created attachment 140439 [details]
> suspend-resume cycle trying with drm-tip showing issues.

Sorry for the delay
Can you attach gpu crash dump /sys/class/drm/card0/error ?
Currently the latest stable kernel is 4.19. Can you please verify with that if it occurs still?
Comment 10 Lakshmi 2018-11-05 09:16:41 UTC
(In reply to MichaelLong from comment #8)
> Created attachment 140439 [details]
> suspend-resume cycle trying with drm-tip showing issues.

Reporter, we need gpu crash dump /sys/class/drm/card0/error to proceed further.
Comment 11 MichaelLong 2018-11-05 10:01:51 UTC
Now I'm sorry for the delay.

We have a minor misunderstanding here. In comment 6 I described how I tried to reproduce my original problem with the current drm-tip at _that_ time. If you look at the attached dmesg-log it happend with revision 4.18.0-rc3-gdf42fb566d33. I did not intent do divert from the original issue so there is no crash dump. I'm running fairly recent kernels and this particular hang did not occur with any stable release kernels.

Additional information to the original vblank wait timed out: With kernels 4.18.x and newer I haven't experienced this issue anymore.

Thanks!
Comment 12 Lakshmi 2018-11-05 14:34:19 UTC
Thanks for the feedback. Closing this bug as WORKSFORME.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.