Summary: | [skl dp-mst] NULL pointer dereference after vblank/flip_done timed out | ||
---|---|---|---|
Product: | DRI | Reporter: | Chris Wilson <chris> |
Component: | DRM/Intel | Assignee: | Daniel Vetter <daniel> |
Status: | CLOSED FIXED | QA Contact: | Elio <elio.martinez.monroy> |
Severity: | critical | ||
Priority: | highest | CC: | bugs, chrischavez, cs_gon, danielnicoletti, diego.viola, erroneous, fd, felix.schwarz, freedesktop-bugs, freedesktop, Hi-Angel, intel-gfx-bugs, lists.jjorge, luke, nemesis, rees, tiwai, wferi |
Version: | XOrg git | Keywords: | bisected |
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | I965GM, SKL | i915 features: | display/atomic |
Attachments: |
Description
Chris Wilson
2016-07-02 09:45:13 UTC
Hi Chris, what are setps to reproduce this bug? Different path in the warn+crash (via drm_mode_getconnector), but this seems to have same symptoms: https://apibugzilla.suse.com/show_bug.cgi?id=1006392 I reported http://bugzilla.suse.com/show_bug.cgi?id=1006392 I have in my laptop: 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 03) 00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (secondary) (rev 03) Work-around: Adding i915.semaphores=1 as kernel parameter I can boot kernel 4.8.3. System is a bit slow in some situations though. E.g. when switching to/from text console there are still some backtraces logged: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2025 at ../drivers/gpu/drm/drm_irq.c:1224 drm_wait_one_vblank+0x17d/0x190 [drm] vblank wait timed out on crtc 0 Modules linked in: ipt_REJECT nf_reject_ipv4 tun bridge stp llc fuse ebtable_filter ebtables ip6table_filter ip6_tables af_packet ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack xt_tcpudp nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_filter ip_tables x_tables snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 ppdev coretemp kvm_intel kvm irqbypass pcspkr joydev iwl4965 iwlegacy i2c_i801 i2c_smbus mac80211 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep lpc_ich mfd_core snd_pcm cfg80211 smsc_ircc2 sky2 rfkill irda i915 parport_pc snd_timer parport drm_kms_helper snd battery drm thermal fb_sys_fops syscopyarea sysfillrect sysimgblt i2c_algo_bit fjes shpchp soundcore acpi_cpufreq button fujitsu_laptop tpm_tis tpm_tis_core video ac tpm dm_crypt algif_skcipher af_alg hid_generic usbhid sr_mod cdrom ata_generic pcmcia ata_piix serio_raw sdhci_pci sdhci yenta_socket pcmcia_rsrc pcmcia_core mmc_core uhci_hcd ehci_pci ehci_hcd usbcore usb_common dm_mod sg CPU: 0 PID: 2025 Comm: X Tainted: G U W 4.8.3-1-default #1 Hardware name: FUJITSU SIEMENS LIFEBOOK E8310/FJNB1CE, BIOS Version 1.14 08/20/2008 0000000000000000 ffffffffb53a3e62 ffff957168e6b8c8 0000000000000000 ffffffffb507ddde ffff957169210000 ffff957168e6b918 0000000000000000 000000000c000006 ffff9571698e0e08 ffff9571424e3c00 ffffffffb507de4f Call Trace: [<ffffffffb502eefe>] dump_trace+0x5e/0x310 [<ffffffffb502f2cb>] show_stack_log_lvl+0x11b/0x1a0 [<ffffffffb5030001>] show_stack+0x21/0x40 [<ffffffffb53a3e62>] dump_stack+0x5c/0x7a [<ffffffffb507ddde>] __warn+0xbe/0xe0 [<ffffffffb507de4f>] warn_slowpath_fmt+0x4f/0x60 [<ffffffffc054b4ed>] drm_wait_one_vblank+0x17d/0x190 [drm] [<ffffffffc06c69e7>] intel_pre_plane_update+0x157/0x180 [i915] [<ffffffffc06c6d39>] intel_atomic_commit_tail+0x129/0x1060 [i915] [<ffffffffc06c807c>] intel_atomic_commit+0x40c/0x510 [i915] [<ffffffffc06cd6cf>] intel_release_load_detect_pipe+0x1f/0x80 [i915] [<ffffffffc0706f2a>] intel_tv_detect+0x33a/0x5c0 [i915] [<ffffffffc05f6dfd>] drm_helper_probe_single_connector_modes+0x26d/0x510 [drm_kms_helper] [<ffffffffc0556314>] drm_mode_getconnector+0x324/0x360 [drm] [<ffffffffc0549913>] drm_ioctl+0x1b3/0x440 [drm] [<ffffffffb522c31f>] do_vfs_ioctl+0x8f/0x5d0 [<ffffffffb522c8d4>] SyS_ioctl+0x74/0x80 [<ffffffffb56d43f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Leftover inexact backtrace: ---[ end trace 2b5094a97699f467 ]--- (In reply to Rami from comment #1) > Hi Chris, > what are setps to reproduce this bug? I have the same bug. The steps are simply to boot the system with a 4.8.x kernel. With a 4.7.5 I don't have this bug. Same hardware : 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller I am using Mageia Cauldron (future Mageia 6). Created attachment 127582 [details]
dmesg of a system with this bug
I have the same error on Fedora with the current 4.8.4 Kernel. It is also present in 4.9-rc3 [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out This leads to a system freeze after exiting a full screen game. lspci -v 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 0c) (prog-if 00 [VGA controller]) Subsystem: Dell Latitude D630 Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at f6e00000 (64-bit, non-prefetchable) [size=1M] Memory at e0000000 (64-bit, prefetchable) [size=256M] I/O ports at eff8 [size=8] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: i915 Kernel modules: i915 Created attachment 127741 [details]
dmesg log with drm.debug enabled
I did a kernel bisect and the flip_done timeout error came with commit ea0000f0d369a59c2086fe9c489e0a2a86e080ba - drm/i915: Roll out the helper nonblock tracking. The last good commit was 1f7528c4dbea46bd266798d3c374a961b1228055 - drm/i915: Signal drm events for atomic The vblank time out bug has been in the kernel since 4.4 or so and there is already a report here: https://bugs.freedesktop.org/show_bug.cgi?id=93782 I don't know if they are somehow related. The vblank bug has no visible effects on my laptop other than an error message in dmesg but the flip_done time out bug is more serious. It makes the boot hang for about 10 seconds with a blank screen and crashes the computer when switching back to the desktop from a full screen game. Created attachment 128085 [details] [review] revert ea0000f0d369a59c2086fe9c489e0a2a86e080ba for 4.8.9 fix for https://bugs.freedesktop.org/show_bug.cgi?id=96781 patch adapted from https://bugs.freedesktop.org/show_bug.cgi?id=97529 I'm not sure what happened to my comment on the attachment I just sent. This bug causes a flip_done timeout and crash when I exit Xorg on my Thinkpad x220 with i915. The attached patch fixes it. As this is a regression, this should go upstream and to stable. What's the procedure for pushing it up? Has there been any discussion on lkml? (In reply to willma from comment #8) > I did a kernel bisect and the flip_done timeout error came with commit > ea0000f0d369a59c2086fe9c489e0a2a86e080ba - drm/i915: Roll out the helper > nonblock tracking. Bad commit: commit ea0000f0d369a59c2086fe9c489e0a2a86e080ba Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Jun 13 16:13:46 2016 +0200 drm/i915: Roll out the helper nonblock tracking *** Bug 98554 has been marked as a duplicate of this bug. *** *** Bug 95165 has been marked as a duplicate of this bug. *** I sent a patch to i915 maintainers and list but got no response. I wonder if commit e411072d57 "drm/i915: drop the struct_mutex when wedged or trying to reset" fixes the flip_done timeout. I'm going to do some testing. Commit e411072d57 did not help. In fact it seems to have made things worse. I now get the flip_done timeout both on X startup and shutdown. Kernel 4.9 went out with this bug unfixed. 4.8.13-1-ARCH VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 0c) (prog-if 00 [VGA controller]) Subsystem: Dell Latitude D630 Flags: bus master, fast devsel, latency 0, IRQ 28 Memory at f6e00000 (64-bit, non-prefetchable) [size=1M] Memory at e0000000 (64-bit, prefetchable) [size=256M] I/O ports at eff8 [size=8] [virtual] Expansion ROM at 000c0000 [disabled] [size=128K] Capabilities: <access denied> Kernel driver in use: i915 Kernel modules: i915 [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe B] flip_done timed out It makes occasional Xorg freeze, often changing from console mode. So it seems, that the same/similar hardware is affected :/ (In reply to info.artur from comment #16) > 4.8.13-1-ARCH If you're chiming in "me too", please try to use the latest kernels. Preferrably drm-tip branch of https://cgit.freedesktop.org/drm-tip. Thanks. I have just finished testing drm-tip, and it fixes the flip_done timeout for me (Thinkpad x220 i915). I have not bisected so I do not know what the fix is. Just to repeat, kernel 4.9 is broken, so it would be nice if either the ea0000f revert could be pushed to -stable, or the fix isolated and pushed to -stable. Also it would be good if others could test drm-tip. I still have not seen any comments on the ea0000f revert that I sent to the drm mailing list. This seems like a bug that affects a lot of people and several kernel versions, and is a regression from previous kernels. Should this be escalated? How? Typed up a patch to avoid the oops and make the nonblocking helpers more robust: https://patchwork.freedesktop.org/patch/128918/ There will still be warnings in dmesg (and timeouts), but the driver should at least survive. For the vblank timeout issue itself I think the best approach is to bisect what fixed it, and then backport that. Created attachment 128603 [details] [review] [PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" This is the patch I sent to the drm mailing list. It is the same as my previous patch but based on 4.9 and with summary and signoff. (In reply to Jim Rees from comment #20) > Created attachment 128603 [details] [review] [review] > [PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" > > This is the patch I sent to the drm mailing list. It is the same as my > previous patch but based on 4.9 and with summary and signoff. Please try the patch from comment #19. (In reply to Jani Nikula from comment #21) > (In reply to Jim Rees from comment #20) > > Created attachment 128603 [details] [review] [review] [review] > > [PATCH] drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" > > > > This is the patch I sent to the drm mailing list. It is the same as my > > previous patch but based on 4.9 and with summary and signoff. > > Please try the patch from comment #19. Yeah, as soon as that has a tested-by (and note that it only fixes the hard crash/oops, there wills till be timeouts) I can apply it and stuff it into the stable kernel queue. If there will still be timeouts, what am I testing for? Is the flip_done timeout a separate bug, and should I open a separate bug report for it? Created attachment 128686 [details]
linux-tip build with patch and drm.debug enabled dmesg output
dmesg output for drm-tip kernel with patch 128603 on Arch Linux. Still get the crash in addition to timeouts.
Is there some reason not to revert ea0000f0 while we debug this? It's clearly causing problems, is a regression from previous working behavior, and we have a fix that works. ea0000f0 was applied before it was ready, has broken previously working configurations, and should be reverted. (In reply to Jim Rees from comment #25) > Is there some reason not to revert ea0000f0 while we debug this? It's > clearly causing problems, is a regression from previous working behavior, > and we have a fix that works. ea0000f0 was applied before it was ready, has > broken previously working configurations, and should be reverted. We can only *backport* commits to stable. We can't apply commits to stable kernels unless the commits are present in Linus' master. So we need to have this debugged and fixed upstream first. Unfortunately, the revert working for you in v4.9 does not get us any closer to having this fixed upstream. The commit also doesn't cleanly revert on v4.10-rc1 anymore. Can you please try v4.10-rc1 and/or drm-tip branch of https://cgit.freedesktop.org/drm-tip, plus https://patchwork.freedesktop.org/patch/128918/ on top? (In reply to erroneous@gmail.com from comment #24) > Created attachment 128686 [details] > linux-tip build with patch and drm.debug enabled dmesg output > > dmesg output for drm-tip kernel with patch 128603 on Arch Linux. Still get > the crash in addition to timeouts. This patch will _not_ fix the WARNING backtrace, but it should fix the Oops/hard-hangs of the driver. Looking at dmesg, only the WARNING is left. Can you pls confirm that the hard hangs (not the long delays when vt switching or similar) are gone? (In reply to Daniel Vetter from comment #27) > (In reply to erroneous@gmail.com from comment #24) > > Created attachment 128686 [details] > > linux-tip build with patch and drm.debug enabled dmesg output > > > > dmesg output for drm-tip kernel with patch 128603 on Arch Linux. Still get > > the crash in addition to timeouts. > > This patch will _not_ fix the WARNING backtrace, but it should fix the > Oops/hard-hangs of the driver. Looking at dmesg, only the WARNING is left. > Can you pls confirm that the hard hangs (not the long delays when vt > switching or similar) are gone? Sorry, didn't realize it was a warning BT. The system continues on without an oops, but now that I realize that the BT is just a warning I realize that it never crashed in the first place for my hardware with the 4.8.13 kernel. It only does the the same warning, not the same Oops. Please disregard my tests then since I couldn't reproduce the same Oops. I don't get the hard hang or the oops. Do you still want me to test? So should I file a new bug report for the timeout? (In reply to Daniel Vetter from comment #27) > This patch will _not_ fix the WARNING backtrace, but it should fix the > Oops/hard-hangs of the driver. Looking at dmesg, only the WARNING is left. > Can you pls confirm that the hard hangs (not the long delays when vt > switching or similar) are gone? I am currently using drm-tip with your patch applied and my system hangs a couple seconds while booting and when switching modes but I did not experience any hard hangs or crashes as I did with kernel 4.8. I still have this line im my dmesg though: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:34:pipe B] flip_done timed out Is this what you mean by WARNING? (In reply to Jim Rees from comment #29) > I don't get the hard hang or the oops. Do you still want me to test? On which kernel do you not get the hard hang or the oops? Created attachment 128882 [details]
journalctl 4.8.17
I'm hitting this rarely when hitting a key to "wake up" the display; system is not in suspend but gnome-shell has dimmed the display. Hitting a key is what happened 2 seconds before this crash. Including the entire journal output. Two call traces are there, maybe the first one caused the instability leading to the first. I was able to login remotely with ssh; but the keyboard and mouse were unresponsive after this oops, I couldn't get to a VT.
4.8.17-200.fc24.x86_64
Parameters i915.enable_guc_loading=-1 i915.enable_guc_submission=-1 are used for this event.
[12446.222314] f25h kernel: WARNING: CPU: 3 PID: 1549 at drivers/gpu/drm/i915/intel_display.c:13714 intel_atomic_commit_tail+0x1043/0x1050 [i915]
[12446.222320] f25h kernel: pipe A vblank wait timed out
and then
[12534.964553] f25h kernel: BUG: unable to handle kernel paging request at 00007fb63e44c94b
[12534.964737] f25h kernel: IP: [<ffffffffa80e46eb>] __wake_up_common+0x2b/0x80
[12534.964865] f25h kernel: PGD 2b2190067 PUD 0
[12534.964946] f25h kernel: Oops: 0000 [#1] SMP
Hi, I have the same issue on Fedora 25 with kernel 4.9.3-200.fc25.x86_64. I have reported about it here https://bugzilla.redhat.com/show_bug.cgi?id=1409228 The system hard freezes when logging out of Gnome, it does not show the GDM login screen. Also when switching TTYs it is either slow or hangs. I am on: 00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) Dell latitude e6320 with Sandy Bridge graphics, HD3000. I get the following logs: Jan 18 18:36:57 latitude kernel: [<ffffffff950a202b>] __warn+0xcb/0xf0 Jan 18 18:36:57 latitude kernel: [<ffffffff950a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 18 18:36:57 latitude kernel: [<ffffffff950e7054>] ? finish_wait+0x54/0x70 Jan 18 18:36:57 latitude kernel: [<ffffffffc04cda6a>] drm_wait_one_vblank+0x1aa/0x1b0 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffff950e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 18 18:36:57 latitude kernel: [<ffffffffc09aa1f9>] ironlake_crtc_enable+0x779/0xbe0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc09a6420>] intel_update_crtc+0x50/0xe0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc09a6516>] intel_update_crtcs+0x66/0x80 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc09a6c1e>] intel_atomic_commit_tail+0x33e/0xff0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc09a7c23>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e055a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e0ae3>] ? drm_atomic_set_mode_prop_for_crtc+0x103/0x110 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e0829>] drm_atomic_commit+0x49/0x50 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc0649c3d>] drm_atomic_helper_set_config+0x7d/0xb0 [drm_kms_helper] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d3895>] drm_mode_set_config_internal+0x65/0x110 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d507d>] drm_mode_setcrtc+0x3fd/0x4f0 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04cbdcb>] drm_ioctl+0x21b/0x4c0 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d4c80>] ? drm_mode_getcrtc+0x140/0x140 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffff9526db43>] do_vfs_ioctl+0xa3/0x5f0 Jan 18 18:36:57 latitude kernel: [<ffffffff9526e109>] SyS_ioctl+0x79/0x90 Jan 18 18:36:57 latitude kernel: [<ffffffff9581bbf7>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 18 18:36:57 latitude kernel: ---[ end trace 337cd55a01ebeea6 ]--- Jan 18 18:36:57 latitude kernel: ------------[ cut here ]------------ Jan 18 18:36:57 latitude kernel: WARNING: CPU: 3 PID: 886 at drivers/gpu/drm/i915/intel_display.c:14191 intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:36:57 latitude kernel: pipe A vblank wait timed out Jan 18 18:36:57 latitude kernel: Modules linked in: binfmt_misc fuse uas usb_storage ccm snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel iTCO_wdt i915 mei_wdt i Jan 18 18:36:57 latitude kernel: CPU: 3 PID: 886 Comm: Xorg Tainted: G W 4.9.3-200.fc25.x86_64 #1 Jan 18 18:36:57 latitude kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Jan 18 18:36:57 latitude kernel: ffff98e5414afa60 ffffffff953f3ddd ffff98e5414afab0 0000000000000000 Jan 18 18:36:57 latitude kernel: ffff98e5414afaa0 ffffffff950a202b 0000376fd279f000 0000000000000000 Jan 18 18:36:57 latitude kernel: 0000000000000000 0000000000000000 0000000000000001 ffff8b4ae0fda000 Jan 18 18:36:57 latitude kernel: Call Trace: Jan 18 18:36:57 latitude kernel: [<ffffffff953f3ddd>] dump_stack+0x63/0x86 Jan 18 18:36:57 latitude kernel: [<ffffffff950a202b>] __warn+0xcb/0xf0 Jan 18 18:36:57 latitude kernel: [<ffffffff950a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 18 18:36:57 latitude kernel: [<ffffffff950e7054>] ? finish_wait+0x54/0x70 Jan 18 18:36:57 latitude kernel: [<ffffffffc09a78b0>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffff950e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 18 18:36:57 latitude kernel: [<ffffffffc09a7c23>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e055a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e0ae3>] ? drm_atomic_set_mode_prop_for_crtc+0x103/0x110 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04e0829>] drm_atomic_commit+0x49/0x50 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc0649c3d>] drm_atomic_helper_set_config+0x7d/0xb0 [drm_kms_helper] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d3895>] drm_mode_set_config_internal+0x65/0x110 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d507d>] drm_mode_setcrtc+0x3fd/0x4f0 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04cbdcb>] drm_ioctl+0x21b/0x4c0 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffffc04d4c80>] ? drm_mode_getcrtc+0x140/0x140 [drm] Jan 18 18:36:57 latitude kernel: [<ffffffff9526db43>] do_vfs_ioctl+0xa3/0x5f0 Jan 18 18:36:57 latitude kernel: [<ffffffff9526e109>] SyS_ioctl+0x79/0x90 Jan 18 18:36:57 latitude kernel: [<ffffffff9581bbf7>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 18 18:36:57 latitude kernel: ---[ end trace 337cd55a01ebeea7 ]--- Jan 18 18:37:07 latitude /usr/libexec/gdm-x-session[884]: (EE) intel(0): sna_mode_shutdown_crtc: invalid state found on pipe 1, disabling CRTC:30 Jan 18 18:37:07 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 18 18:37:17 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 18 18:37:28 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 18 18:41:39 latitude kernel: ------------[ cut here ]------------ Jan 18 18:41:39 latitude kernel: WARNING: CPU: 0 PID: 1127 at drivers/gpu/drm/i915/intel_display.c:14191 intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:41:39 latitude kernel: pipe A vblank wait timed out Jan 18 18:41:39 latitude kernel: Modules linked in: ccm snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic intel_rapl snd_hda_intel x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec kvm_intel iTCO_wdt arc4 kvm mei_wdt iT Jan 18 18:41:39 latitude kernel: CPU: 0 PID: 1127 Comm: Xorg Not tainted 4.9.3-200.fc25.x86_64 #1 Jan 18 18:41:39 latitude kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Jan 18 18:41:39 latitude kernel: ffffac8e417e77e0 ffffffff8d3f3ddd ffffac8e417e7830 0000000000000000 Jan 18 18:41:39 latitude kernel: ffffac8e417e7820 ffffffff8d0a202b 0000376f4de99800 0000000000000000 Jan 18 18:41:39 latitude kernel: 0000000000000000 0000000000000000 0000000000000001 ffff9f216190f000 Jan 18 18:41:39 latitude kernel: Call Trace: Jan 18 18:41:39 latitude kernel: [<ffffffff8d3f3ddd>] dump_stack+0x63/0x86 Jan 18 18:41:39 latitude kernel: [<ffffffff8d0a202b>] __warn+0xcb/0xf0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d0a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 18 18:41:39 latitude kernel: [<ffffffff8d0e7054>] ? finish_wait+0x54/0x70 Jan 18 18:41:39 latitude kernel: [<ffffffffc07768b0>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:41:39 latitude kernel: [<ffffffff8d0e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 18 18:41:39 latitude kernel: [<ffffffffc0776c23>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 18 18:41:39 latitude kernel: [<ffffffffc04d255a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 18 18:41:39 latitude kernel: [<ffffffffc04d2d21>] ? drm_atomic_add_affected_connectors+0x61/0xf0 [drm] Jan 18 18:41:39 latitude kernel: [<ffffffffc04d2829>] drm_atomic_commit+0x49/0x50 [drm] Jan 18 18:41:39 latitude kernel: [<ffffffffc051bbdc>] restore_fbdev_mode+0x14c/0x270 [drm_kms_helper] Jan 18 18:41:39 latitude kernel: [<ffffffffc051d7b4>] drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] Jan 18 18:41:39 latitude kernel: [<ffffffffc051d82d>] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] Jan 18 18:41:39 latitude kernel: [<ffffffffc0790cd8>] intel_fbdev_set_par+0x18/0x70 [i915] Jan 18 18:41:39 latitude kernel: [<ffffffff8d4774f6>] fb_set_var+0x236/0x460 Jan 18 18:41:39 latitude kernel: [<ffffffff8d406559>] ? flex_array_get_ptr+0x9/0x20 Jan 18 18:41:39 latitude kernel: [<ffffffff8d38caa6>] ? type_attribute_bounds_av+0x46/0x1e0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d1c5567>] ? find_get_entries+0x177/0x2b0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d22fac1>] ? __slab_free+0xa1/0x2a0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d46d31f>] fbcon_blank+0x30f/0x350 Jan 18 18:41:39 latitude kernel: [<ffffffff8d501592>] do_unblank_screen+0xd2/0x1a0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d4f7277>] vt_ioctl+0x507/0x12a0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d4eb795>] tty_ioctl+0x355/0xc40 Jan 18 18:41:39 latitude kernel: [<ffffffff8d37aa58>] ? selinux_inode_free_security+0x58/0x70 Jan 18 18:41:39 latitude kernel: [<ffffffff8d29e271>] ? fsnotify_destroy_marks+0x61/0x80 Jan 18 18:41:39 latitude kernel: [<ffffffff8d10d45d>] ? call_rcu_sched+0x1d/0x20 Jan 18 18:41:39 latitude kernel: [<ffffffff8d1df53d>] ? shmem_destroy_inode+0x2d/0x40 Jan 18 18:41:39 latitude kernel: [<ffffffff8d26db43>] do_vfs_ioctl+0xa3/0x5f0 Jan 18 18:41:39 latitude kernel: [<ffffffff8d26e109>] SyS_ioctl+0x79/0x90 Jan 18 18:41:39 latitude kernel: [<ffffffff8d81bbf7>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 18 18:41:39 latitude kernel: ---[ end trace 819c87644d1c8c2a ]--- Jan 18 18:41:39 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 18 18:41:39 latitude /usr/libexec/gdm-x-session[1125]: (II) Server terminated successfully (0). Closing log file. Jan 18 18:44:59 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 18 18:44:59 latitude kernel: ------------[ cut here ]------------ Jan 18 18:44:59 latitude kernel: WARNING: CPU: 0 PID: 853 at drivers/gpu/drm/i915/intel_display.c:14191 intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:44:59 latitude kernel: pipe A vblank wait timed out Jan 18 18:44:59 latitude kernel: Modules linked in: ccm snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic i915 mei_wdt intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel arc4 iTCO_wdt iTCO_vendor_support ppdev snd Jan 18 18:44:59 latitude kernel: CPU: 0 PID: 853 Comm: Xorg Tainted: G W 4.9.3-200.fc25.x86_64 #1 Jan 18 18:44:59 latitude kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Jan 18 18:44:59 latitude kernel: ffffb14e4136fa60 ffffffff8b3f3ddd ffffb14e4136fab0 0000000000000000 Jan 18 18:44:59 latitude kernel: ffffb14e4136faa0 ffffffff8b0a202b 0000376f1ff22b40 0000000000000000 Jan 18 18:44:59 latitude kernel: 0000000000000000 0000000000000000 0000000000000001 ffff89341ff91000 Jan 18 18:44:59 latitude kernel: Call Trace: Jan 18 18:44:59 latitude kernel: [<ffffffff8b3f3ddd>] dump_stack+0x63/0x86 Jan 18 18:44:59 latitude kernel: [<ffffffff8b0a202b>] __warn+0xcb/0xf0 Jan 18 18:44:59 latitude kernel: [<ffffffff8b0a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 18 18:44:59 latitude kernel: [<ffffffff8b0e7054>] ? finish_wait+0x54/0x70 Jan 18 18:44:59 latitude kernel: [<ffffffffc07168b0>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 18 18:44:59 latitude kernel: [<ffffffff8b0e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 18 18:44:59 latitude kernel: [<ffffffffc0716c23>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 18 18:44:59 latitude kernel: [<ffffffffc019855a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc0197fa0>] ? drm_atomic_set_crtc_for_connector+0xc0/0xf0 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc0198829>] drm_atomic_commit+0x49/0x50 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc02d2c3d>] drm_atomic_helper_set_config+0x7d/0xb0 [drm_kms_helper] Jan 18 18:44:59 latitude kernel: [<ffffffffc018b895>] drm_mode_set_config_internal+0x65/0x110 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc018d07d>] drm_mode_setcrtc+0x3fd/0x4f0 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc0183dcb>] drm_ioctl+0x21b/0x4c0 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffffc018cc80>] ? drm_mode_getcrtc+0x140/0x140 [drm] Jan 18 18:44:59 latitude kernel: [<ffffffff8b26db43>] do_vfs_ioctl+0xa3/0x5f0 Jan 18 18:44:59 latitude kernel: [<ffffffff8b26e109>] SyS_ioctl+0x79/0x90 Jan 18 18:44:59 latitude kernel: [<ffffffff8b81bbf7>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 18 18:44:59 latitude kernel: ---[ end trace 65ae7d3243347a21 ]--- Jan 18 18:45:09 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out I just discovered that the workaround described in bug 93782, comment 40 also solves the flip_done timout on my machine (kernel 4.9.5-100.fc24.x86_64). All hangs are now gone and dmesg is clean. Just add video=SVIDEO-1:d to the kernel command line and give it a try. I can confirm that the patch in comment #19 gets rid of the crashes (leaving the timeouts) and that the workaround in comment #34 gets rid of the timeouts as well. I'll test drm-tip eventually. 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (primary) (rev 03) 00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (secondary) (rev 03) (In reply to willma from comment #34) > I just discovered that the workaround described in bug 93782, comment 40 > also solves the flip_done timout on my machine (kernel > 4.9.5-100.fc24.x86_64). All hangs are now gone and dmesg is clean. > > Just add > > video=SVIDEO-1:d > > to the kernel command line and give it a try. So far so good, haven't had any hangs/messages or slowdowns when switching TTYs yet. So it seems adding video=SVIDEO-1:d helps. Will keep an eye on it for a couple of days of uptime before I can tell if my GNU+Linux is great again :) Let's hope a definitive fix is coming soon. (In reply to willma from comment #34) > video=SVIDEO-1:d Good catch here also. So the SVIDEO is wrongly enabled since kernels 4.8 ... The workaround as described from comment#34 does not fully solve the problem. Today I did some further testing, based on some of the hangs/freezes I got earlier without the mentioned workaround. First I did: systemctl isolate multi-user.target and then I tried starting the GUI (X11) while doing startx And I tried logging out and thereby going back to the TTY. I did this in total 3 times. The third time the system locked up just as before with the following log: Jan 29 10:39:14 latitude kernel: ------------[ cut here ]------------ Jan 29 10:39:14 latitude kernel: WARNING: CPU: 2 PID: 7178 at drivers/gpu/drm/i915/intel_display.c:14180 intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 29 10:39:14 latitude kernel: pipe A vblank wait timed out Jan 29 10:39:14 latitude kernel: Modules linked in: ccm snd_hda_codec_hdmi intel_rapl snd_hda_codec_idt snd_hda_codec_generic x86_pkg_temp_thermal inte Jan 29 10:39:14 latitude kernel: CPU: 2 PID: 7178 Comm: Xorg Not tainted 4.9.5-200.fc25.x86_64 #1 Jan 29 10:39:14 latitude kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Jan 29 10:39:14 latitude kernel: ffffb2ca80d6f7e0 ffffffff933f40bd ffffb2ca80d6f830 0000000000000000 Jan 29 10:39:14 latitude kernel: ffffb2ca80d6f820 ffffffff930a202b 0000376407463bc0 0000000000000000 Jan 29 10:39:14 latitude kernel: 0000000000000000 0000000000000000 0000000000000001 ffff97fc2a885000 Jan 29 10:39:14 latitude kernel: Call Trace: Jan 29 10:39:14 latitude kernel: [<ffffffff933f40bd>] dump_stack+0x63/0x86 Jan 29 10:39:14 latitude kernel: [<ffffffff930a202b>] __warn+0xcb/0xf0 Jan 29 10:39:14 latitude kernel: [<ffffffff930a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 29 10:39:14 latitude kernel: [<ffffffff930e7054>] ? finish_wait+0x54/0x70 Jan 29 10:39:14 latitude kernel: [<ffffffffc0767920>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 29 10:39:14 latitude kernel: [<ffffffff930e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 29 10:39:14 latitude kernel: [<ffffffffc0767c93>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 29 10:39:14 latitude kernel: [<ffffffffc04f156a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 29 10:39:14 latitude kernel: [<ffffffffc04f1d31>] ? drm_atomic_add_affected_connectors+0x61/0xf0 [drm] Jan 29 10:39:14 latitude kernel: [<ffffffffc04f1839>] drm_atomic_commit+0x49/0x50 [drm] Jan 29 10:39:14 latitude kernel: [<ffffffffc053abec>] restore_fbdev_mode+0x14c/0x270 [drm_kms_helper] Jan 29 10:39:14 latitude kernel: [<ffffffffc053c7c4>] drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] Jan 29 10:39:14 latitude kernel: [<ffffffffc053c83d>] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] Jan 29 10:39:14 latitude kernel: [<ffffffffc0781d68>] intel_fbdev_set_par+0x18/0x70 [i915] Jan 29 10:39:14 latitude kernel: [<ffffffff93478816>] fb_set_var+0x236/0x460 Jan 29 10:39:14 latitude kernel: [<ffffffff9322f795>] ? kmem_cache_alloc+0x195/0x1b0 Jan 29 10:39:14 latitude kernel: [<ffffffff933750a7>] ? avc_alloc_node+0x27/0x120 Jan 29 10:39:14 latitude kernel: [<ffffffff93406839>] ? flex_array_get_ptr+0x9/0x20 Jan 29 10:39:14 latitude kernel: [<ffffffff9338cdb6>] ? type_attribute_bounds_av+0x46/0x1e0 Jan 29 10:39:14 latitude kernel: [<ffffffff9346e63f>] fbcon_blank+0x30f/0x350 Jan 29 10:39:14 latitude kernel: [<ffffffff935028d2>] do_unblank_screen+0xd2/0x1a0 Jan 29 10:39:14 latitude kernel: [<ffffffff934f85b7>] vt_ioctl+0x507/0x12a0 Jan 29 10:39:14 latitude kernel: [<ffffffff934ecad5>] tty_ioctl+0x355/0xc40 Jan 29 10:39:14 latitude kernel: [<ffffffff9337ad68>] ? selinux_inode_free_security+0x58/0x70 Jan 29 10:39:14 latitude kernel: [<ffffffff9329e581>] ? fsnotify_destroy_marks+0x61/0x80 Jan 29 10:39:14 latitude kernel: [<ffffffff9326de03>] do_vfs_ioctl+0xa3/0x5f0 Jan 29 10:39:14 latitude kernel: [<ffffffff9326e3c9>] SyS_ioctl+0x79/0x90 Jan 29 10:39:14 latitude kernel: [<ffffffff9381cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 29 10:39:14 latitude kernel: ---[ end trace 0022451fc8e72435 ]--- Jan 29 10:39:14 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 29 10:39:24 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 29 10:39:43 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 29 10:39:43 latitude pkexec[7945]: pam_systemd(polkit-1:session): Cannot create session: Already running in a session Jan 29 10:39:43 latitude audit[7945]: USER_START pid=7945 uid=1000 auid=1000 ses=6 subj=unconfined_u:unconfined_r:xserver_t:s0-s0:c0.c1023 msg='op=PAM: Jan 29 10:39:43 latitude pkexec[7945]: pam_unix(polkit-1:session): session opened for user root by (uid=1000) Jan 29 10:39:43 latitude pkexec[7945]: Kadir: Executing command [USER=root] [TTY=unknown] [CWD=/home/Kadir] [COMMAND=/usr/libexec/xf86-video-intel-back Jan 29 10:39:47 latitude chronyd[739]: Selected source 188.166.57.207 Jan 29 10:39:53 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out Jan 29 10:39:53 latitude kernel: ------------[ cut here ]------------ Jan 29 10:39:53 latitude kernel: WARNING: CPU: 2 PID: 7940 at drivers/gpu/drm/i915/intel_display.c:14180 intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 29 10:39:53 latitude kernel: pipe A vblank wait timed out Jan 29 10:39:53 latitude kernel: Modules linked in: ccm snd_hda_codec_hdmi intel_rapl snd_hda_codec_idt snd_hda_codec_generic x86_pkg_temp_thermal inte Jan 29 10:39:53 latitude kernel: CPU: 2 PID: 7940 Comm: Xorg Tainted: G W 4.9.5-200.fc25.x86_64 #1 Jan 29 10:39:53 latitude kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Jan 29 10:39:53 latitude kernel: ffffb2ca80f37a60 ffffffff933f40bd ffffb2ca80f37ab0 0000000000000000 Jan 29 10:39:53 latitude kernel: ffffb2ca80f37aa0 ffffffff930a202b 00003764f6f14540 0000000000000000 Jan 29 10:39:53 latitude kernel: 0000000000000000 0000000000000000 0000000000000001 ffff97fc2a885000 Jan 29 10:39:53 latitude kernel: Call Trace: Jan 29 10:39:53 latitude kernel: [<ffffffff933f40bd>] dump_stack+0x63/0x86 Jan 29 10:39:53 latitude kernel: [<ffffffff930a202b>] __warn+0xcb/0xf0 Jan 29 10:39:53 latitude kernel: [<ffffffff930a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 29 10:39:53 latitude kernel: [<ffffffff930e7054>] ? finish_wait+0x54/0x70 Jan 29 10:39:53 latitude kernel: [<ffffffffc0767920>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 29 10:39:53 latitude kernel: [<ffffffff930e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 29 10:39:53 latitude kernel: [<ffffffffc0767c93>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 29 10:39:53 latitude kernel: [<ffffffffc04f156a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffffc04f0fb0>] ? drm_atomic_set_crtc_for_connector+0xc0/0xf0 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffffc04f1839>] drm_atomic_commit+0x49/0x50 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffffc0538c4d>] drm_atomic_helper_set_config+0x7d/0xb0 [drm_kms_helper] Jan 29 10:39:53 latitude kernel: [<ffffffffc04e48a5>] drm_mode_set_config_internal+0x65/0x110 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffffc04e608d>] drm_mode_setcrtc+0x3fd/0x4f0 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffffc04dcdcb>] drm_ioctl+0x21b/0x4c0 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffff931fa3e5>] ? do_wp_page+0x105/0x870 Jan 29 10:39:53 latitude kernel: [<ffffffffc04e5c90>] ? drm_mode_getcrtc+0x140/0x140 [drm] Jan 29 10:39:53 latitude kernel: [<ffffffff9326de03>] do_vfs_ioctl+0xa3/0x5f0 Jan 29 10:39:53 latitude kernel: [<ffffffff9326e3c9>] SyS_ioctl+0x79/0x90 Jan 29 10:39:53 latitude kernel: [<ffffffff9381cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 29 10:39:53 latitude kernel: ---[ end trace 0022451fc8e72436 ]--- Jan 29 10:40:03 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out So the workaround does not fully fix the ussue for me. I just noticed that after a suspend and resume, a simple logout and back to GDM completely freezed the system. I am on Fedora 25 4.9.5-200.fc25.x86_64 Jan 29 11:25:09 latitude kernel: [<ffffffff870a202b>] __warn+0xcb/0xf0 Jan 29 11:25:09 latitude kernel: [<ffffffff870a20af>] warn_slowpath_fmt+0x5f/0x80 Jan 29 11:25:09 latitude kernel: [<ffffffff870e7054>] ? finish_wait+0x54/0x70 Jan 29 11:25:09 latitude kernel: [<ffffffffc0390920>] intel_atomic_commit_tail+0xfd0/0xff0 [i915] Jan 29 11:25:09 latitude kernel: [<ffffffff870e7270>] ? prepare_to_wait_event+0x100/0x100 Jan 29 11:25:09 latitude kernel: [<ffffffffc0390c93>] intel_atomic_commit+0x353/0x4c0 [i915] Jan 29 11:25:09 latitude kernel: [<ffffffffc017756a>] ? drm_atomic_check_only+0x30a/0x590 [drm] Jan 29 11:25:09 latitude kernel: [<ffffffffc0177d31>] ? drm_atomic_add_affected_connectors+0x61/0xf0 [drm] Jan 29 11:25:09 latitude kernel: [<ffffffffc0177839>] drm_atomic_commit+0x49/0x50 [drm] Jan 29 11:25:09 latitude kernel: [<ffffffffc01ffbec>] restore_fbdev_mode+0x14c/0x270 [drm_kms_helper] Jan 29 11:25:09 latitude kernel: [<ffffffffc02017c4>] drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] Jan 29 11:25:09 latitude kernel: [<ffffffffc020183d>] drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] Jan 29 11:25:09 latitude kernel: [<ffffffffc03aad68>] intel_fbdev_set_par+0x18/0x70 [i915] Jan 29 11:25:09 latitude kernel: [<ffffffff87478816>] fb_set_var+0x236/0x460 Jan 29 11:25:09 latitude kernel: [<ffffffff87406839>] ? flex_array_get_ptr+0x9/0x20 Jan 29 11:25:09 latitude kernel: [<ffffffff8738cdb6>] ? type_attribute_bounds_av+0x46/0x1e0 Jan 29 11:25:09 latitude kernel: [<ffffffff871c55c7>] ? find_get_entries+0x177/0x2b0 Jan 29 11:25:09 latitude kernel: [<ffffffff8746e63f>] fbcon_blank+0x30f/0x350 Jan 29 11:25:09 latitude kernel: [<ffffffff875028d2>] do_unblank_screen+0xd2/0x1a0 Jan 29 11:25:09 latitude kernel: [<ffffffff874f85b7>] vt_ioctl+0x507/0x12a0 Jan 29 11:25:09 latitude kernel: [<ffffffff874ecad5>] tty_ioctl+0x355/0xc40 Jan 29 11:25:09 latitude kernel: [<ffffffff8737ad68>] ? selinux_inode_free_security+0x58/0x70 Jan 29 11:25:09 latitude kernel: [<ffffffff8729e581>] ? fsnotify_destroy_marks+0x61/0x80 Jan 29 11:25:09 latitude kernel: [<ffffffff8710d45d>] ? call_rcu_sched+0x1d/0x20 Jan 29 11:25:09 latitude kernel: [<ffffffff871df69d>] ? shmem_destroy_inode+0x2d/0x40 Jan 29 11:25:09 latitude kernel: [<ffffffff8726de03>] do_vfs_ioctl+0xa3/0x5f0 Jan 29 11:25:09 latitude kernel: [<ffffffff8726e3c9>] SyS_ioctl+0x79/0x90 Jan 29 11:25:09 latitude kernel: [<ffffffff8781cc77>] entry_SYSCALL_64_fastpath+0x1a/0xa9 Jan 29 11:25:09 latitude kernel: ---[ end trace 1506db7661e4884c ]--- Jan 29 11:25:09 latitude kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out I tested commit 8c3608f from drm-tip. During a plain bootup into lightdm, it gave several WARNING backtraces like: > [ 12.356033] ------------[ cut here ]------------ > [ 12.356081] WARNING: CPU: 1 PID: 5 at drivers/gpu/drm/drm_irq.c:1199 drm_wait_one_vblank+0x154/0x1a0 > [drm] > [ 12.356090] vblank wait timed out on crtc 0 > [ 12.356096] Modules linked in: hid_generic(E) usbhid(E) hid(E) arc4(E) psmouse(E) ahci(E) ata_piix(E) libahci(E) libata(E) scsi_mod(E) ath5k(E) mac80211(E) ath(E) cfg80211(E) rfkill(E) tg3(E) ptp(E) pps_core(E) libphy(E) thermal(E) i915(E) i2c_algo_bit(E) drm_kms_helper(E) ehci_pci(E) uhci_hcd(E) ehci_hcd(E) fjes(E) video(E) button(E) usbcore(E) drm(E) > [ 12.356148] CPU: 1 PID: 5 Comm: kworker/u4:0 Tainted: G E 4.10.0-rc5+ #1 > [ 12.356158] Hardware name: Acer Aspire 2920 /Calado , BIOS V1.13 02/14/2008 > [ 12.356172] Workqueue: events_unbound async_run_entry_fn > [ 12.356179] Call Trace: > [ 12.356189] ? dump_stack+0x5c/0x77 > [ 12.356196] ? __warn+0xc4/0xe0 > [ 12.356202] ? warn_slowpath_fmt+0x5f/0x80 > [ 12.356209] ? finish_wait+0x3c/0x80 > [ 12.356230] ? drm_wait_one_vblank+0x154/0x1a0 [drm] > [ 12.356236] ? remove_wait_queue+0x60/0x60 > [ 12.356322] ? intel_get_load_detect_pipe+0x5a8/0x610 [i915] > [ 12.356375] ? intel_tv_detect+0x156/0x520 [i915] > [ 12.356390] ? drm_helper_probe_single_connector_modes+0x2bb/0x510 [drm_kms_helper] > [ 12.356407] ? drm_setup_crtcs+0x7d/0xa10 [drm_kms_helper] > [ 12.356415] ? check_preempt_wakeup+0xeb/0x200 > [ 12.356421] ? sched_clock_cpu+0x41/0x90 > [ 12.356434] ? drm_fb_helper_initial_config+0x79/0x400 [drm_kms_helper] > [ 12.356441] ? ttwu_do_wakeup+0x14/0xe0 > [ 12.356494] ? intel_fbdev_initial_config+0x14/0x30 [i915] > [ 12.356501] ? async_run_entry_fn+0x34/0x160 > [ 12.356508] ? process_one_work+0x15e/0x420 > [ 12.356514] ? worker_thread+0x65/0x4b0 > [ 12.356520] ? rescuer_thread+0x390/0x390 > [ 12.356526] ? kthread+0x104/0x140 > [ 12.356532] ? kthread_park+0x80/0x80 > [ 12.356540] ? ret_from_fork+0x26/0x40 > [ 12.356546] ---[ end trace 591a4980f651c4a0 ]--- > [ 12.502499] ------------[ cut here ]------------ and also an *ERROR* (followed by some more similar backtraces): > [ 338.652100] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out > [ 338.755173] ------------[ cut here ]------------ > [ 338.755231] WARNING: CPU: 1 PID: 1751 at drivers/gpu/drm/drm_irq.c:1199 drm_wait_one_vblank+0x154/0x1a0 [drm] > [ 338.755237] vblank wait timed out on crtc 0 > [ 338.755241] Modules linked in: ib_iser(E) rdma_cm(E) iw_cm(E) ib_cm(E) ib_core(E) configfs(E) ext4(E) crc16(E) jbd2(E) fscrypto(E) mbcache(E) uvcvideo(E) videobuf2_vmalloc(E) videobuf2_memops(E) videobuf2_v4l2(E) videobuf2_core(E) videodev(E) media(E) iTCO_wdt(E) iTCO_vendor_support(E) snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) snd_hda_intel(E) snd_hda_codec(E) snd_hda_core(E) snd_hwdep(E) snd_pcm(E) snd_timer(E) coretemp(E) snd(E) i2c_i801(E) joydev(E) soundcore(E) pcspkr(E) lpc_ich(E) mfd_core(E) serio_raw(E) ac(E) battery(E) evdev(E) acpi_cpufreq(E) shpchp(E) tpm_tis(E) tpm_tis_core(E) tpm(E) fuse(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) ecb(E) crypto_simd(E) glue_helper(E) cryptd(E) aes_x86_64(E) xts(E) gf128mul(E) algif_skcipher(E) af_alg(E) dm_crypt(E) > [ 338.755355] xfs(E) crc32c_generic(E) libcrc32c(E) dm_round_robin(E) iscsi_tcp(E) libiscsi_tcp(E) libiscsi(E) scsi_transport_iscsi(E) iscsi_ibft(E) iscsi_boot_sysfs(E) virtio_pci(E) virtio_net(E) virtio_ring(E) virtio(E) ctr(E) ccm(E) dm_service_time(E) dm_multipath(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_conntrack(E) dm_mod(E) sg(E) sd_mod(E) sr_mod(E) cdrom(E) ata_generic(E) hid_generic(E) usbhid(E) hid(E) arc4(E) psmouse(E) ahci(E) ata_piix(E) libahci(E) libata(E) scsi_mod(E) ath5k(E) mac80211(E) ath(E) cfg80211(E) rfkill(E) tg3(E) ptp(E) pps_core(E) libphy(E) thermal(E) i915(E) i2c_algo_bit(E) drm_kms_helper(E) ehci_pci(E) uhci_hcd(E) ehci_hcd(E) fjes(E) video(E) button(E) usbcore(E) drm(E) > [ 338.755463] CPU: 1 PID: 1751 Comm: Xorg Tainted: G W E 4.10.0-rc5+ #1 > [ 338.755468] Hardware name: Acer Aspire 2920 /Calado , BIOS V1.13 02/14/2008 > [ 338.755473] Call Trace: > [ 338.755487] ? dump_stack+0x5c/0x77 > [ 338.755494] ? __warn+0xc4/0xe0 > [ 338.755500] ? warn_slowpath_fmt+0x5f/0x80 > [ 338.755508] ? finish_wait+0x3c/0x80 > [ 338.755543] ? drm_wait_one_vblank+0x154/0x1a0 [drm] > [ 338.755550] ? remove_wait_queue+0x60/0x60 > [ 338.755633] ? intel_get_load_detect_pipe+0x5a8/0x610 [i915] > [ 338.755702] ? intel_tv_detect+0x156/0x520 [i915] > [ 338.755730] ? drm_helper_probe_single_connector_modes+0x2bb/0x510 [drm_kms_helper] > [ 338.755769] ? drm_mode_getconnector+0x2f0/0x320 [drm] > [ 338.755804] ? drm_ioctl+0x200/0x430 [drm] > [ 338.755843] ? drm_mode_connector_property_set_ioctl+0x60/0x60 [drm] > [ 338.755957] ? xfs_file_write_iter+0x10b/0x150 [xfs] > [ 338.755966] ? do_vfs_ioctl+0x9b/0x600 > [ 338.755973] ? vfs_write+0x163/0x1a0 > [ 338.755979] ? SyS_ioctl+0x76/0x90 > [ 338.755987] ? entry_SYSCALL_64_fastpath+0x1e/0xad > [ 338.755993] ---[ end trace 591a4980f651c4a3 ]--- > [ 349.148087] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out > [ 349.252053] ------------[ cut here ]------------ After all, it looks like this regression isn't fixed in drm-tip, though the hardening patch from comment #19 (24835e44) helps to avoid hard crashes. Please tell if I can help by providing further info or testing. Created attachment 129266 [details] [review] 4.10.0-rc6: drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" This is the patch to revert ea0000f0 "Roll out the helper nonblock tracking", updated to apply to 4.10.0-rc6. Rami - please check if you can reproduce and push this forward with developers. What's the status on pushing out this revert? Can we please aim to get this into 4.10? I've been running for 2 weeks on 4.9 with the patch in comment #20, and the hard lockups have totally gone. This is certainly way, way better than the current stock experience, which results in my T460s entirely locking up daily. Created attachment 129434 [details]
/sys/class/drm/card0/error
Less than a day after I made that comment I got a hard lockup, typical :-(
/sys/class/drm/card0/error is attached. No messages about flip_done timing out now, instead I get this:
[ 9945.797615] [drm] GPU HANG: ecode 9:0:0xfffffffe, in Xorg [1858], reason: Hang on render ring, action: reset
[...]
[ 9945.797628] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[ 9945.797708] drm/i915: Resetting chip after gpu hang
[ 9945.800332] [drm] RC6 on
[ 9945.813539] [drm] GuC firmware load skipped
(In reply to Chris Down from comment #44) > Created attachment 129434 [details] > /sys/class/drm/card0/error > > Less than a day after I made that comment I got a hard lockup, typical :-( > > /sys/class/drm/card0/error is attached. No messages about flip_done timing > out now, instead I get this: > > [ 9945.797615] [drm] GPU HANG: ecode 9:0:0xfffffffe, in Xorg [1858], reason: > Hang on render ring, action: reset > [...] > [ 9945.797628] [drm] GPU crash dump saved to /sys/class/drm/card0/error > [ 9945.797708] drm/i915: Resetting chip after gpu hang > [ 9945.800332] [drm] RC6 on > [ 9945.813539] [drm] GuC firmware load skipped This is unrelated to the bug here. It might explain your hard lookup though, so please file a new bug report with all the details. Created attachment 129979 [details] [review] Don't fall over flip_done failures that hard Another hack on top of the already merged hack. Please make sure you have the referenced patch, so either drm-tip or apply both patches. Again it won't fix the stalls, but should help with full freeze. I've been running with drm-tip+"Don't fall over flip_done failures that hard" (4.10.0+) for a couple of days, and didn't notice any change: there are stalls but no freezes, as advertised. Created attachment 130168 [details] [review] 4.11.0-rc1: drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" This is the patch to revert ea0000f0 "Roll out the helper nonblock tracking", updated to apply to 4.11.0-rc1. You can follow this on patchwork: https://patchwork.freedesktop.org/patch/124229/ link to series https://patchwork.freedesktop.org/series/16022/ We can't apply the revert because the entire atomic house will come crashing down on us. But Maarten fixed another potential oops with nonblocking commits with his atomic iterator patches. Those all now landed in drm-tip. We need to retest, if that's ok we can figure out how to backport the entire pile (or apply the revert just to some old stable kernels). Daniel, what exactly shall we test? Current drm-tip with or without attachment 129979 [details] [review] (Don't fall over flip_done failures that hard)? I haven't seen oopses for long, but the stalls with long timeouts are there on every reboot or mode switch. Just updated to Fedora kernel 4.10.5-200.fc25.x86_64. Until kernel 4.9.XX it said: [drm_kms_helper]] *ERROR* [CRTC:26:pipe A] flip_done timed out now it says: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:31:pipe A] flip_done timed out I still get the timeouts, journalctl says: Mar 28 10:25:05 elif kernel: ------------[ cut here ]------------ Mar 28 10:25:05 elif kernel: WARNING: CPU: 2 PID: 839 at drivers/gpu/drm/i915/intel_display.c:14189 intel_atomic_commit_tail+0xf97/0xfc0 Mar 28 10:25:05 elif kernel: pipe A vblank wait timed out Mar 28 10:25:05 elif kernel: Modules linked in: ccm snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic i915 intel_rapl x86_pkg_t Mar 28 10:25:05 elif kernel: CPU: 2 PID: 839 Comm: Xorg Not tainted 4.10.5-200.fc25.x86_64 #1 Mar 28 10:25:05 elif kernel: Hardware name: Dell Inc. Latitude E6320/09PHH9, BIOS A19 11/14/2013 Mar 28 10:25:05 elif kernel: Call Trace: Mar 28 10:25:05 elif kernel: dump_stack+0x63/0x86 Mar 28 10:25:05 elif kernel: __warn+0xcb/0xf0 Mar 28 10:25:05 elif kernel: warn_slowpath_fmt+0x5f/0x80 Mar 28 10:25:05 elif kernel: ? finish_wait+0x67/0x80 Mar 28 10:25:05 elif kernel: intel_atomic_commit_tail+0xf97/0xfc0 [i915] Mar 28 10:25:05 elif kernel: ? __switch_to+0x227/0x460 Mar 28 10:25:05 elif kernel: ? remove_wait_queue+0x70/0x70 Mar 28 10:25:05 elif kernel: intel_atomic_commit+0x3cb/0x4f0 [i915] Mar 28 10:25:05 elif kernel: drm_atomic_commit+0x4b/0x50 [drm] Mar 28 10:25:05 elif kernel: restore_fbdev_mode+0x14c/0x2a0 [drm_kms_helper] Mar 28 10:25:05 elif kernel: drm_fb_helper_restore_fbdev_mode_unlocked+0x34/0x80 [drm_kms_helper] Mar 28 10:25:05 elif kernel: drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] Mar 28 10:25:05 elif kernel: intel_fbdev_set_par+0x18/0x70 [i915] Mar 28 10:25:05 elif kernel: fb_set_var+0x236/0x460 Mar 28 10:25:05 elif kernel: ? kmem_cache_alloc+0x195/0x1b0 Mar 28 10:25:05 elif kernel: ? avc_alloc_node+0x27/0x120 Mar 28 10:25:05 elif kernel: ? flex_array_get_ptr+0x9/0x20 Mar 28 10:25:05 elif kernel: ? type_attribute_bounds_av+0x46/0x1e0 Mar 28 10:25:05 elif kernel: fbcon_blank+0x30f/0x350 Mar 28 10:25:05 elif kernel: do_unblank_screen+0xd2/0x1a0 Mar 28 10:25:05 elif kernel: vt_ioctl+0x507/0x12a0 Mar 28 10:25:05 elif kernel: tty_ioctl+0x355/0xc40 Mar 28 10:25:05 elif kernel: ? selinux_inode_free_security+0x6d/0x80 Mar 28 10:25:05 elif kernel: ? fsnotify_destroy_marks+0x61/0x80 Mar 28 10:25:05 elif kernel: ? call_rcu_sched+0x1d/0x20 Mar 28 10:25:05 elif kernel: do_vfs_ioctl+0xa3/0x5f0 Mar 28 10:25:05 elif kernel: SyS_ioctl+0x79/0x90 Mar 28 10:25:05 elif kernel: ? call_rcu_sched+0x1d/0x20 Mar 28 10:25:05 elif kernel: do_vfs_ioctl+0xa3/0x5f0 Mar 28 10:25:05 elif kernel: SyS_ioctl+0x79/0x90 Mar 28 10:25:05 elif kernel: do_syscall_64+0x67/0x180 Mar 28 10:25:05 elif kernel: entry_SYSCALL64_slow_path+0x25/0x25 Mar 28 10:25:05 elif kernel: RIP: 0033:0x7f5ac7f87787 Mar 28 10:25:05 elif kernel: RSP: 002b:00007ffc5e2587b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Mar 28 10:25:05 elif kernel: RAX: ffffffffffffffda RBX: 000000000082c800 RCX: 00007f5ac7f87787 Mar 28 10:25:05 elif kernel: RDX: 0000000000000000 RSI: 0000000000004b3a RDI: 000000000000000a Mar 28 10:25:05 elif kernel: RBP: 000000000084b698 R08: 0000000001150a50 R09: 0000000001158e00 Mar 28 10:25:05 elif kernel: R10: 00007ffc5e258750 R11: 0000000000000246 R12: 000000000084b6a0 Mar 28 10:25:05 elif kernel: R13: 000000000084b6d8 R14: 000000000084f818 R15: 0000000000830bd8 Mar 28 10:25:05 elif kernel: ---[ end trace 4ca50fbca84bf134 ]--- Mar 28 10:25:05 elif kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:31:pipe A] flip_done timed out Mar 28 10:25:25 elif kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:31:pipe A] flip_done timed out Mar 28 10:25:25 elif kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:31:pipe A] flip_done timed out Mar 28 10:25:38 elif kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:31:pipe A] flip_done timed out (In reply to Ferenc Wágner from comment #52) > Daniel, what exactly shall we test? Current drm-tip with or without > attachment 129979 [details] [review] [review] (Don't fall over flip_done failures > that hard)? I haven't seen oopses for long, but the stalls with long > timeouts are there on every reboot or mode switch. drm-tip has all current patches. OK, compiled and booted the 2017y-03m-28d-08h-54m-35s UTC integration manifest. There are several hangs during bootup:
> [ 11.996006] [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:30:pipe A] flip_done timed out
> [ 12.100021] ------------[ cut here ]------------
> [ 12.100021] WARNING: CPU: 0 PID: 5 at drivers/gpu/drm/drm_irq.c:1242 drm_wait_one_vblank+0x154/0x1a0 [drm]
> [ 12.100021] vblank wait timed out on crtc 0
> [ 12.100021] Modules linked in: hid_generic(E) usbhid(E) hid(E) arc4(E) psmouse(E) ata_piix(E) ahci(E) libahci(E) libata(E) scsi_mod(E) ath5k(E) mac80211(E) ath(E) cfg80211(E) rfkill(E) i915(E) tg3(E) ptp(E) pps_core(E) prime_numbers(E) libphy(E) i2c_algo_bit(E) drm_kms_helper(E) thermal(E) uhci_hcd(E) ehci_pci(E) ehci_hcd(E) video(E) drm(E) button(E) usbcore(E)
> [ 12.100021] CPU: 0 PID: 5 Comm: kworker/u4:0 Tainted: G E 4.11.0-rc4+ #3
> [ 12.100021] Hardware name: Acer Aspire 2920 /Calado , BIOS V1.13 02/14/2008
> [ 12.100021] Workqueue: events_unbound async_run_entry_fn
> [ 12.100021] Call Trace:
> [ 12.100021] ? dump_stack+0x5c/0x77
> [ 12.100021] ? __warn+0xc4/0xe0
> [ 12.100021] ? warn_slowpath_fmt+0x5f/0x80
> [ 12.100021] ? finish_wait+0x3c/0x80
> [ 12.100021] ? drm_wait_one_vblank+0x154/0x1a0 [drm]
> [ 12.100021] ? remove_wait_queue+0x60/0x60
> [ 12.100021] ? intel_get_load_detect_pipe+0x5ea/0x640 [i915]
> [ 12.100021] ? intel_tv_detect+0x156/0x520 [i915]
> [ 12.100021] ? drm_helper_probe_single_connector_modes+0x2fc/0x550 [drm_kms_helper]
> [ 12.100021] ? drm_setup_crtcs+0x7d/0xa10 [drm_kms_helper]
> [ 12.100021] ? check_preempt_wakeup+0xeb/0x200
> [ 12.100021] ? drm_fb_helper_initial_config+0x79/0x420 [drm_kms_helper]
> [ 12.100021] ? try_to_wake_up+0x54/0x460
> [ 12.100021] ? intel_fbdev_initial_config+0x14/0x30 [i915]
> [ 12.100021] ? async_run_entry_fn+0x34/0x160
> [ 12.100021] ? process_one_work+0x15e/0x420
> [ 12.100021] ? worker_thread+0x65/0x4b0
> [ 12.100021] ? rescuer_thread+0x390/0x390
> [ 12.100021] ? kthread+0x104/0x140
> [ 12.100021] ? kthread_park+0x80/0x80
> [ 12.100021] ? ret_from_fork+0x26/0x40
> [ 12.100021] ---[ end trace 4fb69e1dcd9a9df7 ]---
> [ 12.235090] ------------[ cut here ]------------
> [ 12.235090] WARNING: CPU: 0 PID: 5 at drivers/gpu/drm/drm_irq.c:1242 drm_wait_one_vblank+0x154/0x1a0 [drm]
> [ 12.235090] vblank wait timed out on crtc 0
> [ 12.235090] Modules linked in: hid_generic(E) usbhid(E) hid(E) arc4(E) psmouse(E) ata_piix(E) ahci(E) libahci(E) libata(E) scsi_mod(E) ath5k(E) mac80211(E) ath(E) cfg80211(E) rfkill(E) i915(E) tg3(E) ptp(E) pps_core(E) prime_numbers(E) libphy(E) i2c_algo_bit(E) drm_kms_helper(E) thermal(E) uhci_hcd(E) ehci_pci(E) ehci_hcd(E) video(E) drm(E) button(E) usbcore(E)
> [ 12.235090] CPU: 0 PID: 5 Comm: kworker/u4:0 Tainted: G W E 4.11.0-rc4+ #3
> [ 12.235090] Hardware name: Acer Aspire 2920 /Calado , BIOS V1.13 02/14/2008
> [ 12.235090] Workqueue: events_unbound async_run_entry_fn
> [ 12.235090] Call Trace:
> [ 12.235090] ? dump_stack+0x5c/0x77
> [ 12.235090] ? __warn+0xc4/0xe0
> [ 12.235090] ? warn_slowpath_fmt+0x5f/0x80
> [ 12.235090] ? finish_wait+0x3c/0x80
> [ 12.235090] ? drm_wait_one_vblank+0x154/0x1a0 [drm]
> [ 12.235090] ? remove_wait_queue+0x60/0x60
> [ 12.235090] ? intel_pre_plane_update+0x10c/0x190 [i915]
> [ 12.235090] ? intel_atomic_commit_tail+0x9f/0xed0 [i915]
> [ 12.235090] ? __queue_work+0x13c/0x440
> [ 12.235090] ? intel_atomic_commit+0x452/0x4f0 [i915]
> [ 12.235090] ? intel_release_load_detect_pipe+0x58/0xa0 [i915]
> [ 12.235090] ? intel_tv_detect+0x374/0x520 [i915]
> [ 12.235090] ? drm_helper_probe_single_connector_modes+0x2fc/0x550 [drm_kms_helper]
> [ 12.235090] ? drm_setup_crtcs+0x7d/0xa10 [drm_kms_helper]
> [ 12.235090] ? check_preempt_wakeup+0xeb/0x200
> [ 12.235090] ? drm_fb_helper_initial_config+0x79/0x420 [drm_kms_helper]
> [ 12.235090] ? try_to_wake_up+0x54/0x460
> [ 12.235090] ? intel_fbdev_initial_config+0x14/0x30 [i915]
> [ 12.235090] ? async_run_entry_fn+0x34/0x160
> [ 12.235090] ? process_one_work+0x15e/0x420
> [ 12.235090] ? worker_thread+0x65/0x4b0
> [ 12.235090] ? rescuer_thread+0x390/0x390
> [ 12.235090] ? kthread+0x104/0x140
> [ 12.235090] ? kthread_park+0x80/0x80
> [ 12.235090] ? ret_from_fork+0x26/0x40
> [ 12.235090] ---[ end trace 4fb69e1dcd9a9df8 ]---
> [ 12.295126] fbcon: inteldrmfb (fb0) is primary device
> [ 13.189124] Console: switching to colour frame buffer device 160x50
> [ 13.208494] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
and several more which I don't include here, but each has drm_helper_probe_single_connector_modes and intel_tv_detect on its call trace.
Please tell if there's anything more I could do here.
The tv out issue is separate and has its own bug: https://bugs.freedesktop.org/show_bug.cgi?id=93782 Please use the workaround listed there so we can concentrate on this bug here. :) Is this still issue seen here that are _for this bug_ ? Adding tag into "Whiteboard" field - ReadyForDev The bug still active *Status is correct *Platform is included *Feature is included *Priority and Severity correctly set *Logs included Created attachment 131280 [details] [review] 4.11.0: drm/i915: Revert ea0000f0 "Roll out the helper nonblock tracking" This is the patch to revert ea0000f0 "Roll out the helper nonblock tracking", updated to apply to 4.11.0. Ok, the original report was about the Oops, i.e. the backtrace containing: [ 179.786889] Oops: 0000 [#1] SMP I think those are gone now since the refcount patch was merged. If that's not the case, then please pipe up. Can this still be reproduced with the current drm-tip? If so, what are the steps needed to reproduce this? This should have been fixed in v4.11 commit 24835e442f289813aa568d142a755672a740503c Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Dec 21 11:23:30 2016 +0100 drm: reference count event->completion Since that commit explicitly mentions this bug, I think it's best to close this bug now. (In reply to Maarten Lankhorst from comment #62) > This should have been fixed in v4.11 > > commit 24835e442f289813aa568d142a755672a740503c > Author: Daniel Vetter <daniel.vetter@ffwll.ch> > Date: Wed Dec 21 11:23:30 2016 +0100 > > drm: reference count event->completion > > Since that commit explicitly mentions this bug, I think it's best to close > this bug now. I'm still getting "flip_done timed out" in 4.11. I haven't tried drm-tip though. Should I file a new issue or reopen this one? (In reply to Christopher Chavez from comment #63) > I'm still getting "flip_done timed out" in 4.11. I haven't tried drm-tip > though. Should I file a new issue or reopen this one? Forgot to mention steps for reproducing. I can think of several ways: commands like `xrandr --query`, programs such as mate-display-properties, and switching to/from graphical VT are enough to trigger. Created attachment 131591 [details]
dmesg
I'm having a similar issue here.
OS: Arch Linux (x86_64)
00:02.0 VGA compatible controller: Intel Corporation 4 Series Chipset Integrated Graphics Controller (rev 03)
mesa 17.1.0-1
Linux myhost 4.11.3-1-ARCH #1 SMP PREEMPT Sun May 28 10:40:17 CEST 2017 x86_64 GNU/Linux
I was playing a game (NFSIISE) while I got this, I remember making the game go into windowed mode and then tile it to the right (I use i3wm), at that point my machine just crashed and I had to do a hard reboot.
Please see the dmesg I'm attaching with the information about the crash.
If you think my issue is different, please let me know and I'll open a different bug report.
I also see this in my journal: May 30 21:00:53 myhost kernel: ------------[ cut here ]------------ May 30 21:00:53 myhost kernel: WARNING: CPU: 1 PID: 362 at drivers/gpu/drm/i915/intel_display.c:14229 intel_atomic_commit_tail+0xfd5/0xfe0 [i915] May 30 21:00:53 myhost kernel: pipe A vblank wait timed out May 30 21:00:53 myhost kernel: Modules linked in: uas usb_storage fuse cfg80211 rfkill gpio_ich iTCO_wdt iTCO_vendor_support i915 mousedev coretemp input_leds joydev kvm_intel kvm snd_hda_codec_idt snd_hda_codec_generic video drm_kms_helper drm snd_hda_intel led_class lpc_ich syscopyarea irqbypass snd_hda_codec sysfillrect evdev mac_hid sysimgblt jme snd_hda_core fb_sys_fops mii i2c_algo_bit psmouse pcspkr i2c_i801 rng_core intel_agp thermal intel_gtt shpchp button snd_hwdep snd_pcm snd_timer snd soundcore acpi_cpufreq tpm_tis tpm_tis_core tpm sch_fq_codel ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi serio_raw atkbd libps2 uhci_hcd i8042 serio ata_piix libata scsi_mod ehci_pci ehci_hcd usbcore usb_common May 30 21:00:53 myhost kernel: CPU: 1 PID: 362 Comm: Xorg Tainted: G W 4.11.3-1-ARCH #1 May 30 21:00:53 myhost kernel: Hardware name: Positivo Informatica SA POS-ECIG41BS/POS-ECIG41BS, BIOS 080015 05/14/2010 May 30 21:00:53 myhost kernel: Call Trace: May 30 21:00:53 myhost kernel: dump_stack+0x63/0x81 May 30 21:00:53 myhost kernel: __warn+0xcb/0xf0 May 30 21:00:53 myhost kernel: warn_slowpath_fmt+0x5a/0x80 May 30 21:00:53 myhost kernel: intel_atomic_commit_tail+0xfd5/0xfe0 [i915] May 30 21:00:53 myhost kernel: ? wake_bit_function+0x60/0x60 May 30 21:00:53 myhost kernel: intel_atomic_commit+0x360/0x480 [i915] May 30 21:00:53 myhost kernel: ? drm_atomic_check_only+0x39e/0x580 [drm] May 30 21:00:53 myhost kernel: drm_atomic_commit+0x4b/0x50 [drm] May 30 21:00:53 myhost kernel: restore_fbdev_mode+0x222/0x280 [drm_kms_helper] May 30 21:00:53 myhost kernel: drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x80 [drm_kms_helper] May 30 21:00:53 myhost kernel: drm_fb_helper_set_par+0x2d/0x60 [drm_kms_helper] May 30 21:00:53 myhost kernel: intel_fbdev_set_par+0x18/0x70 [i915] May 30 21:00:53 myhost kernel: fb_set_var+0x193/0x430 May 30 21:00:53 myhost kernel: ? update_curr+0xf2/0x1e0 May 30 21:00:53 myhost kernel: ? __enqueue_entity+0x6c/0x70 May 30 21:00:53 myhost kernel: ? put_prev_entity+0x80/0xc10 May 30 21:00:53 myhost kernel: ? set_next_entity+0x57/0xdb0 May 30 21:00:53 myhost kernel: fbcon_blank+0x206/0x390 May 30 21:00:53 myhost kernel: do_unblank_screen+0xa4/0x190 May 30 21:00:53 myhost kernel: complete_change_console+0x59/0xe0 May 30 21:00:53 myhost kernel: vt_ioctl+0x10e7/0x11e0 May 30 21:00:53 myhost kernel: ? __generic_file_write_iter+0x108/0x1c0 May 30 21:00:53 myhost kernel: ? __wake_up+0x44/0x50 May 30 21:00:53 myhost kernel: tty_ioctl+0x229/0xc40 May 30 21:00:53 myhost kernel: ? n_tty_open+0xd0/0xd0 May 30 21:00:53 myhost kernel: ? __fget+0x77/0xb0 May 30 21:00:53 myhost kernel: ? sock_poll+0x68/0x90 May 30 21:00:53 myhost kernel: do_vfs_ioctl+0xa5/0x600 May 30 21:00:53 myhost kernel: ? __fget+0x77/0xb0 May 30 21:00:53 myhost kernel: SyS_ioctl+0x79/0x90 May 30 21:00:53 myhost kernel: entry_SYSCALL_64_fastpath+0x1a/0xa9 May 30 21:00:53 myhost kernel: RIP: 0033:0x7f79fbea7cb7 May 30 21:00:53 myhost kernel: RSP: 002b:00007fffcc5fe908 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 May 30 21:00:53 myhost kernel: RAX: ffffffffffffffda RBX: 00007f79fc162ae0 RCX: 00007f79fbea7cb7 May 30 21:00:53 myhost kernel: RDX: 0000000000000001 RSI: 0000000000005605 RDI: 000000000000000a May 30 21:00:53 myhost kernel: RBP: 000000000004b640 R08: 0000000000000000 R09: 0000000000000001 May 30 21:00:53 myhost kernel: R10: 00007fffcc5fe8b0 R11: 0000000000000246 R12: 000000000005b680 May 30 21:00:53 myhost kernel: R13: 00000000034b3980 R14: 000000000084c3c8 R15: 0000000002e2da70 May 30 21:00:53 myhost kernel: ---[ end trace c02755ea47d64b4d ]--- May 30 21:00:53 myhost kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out May 30 21:00:53 myhost kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out May 30 21:00:53 myhost kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out May 30 21:01:03 myhost kernel: [drm:drm_atomic_helper_commit_cleanup_done [drm_kms_helper]] *ERROR* [CRTC:29:pipe A] flip_done timed out May 30 21:01:03 myhost kernel: ------------[ cut here ]------------ May 30 21:01:03 myhost kernel: WARNING: CPU: 0 PID: 362 at drivers/gpu/drm/i915/intel_display.c:14229 intel_atomic_commit_tail+0xfd5/0xfe0 [i915] May 30 21:01:03 myhost kernel: pipe A vblank wait timed out May 30 21:01:03 myhost kernel: Modules linked in: uas usb_storage fuse cfg80211 rfkill gpio_ich iTCO_wdt iTCO_vendor_support i915 mousedev coretemp input_leds joydev kvm_intel kvm snd_hda_codec_idt snd_hda_codec_generic video drm_kms_helper drm snd_hda_intel led_class lpc_ich syscopyarea irqbypass snd_hda_codec sysfillrect evdev mac_hid sysimgblt jme snd_hda_core fb_sys_fops mii i2c_algo_bit psmouse pcspkr i2c_i801 rng_core intel_agp thermal intel_gtt shpchp button snd_hwdep snd_pcm snd_timer snd soundcore acpi_cpufreq tpm_tis tpm_tis_core tpm sch_fq_codel ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid sr_mod cdrom sd_mod ata_generic pata_acpi serio_raw atkbd libps2 uhci_hcd i8042 serio ata_piix libata scsi_mod ehci_pci ehci_hcd usbcore usb_common May 30 21:01:03 myhost kernel: CPU: 0 PID: 362 Comm: Xorg Tainted: G W 4.11.3-1-ARCH #1 May 30 21:01:03 myhost kernel: Hardware name: Positivo Informatica SA POS-ECIG41BS/POS-ECIG41BS, BIOS 080015 05/14/2010 May 30 21:01:03 myhost kernel: Call Trace: May 30 21:01:03 myhost kernel: dump_stack+0x63/0x81 May 30 21:01:03 myhost kernel: __warn+0xcb/0xf0 May 30 21:01:03 myhost kernel: warn_slowpath_fmt+0x5a/0x80 May 30 21:01:03 myhost kernel: intel_atomic_commit_tail+0xfd5/0xfe0 [i915] May 30 21:01:03 myhost kernel: ? wake_bit_function+0x60/0x60 May 30 21:01:03 myhost kernel: intel_atomic_commit+0x360/0x480 [i915] May 30 21:01:03 myhost kernel: ? drm_atomic_check_only+0x39e/0x580 [drm] May 30 21:01:03 myhost kernel: drm_atomic_commit+0x4b/0x50 [drm] May 30 21:01:03 myhost kernel: drm_atomic_helper_set_config+0x83/0xe0 [drm_kms_helper] May 30 21:01:03 myhost kernel: drm_mode_set_config_internal+0x65/0x110 [drm] May 30 21:01:03 myhost kernel: drm_mode_setcrtc+0x10c/0x560 [drm] May 30 21:01:03 myhost kernel: drm_ioctl+0x212/0x4d0 [drm] May 30 21:01:03 myhost kernel: ? drm_mode_getcrtc+0x170/0x170 [drm] May 30 21:01:03 myhost kernel: ? __vfs_write+0xe4/0x140 May 30 21:01:03 myhost kernel: do_vfs_ioctl+0xa5/0x600 May 30 21:01:03 myhost kernel: ? __fget+0x77/0xb0 May 30 21:01:03 myhost kernel: SyS_ioctl+0x79/0x90 May 30 21:01:03 myhost kernel: entry_SYSCALL_64_fastpath+0x1a/0xa9 May 30 21:01:03 myhost kernel: RIP: 0033:0x7f79fbea7cb7 May 30 21:01:03 myhost kernel: RSP: 002b:00007fffcc5fd828 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 May 30 21:01:03 myhost kernel: RAX: ffffffffffffffda RBX: 000000000000001a RCX: 00007f79fbea7cb7 May 30 21:01:03 myhost kernel: RDX: 00007fffcc5fd860 RSI: 00000000c06864a2 RDI: 000000000000000b May 30 21:01:03 myhost kernel: RBP: 000000000083dbe0 R08: 0000000000000000 R09: 0000000002b60870 May 30 21:01:03 myhost kernel: R10: 00007fffcc5fda00 R11: 0000000000000246 R12: 00007fffcc5fdbe0 May 30 21:01:03 myhost kernel: R13: 0000000000000000 R14: 00007fffcc5fdc80 R15: 00007fffcc5fe994 May 30 21:01:03 myhost kernel: ---[ end trace c02755ea47d64b4e ]--- If you're still seeing the oops i.e. [ 179.786793] BUG: unable to handle kernel NULL pointer dereference at (null) [ 179.786840] IP: [<ffffffff810983eb>] __wake_up_common+0x2b/0x90 then please reopen this one. Do not use this one to report about vblank wait time outs. File a new bug for that if you're seeing it on or after v4.11. My problem was solved here: Bug 101261 Sorry for spamming unrelated information. (In reply to Diego Viola from comment #69) > My problem was solved here: Bug 101261 > > Sorry for spamming unrelated information. Thank you for the information. Closing the bug. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.