The X server always hangs when I toggle fullscreen mode in the mpv media player. When it hang I can still move the mouse and hear sound but inputs doesn't work and the screen is frozen. Switching to framebuffer terminal is not possible. The hang only occurs when using SNA acceleration with TearFree. UXA works fine and SNA without TearFree also work. I have a Haswell (i7-4790K) cpu and connect monitors to all 3 connectors (hdmi, dp, dvi). All 3 monitors have the same resolution 1920x1200. If I have several monitors active at the same time in a dual head setup the hang doesn't occur. It only hangs if a single head it active when I toggle fullscreen. The resolution and type of the video doesn't matter. I'm using an xfce desktop without compositing enabled. Mpv use the opengl output (profile=opengl-hq) without any hardware accelerated video decoding. I'm using xorg-server-1.19.2 kernel-4.4.52 xf86-video-intel-git media-libs/mesa-13.0.5 on a gentoo system. My xorg.conf only contains: Section "ServerFlags" Option "LogVerbose" "10" EndSection Section "Device" Identifier "IGP" Driver "intel" Option "TearFree" "true" Option "DRI" "3" EndSection There are no errors or other output in dmesg or Xorg.0.log as a result of the hang. I ran "thread apply all bt full" in gdb on the X process after the hang. I only had debug symbols on xf86-video-intel and I can provide a better backtrace if needed.
Created attachment 130167 [details] gdb backtrace, xorg.log and dmesg the attachment I added when creating the bug shows up as (deleted) so let's add it again.
According to that log we are stuck waiting for an event following a flip. The question is where that went - was a failed flip misreported to userspace, or did it just vanish?
commit be913a3336bcc1c933ad448224f09da138f16c0a Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Mar 12 09:28:56 2017 +0000 sna: Don't stall indefinitely for a missing flip event Will lessen the impact, but still likely to be a frozen screen (just no longer a frozen X).
I've been using the git version with commit be913a3336bcc1c933ad448224f09da138f16c0a for a few days now and I no longer experience any problems. There is no hang or other problem when switching to fullscreen. As far as I'm concerned the problem is fixed unless you want to spend more time figuring out what happened to that missing flip?
(In reply to Thomas Lindroth from comment #4) > I've been using the git version with commit > be913a3336bcc1c933ad448224f09da138f16c0a for a few days now and I no longer > experience any problems. There is no hang or other problem when switching to > fullscreen. As far as I'm concerned the problem is fixed unless you want to > spend more time figuring out what happened to that missing flip? Thanks Thomas, since this is already upstreamed and you confirm this fixes your issue, I am closing it If it is occurring again, please reopen the ticket.
There's still the issue that we are running in a degraded mode (no pageflipping) if the original bug occurs.
The claim that I don't experience any problems was premature. I'm getting kernel warnings like the ones below now. I only used the intel driver with SNA for a few days before opening this bug and I've never seen warnings like these before using UXA. I also noticed that I sometimes get graphical corruptions in firefox when I use firefox's hardware acceleration. "GPU Accelerated Windows = 1/1 OpenGL (OMTC)" in about:support. It's easy to reproduce by opening any page, selecting some text and then deselecting it. Firefox then needs to redraw the area with the text but this sometimes doesn't happen and the text area is left blank. It happens rarely unless there is another opengl windows displayed at the same time like mpv or glxgears. I'm guessing missing page flips is more visible in firefox since it only do the painting once instead of continuously paining new frames. I'm using the same setup as before but with kernel-4.4.55 now [warning] ------------[ cut here ]------------ [warning] WARNING: CPU: 0 PID: 0 at /usr/src/linux-4.4.55/drivers/gpu/drm/i915/intel_display.c:11412 intel_check_page_flip+0x105/0x120() [warning] Kicking stuck page flip: queued at 729009, now 729013 Modules linked in: cfg80211 iptable_nat nf_nat_ipv4 nf_nat xt_limit xt_conntrack iptable_filter iptable_mangle ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pclmul snd_hda_intel snd_hda_codec lpc_ich uas mfd_core usb_storage snd_hwdep joydev snd_hda_core hid_microsoft [warning] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.55 #67 [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 [warning] 0000000000000086 77b737bba2c6b8e2 ffff88042fa03d60 ffffffffb02f629b [warning] ffff88042fa03da8 ffffffffb0a70538 ffff88042fa03d98 ffffffffb0073f36 [warning] ffff88041c1ca800 ffff88041c7cd000 ffff88041c1ca9a8 0000000000000000 [warning] Call Trace: [warning] <IRQ> [<ffffffffb02f629b>] dump_stack+0x4d/0x72 [warning] [<ffffffffb0073f36>] warn_slowpath_common+0x86/0xc0 [warning] [<ffffffffb0073fcc>] warn_slowpath_fmt+0x5c/0x80 [warning] [<ffffffffb0484e2a>] ? __intel_pageflip_stall_check+0xfa/0x110 [warning] [<ffffffffb049e045>] intel_check_page_flip+0x105/0x120 [warning] [<ffffffffb0422faa>] ironlake_irq_handler+0x2da/0xbf0 [warning] [<ffffffffb008a480>] ? execute_in_process_context+0x70/0x70 [warning] [<ffffffffb008a498>] ? delayed_work_timer_fn+0x18/0x20 [warning] [<ffffffffb00bc0fc>] handle_irq_event_percpu+0x4c/0x1f0 [warning] [<ffffffffb00bc2d9>] handle_irq_event+0x39/0x60 [warning] [<ffffffffb00bf59f>] handle_edge_irq+0x6f/0x150 [warning] [<ffffffffb00063fd>] handle_irq+0x1d/0x30 [warning] [<ffffffffb0763dfb>] do_IRQ+0x4b/0xd0 [warning] [<ffffffffb0762404>] common_interrupt+0x84/0x84 [warning] <EOI> [<ffffffffb063faf2>] ? cpuidle_enter_state+0x132/0x2d0 [warning] [<ffffffffb063fcc7>] cpuidle_enter+0x17/0x20 [warning] [<ffffffffb00af16f>] cpu_startup_entry+0x30f/0x370 [warning] [<ffffffffb075b4a4>] rest_init+0x84/0x90 [warning] [<ffffffffb0d12ed6>] start_kernel+0x43f/0x460 [warning] [<ffffffffb0d12495>] x86_64_start_reservations+0x2a/0x2c [warning] [<ffffffffb0d12582>] x86_64_start_kernel+0xeb/0xee [warning] ---[ end trace f48c4261daf1609d ]--- [warning] ------------[ cut here ]------------ [warning] WARNING: CPU: 0 PID: 0 at /usr/src/linux-4.4.55/drivers/gpu/drm/i915/intel_display.c:11412 intel_check_page_flip+0x105/0x120() [warning] Kicking stuck page flip: queued at 607433, now 607437 [warning] Modules linked in: cfg80211 iptable_mangle xt_limit xt_conntrack iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pc lmul snd_hda_intel snd_hda_codec uas lpc_ich mfd_core usb_storage snd_hwdep snd_hda_core joydev hid_microsoft [warning] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.55 #67 [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 [warning] 0000000000000086 2122cfe03e3ca646 ffff88042fa03d60 ffffffffb32f629b [warning] ffff88042fa03da8 ffffffffb3a70538 ffff88042fa03d98 ffffffffb3073f36 [warning] ffff88041c1e2800 ffff88041c7de000 ffff88041c1e29a8 0000000000000001 [warning] Call Trace: [warning] <IRQ> [<ffffffffb32f629b>] dump_stack+0x4d/0x72 [warning] [<ffffffffb3073f36>] warn_slowpath_common+0x86/0xc0 [warning] [<ffffffffb3073fcc>] warn_slowpath_fmt+0x5c/0x80 [warning] [<ffffffffb3484e2a>] ? __intel_pageflip_stall_check+0xfa/0x110 [warning] [<ffffffffb349e045>] intel_check_page_flip+0x105/0x120 [warning] [<ffffffffb3422faa>] ironlake_irq_handler+0x2da/0xbf0 [warning] [<ffffffffb30a936f>] ? rebalance_domains+0xbf/0x2f0 [warning] [<ffffffffb30bc0fc>] handle_irq_event_percpu+0x4c/0x1f0 [warning] [<ffffffffb30bc2d9>] handle_irq_event+0x39/0x60 [warning] [<ffffffffb30bf59f>] handle_edge_irq+0x6f/0x150 [warning] [<ffffffffb30063fd>] handle_irq+0x1d/0x30 [warning] [<ffffffffb3763dfb>] do_IRQ+0x4b/0xd0 [warning] [<ffffffffb3762404>] common_interrupt+0x84/0x84 [warning] <EOI> [<ffffffffb363faf2>] ? cpuidle_enter_state+0x132/0x2d0 [warning] [<ffffffffb363fcc7>] cpuidle_enter+0x17/0x20 [warning] [<ffffffffb30af16f>] cpu_startup_entry+0x30f/0x370 [warning] [<ffffffffb375b4a4>] rest_init+0x84/0x90 [warning] [<ffffffffb3d12ed6>] start_kernel+0x43f/0x460 [warning] [<ffffffffb3d12495>] x86_64_start_reservations+0x2a/0x2c [warning] [<ffffffffb3d12582>] x86_64_start_kernel+0xeb/0xee [warning] ---[ end trace db235e151b59e394 ]---
I got another hang today. This one was a bit different so I'm not sure if it's the same problem. The desktop froze but I could move the mouse and the mouse cursor would change shape depending on what I was mousing over. Trying to switch to framebuffer made my 2nd monitor go black and the primary monitor was still frozen. After about a minute the framebuffer came up and after that I could switch back to X and keep working. Kernel 4.4.52 - 4.4.54 didn't give me any freezes but I had several freezes with 4.4.55. 4.4.56 - 4.4.59 didn't freeze but 4.4.60 froze almost immediately after booting with it. There are almost no patches for i915 in those releases but recompiling the kernel will shuffle the kernels memory layout. Perhaps there is a kernel bug that depends on a specific kernel memory layout? I'm using the same setup as before but with kernel 4.4.60. Errors in dmesg after the hang: [warning] ------------[ cut here ]------------ [warning] WARNING: CPU: 0 PID: 3131 at /usr/src/linux-4.4.60/drivers/gpu/drm/i915/intel_display.c:3965 intel_crtc_wait_for_pending_flips+0x1dc/0x240() [warning] WARN_ON(wait_event_timeout(dev_priv->pending_flip_queue, !intel_crtc_has_pending_flip(crtc), 60*HZ) == 0)Modules linked in: cfg80211 iptable_mangle xt_limit xt_conntrack iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pclmul snd_hda_intel snd_hda_codec lpc_ich mfd_core uas usb_storage snd_hwdep joydev snd_hda_core hid_microsoft [warning] CPU: 0 PID: 3131 Comm: X Not tainted 4.4.60 #2 [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 [warning] 0000000000000286 000000007b7db219 ffff88040425baa0 ffffffffba2f644b [warning] ffff88040425bae8 ffffffffbaa73a20 ffff88040425bad8 ffffffffba073f86 [warning] 0000000000000000 ffff88040c1f8e10 ffff88040c0af000 ffff88040c1b9800 [warning] Call Trace: [warning] [<ffffffffba2f644b>] dump_stack+0x4d/0x72 [warning] [<ffffffffba073f86>] warn_slowpath_common+0x86/0xc0 [warning] [<ffffffffba07401c>] warn_slowpath_fmt+0x5c/0x80 [warning] [<ffffffffba0ae9a3>] ? finish_wait+0x53/0x70 [warning] [<ffffffffba499bac>] intel_crtc_wait_for_pending_flips+0x1dc/0x240 [warning] [<ffffffffba0aeb00>] ? wait_woken+0x80/0x80 [warning] [<ffffffffba49add1>] intel_pre_plane_update+0x111/0x140 [warning] [<ffffffffba49b465>] intel_atomic_commit+0x215/0x690 [warning] [<ffffffffba41b684>] ? drm_atomic_check_only+0x144/0x5d0 [warning] [<ffffffffba41bb47>] drm_atomic_commit+0x37/0x60 [warning] [<ffffffffba3f84ee>] drm_atomic_helper_disable_plane+0xae/0xf0 [warning] [<ffffffffba41a518>] ? drm_modeset_lock+0x68/0xe0 [warning] [<ffffffffba40b311>] __setplane_internal+0x171/0x270 [warning] [<ffffffffba41a620>] ? drm_modeset_lock_all_crtcs+0x90/0xa0 [warning] [<ffffffffba40f1a8>] drm_mode_setplane+0x138/0x1b0 [warning] [<ffffffffba40102b>] drm_ioctl+0x14b/0x510 [warning] [<ffffffffba40f070>] ? drm_plane_check_pixel_format+0x50/0x50 [warning] [<ffffffffba1afc94>] do_vfs_ioctl+0x2c4/0x4a0 [warning] [<ffffffffba2aea89>] ? tomoyo_file_ioctl+0x19/0x20 [warning] [<ffffffffba2a05a3>] ? security_file_ioctl+0x43/0x60 [warning] [<ffffffffba1afee9>] SyS_ioctl+0x79/0x90 [warning] [<ffffffffba001cba>] ? syscall_return_slowpath+0xaa/0x140 [warning] [<ffffffffba765f57>] entry_SYSCALL_64_fastpath+0x12/0x66 [warning] ---[ end trace e040b901003e878d ]--- [warning] ------------[ cut here ]------------ [warning] WARNING: CPU: 0 PID: 3131 at /usr/src/linux-4.4.60/drivers/gpu/drm/i915/intel_display.c:3970 intel_crtc_wait_for_pending_flips+0x22d/0x240() [warning] Removing stuck page flip [warning] Modules linked in: cfg80211 iptable_mangle xt_limit xt_conntrack iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pclmul snd_hda_intel snd_hda_codec lpc_ich mfd_core uas usb_storage snd_hwdep joydev snd_hda_core hid_microsoft [warning] CPU: 0 PID: 3131 Comm: X Tainted: G W 4.4.60 #2 [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 [warning] 0000000000000086 000000007b7db219 ffff88040425baa0 ffffffffba2f644b [warning] ffff88040425bae8 ffffffffbaa73a20 ffff88040425bad8 ffffffffba073f86 [warning] ffff88040c1b99a8 ffff88040c1f8e10 ffff88040c0af000 ffff88040c1b9800 [warning] Call Trace: [warning] [<ffffffffba2f644b>] dump_stack+0x4d/0x72 [warning] [<ffffffffba073f86>] warn_slowpath_common+0x86/0xc0 [warning] [<ffffffffba07401c>] warn_slowpath_fmt+0x5c/0x80 [warning] [<ffffffffba0ae9a3>] ? finish_wait+0x53/0x70 [warning] [<ffffffffba499bfd>] intel_crtc_wait_for_pending_flips+0x22d/0x240 [warning] [<ffffffffba0aeb00>] ? wait_woken+0x80/0x80 [warning] [<ffffffffba49add1>] intel_pre_plane_update+0x111/0x140 [warning] [<ffffffffba49b465>] intel_atomic_commit+0x215/0x690 [warning] [<ffffffffba41b684>] ? drm_atomic_check_only+0x144/0x5d0 [warning] [<ffffffffba41bb47>] drm_atomic_commit+0x37/0x60 [warning] [<ffffffffba3f84ee>] drm_atomic_helper_disable_plane+0xae/0xf0 [warning] [<ffffffffba41a518>] ? drm_modeset_lock+0x68/0xe0 [warning] [<ffffffffba40b311>] __setplane_internal+0x171/0x270 [warning] [<ffffffffba41a620>] ? drm_modeset_lock_all_crtcs+0x90/0xa0 [warning] [<ffffffffba40f1a8>] drm_mode_setplane+0x138/0x1b0 [warning] [<ffffffffba40102b>] drm_ioctl+0x14b/0x510 [warning] [<ffffffffba40f070>] ? drm_plane_check_pixel_format+0x50/0x50 [warning] [<ffffffffba1afc94>] do_vfs_ioctl+0x2c4/0x4a0 [warning] [<ffffffffba2aea89>] ? tomoyo_file_ioctl+0x19/0x20 [warning] [<ffffffffba2a05a3>] ? security_file_ioctl+0x43/0x60 [warning] [<ffffffffba1afee9>] SyS_ioctl+0x79/0x90 [warning] [<ffffffffba001cba>] ? syscall_return_slowpath+0xaa/0x140 [warning] [<ffffffffba765f57>] entry_SYSCALL_64_fastpath+0x12/0x66 [warning] ---[ end trace e040b901003e878e ]---
Hello Thomas, Is this problem still occurring? Have you change any configuration on SW or HW? Do you have new logs that provide new information? Thank you.
Yes, I still get hangs. According to my logs I've been getting a hang on average every 10 days. Here is the software I use. Hardware is unchanged. xorg-server-1.19.3 kernel-4.4.74 mesa-17.0.6 xf86-video-intel-git (from June 1) Last hang I got was yesterday. The screen froze, audio kept playing, the mouse cursor moved and changed shape but nothing was redrawn. It will remain stuck in that state indefinitely unless I try to switch to a framebuffer terminal. Then it will be stuck for another 60 sec until some timeout fires and I get to the framebuffer. After that I can switch back to the Xserver like nothing happened. Dmesg error: 2017 Jun 28 00:17:13 multivac [err] DMAR: DRHD: handling fault status reg 3 2017 Jun 28 00:17:13 multivac [err] DMAR: DMAR:[DMA Read] Request device [00:02.0] fault addr fa40d000 2017 Jun 28 00:17:13 multivac [err] DMAR:[fault reason 06] PTE Read access is not set [...] 2017 Jun 28 01:26:50 multivac [warning] ------------[ cut here ]------------ 2017 Jun 28 01:26:50 multivac [warning] WARNING: CPU: 0 PID: 3139 at /usr/src/linux-4.4.74/drivers/gpu/drm/i915/intel_display.c:3965 intel_crtc_wait_for_pending_flips+0x1dd/0x230() 2017 Jun 28 01:26:50 multivac [warning] WARN_ON(wait_event_timeout(dev_priv->pending_flip_queue, !intel_crtc_has_pending_flip(crtc), 60*HZ) == 0)Modules linked in: cfg80211 iptable_nat nf_nat_ipv4 nf_nat xt_limit xt_conntrack iptable_filt er iptable_mangle ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pclmul snd_hda_intel snd_hda_codec lpc_ich mfd_core uas snd_hwdep usb_storage snd_hda_core hid_microsoft joydev 2017 Jun 28 01:26:50 multivac [warning] CPU: 0 PID: 3139 Comm: X Not tainted 4.4.74 #17 2017 Jun 28 01:26:50 multivac [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 2017 Jun 28 01:26:50 multivac [warning] 0000000000000286 b997d12cf73fc6e4 ffff880415b0baa0 ffffffff8a2f84bb 2017 Jun 28 01:26:50 multivac [warning] ffff880415b0bae8 ffffffff8aa6cdc8 ffff880415b0bad8 ffffffff8a073dc2 2017 Jun 28 01:26:50 multivac [warning] ffff88041c7ed1a8 ffff88041c208e10 ffff88041c7d3000 ffff88041c7ed000 2017 Jun 28 01:26:50 multivac [warning] Call Trace: 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2f84bb>] dump_stack+0x4d/0x72 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a073dc2>] warn_slowpath_common+0x82/0xc0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a073e5c>] warn_slowpath_fmt+0x5c/0x80 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a0aede3>] ? finish_wait+0x53/0x70 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49b0ed>] intel_crtc_wait_for_pending_flips+0x1dd/0x230 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a0af0a0>] ? wake_atomic_t_function+0x70/0x70 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49c311>] intel_pre_plane_update+0x111/0x140 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49cae2>] intel_atomic_commit+0x352/0x6f0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41deee>] ? drm_atomic_check_only+0x18e/0x590 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41e327>] drm_atomic_commit+0x37/0x60 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a3fa719>] drm_atomic_helper_disable_plane+0xa9/0xf0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41cae1>] ? drm_modeset_lock+0x81/0xd0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a40db39>] __setplane_internal+0x169/0x250 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41cbc0>] ? drm_modeset_lock_all_crtcs+0x90/0xa0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a411606>] drm_mode_setplane+0x136/0x1b0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a403292>] drm_ioctl+0x152/0x540 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a4114d0>] ? drm_plane_check_pixel_format+0x50/0x50 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1b1338>] do_vfs_ioctl+0x298/0x480 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2b09b9>] ? tomoyo_file_ioctl+0x19/0x20 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2a2453>] ? security_file_ioctl+0x43/0x60 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1b1599>] SyS_ioctl+0x79/0x90 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1368fd>] ? context_tracking_enter+0x1d/0x20 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a767f17>] entry_SYSCALL_64_fastpath+0x12/0x66 2017 Jun 28 01:26:50 multivac [warning] ---[ end trace 1b805930a62a07c1 ]--- 2017 Jun 28 01:26:50 multivac [warning] ------------[ cut here ]------------ 2017 Jun 28 01:26:50 multivac [warning] WARNING: CPU: 0 PID: 3139 at /usr/src/linux-4.4.74/drivers/gpu/drm/i915/intel_display.c:3970 intel_crtc_wait_for_pending_flips+0x225/0x230() 2017 Jun 28 01:26:50 multivac [warning] Removing stuck page flip 2017 Jun 28 01:26:50 multivac [warning] Modules linked in: cfg80211 iptable_nat nf_nat_ipv4 nf_nat xt_limit xt_conntrack iptable_filter iptable_mangle ip_tables iTCO_wdt kvm_intel kvm snd_hda_codec_hdmi crc32_pclmul snd_hda_intel snd_hda_codec lpc_ich mfd_core uas snd_hwdep usb_storage snd_hda_core hid_microsoft joydev 2017 Jun 28 01:26:50 multivac [warning] CPU: 0 PID: 3139 Comm: X Tainted: G W 4.4.74 #17 2017 Jun 28 01:26:50 multivac [warning] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1/Z97X-Gaming G1, BIOS F9 07/31/2015 2017 Jun 28 01:26:50 multivac [warning] 0000000000000086 b997d12cf73fc6e4 ffff880415b0baa0 ffffffff8a2f84bb 2017 Jun 28 01:26:50 multivac [warning] ffff880415b0bae8 ffffffff8aa6cdc8 ffff880415b0bad8 ffffffff8a073dc2 2017 Jun 28 01:26:50 multivac [warning] ffff88041c7ed1a8 ffff88041c208e10 ffff88041c7d3000 ffff88041c7ed000 2017 Jun 28 01:26:50 multivac [warning] Call Trace: 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2f84bb>] dump_stack+0x4d/0x72 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a073dc2>] warn_slowpath_common+0x82/0xc0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a073e5c>] warn_slowpath_fmt+0x5c/0x80 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a0aede3>] ? finish_wait+0x53/0x70 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49b135>] intel_crtc_wait_for_pending_flips+0x225/0x230 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a0af0a0>] ? wake_atomic_t_function+0x70/0x70 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49c311>] intel_pre_plane_update+0x111/0x140 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a49cae2>] intel_atomic_commit+0x352/0x6f0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41deee>] ? drm_atomic_check_only+0x18e/0x590 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41e327>] drm_atomic_commit+0x37/0x60 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a3fa719>] drm_atomic_helper_disable_plane+0xa9/0xf0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41cae1>] ? drm_modeset_lock+0x81/0xd0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a40db39>] __setplane_internal+0x169/0x250 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a41cbc0>] ? drm_modeset_lock_all_crtcs+0x90/0xa0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a411606>] drm_mode_setplane+0x136/0x1b0 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a403292>] drm_ioctl+0x152/0x540 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a4114d0>] ? drm_plane_check_pixel_format+0x50/0x50 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1b1338>] do_vfs_ioctl+0x298/0x480 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2b09b9>] ? tomoyo_file_ioctl+0x19/0x20 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a2a2453>] ? security_file_ioctl+0x43/0x60 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1b1599>] SyS_ioctl+0x79/0x90 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a1368fd>] ? context_tracking_enter+0x1d/0x20 2017 Jun 28 01:26:50 multivac [warning] [<ffffffff8a767f17>] entry_SYSCALL_64_fastpath+0x12/0x66 2017 Jun 28 01:26:50 multivac [warning] ---[ end trace 1b805930a62a07c2 ]--- There was a DMAR error before the hang this time. I have the IOMMU on at all times with intel_iommu=on kernel argument. There is a well know bug on Haswell that results in broken audio over hdmi if the IOMMU is on. As far as I know there is no solution to that problem and most developers have given up on it. I don't use hdmi audio so I don't care about it but perhaps this hang is related? https://bugzilla.kernel.org/show_bug.cgi?id=60769 I could disable the IOMMU to test if the hangs go away but since the hang only happens once every 10 days I would have to run without IOMMU for a month or two to make sure. I need the IOMMU for virtualization and don't want to disable it for that long. The "Request device [00:02.0]" in the DMAR error is 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller. Here is a dump of /proc/iomem right after the hang: 00000000-00000fff : reserved 00001000-0009d7ff : System RAM 0009d800-0009ffff : reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000cfdff : Video ROM 000d0000-000d3fff : PCI Bus 0000:00 000d4000-000d7fff : PCI Bus 0000:00 000d8000-000dbfff : PCI Bus 0000:00 000dc000-000dffff : PCI Bus 0000:00 000e0000-000fffff : reserved 000e0000-000e3fff : PCI Bus 0000:00 000e4000-000e7fff : PCI Bus 0000:00 000f0000-000fffff : System ROM 00100000-a48b4fff : System RAM 0a000000-0a76cf21 : Kernel code 0a76cf22-0ace6c7f : Kernel data 0ae13000-0af20fff : Kernel bss a48b5000-a48bbfff : ACPI Non-volatile Storage a48bc000-a57d0fff : System RAM a57d1000-a607efff : reserved a607f000-c3226fff : System RAM c3227000-c32b8fff : reserved c32b9000-c3325fff : System RAM c3326000-c346cfff : ACPI Non-volatile Storage c346d000-c9ffefff : reserved c9fff000-c9ffffff : System RAM ca000000-caffffff : RAM buffer cb000000-cf1fffff : reserved cf200000-feafffff : PCI Bus 0000:00 d0000000-dfffffff : 0000:00:02.0 e0000000-f1ffffff : PCI Bus 0000:01 e0000000-f1ffffff : PCI Bus 0000:02 e0000000-f1ffffff : PCI Bus 0000:04 e0000000-efffffff : 0000:04:00.0 f0000000-f1ffffff : 0000:04:00.0 f6000000-f71fffff : PCI Bus 0000:01 f6000000-f70fffff : PCI Bus 0000:02 f6000000-f70fffff : PCI Bus 0000:04 f6000000-f6ffffff : 0000:04:00.0 f7000000-f707ffff : 0000:04:00.0 f7080000-f7083fff : 0000:04:00.1 f7100000-f713ffff : 0000:01:00.0 f7400000-f77fffff : 0000:00:02.0 f7800000-f78fffff : PCI Bus 0000:0e f7800000-f780ffff : 0000:0e:00.0 f7810000-f78101ff : 0000:0e:00.0 f7810000-f78101ff : ahci f7900000-f79fffff : PCI Bus 0000:0d f7900000-f790ffff : 0000:0d:00.0 f7910000-f79101ff : 0000:0d:00.0 f7910000-f79101ff : ahci f7a00000-f7afffff : PCI Bus 0000:07 f7a00000-f7a03fff : 0000:07:00.0 f7b00000-f7bfffff : PCI Bus 0000:06 f7b00000-f7b3ffff : 0000:06:00.0 f7b00000-f7b3ffff : alx f7c00000-f7c1ffff : 0000:00:19.0 f7c00000-f7c1ffff : e1000e f7c20000-f7c2ffff : 0000:00:14.0 f7c20000-f7c2ffff : xhci-hcd f7c30000-f7c33fff : 0000:00:03.0 f7c30000-f7c33fff : ICH HD audio f7c34000-f7c340ff : 0000:00:1f.3 f7c35000-f7c357ff : 0000:00:1f.2 f7c35000-f7c357ff : ahci f7c36000-f7c363ff : 0000:00:1d.0 f7c36000-f7c363ff : ehci_hcd f7c37000-f7c373ff : 0000:00:1a.0 f7c37000-f7c373ff : ehci_hcd f7c38000-f7c38fff : 0000:00:19.0 f7c38000-f7c38fff : e1000e f7c39000-f7c3900f : 0000:00:16.0 f7c39000-f7c3900f : mei_me f7fe0000-f7feffff : pnp 00:06 f8000000-fbffffff : PCI MMCONFIG 0000 [bus 00-3f] f8000000-fbffffff : reserved f8000000-fbffffff : pnp 00:06 fec00000-fec00fff : reserved fec00000-fec003ff : IOAPIC 0 fed00000-fed03fff : reserved fed00000-fed003ff : HPET 0 fed00000-fed003ff : PNP0103:00 fed10000-fed17fff : pnp 00:06 fed18000-fed18fff : pnp 00:06 fed19000-fed19fff : pnp 00:06 fed1c000-fed1ffff : reserved fed1c000-fed1ffff : pnp 00:06 fed1f410-fed1f414 : iTCO_wdt.0.auto fed20000-fed3ffff : pnp 00:06 fed40000-fed44fff : pnp 00:00 fed45000-fed8ffff : pnp 00:06 fed90000-fed90fff : dmar0 fed91000-fed91fff : dmar1 fee00000-fee00fff : Local APIC fee00000-fee00fff : reserved ff000000-ffffffff : reserved ff000000-ffffffff : INT0800:00 ff000000-ffffffff : pnp 00:06 100000000-42fdfffff : System RAM 42fe00000-42fffffff : RAM buffer
(In reply to Thomas Lindroth from comment #10) Thanks for the update Thomas. If any other information is needed for this case, it will be commented below.
Quick note: DMAR error could be related to this one https://bugs.freedesktop.org/show_bug.cgi?id=89360
(In reply to Elizabeth from comment #12) > Quick note: DMAR error could be related to this one > https://bugs.freedesktop.org/show_bug.cgi?id=89360 Hello, it seems no new advances have been done in this case, you could still try with drm-tip branch https://cgit.freedesktop.org/drm-tip or the workaround from bug 89360, intel_iommu=igfx_off in grub.
I tried setting intel_iommu=on,igfx_off but as expected this broke IOMMU in kvm. I don't know why that happens. As I understand it igfx_off should only disable the IOMMU dedicated to the igpu without changing anything else. With igfx_off I got errors like these when trying to start a VM with kvm: DMAR: DRHD: handling fault status reg 3 DMAR: DMAR:[DMA Read] Request device [04:00.1] fault addr 1eac00000 DMAR:[fault reason 12] non-zero reserved fields in PTE device [04:00.1] is one of the devices I assign to the VM. "non-zero reserved fields in PTE" is an odd error. It makes me think there is some corruption of the IOMMU pagetables caused by the igfx_off option. While I was testing igfx_off I got lucky and did get a hang. The hang looked the same as all my other hangs. "WARNING: CPU: 0 PID: 3133 at /usr/src/linux-4.4.89/drivers/gpu/drm/i915/inte l_display.c:3965 intel_crtc_wait_for_pending_flips+0x1dd/0x230()". Since the hangs happen even with igfx_off I guess the problem is not IOMMU related.
I was looking in /sys for something and accidentally discovered that reading from /sys/kernel/debug/dri/0/i915_gem_pageflip can trigger the hang. The following content in i915_gem_pageflip will result in a hang: Flip queued on pipe A (plane A) Flip queued on blitter ring at seqno a9e95, next seqno a9e97 [current breadcrumb a9e96], completed? 1 Flip queued on frame 533858, (was ready on frame 0), now 533858 Stall check enabled, 1 prepares Current scanout address 0x00b80000 New framebuffer address 0x02780000 MMIO update completed? 0 No flip due on pipe B (plane B) Flip queued on pipe C (plane C) Flip queued on blitter ring at seqno a9e96, next seqno a9e97 [current breadcrumb a9e96], completed? 1 Flip queued on frame 533715, (was ready on frame 0), now 533715 Stall check enabled, 1 prepares Current scanout address 0x00b8f000 New framebuffer address 0x0278f000 MMIO update completed? 0 but this content will not hang: No flip due on pipe A (plane A) No flip due on pipe B (plane B) No flip due on pipe C (plane C) So basically reading from i915_gem_pageflip when there is a pageflip queued can cause the hang but like before the hang also happens when I'm not reading from /sys The hang is like before. I can move the mouse and use keyboard shortcuts but the screen is frozen. If I try switching to framebuffer it will come up after 60 sec. Using UXA instead of SNA or using modesetting will never hang. i915_gem_pageflip will always read "No flip due on pipe ..." for those. If I try to use a compositor like compton with modesetting there are still no hangs even though i915_gem_pageflip shows pageflips. I'm running kernel 4.4.110 now but I also tested 4.9.75 and it also hangs. The hang in 4.9.75 is worse because I couldn't move the mouse or switch to framebuffer. I had to sysrq reboot. Since I now have a deterministic way of triggering the hang (or a similar hang) it should be easier to test.
First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
The file /sys/kernel/debug/dri/0/i915_gem_pageflip was removed with the legacy flip code in 4.14-rc1 so I can't easily test any kernel more recent than that. The hang still exists in 4.4.125 and the 4.4 series is supported for several more years so the bug is still valid.
OK, thanks the feedback.
Thomas, sorry for the delay... Do you still have this issue with latest drm-tip? (https://cgit.freedesktop.org/drm-tip)
The hang doesn't seem to happen in recent kernels. I use 4.14 now and there are no hangs. I don't know what version fixed it but I guess it was fixed by 4c01ded5732d6533a2858fae30c197f734745062 "drm/i915: Use atomic page flip for intel again" in 4.12. The bug likely still exists in kernel 4.4 and 4.9 (I haven't tested in a while) and they are supported up until year 2023. Realistically this bug will never get fixed in those kernels so I'd might just as well let you close this bug as WONTFIX.
(In reply to Thomas Lindroth from comment #20) > The hang doesn't seem to happen in recent kernels. I use 4.14 now and there > are no hangs. I don't know what version fixed it but I guess it was fixed by > 4c01ded5732d6533a2858fae30c197f734745062 "drm/i915: Use atomic page flip for > intel again" in 4.12. > > The bug likely still exists in kernel 4.4 and 4.9 (I haven't tested in a > while) and they are supported up until year 2023. Realistically this bug > will never get fixed in those kernels so I'd might just as well let you > close this bug as WONTFIX. The bug has been fixed upstream. While the Linux foundation is taking care of some of the backporting of fixes, invasive fixes are not going to be backported. As far as we are concerned, the latest LTS kernel is working, so that's all we can commit to. Thanks for reporting back! PS: have you tried using the modesetting driver? I am adding support right now for TearFree if this is what prevented you from using it.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.