Created attachment 137665 [details]
Bug description: My entire display froze while switching between windows in X11. Nothing else seems to have hanged, as music was still playing and everything came back to normal after SIGKILLing Blender which was running on the nVidia GPU.
Details / Reproducing steps:
- Blender 2.79 was running on the nVidia GPU through primus with primusrun. CUDA was used to render Blender Cycles images. The Blender window was inactive for a while and did not render any other image since at least 10 minutes.
- I (accidentally) switched to the Blender window by clicking below the other window icon I tried to click on in (a vertical) xfce4-panel, then used the mouse wheel to get to another window above it in the list, scrolling through 4 other windows before reaching Chromium's, where the hang happened
- Xorg did not visually respond to VT switch requests in the minute or so following the freeze, but it turned out later switching itself worked; I left tty2 active (still without visual feedback; X11 on tty1)
- I suspended then resumed the laptop, same display before and after
- I ssh'd into my machine, where I ran:
* `htop`, which did not show any CPU usage other than itself, sshd, firefox and pulseaudio (which were playing music in the background)
* `perf top` showed no graphics-related perf event samples
* `killall -9 blender`
- At this point the display did not update but was on tty2 (expected killing blender would unclog the graphics stack and make the console render)
- Alt+F1, and X11 resumes
- Ctrl+Alt+F2 and tty2 displays properly
- Back to X11, read dmesg and report this bug
System environment (package versions as reported by `pacman`):
-- chipset: HD4000 (part of an Intel i5-3317U; Ivy Bridge)
-- system architecture: 64-bit
-- xf86-video-intel: 1:2.99.917+812+g75795523-1
-- xserver: 1.19.6+13+gd0d1a694f-1
-- mesa: 17.3.5-1
-- libdrm: 2.4.90-3
-- kernel: 4.15.5-1-ARCH #1 SMP PREEMPT Thu Feb 22 22:15:20 UTC 2018 x86_64
-- Linux distribution: Arch Linux
-- Machine or mobo model: ASUS K56CB
-- Display connector: LVDS panel
-- nvidia: 390.25-13
-- nvidia GPU: GeForce 740M
-- primus: 20151110-7
-- bumblebee: 3.2.1-16
-- bbswitch: 0.8-113
-- blender: 17:2.79-9
-- compton (X11 compositor in use): 0.1_beta2.5-10
-- chromium: 64.0.3282.167-1
In the process of resetting the i915, a fence wait timed out:
[95008.506693] i915 0000:00:02.0: Resetting chip after gpu hang
[95010.549217] asynchronous wait on fence i915:[global]:6fd684 timed out
[95016.501277] i915 0000:00:02.0: Resetting chip after gpu hang
Starting up or using primus-forwarded software sometimes creates graphics corruption on some windows, which is fixed when a redraw happens but that also seems to happen all types of graphics buffers on the i915 like font cache/atlases, some applications like Steam are particularly affected by this problem. It is not unexpected that more than just buffer content gets corrupted.
Booting with intel_iommu enabled prevents graphical output as soon as the kernel switches away from efifb to inteldrmfb (that is, early in boot); maybe it could have a beneficial impact on the graphics corruption problem if it worked...
Hi, could you try mesa 17.3.6 or latest 18.0? Any way to trigger this more reliably? It still happens if you're not using the nVidia gpu?
Dorian, is this bug reproduced easy and regularly on your hw/sw configuration?
There were no bugs reproduced with similar SW configuration.
Mesa 17.3.5 (and 17.3.7) and blender 17:2.79-10 were installed.