Created attachment 143685 [details] crash dump output [21763.262354] [drm] GPU HANG: ecode 9:0:0x86cdffff, in sway [26607], reason: hang on rcs0, action: reset [21763.262356] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [21763.262356] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [21763.262357] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [21763.262357] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [21763.262358] [drm] GPU crash dump saved to /sys/class/drm/card1/error [21763.263378] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0
Crash from 2 days ago (perhaps seeing two crashes is helpful?) [15600.080449] [drm] GPU HANG: ecode 9:0:0x85dffffb, in sway [2338], reason: hang on rcs0, action: reset [15600.080453] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [15600.080454] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [15600.080456] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [15600.080457] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [15600.080460] [drm] GPU crash dump saved to /sys/class/drm/card1/error [15600.081554] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 [18606.079498] perf: interrupt took too long (3136 > 3132), lowering kernel.perf_event_max_sample_rate to 63000 [30631.655904] nouveau 0000:01:00.0: disp: 0x000064a8[0]: INIT_GENERIC_CONDITON: unknown 0x07 [30631.695021] nouveau 0000:01:00.0: disp: 0x000064a8[0]: INIT_GENERIC_CONDITON: unknown 0x07 [40061.910500] perf: interrupt took too long (4001 > 3920), lowering kernel.perf_event_max_sample_rate to 49000
Created attachment 143686 [details] crash dump output
What kernel & mesa version are you running?
Kernel: 4.20.14-200.fc29.x86_64 LibDRM: 2.4.97 Mesa: 18.3.4 Device: Mesa DRI Intel(R) HD Graphics 530 (Skylake GT2) (0x191b)
Created attachment 143703 [details] [review] i965: make sure to have cs stall before vf cache invalidate Is there any way you could give this patch a try? How frequent are the hangs? Thanks a lot!
The hangs aren't that often, so a week of testing may be necessary to determine if this fixes the issue. I applied the patch to the RPM sources in Fedora 29 and it applied cleanly to 18.3.4. The fedora mesa package is split into a number of different RPM packages. Do you know if I can just update the mesa-dri-drivers package or do I need to update all of them? For reference, the files in that package are: /usr/lib64/dri/i915_dri.so /usr/lib64/dri/i965_dri.so /usr/lib64/dri/kms_swrast_dri.so /usr/lib64/dri/nouveau_dri.so /usr/lib64/dri/nouveau_drv_video.so /usr/lib64/dri/nouveau_vieux_dri.so /usr/lib64/dri/r200_dri.so /usr/lib64/dri/r300_dri.so /usr/lib64/dri/r600_dri.so /usr/lib64/dri/r600_drv_video.so /usr/lib64/dri/radeon_dri.so /usr/lib64/dri/radeonsi_dri.so /usr/lib64/dri/radeonsi_drv_video.so /usr/lib64/dri/swrast_dri.so /usr/lib64/dri/virtio_gpu_dri.so /usr/lib64/dri/vmwgfx_dri.so /usr/lib64/gallium-pipe /usr/lib64/gallium-pipe/pipe_nouveau.so /usr/lib64/gallium-pipe/pipe_r300.so /usr/lib64/gallium-pipe/pipe_r600.so /usr/lib64/gallium-pipe/pipe_radeonsi.so /usr/lib64/gallium-pipe/pipe_swrast.so /usr/lib64/gallium-pipe/pipe_vmwgfx.so /usr/share/drirc.d /usr/share/drirc.d/00-mesa-defaults.conf
I've just upgraded everything. Will report back sometime next week.
Created attachment 143741 [details] crash dump 3
Reporting back faster than I thought, the problem remains: [56521.372029] [drm] GPU HANG: ecode 9:0:0x85dffffb, in sway [2371], reason: hang on rcs0, action: reset [56521.373094] i915 0000:00:02.0: Resetting rcs0 for hang on rcs0 Attached error log above.
hi, what do you usually do when hang occurs? Any apps in use, or just navigating in system?
Seems to occur mostly when interacting with video. I know it's happened while using Bluejeans video conferencing, but I may have seen it while watching videos in the browser too.
I tried to install sway on my SKL, using fedora 29 (KDE). I used this manual https://nationpigeon.com/compiling-sway-on-fedora-29/ I could login into the sway, and the only thing I see - is a clock (working clock). According to the man pages - https://github.com/swaywm/sway/wiki#i-just-installed-sway-i-can-move-my-mouse-cursor-but-my-keyboard-does-not-work I added default man page into .config/sway/config , but it didn't help. As I understood, I need to setup all bindings manually, that's correct? If so, it would be helpful from you to provide your config file, so I can reuse it.
Created attachment 143758 [details] sway config Here's my sway config. You really just need to make sure you have your terminal short cut set up correctly.
aha, thanks for clarification. I sorted out my problem, ran cmd and connected to network. Now I am running firefox-wayland for about 2 hours (in one tab - youtube video, in another tab - webGL app (aquarium)). Still nothing, waiting more.
One thing I just realized is that the crash DID go away. I should not have used the word crash in the latest upload. I assumed the patch would fix the hang too. Is the patch supposed to fix the crash, hang, or both? Denis, thanks for your reproducing efforts. If there's any additional debug I can do since it sounds like your system is not behaving in the same way let me know.
(In reply to Jeff Peeler from comment #15) > One thing I just realized is that the crash DID go away. I should not have > used the word crash in the latest upload. I assumed the patch would fix the > hang too. Is the patch supposed to fix the crash, hang, or both? The patch would only help with the hang. > > Denis, thanks for your reproducing efforts. If there's any additional debug > I can do since it sounds like your system is not behaving in the same way > let me know.
(In reply to Jeff Peeler from comment #15) > One thing I just realized is that the crash DID go away. I should not have > used the word crash in the latest upload. I assumed the patch would fix the > hang too. Is the patch supposed to fix the crash, hang, or both? > > Denis, thanks for your reproducing efforts. If there's any additional debug > I can do since it sounds like your system is not behaving in the same way > let me know. I don't know, our configurations look very similar. Did you build sway from git or took from repository? What browser do you use?
I'm using Firefox 65 and built sway from the 1.0 tag. A 1.0 release of sway doesn't exist in Fedora (yet).
Given that the patch did seem to help some, is this something that is going to be merged or is there anything else I can do to push this along?
Hi, sway dev here. i915 devs: let me know if you need info about userspace. Our DRM code behaves more or less like Weston, so I'm not sure what could be wrong here. However I see that the Intel card is card1 and there are nouveau logs. What is your setup exactly? Is card0 a NVIDIA card? You could also e.g. run https://github.com/ascent12/drm_info on your device to get information about what cards are plugged in and what are their capabilities. On multi-GPU setups we render on one primary GPU and use DMA-BUFs to copy buffers from the primary GPU to the secondary one (we don't do direct scan-out yet, this really is a copy). If you could share sway debug logs (sway -d >sway.log 2>&1) that would help figuring out the exact setup sway/wlroots runs on. If you connect all monitors to one card, you could force sway/wlroots to use only this card. This might or might not help, and will hide connectors of the other cards. You can do so by exporting WLR_DRM_DEVICES=/dev/dri/card0 (or card1).
According to drm_info (how else would I have found this information?), card0 is i915 and card1 is nouveau. I set the graphics to hybrid mode instead of discrete as I was told the former was more stable. I have a somewhat complex set up with a laptop in a dock. The dock connects to two external monitors and I have third external monitor plugged directly into the laptop (I couldn't get the third screen working otherwise). I tried the commands you suggested to direct sway to use just one GPU, but what ended up happening is with card0 enabled I just had my laptop screen and with card1 just the 3 external monitors.
Created attachment 143942 [details] output from drm_info
Created attachment 143943 [details] sway debug log
After a system upgrade, the bug still persists. The title should probably be changed, as I see the issue in Gnome on Wayland too. Kernel: 5.0.16-300.fc30.x86_64 LibDRM: 2.4.98 Mesa: 19.0.4 Wed May 22 11:33:18 2019] [drm] GPU HANG: ecode 9:0:0x84dfdffb, in gnome-shell [5332], reason: hang on rcs0, action: reset [Wed May 22 11:33:18 2019] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [Wed May 22 11:33:18 2019] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [Wed May 22 11:33:18 2019] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [Wed May 22 11:33:18 2019] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [Wed May 22 11:33:18 2019] [drm] GPU crash dump saved to /sys/class/drm/card0/error
Created attachment 144324 [details] crash dump output while using gnome
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1801.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.