|Summary:||[GLK] GPU HANG in kodi|
|Product:||Mesa||Reporter:||Erik Sandlund <erik.sandlund>|
|Component:||Drivers/DRI/i965||Assignee:||Intel 3D Bugs Mailing List <intel-3d-bugs>|
|Status:||RESOLVED MOVED||QA Contact:||Intel 3D Bugs Mailing List <intel-3d-bugs>|
|i915 platform:||GLK||i915 features:||GPU hang|
dmesg Ubuntu 18.04 server
dmesg with error
i965: flush render target before ISP disable
Mesa 18.1.3 error log
Mesa 18.1.3 dmesg log
Mesa 18.1.3 xorg log
Mesa 18.2 /sys/class/drm/card0/error
Description Erik Sandlund 2018-07-08 21:55:36 UTC
Created attachment 140515 [details] /sys/class/drm/card0/error Hi, I'm using a NUC7PJYH for Kodi. It worked fine for a couple of weeks but lately the i915-driver hangs when I stress it. I have tried reinstalling the system (Ubuntu 18.04 Server) and tried various LibreElec-versions. I can reproduce error by running glxgears on openbox. Depending on driver-settings for Xorg Intel-driver the time until crash varies. If I load up Kodi it crashes if I move around in the menus. Videos seems to play ok. I supply an error-log from a LibreElec "Milhouse build" since I figure it's the least touched by my messing around.
Comment 1 Francesco Balestrieri 2018-07-09 05:19:36 UTC
Can you also send a dmesg from boot with kernel options drm.debug=0x1e log_buf_len=4M? And it would be great if you could try to reproduce using drm-tip (https://cgit.freedesktop.org/drm-tip)
Comment 2 Erik Sandlund 2018-07-09 08:07:55 UTC
Created attachment 140517 [details] dmesg Ubuntu 18.04 server
Comment 3 Erik Sandlund 2018-07-09 08:08:41 UTC
Created attachment 140518 [details] dmesg with error
Comment 4 Erik Sandlund 2018-07-09 08:09:25 UTC
Created attachment 140519 [details] /sys/class/drm/card0/error Ubuntu
Comment 5 Erik Sandlund 2018-07-09 08:10:41 UTC
I've tried drm-tip from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ and reproduced the error. Can install it again later tonight and provide logs.
Comment 6 Lionel Landwerlin 2018-07-09 08:45:20 UTC
The first error state reminds me of https://bugs.freedesktop.org/show_bug.cgi?id=106243. This hang seems to have happened after the fix so looks like we might need a bigger hammer before disable the indirect state pointers... On the other hand the second error state seems to indicate that the hang happened in a batch buffer that isn't part of the error state (could be a batch from i915?). Would you be able to try the attached patch for Mesa? Thanks a lot!
Comment 7 Lionel Landwerlin 2018-07-09 08:45:46 UTC
Created attachment 140520 [details] [review] i965: flush render target before ISP disable
Comment 8 Lionel Landwerlin 2018-07-09 10:22:41 UTC
If you could give your settings for the xorg intel driver and your version of Mesa that would really useful too. Thanks!
Comment 9 Erik Sandlund 2018-07-10 00:02:25 UTC
Created attachment 140531 [details] Mesa 18.1.3 error log
Comment 10 Erik Sandlund 2018-07-10 00:03:01 UTC
Created attachment 140532 [details] Mesa 18.1.3 dmesg log
Comment 11 Erik Sandlund 2018-07-10 00:03:24 UTC
Created attachment 140533 [details] Mesa 18.1.3 xorg log
Comment 12 Erik Sandlund 2018-07-10 00:06:11 UTC
Patch applied in logs above. I'm not used to compiling and applying patches though so I don't know if everything worked. Bug still occurs though. No grub-settings except for debug-string. No xorg-settings except Driver Intel and TearFree on. Mesa is now 18.1.3. Was Standard Bionic before, 18.0.0-rc5.
Comment 13 Lionel Landwerlin 2018-07-10 00:14:35 UTC
(In reply to Erik Sandlund from comment #12) > Patch applied in logs above. I'm not used to compiling and applying patches > though so I don't know if everything worked. Bug still occurs though. > Looking at the traces, it seems the patch wasn't applied :(
Comment 14 Erik Sandlund 2018-07-10 21:42:14 UTC
What should I look for to see if patch is in use? I've re-added patch and recompiled but I don't want to flood this bug report with my not-so-useful attachments.
Comment 15 Lionel Landwerlin 2018-07-10 23:52:22 UTC
Apologies, I must have downloaded the wrong attachment (or mess up locally). Looks like you're now hitting a different issue. I'm looking at where the GPU stopped to figure out what's wrong. Here is how to do it : If you compile the mesa repository with the intel tools activated (I usually use meson) : $ cd mesa $ meson -Dgles2=true -Ddri-drivers=i915,i965 -platforms=x11,drm,wayland,surfaceless -Dgallium-drivers= --buildtype=release -vulkan-drivers=intel -Dtools=intel -Dbuild-tests=true build . $ ninja -C build Then you can run the aubinator_error_decode tool : $ ./build/src/intel/tools/aubinator_error_decode /path/to/my/card0/error Then search "ACTHD:", if I take the last error state you posted in should be this line : ACTHD: 0x00000000 001389f4 Then search with that address : 001389f4 0x00135b04: 0x78150009: 3DSTATE_CONSTANT_VS This is the instruction triggering the GPU hang. In the previous error state, it was a PIPE_CONTROL (which was related to the other bug I mentioned). So looks like the patch helps. Is your machine hanging as often with this patch?
Comment 16 Erik Sandlund 2018-07-11 09:29:47 UTC
Yes, no difference really. Directly after compile it worked pretty good but after a few minutes it hung and started to hang more often after that. I see gfx corruption on the screen which sometimes looks the same even after a reboot. I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/ applied. No differene though really. I also tried your patch on mesa 18.2.0 with no real difference. Should I supply more logs?
Comment 17 Erik Sandlund 2018-07-13 11:06:06 UTC
Created attachment 140623 [details] Mesa 18.2 /sys/class/drm/card0/error /sys/class/drm/card0/error OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.2.0-devel (git-0288fe8d04) name: i915 vermagic: 4.18.0-rc4+ SMP mod_unload
Comment 18 Lionel Landwerlin 2018-07-13 11:19:05 UTC
(In reply to Erik Sandlund from comment #16) > Yes, no difference really. Directly after compile it worked pretty good but > after a few minutes it hung and started to hang more often after that. I see > gfx corruption on the screen which sometimes looks the same even after a > reboot. > > I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/ > applied. No differene though really. I also tried your patch on mesa 18.2.0 > with no real difference. > > Should I supply more logs? Hi, Thanks a lot for all the traces, I don't think we'll need more traces at this point. I think we need to find what's right fix here, your last error state shows that the patch I've attached doesn't help.
Comment 19 Erik Sandlund 2018-07-14 23:19:29 UTC
This might be a hardware issue since Windows 10 also produces strange artifacts and hangs after Intel-driver install.
Comment 20 GitLab Migration User 2019-09-25 19:12:17 UTC
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1737.