Created attachment 140515 [details]
Hi, I'm using a NUC7PJYH for Kodi. It worked fine for a couple of weeks but lately the i915-driver hangs when I stress it. I have tried reinstalling the system (Ubuntu 18.04 Server) and tried various LibreElec-versions. I can reproduce error by running glxgears on openbox. Depending on driver-settings for Xorg Intel-driver the time until crash varies. If I load up Kodi it crashes if I move around in the menus. Videos seems to play ok.
I supply an error-log from a LibreElec "Milhouse build" since I figure it's the least touched by my messing around.
Can you also send a dmesg from boot with kernel options drm.debug=0x1e log_buf_len=4M?
And it would be great if you could try to reproduce using drm-tip (https://cgit.freedesktop.org/drm-tip)
Created attachment 140517 [details]
dmesg Ubuntu 18.04 server
Created attachment 140518 [details]
dmesg with error
Created attachment 140519 [details]
I've tried drm-tip from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ and reproduced the error. Can install it again later tonight and provide logs.
The first error state reminds me of https://bugs.freedesktop.org/show_bug.cgi?id=106243.
This hang seems to have happened after the fix so looks like we might need a bigger hammer before disable the indirect state pointers...
On the other hand the second error state seems to indicate that the hang happened in a batch buffer that isn't part of the error state (could be a batch from i915?).
Would you be able to try the attached patch for Mesa?
Thanks a lot!
Created attachment 140520 [details] [review]
i965: flush render target before ISP disable
If you could give your settings for the xorg intel driver and your version of Mesa that would really useful too.
Created attachment 140531 [details]
Mesa 18.1.3 error log
Created attachment 140532 [details]
Mesa 18.1.3 dmesg log
Created attachment 140533 [details]
Mesa 18.1.3 xorg log
Patch applied in logs above. I'm not used to compiling and applying patches though so I don't know if everything worked. Bug still occurs though.
No grub-settings except for debug-string. No xorg-settings except Driver Intel and TearFree on.
Mesa is now 18.1.3. Was Standard Bionic before, 18.0.0-rc5.
(In reply to Erik Sandlund from comment #12)
> Patch applied in logs above. I'm not used to compiling and applying patches
> though so I don't know if everything worked. Bug still occurs though.
Looking at the traces, it seems the patch wasn't applied :(
What should I look for to see if patch is in use? I've re-added patch and recompiled but I don't want to flood this bug report with my not-so-useful attachments.
Apologies, I must have downloaded the wrong attachment (or mess up locally).
Looks like you're now hitting a different issue.
I'm looking at where the GPU stopped to figure out what's wrong.
Here is how to do it :
If you compile the mesa repository with the intel tools activated (I usually use meson) :
$ cd mesa
$ meson -Dgles2=true -Ddri-drivers=i915,i965 -platforms=x11,drm,wayland,surfaceless -Dgallium-drivers= --buildtype=release -vulkan-drivers=intel -Dtools=intel -Dbuild-tests=true build .
$ ninja -C build
Then you can run the aubinator_error_decode tool :
$ ./build/src/intel/tools/aubinator_error_decode /path/to/my/card0/error
Then search "ACTHD:", if I take the last error state you posted in should be this line :
ACTHD: 0x00000000 001389f4
Then search with that address : 001389f4
0x00135b04: 0x78150009: 3DSTATE_CONSTANT_VS
This is the instruction triggering the GPU hang.
In the previous error state, it was a PIPE_CONTROL (which was related to the other bug I mentioned).
So looks like the patch helps.
Is your machine hanging as often with this patch?
Yes, no difference really. Directly after compile it worked pretty good but after a few minutes it hung and started to hang more often after that. I see gfx corruption on the screen which sometimes looks the same even after a reboot.
I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/ applied. No differene though really. I also tried your patch on mesa 18.2.0 with no real difference.
Should I supply more logs?
Created attachment 140623 [details]
Mesa 18.2 /sys/class/drm/card0/error
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.2.0-devel (git-0288fe8d04)
vermagic: 4.18.0-rc4+ SMP mod_unload
(In reply to Erik Sandlund from comment #16)
> Yes, no difference really. Directly after compile it worked pretty good but
> after a few minutes it hung and started to hang more often after that. I see
> gfx corruption on the screen which sometimes looks the same even after a
> I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/
> applied. No differene though really. I also tried your patch on mesa 18.2.0
> with no real difference.
> Should I supply more logs?
Thanks a lot for all the traces, I don't think we'll need more traces at this point.
I think we need to find what's right fix here, your last error state shows that the patch I've attached doesn't help.
This might be a hardware issue since Windows 10 also produces strange artifacts and hangs after Intel-driver install.
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1737.