Created attachment 140515 [details]
Hi, I'm using a NUC7PJYH for Kodi. It worked fine for a couple of weeks but lately the i915-driver hangs when I stress it. I have tried reinstalling the system (Ubuntu 18.04 Server) and tried various LibreElec-versions. I can reproduce error by running glxgears on openbox. Depending on driver-settings for Xorg Intel-driver the time until crash varies. If I load up Kodi it crashes if I move around in the menus. Videos seems to play ok.
I supply an error-log from a LibreElec "Milhouse build" since I figure it's the least touched by my messing around.
Can you also send a dmesg from boot with kernel options drm.debug=0x1e log_buf_len=4M?
And it would be great if you could try to reproduce using drm-tip (https://cgit.freedesktop.org/drm-tip)
Created attachment 140517 [details]
dmesg Ubuntu 18.04 server
Created attachment 140518 [details]
dmesg with error
Created attachment 140519 [details]
I've tried drm-tip from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-tip/current/ and reproduced the error. Can install it again later tonight and provide logs.
The first error state reminds me of https://bugs.freedesktop.org/show_bug.cgi?id=106243.
This hang seems to have happened after the fix so looks like we might need a bigger hammer before disable the indirect state pointers...
On the other hand the second error state seems to indicate that the hang happened in a batch buffer that isn't part of the error state (could be a batch from i915?).
Would you be able to try the attached patch for Mesa?
Thanks a lot!
Created attachment 140520 [details] [review]
i965: flush render target before ISP disable
If you could give your settings for the xorg intel driver and your version of Mesa that would really useful too.
Created attachment 140531 [details]
Mesa 18.1.3 error log
Created attachment 140532 [details]
Mesa 18.1.3 dmesg log
Created attachment 140533 [details]
Mesa 18.1.3 xorg log
Patch applied in logs above. I'm not used to compiling and applying patches though so I don't know if everything worked. Bug still occurs though.
No grub-settings except for debug-string. No xorg-settings except Driver Intel and TearFree on.
Mesa is now 18.1.3. Was Standard Bionic before, 18.0.0-rc5.
(In reply to Erik Sandlund from comment #12)
> Patch applied in logs above. I'm not used to compiling and applying patches
> though so I don't know if everything worked. Bug still occurs though.
Looking at the traces, it seems the patch wasn't applied :(
What should I look for to see if patch is in use? I've re-added patch and recompiled but I don't want to flood this bug report with my not-so-useful attachments.
Apologies, I must have downloaded the wrong attachment (or mess up locally).
Looks like you're now hitting a different issue.
I'm looking at where the GPU stopped to figure out what's wrong.
Here is how to do it :
If you compile the mesa repository with the intel tools activated (I usually use meson) :
$ cd mesa
$ meson -Dgles2=true -Ddri-drivers=i915,i965 -platforms=x11,drm,wayland,surfaceless -Dgallium-drivers= --buildtype=release -vulkan-drivers=intel -Dtools=intel -Dbuild-tests=true build .
$ ninja -C build
Then you can run the aubinator_error_decode tool :
$ ./build/src/intel/tools/aubinator_error_decode /path/to/my/card0/error
Then search "ACTHD:", if I take the last error state you posted in should be this line :
ACTHD: 0x00000000 001389f4
Then search with that address : 001389f4
0x00135b04: 0x78150009: 3DSTATE_CONSTANT_VS
This is the instruction triggering the GPU hang.
In the previous error state, it was a PIPE_CONTROL (which was related to the other bug I mentioned).
So looks like the patch helps.
Is your machine hanging as often with this patch?
Yes, no difference really. Directly after compile it worked pretty good but after a few minutes it hung and started to hang more often after that. I see gfx corruption on the screen which sometimes looks the same even after a reboot.
I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/ applied. No differene though really. I also tried your patch on mesa 18.2.0 with no real difference.
Should I supply more logs?
Created attachment 140623 [details]
Mesa 18.2 /sys/class/drm/card0/error
OpenGL core profile version string: 4.5 (Core Profile) Mesa 18.2.0-devel (git-0288fe8d04)
vermagic: 4.18.0-rc4+ SMP mod_unload
(In reply to Erik Sandlund from comment #16)
> Yes, no difference really. Directly after compile it worked pretty good but
> after a few minutes it hung and started to hang more often after that. I see
> gfx corruption on the screen which sometimes looks the same even after a
> I've compiled drm-tip with https://patchwork.freedesktop.org/patch/237548/
> applied. No differene though really. I also tried your patch on mesa 18.2.0
> with no real difference.
> Should I supply more logs?
Thanks a lot for all the traces, I don't think we'll need more traces at this point.
I think we need to find what's right fix here, your last error state shows that the patch I've attached doesn't help.
This might be a hardware issue since Windows 10 also produces strange artifacts and hangs after Intel-driver install.