Maybe some details / logs would be nice?
# cat /var/log/messages
May 2 13:52:38 mypc kernel: [drm] GPU HANG: ecode 6:2:0xffeffffe, in kodi.bin , reason: Hang on bsd ring, action: reset
May 2 13:52:38 mypc kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
May 2 13:52:38 mypc kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
May 2 13:52:38 mypc kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
May 2 13:52:38 mypc kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
May 2 13:52:38 mypc kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
May 2 13:52:38 mypc kernel: drm/i915: Resetting chip after gpu hang
May 2 13:52:46 mypc kernel: drm/i915: Resetting chip after gpu hang
May 2 13:52:54 mypc kernel: drm/i915: Resetting chip after gpu hang
May 2 13:53:01 mypc shutdown: shutting down for system halt
May 2 13:53:01 mypc init: Switching to runlevel: 0
May 2 13:53:02 mypc shutdown: shutting down for system halt
# cat /sys/class/drm/card0
no error state collected
This is totally reproducible
Hi, what kernel and HW/System?
Can you try using latest drm-tip: https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e log_buf_len=4M if not already using drm-tip?
$ uname --all
Linux mypc 4.9.95-gentoo #4 SMP Wed May 2 11:22:22 WEST 2018 x86_64 Intel(R) Core(TM) i7-2675QM CPU @ 2.20GHz GenuineIntel GNU/Linux
Hardware is a MacBook pro early 2011 with radeon switched off by vgaswitcheroo:
# cat /etc/init.d/switcheroo-my
# My switcheroo service
# To switch the gpu from amd to intel and switch the power off to amd at boot
description="Switch ON/OFF the discrete GPU"
ebegin "Switching to Integrated GPU, shutdown Discrete"
echo ON > /sys/kernel/debug/vgaswitcheroo/switch
echo IGD > /sys/kernel/debug/vgaswitcheroo/switch
echo OFF > /sys/kernel/debug/vgaswitcheroo/switch
ebegin "Power ON both GPUs to avoid suspend/shutdown problems"
echo ON > /sys/kernel/debug/vgaswitcheroo/switch
As to drm-tip, gentoo does not have it in their repositories so, please, guide me to build, install and launch.
I've already git cloned drm-tip, and menuconfig lets me see it is the 4.17 kernel. So, should I copy my .config and make silentoldconfig?
And after, are there any special configs? Any other instructions?
Forgot to say that kodi used to work normally about one month ago. It probably started after the upgrade from 4.9.76-gentoo-r1 to 4.9.95-gentoo.
OK, setting low priority as MacBook now. Jani, what you think?
Need the error state before reboot.
Tried but couldn't:
- Logged in by ssh from another machine
- Triggered the problem
- Launched dmesg by ssh
- Target machine prints 2 lines, stops and becomes unrresponsive to other loggins
The worse thing is that I have to hard reset and filesystem is not closed properly; I fear a major problem.
I'm not sure why a GPU hang would hang the whole machine. Seems like there's more to this than meets the eye.
Please try a more recent kernel. Please try to get a full dmesg with drm.debug=14 out of the system. And hopefully the error state too.
In the end, I'm sorry, either vgaswitcheroo or macbook on its own would make this low priority; together they are a pretty off-putting combination.
Created attachment 139344 [details]
messages + dmesg
>> I'm not sure why a GPU hang would hang the whole machine. Seems like there's more to this than meets the eye.
I also find it strange but please note that everything else on this machine works perfectly.
I still can't get anything out by ssh. If I log before, it hangs as soon as I do any command. And after, it just refuses to connect.
So, I set drm.debug=14, made a cron job:
56 10 * * * root dmesg > /home/myuser/dmesg.txt && sync && reboot
triggered the event around 10:54 and waited for the machine to reboot, which never happened. It just stayed with the last image frozen on the display. I tried ssh again which was refused. A couple of minutes later I did a hard reset.
After log in, the dmesg.txt file that I ordered was 0 byte. But, looking at messages file which I attach, you can see the machine still running, which is strange, why doesn't it accept ssh?
The dmesg I attach is taken after next log in, not immediately after the event.
Created attachment 139345 [details]
dmesg from the zip
Created attachment 139346 [details]
messages from the zip
Please always attach plain text logs as plain text.
sorry, didn't pay attention to protocol.
I was finally able to extract dmesg and messages from the machine by ssh without reboot; attached below.
But as soon as I:
# cat /sys/class/drm/card0/error
or card1, or even ls /sys/class/drm/card0/ (or 1)
ssh hangs and I have to hard reset.
Created attachment 139353 [details]
Created attachment 139354 [details]
jssilva, is this is still an issue?
I dual boot macOS and Gentoo and, in the meantime, I stopped using Gentoo because it was taking me too much time everyday for upgrades, and then fixing dependencies and breakage done by it. So, I've been using macOS all the time except for chores demanding linux.
So I just booted Gentoo and it's working. This is to say that I fixed it, but can't remember how.
I'm sorry I didn't get back here to report.
Thanks for the update! Closing.