Bug 106357 - kodi crashes the system when viewing a *.ts video and pushing several times the right arrow to advance the video.
Summary: kodi crashes the system when viewing a *.ts video and pushing several times t...
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: low normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged, ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-02 15:09 UTC by jssilva
Modified: 2019-01-09 10:29 UTC (History)
3 users (show)

See Also:
i915 platform: SNB
i915 features: GPU hang


Attachments
messages + dmesg (51.81 KB, application/zip)
2018-05-04 10:54 UTC, jssilva
no flags Details
dmesg from the zip (180.19 KB, text/plain)
2018-05-04 10:59 UTC, Jani Nikula
no flags Details
messages from the zip (162.82 KB, text/plain)
2018-05-04 10:59 UTC, Jani Nikula
no flags Details
messages (log) (38.16 KB, text/plain)
2018-05-04 17:48 UTC, jssilva
no flags Details
dmesg (182.07 KB, text/plain)
2018-05-04 17:49 UTC, jssilva
no flags Details

Description jssilva 2018-05-02 15:09:43 UTC

    
Comment 1 Jani Saarinen 2018-05-02 15:16:15 UTC
Maybe some details / logs would be nice?
Comment 2 jssilva 2018-05-02 15:18:28 UTC
# cat /var/log/messages
...
May  2 13:52:38 mypc kernel: [drm] GPU HANG: ecode 6:2:0xffeffffe, in kodi.bin [12696], reason: Hang on bsd ring, action: reset
May  2 13:52:38 mypc kernel: [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
May  2 13:52:38 mypc kernel: [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
May  2 13:52:38 mypc kernel: [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
May  2 13:52:38 mypc kernel: [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
May  2 13:52:38 mypc kernel: [drm] GPU crash dump saved to /sys/class/drm/card0/error
May  2 13:52:38 mypc kernel: drm/i915: Resetting chip after gpu hang
May  2 13:52:46 mypc kernel: drm/i915: Resetting chip after gpu hang
May  2 13:52:54 mypc kernel: drm/i915: Resetting chip after gpu hang
May  2 13:53:01 mypc shutdown[13054]: shutting down for system halt
May  2 13:53:01 mypc init[1]: Switching to runlevel: 0
May  2 13:53:02 mypc shutdown[13065]: shutting down for system halt

# cat /sys/class/drm/card0
no error state collected

This is totally reproducible
Comment 3 Jani Saarinen 2018-05-02 15:45:42 UTC
Hi, what kernel and HW/System? 
Can you try using latest drm-tip: https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e log_buf_len=4M if not already using drm-tip?
Comment 4 jssilva 2018-05-02 16:12:38 UTC
$ uname --all
Linux mypc 4.9.95-gentoo #4 SMP Wed May 2 11:22:22 WEST 2018 x86_64 Intel(R) Core(TM) i7-2675QM CPU @ 2.20GHz GenuineIntel GNU/Linux

Hardware is a MacBook pro early 2011 with radeon switched off by vgaswitcheroo:

# cat /etc/init.d/switcheroo-my
#!/sbin/openrc-run
# My switcheroo service
# To switch the gpu from amd to intel and switch the power off to amd at boot

description="Switch ON/OFF the discrete GPU"

depend()
{
    need sysfs
}

start()
{
    ebegin "Switching to Integrated GPU, shutdown Discrete"
    echo ON > /sys/kernel/debug/vgaswitcheroo/switch
    echo IGD > /sys/kernel/debug/vgaswitcheroo/switch
    echo OFF > /sys/kernel/debug/vgaswitcheroo/switch
    return 0
}

stop()
{
    ebegin "Power ON both GPUs to avoid suspend/shutdown problems"
    echo ON > /sys/kernel/debug/vgaswitcheroo/switch
    return 0
}

As to drm-tip, gentoo does not have it in their repositories so, please, guide me to build, install and launch.
Comment 5 jssilva 2018-05-02 17:17:31 UTC
I've already git cloned drm-tip, and menuconfig lets me see it is the 4.17 kernel. So, should I copy my .config and make silentoldconfig?

And after, are there any special configs? Any other instructions?

Forgot to say that kodi used to work normally about one month ago. It probably started after the upgrade from 4.9.76-gentoo-r1 to 4.9.95-gentoo.
Comment 6 Jani Saarinen 2018-05-02 19:29:08 UTC
OK, setting low priority as MacBook now. Jani, what you think?
Comment 7 Jani Nikula 2018-05-03 08:31:56 UTC
Need the error state before reboot.
Comment 8 jssilva 2018-05-03 09:08:20 UTC
Tried but couldn't:
- Logged in by ssh from another machine
- Triggered the problem
- Launched dmesg by ssh
- Target machine prints 2 lines, stops and becomes unrresponsive to other loggins

The worse thing is that I have to hard reset and filesystem is not closed properly; I fear a major problem.

Any suggestion?
Comment 9 Jani Nikula 2018-05-03 09:33:03 UTC
I'm not sure why a GPU hang would hang the whole machine. Seems like there's more to this than meets the eye.

Please try a more recent kernel. Please try to get a full dmesg with drm.debug=14 out of the system. And hopefully the error state too.

In the end, I'm sorry, either vgaswitcheroo or macbook on its own would make this low priority; together they are a pretty off-putting combination.
Comment 10 jssilva 2018-05-04 10:54:57 UTC
Created attachment 139344 [details]
messages + dmesg
Comment 11 jssilva 2018-05-04 10:56:27 UTC
>> I'm not sure why a GPU hang would hang the whole machine. Seems like there's more to this than meets the eye.

I also find it strange but please note that everything else on this machine works perfectly.

I still can't get anything out by ssh. If I log before, it hangs as soon as I do any command. And after, it just refuses to connect.

So, I set drm.debug=14, made a cron job:
56 10 * * *     root    dmesg > /home/myuser/dmesg.txt && sync && reboot

triggered the event around 10:54 and waited for the machine to reboot, which never happened. It just stayed with the last image frozen on the display. I tried ssh again which was refused. A couple of minutes later I did a hard reset.

After log in, the dmesg.txt file that I ordered was 0 byte. But, looking at messages file which I attach, you can see the machine still running, which is strange, why doesn't it accept ssh?

The dmesg I attach is taken after next log in, not immediately after the event.
Comment 12 Jani Nikula 2018-05-04 10:59:03 UTC
Created attachment 139345 [details]
dmesg from the zip
Comment 13 Jani Nikula 2018-05-04 10:59:42 UTC
Created attachment 139346 [details]
messages from the zip

Please always attach plain text logs as plain text.
Comment 14 jssilva 2018-05-04 11:11:24 UTC
sorry, didn't pay attention to protocol.
Comment 15 jssilva 2018-05-04 17:47:14 UTC
I was finally able to extract dmesg and messages from the machine by ssh without reboot; attached below.

But as soon as I:
# cat /sys/class/drm/card0/error

or card1, or even ls /sys/class/drm/card0/ (or 1)

ssh hangs and I have to hard reset.
Comment 16 jssilva 2018-05-04 17:48:24 UTC
Created attachment 139353 [details]
messages (log)
Comment 17 jssilva 2018-05-04 17:49:20 UTC
Created attachment 139354 [details]
dmesg
Comment 18 Francesco Balestrieri 2019-01-09 09:31:18 UTC
jssilva, is this is still an issue?
Comment 19 jssilva 2019-01-09 10:14:51 UTC
I dual boot macOS and Gentoo and, in the meantime, I stopped using Gentoo because it was taking me too much time everyday for upgrades, and then fixing dependencies and breakage done by it. So, I've been using macOS all the time except for chores demanding linux.

So I just booted Gentoo and it's working. This is to say that I fixed it, but can't remember how.

I'm sorry I didn't get back here to report.
Comment 20 Francesco Balestrieri 2019-01-09 10:28:53 UTC
Thanks for the update! Closing.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.