Created attachment 118549 [details] /sys/class/drm/card0/error Stuck occurred always when chrome is running, but actual triggering time/action/environment is unclear. I built drm-nightly, but since my current kernel have drm-i915 built-in and bug reproduction takes unknown time I can't report anything about it. First time I notice this issue on kernel 4.2.0 - system architecture: x86_64 - kernel version: 4.2.1-gentoo - linux distribution: Gentoo - machine: Lenovo T430 (2344BZU) - display connector: panel connected via LVDS, VGA and DP++ disconnected
Created attachment 118550 [details] dmesg.txt Most relevant part of dmesg. Since I hadn't drm debug turned on it's all relevant info which I have.
A few reports now with Sandybridge + kernel 4.2.0. Could you please double check with kernel 4.1.6+ to see if it is indeed the new kernel that introduces the fault?
Created attachment 118557 [details] error-4.1.6.bz2 Friend of mine confirmed gpu hung on 4.1.6. Sadly he have somewhat misconfigured logging, so the only useful log info is Sep 30 08:42:13 workhorse kernel: [drm] GPU HANG: ecode 6:0:0x85fffffc, in plasmashell [4084], reason: Ring hung, action: reset uname -m: x86_64 uname -r: 4.1.6-gentoo
(In reply to Denis Sokolovsky from comment #3) > Created attachment 118557 [details] > error-4.1.6.bz2 That error looks more likely to be a mesa bug - definite hang inside a batch suggesting a userspace bug as opposed to a bug in the kernel submission.
If bug is in userspace then it is, most probably, in plasmashell, as we have same mesa version (11.0.0). Also I have a lot of options in kernel command line, namely "i915.semaphores=1 i915.enable_rc6=7 i915.enable_fbc=1 i915.lvds_downclock=1"
(In reply to Denis Sokolovsky from comment #5) > Also I have a lot of options in kernel command > line, namely "i915.semaphores=1 i915.enable_rc6=7 i915.enable_fbc=1 > i915.lvds_downclock=1" Please reproduce the issue without any of those set. They are basically debug options, and we don't support changing them from their platform specific defaults.
Okay, should I try 4.1.6 (4.1.9), 4.2.1 (4.2.2) or drm-intel-nightly?
Without kernel cmdline fancy stuff things get much worse. System just hang completely, no magic sysrq, network, broken filesystem, etc. I'm not sure right now, but, afair, I've added "i915.semaphores=1" to cmdline because I had stability problems before, which was fixed with semaphores turned on.
Created attachment 118581 [details] kern-drm-nightly.log.bz2 On my system dmesg saved into kern.log, but, due to system hang, I'm not sure if it contain all messages till system death. I cut part of messages, because complete file, compressed with "bzip2 -9" hadn't fit in 3MB constraint.
Things are not that straightforward, as it seems initially. With same configuration and workload, system on 4.2.1 kernel works without hang, that I saw on drm-intel-nightly, for a lot longer. And even when hung occur kernel driver managed to restart rendering.
Created attachment 118599 [details] dmesg.txt.bz2 dmesg from 4.2.1 with drm debug
Created attachment 118601 [details] error.bz2 Corresponding error info
Not sure if this bug is still relevant, as I haven't experience GPU hang since I switched to 4.3.0 kernel.
Actually, not only kernel was new. Almost at the same time were few updates: chrome (47->48), mesa (11.0->11.1) and kernel (4.2.x->4.3.0). Since Dec 20, in logs, I can found only one warning about GPU (see below), but no hungs with current setup, usual workflow and uptime for 17/38+ days (excluding/including sleeps). Jan 9 20:17:19 isis kernel: [1151010.810868] WARNING: CPU: 0 PID: 0 at /usr/src/linux-4.3.0-gentoo/drivers/gpu/drm/i915/intel_display.c:11293 intel_check_page_flip+0xed/0x100() Jan 9 20:17:19 isis kernel: [1151010.810871] Kicking stuck page flip: queued at 28897891, now 28897895
(In reply to Denis Sokolovsky from comment #14) > Actually, not only kernel was new. Almost at the same time were few updates: > chrome (47->48), mesa (11.0->11.1) and kernel (4.2.x->4.3.0). Since Dec 20, > in logs, I can found only one warning about GPU (see below), but no hungs > with current setup, usual workflow and uptime for 17/38+ days > (excluding/including sleeps). > > Jan 9 20:17:19 isis kernel: [1151010.810868] WARNING: CPU: 0 PID: 0 at > /usr/src/linux-4.3.0-gentoo/drivers/gpu/drm/i915/intel_display.c:11293 > intel_check_page_flip+0xed/0x100() > Jan 9 20:17:19 isis kernel: [1151010.810871] Kicking stuck page flip: > queued at 28897891, now 28897895 Closing this GPU hang issue now. Regarding warning, please update your kernel and if this is still occurring, fill a new bug and attached kernel log.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.