Bug 95064 - [bdw-U] [broadwell-u] CPU/GPU halt for multiple display configurations i915 i5-5200U i5-5287U system freeze get-around: i915.enable_rc6=0
Summary: [bdw-U] [broadwell-u] CPU/GPU halt for multiple display configurations i915 i...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: highest critical
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-22 10:34 UTC by Harry
Modified: 2016-12-15 08:12 UTC (History)
2 users (show)

See Also:
i915 platform: BDW
i915 features:


Attachments

Description Harry 2016-04-22 10:34:30 UTC
I have had lots of troubles with GNOME Shell halting the cpu requiring power off.

I could provoke the same situation using the Weston reference implementation

Computer is MacBook Pro 2015 13"

Have external display to the right with display bottoms aligned
Launch a terminal window on the external display put it near lower left corner
around the lower right corner of the internal display move the mouse cursor back and forth from the internal display to into the terminal window on the external display

BUG
within about 30 s, cpu halts, computer has to be powered off
static video from the halt moment is displayed

Halting the cpu is not helpful at all

This problem did not exist in 1.08/15.10

The cause seems to be drawing the mouse cursor on multiple displays

without an external display no problems.

GNOME Shell does not crash until
- a window exists
https://bugzilla.gnome.org/show_bug.cgi?id=765285

There is some special handling of the mouse cursor on external displays within and inch of a cross-display boundary
if a window edge is placed here, say a maximized window on the external display, displaying a different mouse cursor shape, the mouse shape change close to a cross-display boundary causes the halt to occur almost immediately

dpkg -s weston | egrep ^V
Version: 1.9.0-3
dpkg -s libwayland-server0 | egrep ^V
Version: 1.10.0-1~xenial0
lsb_release -sd
Ubuntu 16.04 LTS
uname -r
4.4.0-21-generic
Comment 1 Pekka Paalanen 2016-04-25 08:58:54 UTC
Sounds like a driver bug. It has nothing to do with Wayland, and I believe also gnome-shell and weston are just triggers, not at blame here.

You say the problem did not exist on Weston 1.8.0? Does it occur with 1.8.0 now that you have the rest of the system upgraded? If yes, it is not a Weston regression but most likely with the drivers.

If simply downgrading Weston alone does make the problem go away, then please try weston 1.10. If 1.10 is also problematic, it would be good to bisect which change in weston caused it.
Comment 2 Harry 2016-04-25 18:22:45 UTC
This bugs hangs the CPU (as in not only the GPU)
The system has to be powered off with a 4-second power button press
It is clearly a kernel problem, does that mean it has to be drm?

I was reading over here:
https://01.org/linuxgraphics/documentation/how-report-bugs
- So I should use drm-intel-nightly
I can capture dmesg until halt on another networked machine
I am not sure if cat /sys/class/drm/card0/error will be possible since the cpu halts?
(kernel options: drm.debug=0x1e log_buf_len=1M)

However, this is a MacBook Pro 2015 13", "everybody" has one.
And it's the latest 16.04 "everybody" has that, too

cpu/gpu: i5-5287U CPU @ 2.90GHz
Two displays via two DisplayPort cables
- it crashes with a single external display, too
- it does not crash without external displays

Running Weston from 16.04, it's 100% reproducible

uname --kernel-release --machine
4.4.0-21-generic x86_64

I am running KDE/X now, b/c that does not crash.
Comment 3 Jani Nikula 2016-04-26 07:24:03 UTC
(In reply to Harry from comment #0)
> There is some special handling of the mouse cursor on external displays
> within and inch of a cross-display boundary
> if a window edge is placed here, say a maximized window on the external
> display, displaying a different mouse cursor shape, the mouse shape change
> close to a cross-display boundary causes the halt to occur almost immediately

Sounds familiar, and I think we may have fixed this already. Can you try v4.6-rc5 or drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel?
Comment 4 maria guadalupe 2016-05-09 18:25:16 UTC
With the following configuration on the processor i5-5200U 2.2 GHZ i can see the issue reported on this bug, i5-5200U 2.2 GHZ
 
|=== Software information ===|

 ++ Kernel version                      : 4.6.0-rc7-drm-intel-nightly-ww20-commit-bcc6a84+
 ++ Linux distribution                  : Ubuntu 15.10
 ++ Architecture                        : 64-bit
 ++ Mesa version                        : 11.1.2 (git-7bcd827
 ++ xf86-video-intel version            : 2.99.917
 ++ Xorg-Xserver version                : 1.18.99.1
 ++ DRM version                         : 2.4.67
 ++ VAAPI version                       : Intel i965 driver for Intel(R) Broadwell - 1.7.1.pre1 (1.7.0-8-g2c1bec0)
 ++ Cairo version                       : 1.15.2
 ++ Intel GPU Tools version             : Tag [intel-gpu-tools-1.14-212-g1e9a3ac] / Commit [1e9a3ac]
 ++ Kernel driver in use                : i915
 ++ Hardware acceleration               : Enabled
 ++ Bios revision                       : 1.69
 ++ KSC revision                        : 1.69


 |=== Hardware information ===|

 ++ Platform                            : BDW
 ++ Motherboard model                   : 80E5
 ++ Motherboard type                    : Lenovo G50-80 Notebook
 ++ Motherboard manufacturer            : LENOVO
 ++ CPU family                          : Core i5
 ++ CPU information                     : Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
 ++ GPU Card                            : Intel Corporation Broadwell-U Integrated Graphics (rev 09) (prog-if 00 [VGA controller])
 ++ Memory ram                          : 6 GB
 ++ Maximum memory ram allowed          : 16 GB
 ++ Display resolution                  : 1366x768
 ++ CPU's number                        : 4




Kernel information
===============================================
commit bcc6a843e7e4a3f4794b90dbefb00174171365bd
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon May 9 13:48:17 2016 +0100

    drm-intel-nightly: 2016y-05m-09d-12h-47m-46s UTC integration manifest


Kernel version : 4.6.0-rc7
Architecture : source amd64 all
Homepage : http://www.kernel.org/
Comment 5 Harry 2016-05-10 22:11:20 UTC
Is there work on this bug? It's making my life kind of difficult. I got out of nVidia unsupported hell by upgrading to an Intel 3x4k gpu.

I get the halt approximately weekly on KDE/X which is the most stable configuration I have found.
When I was on GNOME Shell 3.18 or 3.20 I had an uptime of approximately 2 hours
And the 16.04 Weston has the problem 100% reproducible as described in this report.
Although halts are horrible, since it is reproducible it should be "easy" to fix

I did not see this prior to upgrading to 16.04, which likely means that 15.10 with 4.2.0-35.40 did not have it. I am currently on 4.4.0-21.

I believe it has to do with the mouse cursor on multiple displays and the phantom redraws of the displays below the mouse cursor that occur when the mouse cursor moves over a display boundary.

I just lost data again from a halt where fsck removed data that I considered safely written to disk. Like most people, I hate losing data.
Comment 6 Harry 2016-06-05 22:57:33 UTC
Not even with a legacy KDE/X can one avoid this problem.

KDE halts occasionally when using the "Present Windows" desktop effects function that displays all windows side-by-side.

Halting the cpu leads to data loss, which means this bug is Critical.

And fully reproducible.
Comment 7 Harry 2016-06-05 22:59:39 UTC
4.4.0-22-generic
Comment 8 yann 2016-07-15 09:33:12 UTC
Updating priority accordingly: System halt
Comment 9 Harry 2016-07-21 22:40:56 UTC
Despite everything I don’t this bug still halts my machine on occasion

- no GNOME
- no Wayland
- the most retarded KDE available

Anyone with multiple displays on Broadwell and possibly other chips is hosed.

Now it ran for 10 days.
Comment 10 Harry 2016-08-26 12:00:11 UTC
I did some tests and concluded that setting kernel command line parameter
i915.enable_rc6=0
made Wayland not crash anymore
- it appears acceleration is then off in Wayland
- GNOME 3.18/Wayland crashed within minutes

the setting can be checked:
cat /sys/module/i915/parameters/enable_rc6
0

documentation:
modinfo i915 | less

rC-state is from ACPI, c0 is power on, c6 is deep power down, possibly off.

had no effect:
intel_idle.max_cstate=1
intel_pstate=disable

was already set:
for g in /sys/devices/system/cpu/cpu[0-9]/cpufreq/scaling_governor; do   echo powersave > $g;   echo cpu$i: $(cat $g);   ((i++)); done
Comment 11 maria guadalupe 2016-09-01 18:58:43 UTC
(In reply to Harry from comment #10)
> I did some tests and concluded that setting kernel command line parameter
> i915.enable_rc6=0
> made Wayland not crash anymore
> - it appears acceleration is then off in Wayland
> - GNOME 3.18/Wayland crashed within minutes
> 
> the setting can be checked:
> cat /sys/module/i915/parameters/enable_rc6
> 0
> 
> documentation:
> modinfo i915 | less
> 
> rC-state is from ACPI, c0 is power on, c6 is deep power down, possibly off.
> 
> had no effect:
> intel_idle.max_cstate=1
> intel_pstate=disable
> 
> was already set:
> for g in /sys/devices/system/cpu/cpu[0-9]/cpufreq/scaling_governor; do  
> echo powersave > $g;   echo cpu$i: $(cat $g);   ((i++)); done

setted kernel command line parameter i915.enable_rc6=0 and These issue is not reproduced and hardware acceleration is active with the following configuration
 
=== Software information ===	
Kernel version	4.8.0-rc4drm-intel-nighly-ww36-commit-f91144a+
Linux distribution	Ubuntu 16.04.1 LTS
Architecture	64-bit
Mesa version	11.2.2 (git-3a9f628
xf86-video-intel version	2.99.917
DRM version	2.4.70
Cairo version	1.15.2
Bios revision	1.69
KSC revision	1.69
	
 === Hardware information ===	
Platform	BDW
Motherboard type	Lenovo G50-80 Notebook
Motherboard manufacturer	LENOVO
CPU family	Core i5
CPU information	Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
GPU Card	Intel Corporation Broadwell-U Integrated Graphics (rev 09) (prog-if 00 [VGA controller])
Memory ram	6 GB
Maximum memory ram allowed	16 GB
Hard drive capacity	74GiB (80GB)
Comment 12 Jari Tahvanainen 2016-09-30 09:18:29 UTC
mailto:ismail@inbox247.com - can you let us know if you are still having problems with the following environment (used also by Maria):
Kernel version	4.8.0-rc4drm-intel-nighly-ww36-commit-f91144a+
Linux distribution	Ubuntu 16.04.1 LTS
Comment 13 Harry 2016-12-03 19:39:22 UTC
This problem is fixed by kernel command line parameter i915.enable_rc6=0

Some apps in GNOME/Wayland 3.20 hakts in the same way when moved between displays.
https://bugs.freedesktop.org/show_bug.cgi?id=98985
Comment 14 Jani Saarinen 2016-12-14 09:20:14 UTC
Reporter, Harry, can this be resolved or what?
Comment 15 Harry 2016-12-14 21:11:14 UTC
yes.

The same halt occurs in https://bugs.freedesktop.org/show_bug.cgi?id=98985


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.