Bug 94605 - [SKL] Flashing black screen ([drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun)
Summary: [SKL] Flashing black screen ([drm:intel_cpu_fifo_underrun_irq_handler [i915]]...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: high blocker
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-03-17 23:07 UTC by Alex Xu (Hello71)
Modified: 2017-02-03 14:05 UTC (History)
37 users (show)

See Also:
i915 platform: SKL
i915 features: display/atomic


Attachments
dmesg.txt (98.73 KB, text/plain)
2016-03-17 23:07 UTC, Alex Xu (Hello71)
no flags Details
dmesg just after the screen has gone black for the first time (918.63 KB, text/plain)
2016-05-08 10:55 UTC, shdownnine
no flags Details
output of lspci -vvv (27.50 KB, text/plain)
2016-05-08 10:57 UTC, shdownnine
no flags Details
dmesg on Dell XPS 13 2016 model (66.11 KB, text/plain)
2016-09-18 09:57 UTC, Christer
no flags Details
lspci -vvv on Dell XPS 13 2016 model (15.87 KB, text/plain)
2016-09-18 09:58 UTC, Christer
no flags Details
Journalctl -r | grep underrun on Dell XPS 13 (2016 model) (6.05 KB, text/plain)
2016-09-18 10:02 UTC, Christer
no flags Details
dmesg (59.14 KB, text/plain)
2016-11-15 22:54 UTC, Dominik Klementowski
no flags Details
dmesg on current DRM Intel Nightly (55.85 KB, text/plain)
2016-11-16 22:11 UTC, Dominik Klementowski
no flags Details
dmesg.txt (196.38 KB, text/plain)
2016-12-06 17:01 UTC, Dominik Klementowski
no flags Details
dmesg with i915 debug mode both with and without firmware files (24.56 KB, application/x-bzip)
2016-12-07 23:45 UTC, Dominik Klementowski
no flags Details
dmesg (457.02 KB, text/plain)
2016-12-11 18:06 UTC, Dominik Klementowski
no flags Details
Possible fix, patch 1. (1.22 KB, patch)
2016-12-12 20:45 UTC, Paulo Zanoni
no flags Details | Splinter Review
Possible fix, patch 2. (3.00 KB, patch)
2016-12-12 20:46 UTC, Paulo Zanoni
no flags Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Alex Xu (Hello71) 2016-03-17 23:07:06 UTC
Created attachment 122395 [details]
dmesg.txt

Within ~5 minutes after booting, the screen begins intermittently flashing black, partially flashing black (rectangular portion of screen covering "top" or "bottom" of screen), or flashing "offset", i.e. the correct screen contents but offset ~20% of screen to the right.

This message appears to be related:

[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

System information:

$ dmesg | head -n 1
[    0.000000] Linux version 4.5.0-1-ARCH (builduser@tobias) (gcc version 5.3.0 (GCC) ) #1 SMP PREEMPT Tue Mar 15 09:41:03 CET 2016
$ lspci -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation Skylake Integrated Graphics (rev 07)
Comment 1 Alex Xu (Hello71) 2016-03-19 11:16:13 UTC
I am using Plasma 5 with kwin_x11 set to use OpenGL 3.1 over GLX.

I have been unable to reproduce the issue with kwin_x11 set to use EGL.
Comment 2 Alex Xu (Hello71) 2016-03-19 23:43:49 UTC
Never mind, it happened again. I also turned on TearFree which didn't help (not that I thought it would).
Comment 3 Hanno Böck 2016-04-18 08:44:57 UTC
I think I am seeing the same bug. Here's a video of the penomena:
https://www.youtube.com/watch?v=VYxRvFsS-nY

I've reported it at bugzilla.kernel.org ( https://bugzilla.kernel.org/show_bug.cgi?id=116571 ), but got told it needs to be reported here.

Some further info: This must've been introduced between 4.4 and 4.5. I wanted to bisect it, but it's not easy as I don't have a reliable way to reproduce the bug.

The system is a Thinkpad X1 Carbon 2014 edition (20A7), the GPU is listed as "Haswell-ULT Integrated Graphics Controller (rev 0b)" by lspci.
Comment 4 Jani Nikula 2016-04-22 08:59:54 UTC
Please try drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel and report back.
Comment 5 shdownnine 2016-05-08 10:55:30 UTC
Created attachment 123549 [details]
dmesg just after the screen has gone black for the first time
Comment 6 shdownnine 2016-05-08 10:57:08 UTC
Created attachment 123550 [details]
output of lspci -vvv
Comment 7 shdownnine 2016-05-08 10:57:39 UTC
I have a same issue on Asus X751M laptop with Linux kernels 4.5.0 and 4.6-rc6
(is using this one an equivalent of using drm-intel-nightly?), but the screen
just goes black and only goes unblack when I move a mouse connected to the
laptop or touch a touchpad. It is not a usual power-saving thing, since it
usually occurs when I switch between workspaces (I use i3wm). With Linux 4.5.0,
a reliable way to reproduce this is to open two workspaces and run

    while true; do i3-msg workspace back_and_forth >/dev/null; done

In a few seconds the screen goes black. With 4.6-rc6 kernel the script does not
cause this behavior, and it generally occurs rarer, but still does.

When it goes black for the first time in the session, the following line
appears in dmesg:

[drm:valleyview_pipestat_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

The same thing (the black screen, not the line) happens when I run

    xset dpms force off

(but in this case it also goes unblack if I press a key).

I haven't faced this behavior using the 4.4.0 kernel, but all Linux kernels I
tried at some moment just hang up on this laptop. I don't know if it is
related.
Comment 8 Jani Nikula 2016-05-11 13:30:58 UTC
(In reply to shdownnine from comment #6)
> Created attachment 123550 [details]
> output of lspci -vvv

VLV M.
Comment 9 Ville Syrjala 2016-05-11 13:38:21 UTC
(In reply to Jani Nikula from comment #8)
> (In reply to shdownnine from comment #6)
> > Created attachment 123550 [details]
> > output of lspci -vvv
> 
> VLV M.

Which is totally different than the SKL the original bug reporter has. And there was another comment about HSW. Please open new bugs for different platforms.
Comment 10 anomaly256 2016-05-13 02:35:59 UTC
I'm seeing this on skylake, using the drm-intel nightlies (also sorry for cross posting about this on bug 89806 before I saw this bug)

Using 2 monitors, one on DVI the other HDMI.  The monitor that goes blank is using 1920x1200 res, and if I swap the outputs the monitors are using the same monitor goes blank, that is irrespective of the output its on.  The second monitor which appears to be behaving better is using 1920x1080 although I do get graphical artifacts on it that follow the mouse cursor, namely white lines stretching from the cursor to the far right edge of the display.


I've tried switching from sna to uxa, no effect.  Tried switching rendering from gl to egl, no effect.

[ 2320.744016] [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun
[ 2535.029670] [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe B FIFO underrun


Linux scorched 4.6.0-rc7+ #1 SMP Fri May 13 11:04:59 AEST 2016 x86_64 Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz GenuineIntel GNU/Linux

Using nightlies fetched and built about an hour ago.
Comment 11 anomaly256 2016-05-13 02:37:56 UTC
Should add that I've tried with and without i915.enable_rc6=0, no effect.  If I try i915.semaphores=1 the machine hangs outright during boot after X starts

(In reply to anomaly256 from comment #10)
> I'm seeing this on skylake, using the drm-intel nightlies (also sorry for
> cross posting about this on bug 89806 before I saw this bug)
> 
> Using 2 monitors, one on DVI the other HDMI.  The monitor that goes blank is
> using 1920x1200 res, and if I swap the outputs the monitors are using the
> same monitor goes blank, that is irrespective of the output its on.  The
> second monitor which appears to be behaving better is using 1920x1080
> although I do get graphical artifacts on it that follow the mouse cursor,
> namely white lines stretching from the cursor to the far right edge of the
> display.
> 
> 
> I've tried switching from sna to uxa, no effect.  Tried switching rendering
> from gl to egl, no effect.
> 
> [ 2320.744016] [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A
> FIFO underrun
> [ 2535.029670] [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe B
> FIFO underrun
> 
> 
> Linux scorched 4.6.0-rc7+ #1 SMP Fri May 13 11:04:59 AEST 2016 x86_64
> Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz GenuineIntel GNU/Linux
> 
> Using nightlies fetched and built about an hour ago.
Comment 12 anomaly256 2016-05-13 07:00:47 UTC
Adding i915.enable_psr=0 seems to have fixed or at least reduced the blanking (has yet to happen but I am reluctant to declare it gone for good just yet).  The FIFO underrun message still appears in dmesg though.
Comment 13 anomaly256 2016-05-15 23:20:37 UTC
Is this still 'need info' if we've tested on the latest nightly and still see the problem?
Comment 14 anomaly256 2016-05-19 06:03:53 UTC
It is still happening with psr=0, just a lot less frequently.

Is this issue still getting attention?  This is a pretty big annoyance having just spent decent money on Intel's flagship consumer cpu and chipset.

Hello?
Comment 15 anomaly256 2016-05-23 01:12:29 UTC
And with today's nightly it's gotten 100x worse.  If you need more info please tell me what info you need
Comment 16 anomaly256 2016-05-23 06:33:14 UTC
Bumping the priority on this because devs are not responding and this is pretty much a show-stopper for us users experiencing it.
Comment 17 perrantrevan 2016-05-28 11:30:00 UTC
I believe I have problems as a result of this bug on the Intel Skull Canyon NUC with Iris Pro 580 integrated graphics.

I had to turn off modesetting to complete installation of Arch Linux.

Now I can successfully boot into a DE with no problems. However, when I boot into Kodi standalone or Retroarch in KMS/EGL mode the screen loses connection just after the systemd version number appears. 

Very, very occasionally the screen doesn't go black for Retroarch. However, my systemd retroarch.service fails (though I can manually start the systemd service or /usr/bin/retroarch itself).

The same setup works with no problems on my laptop (Intel HD Graphics 4400).

I ssh'd into the running system when the screen was black: -

$ sudo journalctl -b | grep -i drm
May 28 12:50:59 nuc kernel: [drm] Initialized drm 1.1.0 20060810
May 28 12:50:59 nuc kernel: [drm] Found 128MB of eLLC
May 28 12:50:59 nuc kernel: [drm] Memory usable by graphics device = 4096M
May 28 12:50:59 nuc kernel: fb: switching to inteldrmfb from EFI VGA
May 28 12:50:59 nuc kernel: [drm] Replacing VGA console driver
May 28 12:50:59 nuc kernel: [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
May 28 12:50:59 nuc kernel: [drm] Driver supports precise vblank timestamp query.
May 28 12:50:59 nuc kernel: [drm] Finished loading i915/skl_dmc_ver1.bin (v1.26)
May 28 12:50:59 nuc kernel: [drm] failed to retrieve link info, disabling eDP
May 28 12:50:59 nuc kernel: [drm] Initialized i915 1.6.0 20151218 for 0000:00:02.0 on minor 0
May 28 12:50:59 nuc kernel: fbcon: inteldrmfb (fb0) is primary device
May 28 12:50:59 nuc kernel: [drm:intel_dp_link_training_clock_recovery [i915]] *ERROR* failed to enable link training
May 28 12:50:59 nuc kernel: [drm:intel_dp_start_link_train [i915]] *ERROR* failed to start channel equalization
May 28 12:50:59 nuc kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
May 28 12:50:59 nuc kernel: i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
May 28 12:50:59 nuc kernel: [drm] RC6 on
May 28 12:51:00 nuc retroarch[409]: RetroArch [WARN] :: [KMS]: Couldn't open DRM device.
May 28 12:51:00 nuc retroarch[409]: RetroArch [WARN] :: [KMS]: Couldn't open DRM device.
May 28 12:51:00 nuc retroarch[409]: RetroArch [WARN] :: [KMS]: Couldn't open DRM device.
May 28 12:51:00 nuc retroarch[409]: RetroArch [ERROR] :: [KMS]: Couldn't find a suitable DRM device.
Comment 18 Vsevolod Minkov 2016-06-01 19:59:01 UTC
Can confirm this error on my Dell e7470 too. 

The symptoms are same - screen flicks followed by message: 

[ 4364.723080] [drm:gen8_irq_handler] *ERROR* CPU pipe A FIFO underrun


My environment is:
- kernel 4.7-rc1 drm-intel-nightly 2016y-06m-01d-09h-32m-53s @ 0a384375f34ee1a1e83f63f83cfac338b5ce37b
- xf86-video-intel 2.99.917
- xorg 1.18.3
- firmware skl_dmc_ver1_26.bin skl_guc_ver4_3.bin

Can I somehow help with testing?
Comment 19 anomaly256 2016-06-02 05:08:04 UTC
The devs seem to have given up on this.  No response from them for weeks.  I'm guessing we need to open new bugs or just write off skylake support on linux as completely defunct.
Comment 20 Flo H. 2016-06-08 15:23:01 UTC
Same happening here on my Fedora 23 Gnome 
with kernel 4.5.5-201.fc23.x86_64 
on an Thinkpad X230 
(CPU: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz)

lspci -v | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) (prog-if 00 [VGA controller])

xorg-x11-drv-intel-2.99.917-19.20151206.fc23.x86_64


Here is a snip from journalctl: https://paste.fedoraproject.org/376234/

I also noted that a kernel regression test (https://git.fedorahosted.org/git/kernel-tests.git) performed on that system does not come to an end. Logs are here: https://paste.fedoraproject.org/376201/ (with SELinux disabled), and here https://paste.fedoraproject.org/376204/ (SELinux enabled).
Comment 21 Jani Nikula 2016-06-17 16:17:17 UTC
(In reply to anomaly256 from comment #19)
> The devs seem to have given up on this.  No response from them for weeks. 
> I'm guessing we need to open new bugs or just write off skylake support on
> linux as completely defunct.

Please try current drm-intel-nightly. There are some relevant skl watermark fixes.
Comment 22 Jani Nikula 2016-06-17 16:18:13 UTC
(In reply to Flo H. from comment #20)
> Same happening here on my Fedora 23 Gnome 
> with kernel 4.5.5-201.fc23.x86_64 
> on an Thinkpad X230 
> (CPU: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz)
> 
> lspci -v | grep VGA
> 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor
> Graphics Controller (rev 09) (prog-if 00 [VGA controller])

This bug is about Skylake. That's Ivybridge. Please file a separate bug.
Comment 23 anomaly256 2016-06-21 02:15:31 UTC
(In reply to Jani Nikula from comment #21)
> (In reply to anomaly256 from comment #19)
> > The devs seem to have given up on this.  No response from them for weeks. 
> > I'm guessing we need to open new bugs or just write off skylake support on
> > linux as completely defunct.
> 
> Please try current drm-intel-nightly. There are some relevant skl watermark
> fixes.

New since 2016-05-23?  the watermark error message went away, the flashing black screen did not.  I tried a later nightly and the problem got 10x (not exaggerating) worse and more frequent.  I notice that sometimes after it flashes black, my monitor will report that the gpu is sending it an unsupported/non-optimal refresh rate that isn't exactly the 60hz it expects.  A reboot (even a soft reboot) fixes it.

I'll try the current nightly again now but I suspect the watermark fixes you are referring to have already been tested
Comment 24 anomaly256 2016-06-21 02:34:21 UTC
..some time next month when the repo finishes cloning:

Receiving objects:   0% (11662/4807413), 5.68 MiB | 11.00 KiB/s    

You guys really need to move to better hosting :P

(Or maybe change the way you merge so we can do a pull on the nightly branch without *always* having to hand-resolve conflicts)
Comment 25 anomaly256 2016-06-21 03:17:41 UTC
Flickering and screen artefacts abound with the current nightly.

Instead of perpetually shooting in the dark and hoping a nightly eventually magically makes this go away, can you tell us what info you need to get this fixed?  I'm not seeing any errors in dmesg at all but the flickering and glitching persists.  Somehow the gpu is trying to drive a refresh rate of 59.95hz instead of 60hz.
Comment 26 anomaly256 2016-06-21 03:20:09 UTC
What I don't understand though, don't you have access to skylake hardware?  If so many people are reporting these problems, can't you reproduce them in the lab?  You should be able to replicate this and diagnose it directly.
Comment 27 pcnoordhuis 2016-07-05 15:37:49 UTC
Seeing the same issue on Broadwell in an X1 Carbon 3rd generation.

Have two external monitors attached through the OneLink Pro Dockand see intermittent screen blanking (happens both for DisplayPort and DVI).

Just to say, this doesn't seem isolated to Skylake.
Comment 28 anomaly256 2016-07-06 02:01:30 UTC
Hi Intel driver devs, can we get an update on this?  And please don't just ask us to 'try the nightlies' again.  I've done that multiple times now and the problem just gets worse and worse which tells me you're not actively working on /this/ issue but just hoping fixes in other areas coincidentally fix it.
Comment 29 anomaly256 2016-07-10 09:44:24 UTC
Re-marking as 'new' since we've provided your requested feedback and the problem is not resolved by your nightlies.
Comment 30 Spricer 2016-07-11 04:08:48 UTC
I can confirm this bug as well in Mageia 5 latest kernel.
C/P from my post there:

After booting into 4.4.13-desktop-1.mga5 screen starts flickering/going blank for second when switching mouse from one monitor to second monitor. Sometimes screen stays black after leaving it with mouse and moving mouse back restores screen. This is NOT happening with  4.1.15-2.mga5 kernel

*** ENV ***:
Mageia: Mageia release 5 (Official) for x86_64
Kernel: 4.4.13-desktop-1.mga5 #1 SMP Fri Jun 10 12:16:55 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
GUI:mate-desktop-1.8.1-4.mga5
MB: ASUS P7H55-M PRO
CPU: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
X: x11-server-xorg-1.16.4-2.1.mga5
#lspci -s 00:02.0 -v
00:02.0 VGA compatible controller: Intel Corporation Core Processor Integrated Graphics Controller (rev 12) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. Device 8383
	Flags: bus master, fast devsel, latency 0, IRQ 25
	Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
	Memory at e0000000 (64-bit, prefetchable) [size=256M]
	I/O ports at dc00 [size=8]
	Expansion ROM at <unassigned> [disabled]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [d0] Power Management version 2
	Capabilities: [a4] PCI Advanced Features
	Kernel driver in use: i915
	Kernel modules: i915

#cat /etc/X11/xorg.conf
Section "Module"
    Load "v4l" # Video for Linux
EndSection

Section "Monitor"
    Identifier "VGA"
    VendorName "LG"
    ModelName "E2240"
    HorizSync 30-83
    VertRefresh 56-75
    # Monitor preferred modeline (60.0 Hz vsync, 67.5 kHz hsync, ratio 16/9, 102 dpi)
    ModeLine "1920x1080" 148.5 1920 2008 2052 2200 1080 1084 1089 1125 +hsync +vsync
EndSection

Section "Monitor"
    Identifier "HDMI"
    VendorName "ViewSonic"
    ModelName "VX2245wm"
    HorizSync 30-82
    VertRefresh 50-75
    Option "RightOf" "VGA"
    # Monitor preferred modeline (60.0 Hz vsync, 65.3 kHz hsync, ratio 16/10, 90 dpi)
    ModeLine "1680x1050" 146.25 1680 1784 1960 2240 1050 1053 1059 1089 +hsync -vsync
EndSection

Section "Device"
    Identifier "IntelHD"
    VendorName "Intel Corporation"
    BoardName "i915"
    Driver "intel"
    BusID "PCI:0:2:0"
    Option "Monitor-HDMI1" "HDMI"
    Option "Monitor-VGA1" "VGA"
EndSection

Section "Screen"
    Identifier "screen0"
    Device "IntelHD"
    Monitor "VGA"
EndSection

Section "ServerLayout"
    Identifier "DualScreen"
    Screen "screen0"
EndSection


*** Related Logs entries***
dmesg:
[   23.253059] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[   23.253192] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder A FIFO underrun
[   24.185645] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
[   24.185676] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder B FIFO underrun
[   43.650849] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
[   43.650985] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH transcoder B FIFO underrun



Steps to Reproduce:
Boot to 4.4.13-desktop-1.mga5 kernel
Comment 31 Spricer 2016-07-11 04:32:03 UTC
(In reply to Spricer from comment #30)
> I can confirm this bug as well in Mageia 5 latest kernel.
> C/P from my post there:
> 
> After booting into 4.4.13-desktop-1.mga5 screen starts flickering/going
> blank for second when switching mouse from one monitor to second monitor.
> Sometimes screen stays black after leaving it with mouse and moving mouse
> back restores screen. This is NOT happening with  4.1.15-2.mga5 kernel
> 
> *** ENV ***:
> Mageia: Mageia release 5 (Official) for x86_64
> Kernel: 4.4.13-desktop-1.mga5 #1 SMP Fri Jun 10 12:16:55 UTC 2016 x86_64
> x86_64 x86_64 GNU/Linux
> GUI:mate-desktop-1.8.1-4.mga5
> MB: ASUS P7H55-M PRO
> CPU: Intel(R) Core(TM) i5 CPU 650 @ 3.20GHz
> X: x11-server-xorg-1.16.4-2.1.mga5
> #lspci -s 00:02.0 -v
> 00:02.0 VGA compatible controller: Intel Corporation Core Processor
> Integrated Graphics Controller (rev 12) (prog-if 00 [VGA controller])
> 	Subsystem: ASUSTeK Computer Inc. Device 8383
> 	Flags: bus master, fast devsel, latency 0, IRQ 25
> 	Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
> 	Memory at e0000000 (64-bit, prefetchable) [size=256M]
> 	I/O ports at dc00 [size=8]
> 	Expansion ROM at <unassigned> [disabled]
> 	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
> 	Capabilities: [d0] Power Management version 2
> 	Capabilities: [a4] PCI Advanced Features
> 	Kernel driver in use: i915
> 	Kernel modules: i915
> 
> #cat /etc/X11/xorg.conf
> Section "Module"
>     Load "v4l" # Video for Linux
> EndSection
> 
> Section "Monitor"
>     Identifier "VGA"
>     VendorName "LG"
>     ModelName "E2240"
>     HorizSync 30-83
>     VertRefresh 56-75
>     # Monitor preferred modeline (60.0 Hz vsync, 67.5 kHz hsync, ratio 16/9,
> 102 dpi)
>     ModeLine "1920x1080" 148.5 1920 2008 2052 2200 1080 1084 1089 1125
> +hsync +vsync
> EndSection
> 
> Section "Monitor"
>     Identifier "HDMI"
>     VendorName "ViewSonic"
>     ModelName "VX2245wm"
>     HorizSync 30-82
>     VertRefresh 50-75
>     Option "RightOf" "VGA"
>     # Monitor preferred modeline (60.0 Hz vsync, 65.3 kHz hsync, ratio
> 16/10, 90 dpi)
>     ModeLine "1680x1050" 146.25 1680 1784 1960 2240 1050 1053 1059 1089
> +hsync -vsync
> EndSection
> 
> Section "Device"
>     Identifier "IntelHD"
>     VendorName "Intel Corporation"
>     BoardName "i915"
>     Driver "intel"
>     BusID "PCI:0:2:0"
>     Option "Monitor-HDMI1" "HDMI"
>     Option "Monitor-VGA1" "VGA"
> EndSection
> 
> Section "Screen"
>     Identifier "screen0"
>     Device "IntelHD"
>     Monitor "VGA"
> EndSection
> 
> Section "ServerLayout"
>     Identifier "DualScreen"
>     Screen "screen0"
> EndSection
> 
> 
> *** Related Logs entries***
> dmesg:
> [   23.253059] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU
> pipe A FIFO underrun
> [   23.253192] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH
> transcoder A FIFO underrun
> [   24.185645] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU
> pipe B FIFO underrun
> [   24.185676] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH
> transcoder B FIFO underrun
> [   43.650849] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU
> pipe B FIFO underrun
> [   43.650985] [drm:intel_pch_fifo_underrun_irq_handler [i915]] *ERROR* PCH
> transcoder B FIFO underrun
> 
> 
> 
> Steps to Reproduce:
> Boot to 4.4.13-desktop-1.mga5 kernel


lib64drm_intel1-2.4.59-1.mga5
x11-driver-video-intel-2.99.917-14.2.mga5
Comment 32 Mauro Santos 2016-07-16 14:35:05 UTC
I believe I'm also seeing this problem, I haven't tried the drm-intel-nightly branch but I wanted to add a data point that might be useful.

I'm running Arch linux on a Lenovo E560 with an I7-6500U, current kernel version is 4.6.4.

I have not experienced this problem until I have done some tuning(*) to the system in order to get deeper package c-states. Before the cpu would not go lower than PC2, currently the cpu is able to go down to PC7 and according to powertop there is no PC8~PC10 residency.

Since doing the changes I've seen the following messages in dmesg:
[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=414495 end=414496) time 173 us, min 1073, max 1079, scanline start 1069, end 1081
[drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=38859 end=38860) time 410 us, min 1073, max 1079, scanline start 1063, end 1090

The first message seems to be associated with the first flicker, but I've seen the screen flicker without any new messages being written to dmesg.

I have not noticed any reliable way to trigger this problem, it just happens at random times.

(*) I have enabled sata alpm (echo min_power > /sys/class/scsi_host/host{0,1}/link_power_management_policy) which, according to powertop, allows the cpu to get into PC6, and I have further enabled usb autosuspend for a device that was preventing the cpu from getting into PC7.
Comment 33 yann 2016-07-21 06:20:49 UTC
Updating priority accordingly to impact and frequency.
Comment 34 anomaly256 2016-07-21 07:54:49 UTC
Seriously..
Comment 35 anomaly256 2016-07-21 07:56:12 UTC
(In reply to yann from comment #33)
> Updating priority accordingly to impact and frequency.

This is an absolute show-stopper.  I had to get my employer to go buy a discrete non-intel card so my new workstation was even usable.  I think this deserves *much* more priority than Intel are giving it.  And yes it is still a problem, we're just not as active in this thread any more BECAUSE INTEL HAVE BEEN IGNORING IT
Comment 36 Peter Wu 2016-08-09 09:54:20 UTC
(Please keep it constructive, thanks.)

Would it help if we attach some dmesg with some value of drm.debug enabled? I am also experiencing flashing/moving screens for some weeks now on Arch Linux on SKL (i7-6700HQ) with Linux 4.6.5.

The "[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun" message is sometimes logged when this happens for the first time in a while. It seems more likely to happen when selecting text or moving close to the screen borders.
Comment 37 Peter F. Patel-Schneider 2016-08-10 12:31:03 UTC
I too am experiencing this problem.  I am running on a Lenovo 2 Pro with Haswell-ULT integrated graphics.

The problem appears to affect a large number of devices with integrated Intel graphics and appears to be present in all Linux 4.6 kernels.   Many bugs have been reported for the problem in different locations, including https://bugzilla.redhat.com/show_bug.cgi?id=1355851#c6.  

I have not been able to find a bug report that is the right place to find out what is being done to address this bug.  If there is a main report for this bug could someone please add a comment pointing to that report?
Comment 38 Jimmy Merrild Krag 2016-08-12 13:18:05 UTC
I'm pretty sure that this is what I am experiencing on a MacBook Pro 12,1. Screen randomly goes black and comes back again. Afterwards I see
[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
or
[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
or both in dmesg.

Graphics is Intel® Iris 6100 (Broadwell GT3) 
I'm on Ubuntu 16.04 with default kernel. Linux JMK-MBP 4.4.0-34-generic #53-Ubuntu SMP Wed Jul 27 16:06:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Comment 39 Jani Nikula 2016-08-16 12:33:30 UTC
Would you all be so kind as to please report separate bugs for separate platforms, even when you're convinced it's the same bug? It will be trivial to later resolve the bugs as duplicates, once we've actually determined they are the same bugs. But it's virtually impossible to have a coherent debugging story on a mangle of platforms and environments. 

Additionally, please always indicate the version of the kernel you're running.
Comment 40 Peter Wu 2016-08-16 12:51:49 UTC
The original report was about SKL, by "platform" do you refer to SKL vs HSW or specific laptop models? If this is about SKL, maybe it is an idea to prepend "[SKL]" to the subject.
Comment 41 Jani Nikula 2016-08-16 13:27:36 UTC
Platforms are Skylake, Broadwell, Haswell, and so on. There's this i915 platform field near the top, with the abbreviations. But I've now added SKL to the subject too.
Comment 42 Peter F. Patel-Schneider 2016-08-18 15:29:13 UTC
There are several bugs already that appear to be about this particular problem on Haswell devices.
https://bugs.freedesktop.org/show_bug.cgi?id=96704
https://bugs.freedesktop.org/show_bug.cgi?id=96736
https://bugs.freedesktop.org/show_bug.cgi?id=96916
https://bugs.freedesktop.org/show_bug.cgi?id=97056
Several of them have no response from developers.
Comment 43 ggg 2016-08-20 17:17:52 UTC
I've been reproducing this as well _occasionally for the past 6 months on NUC6i5SYB. After updating "Debian testing" two days ago, the "screen blanking" failure rate now makes openarena unplayable.

# dmidecode
BIOS Information
        Vendor: Intel Corp.
        Version: SYSKLi35.86A.0028.2015.1112.1822
        Release Date: 11/12/2015
...
        BIOS Revision: 5.6
...
Base Board Information
        Manufacturer: Intel corporation
        Product Name: NUC6i5SYB
        Version: H81131-502

Screen blanking for 2-3 seconds while playing openarena on a 2460x1440 resolution (via DisplayPort output) happens every 5-20 seconds.

I also noticed the screen blanking doesn't correlate with the "FIFO Underrun" output in dmesg: 
grundler@gggnuc6:~$ dmesg | grep "gen8_irq_handler"
[  932.176734] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[ 8067.746574] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[ 8152.371139] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[10635.449836] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[50462.953067] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[52083.930871] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[52596.327741] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

The last invocation of openarena blanked the screen three different times but only one message appeared.

Note that even during "normal use" (e.g. watching youtube or running chrome browser), the screen will sometimes blank.

Please advise if there are any parameters I should be trying to avoid the screen blanking or to dump more diagnostic info about the screen blanking.
Comment 44 ggg 2016-08-20 17:29:12 UTC
more info from the NUC6i5SYB: connect to Samsung U28D590D via DisplayPort.

tail /var/log/messages:
[50476.600499] systemd[1]: apt-daily.timer: Adding 9h 46min 57.697169s random time.
[50476.728743] systemd[1]: apt-daily.timer: Adding 11min 52.681564s random time.
[52083.930871] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[52596.327741] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

But /var/log/Xorg.0.log is getting spammed around the same time. Any other logs or output that would be helpful?

[ 52105.723] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52105.724] (II) modeset(0): Using hsync ranges from config file
[ 52105.724] (II) modeset(0): Using vrefresh ranges from config file
[ 52105.724] (II) modeset(0): Printing DDC gathered Modelines:
[ 52105.724] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
[ 52105.724] (II) modeset(0): Modeline "2560x1440"x0.0  241.50  2560 2608 2640 2720  1440 1443 1448 1481 +hsync -vsync (88.8 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1920x1080"x0.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync +vsync (67.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "800x600"x0.0   40.00  800 840 968 1056  600 601 605 628 +hsync +vsync (37.9 kHz e)
[ 52105.724] (II) modeset(0): Modeline "800x600"x0.0   36.00  800 824 896 1024  600 601 603 625 +hsync +vsync (35.2 kHz e)
[ 52105.724] (II) modeset(0): Modeline "640x480"x0.0   31.50  640 656 720 840  480 481 484 500 -hsync -vsync (37.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "640x480"x0.0   31.50  640 664 704 832  480 489 492 520 -hsync -vsync (37.9 kHz e)
[ 52105.724] (II) modeset(0): Modeline "640x480"x0.0   30.24  640 704 768 864  480 483 486 525 -hsync -vsync (35.0 kHz e)
[ 52105.724] (II) modeset(0): Modeline "640x480"x0.0   25.18  640 656 752 800  480 490 492 525 -hsync -vsync (31.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "720x400"x0.0   28.32  720 738 846 900  400 412 414 449 -hsync +vsync (31.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1280x1024"x0.0  135.00  1280 1296 1440 1688  1024 1025 1028 1066 +hsync +vsync (80.0 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1024x768"x0.0   78.75  1024 1040 1136 1312  768 769 772 800 +hsync +vsync (60.0 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1024x768"x0.0   75.00  1024 1048 1184 1328  768 771 777 806 -hsync -vsync (56.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1024x768"x0.0   65.00  1024 1048 1184 1344  768 771 777 806 -hsync -vsync (48.4 kHz e)
[ 52105.724] (II) modeset(0): Modeline "832x624"x0.0   57.28  832 864 928 1152  624 625 628 667 -hsync -vsync (49.7 kHz e)
[ 52105.724] (II) modeset(0): Modeline "800x600"x0.0   49.50  800 816 896 1056  600 601 604 625 +hsync +vsync (46.9 kHz e)
[ 52105.724] (II) modeset(0): Modeline "800x600"x0.0   50.00  800 856 976 1040  600 637 643 666 +hsync +vsync (48.1 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1152x864"x0.0  108.00  1152 1216 1344 1600  864 865 868 900 +hsync +vsync (67.5 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1280x800"x0.0   83.50  1280 1352 1480 1680  800 803 809 831 -hsync +vsync (49.7 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1280x720"x60.0   74.48  1280 1336 1472 1664  720 721 724 746 -hsync +vsync (44.8 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1280x1024"x0.0  108.00  1280 1328 1440 1688  1024 1025 1028 1066 +hsync +vsync (64.0 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1600x900"x60.0  119.00  1600 1696 1864 2128  900 901 904 932 -hsync +vsync (55.9 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1680x1050"x0.0  146.25  1680 1784 1960 2240  1050 1053 1059 1089 -hsync +vsync (65.3 kHz e)
[ 52105.724] (II) modeset(0): Modeline "1440x900"x0.0  106.50  1440 1520 1672 1904  900 903 909 934 -hsync +vsync (55.9 kHz e)
[ 52105.724] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52105.903] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52105.904] (II) modeset(0): Using hsync ranges from config file
[ 52105.904] (II) modeset(0): Using vrefresh ranges from config file
[ 52105.904] (II) modeset(0): Printing DDC gathered Modelines:
[ 52105.904] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
[ 52105.904] (II) modeset(0): Modeline "2560x1440"x0.0  241.50  2560 2608 2640 2720  1440 1443 1448 1481 +hsync -vsync (88.8 kHz e)
[ 52105.904] (II) modeset(0): Modeline "1920x1080"x0.0  148.50  1920 2008 2052 2200  1080 1084 1089 1125 +hsync +vsync (67.5 kHz e)
...
[ 52105.904] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52106.039] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52106.039] (II) modeset(0): Using hsync ranges from config file
[ 52106.039] (II) modeset(0): Using vrefresh ranges from config file
[ 52106.039] (II) modeset(0): Printing DDC gathered Modelines:
[ 52106.039] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52106.039] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52106.175] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52106.176] (II) modeset(0): Using hsync ranges from config file
[ 52106.176] (II) modeset(0): Using vrefresh ranges from config file
[ 52106.176] (II) modeset(0): Printing DDC gathered Modelines:
[ 52106.176] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52106.176] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.127] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.128] (II) modeset(0): Using hsync ranges from config file
[ 52600.128] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.128] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.128] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
[ 52600.128] (II) modeset(0): Modeline "2560x1440"x0.0  241.50  2560 2608 2640 2720  1440 1443 1448 1481 +hsync -vsync (88.8 kHz e)
...
[ 52600.128] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.263] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.264] (II) modeset(0): Using hsync ranges from config file
[ 52600.264] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.264] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.264] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
[ 52600.264] (II) modeset(0): Modeline "2560x1440"x0.0  241.50  2560 2608 2640 2720  1440 1443 1448 1481 +hsync -vsync (88.8 kHz e)
...
[ 52600.264] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.451] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.452] (II) modeset(0): Using hsync ranges from config file
[ 52600.452] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.452] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.452] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
[ 52600.452] (II) modeset(0): Modeline "2560x1440"x0.0  241.50  2560 2608 2640 2720  1440 1443 1448 1481 +hsync -vsync (88.8 kHz e)
...
[ 52600.452] (II) modeset(0): Modeline "1440x900"x0.0  106.50  1440 1520 1672 1904  900 903 909 934 -hsync +vsync (55.9 kHz e)
[ 52600.452] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.595] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.596] (II) modeset(0): Using hsync ranges from config file
[ 52600.596] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.596] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.596] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52600.596] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.792] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.792] (II) modeset(0): Using hsync ranges from config file
[ 52600.792] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.792] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.792] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52600.792] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52600.927] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52600.927] (II) modeset(0): Using hsync ranges from config file
[ 52600.927] (II) modeset(0): Using vrefresh ranges from config file
[ 52600.927] (II) modeset(0): Printing DDC gathered Modelines:
[ 52600.927] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52600.927] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52601.062] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52601.062] (II) modeset(0): Using hsync ranges from config file
[ 52601.062] (II) modeset(0): Using vrefresh ranges from config file
[ 52601.062] (II) modeset(0): Printing DDC gathered Modelines:
[ 52601.062] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
...
[ 52601.063] (II) modeset(0): Modeline "720x576"x0.0   27.00  720 732 796 864  576 581 586 625 -hsync -vsync (31.2 kHz e)
[ 52652.639] (II) modeset(0): EDID vendor "SAM", prod id 2944
[ 52652.639] (II) modeset(0): Using hsync ranges from config file
[ 52652.639] (II) modeset(0): Using vrefresh ranges from config file
[ 52652.639] (II) modeset(0): Printing DDC gathered Modelines:
...
Comment 45 ggg 2016-08-20 17:39:52 UTC
Sorry, forgot to mention I _was_ running 4.5.0-1-amd64 kernel from Debian testing release:
Linux gggnuc6 4.5.0-1-amd64 #1 SMP Debian 4.5.1-1 (2016-04-14) x86_64 GNU/Linux

I've installed the 4.6.0-1-amd64 kernel and will try that next after reading:
   http://blog.ffwll.ch/2016/03/neat-drmi915-stuff-for-46.html
Comment 46 ggg 2016-08-20 17:58:03 UTC
And 4.6.4-1 doesn't behave any better. Same symptom. Openarena is still unplayable. :(

# uname -a
Linux gggnuc6 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64 GNU/Linux
Comment 47 ggg 2016-08-20 18:20:18 UTC
Thought the current i915 driver parameters might be of interest:

/sys/module/i915/parameters# for i in * 
> do
> echo $i : $(cat $i)
> done
disable_display : N
disable_power_well : 1
edp_vswing : 0
enable_cmd_parser : 1
enable_dc : -1
enable_execlists : 1
enable_fbc : -1
enable_guc_submission : N
enable_hangcheck : Y
enable_ips : 1
enable_ppgtt : 3
enable_psr : 0
enable_rc6 : 0
fastboot : N
guc_log_level : -1
invert_brightness : 0
load_detect_test : N
lvds_channel_mode : 0
lvds_use_ssc : -1
mmio_debug : 0
modeset : -1
nuclear_pageflip : N
panel_ignore_lid : 1
prefault_disable : N
preliminary_hw_support : 0
reset : Y
semaphores : -1
use_mmio_flip : 0
vbt_sdvo_panel_type : -1
verbose_state_checks : Y

and while I'm at it, DRM parameters as well:
/sys/module/drm/parameters# for i in * ; do echo $i : $(cat $i); done
debug : 0
edid_fixup : 6
timestamp_monotonic : 1
timestamp_precision_usec : 20
vblankoffdelay : 5000

Lastly, when I hit Ctl-N (open a new browser window) in chrome, I sometimes get the screen blank. this has happened too often to be coincidence.
Comment 48 ggg 2016-08-20 18:34:18 UTC
Wow! just discovered that scrolling the mouse wheel to zoom in/out while on maps.google.com (plain map view, not satellite or earth view) also triggers screen to blank - no corresponding output in either Xorg.0.log or /var/log/messages or dmesg.

Switching to "earth" view on maps.google.com, dragging the screen with the mouse also triggers screen blanking.

Display output is set to 3840x2160:
[   108.184] (II) modeset(0): Modeline "3840x2160"x0.0  533.25  3840 3888 3920 4000  2160 2163 2168 2222 +hsync -vsync (133.3 kHz eP)
Comment 49 Eero Tamminen 2016-08-22 14:30:37 UTC
(In reply to ggg from comment #48)
> Wow! just discovered that scrolling the mouse wheel to zoom in/out while on
> maps.google.com (plain map view, not satellite or earth view) also triggers
> screen to blank - no corresponding output in either Xorg.0.log or
> /var/log/messages or dmesg.

This sounds more like the X issues with DRI3 than kernel issue.  Could you try both Intel DDX &modesetting X drivers and with Intel DDX both DRI3 and DRI2?

If it's indeed X issue, please file separate bug about it.
Comment 50 Jimmy Merrild Krag 2016-08-23 09:46:57 UTC
I just realized something I didn't before. I only have the issue on monitors I connect using displayport, and not on my laptops built-in monitor. Is there anything I can give you guys that will show what the difference is between the monitors?
Comment 51 martin.hagstrom 2016-08-23 09:52:48 UTC
(In reply to Jimmy Merrild Krag from comment #50)
> I just realized something I didn't before. I only have the issue on monitors
> I connect using displayport, and not on my laptops built-in monitor. Is
> there anything I can give you guys that will show what the difference is
> between the monitors?

I noticed the same thing but with HDMI connected monitors. I only see this problem with displayport.
Comment 52 Christer 2016-09-18 09:57:53 UTC
Created attachment 126597 [details]
dmesg on Dell XPS 13 2016 model
Comment 53 Christer 2016-09-18 09:58:52 UTC
Created attachment 126598 [details]
lspci -vvv on Dell XPS 13 2016 model
Comment 54 Christer 2016-09-18 10:02:43 UTC
Created attachment 126599 [details]
Journalctl -r | grep underrun on Dell XPS 13 (2016 model)
Comment 55 Christer 2016-09-18 10:06:08 UTC
test
Comment 56 Christer 2016-09-18 10:17:04 UTC
I have this problem as well on my Dell XPS 13 connected to an external 27" Eizo monitor using displayport. 

- Sample frequency: clicking the same link in google search results 20 times caused the signal loss on displayport 4 times as the page refreshed. The signal loss happens for about a second.

- The bug doesnt seem to happen if I lower the resolution to from 2560x1440 to 1920x1200 which is interesting, and hopefully a clue to solving the issue... but of course I cant really use that resolution because text looks quite bad.

- The bug also doesnt seem to happen on HDMI. I cant confirm this with the same external monitor since it only has displayport connection, but I have an ASUS 305 with similar chipset (also running Arch Linux) and I never get this bug on that machine when connected to an external monitor using HDMI.

- Bug happens also on no user activity (no mouse or keyboard usage), but more frequently when doing something active, like browing the web and clicking links.

- Latest 4.8 rc6 kernel and kernel option i915.enable_rc6=0 doesnt help.

- Im using the latest drivers from intel:

local/xf86-video-intel 1:2.99.917+703+g15c5ff1-1 (xorg-drivers xorg)
    X.org Intel i810/i830/i915/945G/G965+ video drivers
Comment 57 Christer 2016-09-18 13:51:47 UTC
Also tried https://aur.archlinux.org/packages/linux-drm-intel-nightly with no luck.

During compilation of this package, the screen was losing signal so much that it became almost completely turned off, which again shows that this bug happens a lot more when computer is doing hard work.

After installation and reboot, the issue still remains. No change. :(

Linux dellbox 4.8.0-1-drm-intel-nightly #1 SMP PREEMPT Sun Sep 18 14:12:27 CEST 2016 x86_64 GNU/Linux
Comment 58 Rami 2016-09-21 14:33:03 UTC
can you try with the last drm-intel-nightly
Comment 59 Russ Dill 2016-09-22 20:23:33 UTC
libdrm-intel1 2.4.70+git1609171830.065955~gd~x
Linux russ-laptop 4.7.3-040703-generic #201609070334 SMP Wed Sep 7 07:36:45 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
model name	: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz

With this latest drm on 4.7, I've been stable with no flickering so far. I have one 1920x1200 monitor connected through the laptop's HMDI port, another connected through the thunderbolt docking station's HDMI port and I also am using the 1920x1080 laptop display.
Comment 60 Timur Alperovich 2016-09-22 20:31:15 UTC
I built the drm-intel nightly from a pull on 09/21/2016 (commit: 463d07a32d87742a73e1ed352a6d6daa3f29d0c2). It appears to have resolved my issues with an external 4k monitor connected over DP. The monitor is at 3840x2160 and the internal screen at 3200x1800.

I have not observed any underrun messages in dmesg or any other issues when attaching/removing the external screen (attempted to attach/detach it multiple times).
Comment 61 Timur Alperovich 2016-09-22 22:16:18 UTC
After running for a while today without issues, I observed the following error:
[drm:intel_pipe_update_end] *ERROR* Atomic update failure on pipe A (start=708910 end=708911) time 167 us, min 2146, max 2159, scanline start 2144, end 2167

Should I open a new bug for it? It appears to be related to the changes made to fix the original issue.
Comment 62 Victor Trac 2016-09-23 04:08:44 UTC
@Timer I've been running drm-intel nightly for a couple of weeks and have been seeing that error in my logs as well, I think usually after a few hours. Also, after a few hours of running, I start to see some really quick tearing/blipping, usually around the Chromium window. That lasts for a while, but have been just getting annoyed and rebooting to make it go away. Possibly related?
Comment 63 Victor Trac 2016-09-27 13:51:45 UTC
Bug isn't fixed. I just had it happen to me on a drm-intel nightly built on Sep 25. Same error message:

Sep 27 08:49:42 callisto kernel: [drm:intel_cpu_fifo_underrun_irq_handler] *ERROR* CPU pipe A FIFO underrun
Comment 64 Jani Nikula 2016-09-27 14:17:08 UTC
Please grab a fresh nightly, and try again, as a bunch of fixes just went in yesterday.
Comment 65 Paulo Zanoni 2016-10-03 17:36:28 UTC
(In reply to Jani Nikula from comment #64)
> Please grab a fresh nightly, and try again, as a bunch of fixes just went in
> yesterday.

And another important fix that solved specific cases of flickering on SKL (#97888) was merged 2 days after Jani's last comment. So please make sure you're testing today's drm-intel-nightly branch.
Comment 66 Direx 2016-10-05 06:09:00 UTC
Issue is *not fixed* on 2016y-09m-27d-16h-32m-56s. Will try a newer nightly soon.
Comment 67 syphyr 2016-10-05 13:44:25 UTC
System information:
dmesg | head -n 1
[    0.000000] Linux version 4.8.0-994-generic (root@droid) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) ) #201610040014 SMP Wed Oct 5 12:12:16 CEST 2016
lspci -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 07)

I'm using ubuntu 16.04 with the following kernel:

git clone git://git.launchpad.net/~ubuntu-kernel-test/ubuntu/+source/linux/+git/mainline-crack
git reset --hard c8d2bc9  (use latest stable linux 4.8 kernel)

git fetch git://git.launchpad.net/~ubuntu-kernel-test/ubuntu/+source/linux/+git/mainline-crack cod/tip/drm-intel-nightly/2016-10-04
git merge FETCH_HEAD

The FIFO underrun seems fixed now!

My only issue is after merging the latest drm nightly to mainline is that zfs support no longer builds in ubuntu.  I had to disable zfs to get it to build.  If anyone can tell me how to fix this..I would appreciate it.

diff --git a/debian.master/rules.d/amd64.mk b/debian.master/rules.d/amd64.mk
index 24d3cf6..ee9da01 100644
--- a/debian.master/rules.d/amd64.mk
+++ b/debian.master/rules.d/amd64.mk
@@ -17,4 +17,4 @@ do_tools_x86  = true
 do_tools_hyperv        = true
 do_extras_package = true
 do_tools_common = true
-do_zfs         = true
+#do_zfs                = true
Comment 68 yann 2016-10-11 06:52:06 UTC
Please re-test with Paulo's patch to apply memory workarounds for skylake: https://patchwork.freedesktop.org/series/13548/
Comment 69 Derek Scherger 2016-10-12 03:51:11 UTC
Adding to the reports I see this error on a gigabyte p35x v5 laptop running ubuntu 14.04 with a 4.8.1 kernel. The error occurs when an external display is connected (vga connector) *and* when the laptop has 32GB (2x16GB DIMM) of ram installed. If I drop to 16GB (either 2x8GB or 1x16GB) the problem goes away.

Here's a stack dump and associated error messages from syslog.

Oct 11 13:47:53 localhost kernel: [ 1101.631938] init: Handling drm-device-changed event
Oct 11 13:47:54 localhost kernel: [ 1103.118129] init: Handling drm-device-changed event
Oct 11 13:47:56 localhost kernel: [ 1104.442387] ------------[ cut here ]------------
Oct 11 13:47:56 localhost kernel: [ 1104.442419] WARNING: CPU: 2 PID: 1587 at /home/kernel/COD/linux/drivers/gpu/drm/drm_irq.c:1215 drm_wait_one_vblank+0x16b/0x1b0 [drm
]
Oct 11 13:47:56 localhost kernel: [ 1104.442420] vblank not available on crtc 0, ret=-22
Oct 11 13:47:56 localhost kernel: [ 1104.442422] Modules linked in: cmac snd_usb_audio snd_usbmidi_lib hid_logitech_hidpp hid_logitech_dj btusb btrtl hid_generic usbhid
 xfrm_user xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key xfrm_algo ctr ccm arc4 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel s
nd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event iwlmvm snd_rawmidi snd_seq intel_rapl mac80211 x86_pkg_temp_thermal intel_powerclamp snd_seq
_device kvm_intel bnep rfcomm snd_timer sparse_keymap kvm mxm_wmi irqbypass iwlwifi hci_uart intel_cstate rtsx_pci_ms snd efi_pstore btbcm btqca joydev btintel cfg80211
 intel_rapl_perf serio_raw efivars soundcore memstick bluetooth wmi ac battery rfkill i2c_hid pinctrl_sunrisepoint pinctrl_intel hid intel_lpss_acpi intel_lpss acpi_pad
 shpchp mei_me evdev mei acpi_als kfifo_buf industrialio fuse parport_pc ppdev nls_iso8859_1 coretemp lp parport ext4 crc16 jbd2 fscrypto mbcache dm_crypt dm_mod nvme n
vme_core rtsx_pci_sdmmc mmc_core crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel i915 nvidia_drm(POE) aesni_intel nvidia_modeset(POE) aes_x86_64 lrw gf12
8mul i2c_algo_bit glue_helper ablk_helper drm_kms_helper cryptd syscopyarea psmouse sysfillrect sysimgblt r8169 fb_sys_fops rtsx_pci nvidia(POE) mii drm ahci libahci th
ermal fan video button fjes
Oct 11 13:47:56 localhost kernel: [ 1104.442555] CPU: 2 PID: 1587 Comm: Xorg Tainted: P        W  OE   4.8.1-040801-generic #201610071031
Oct 11 13:47:56 localhost kernel: [ 1104.442557] Hardware name: GIGABYTE P35V5/P35V5, BIOS FB0F 02/23/2016
Oct 11 13:47:56 localhost kernel: [ 1104.442560]  0000000000000286 0000000009c2ee6b ffffffff8555aec4 ffff98b3997b39c8
Oct 11 13:47:56 localhost kernel: [ 1104.442566]  0000000000000000 ffffffff8527f8ae ffff98b391b80000 ffff98b3997b3a20
Oct 11 13:47:56 localhost kernel: [ 1104.442571]  ffff98b394fd2400 ffff98b391b803d8 ffff98b391b80000 0000000000000000
Oct 11 13:47:56 localhost kernel: [ 1104.442575] Call Trace:
Oct 11 13:47:56 localhost kernel: [ 1104.442584]  [<ffffffff8555aec4>] ? dump_stack+0x5c/0x78
Oct 11 13:47:56 localhost kernel: [ 1104.442591]  [<ffffffff8527f8ae>] ? __warn+0xbe/0xe0
Oct 11 13:47:56 localhost kernel: [ 1104.442596]  [<ffffffff8527f92f>] ? warn_slowpath_fmt+0x5f/0x80
Oct 11 13:47:56 localhost kernel: [ 1104.442618]  [<ffffffffc029d076>] ? drm_vblank_get+0x76/0xc0 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.442638]  [<ffffffffc029d2bb>] ? drm_wait_one_vblank+0x16b/0x1b0 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.442693]  [<ffffffffc11748c0>] ? chv_write32+0x3c0/0x3c0 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442725]  [<ffffffffc1124add>] ? skl_wm_flush_pipe+0xcd/0x100 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442756]  [<ffffffffc112581b>] ? skl_update_wm+0x42b/0x6c0 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442807]  [<ffffffffc11976d8>] ? haswell_crtc_enable+0x798/0x860 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442856]  [<ffffffffc119374f>] ? intel_atomic_commit_tail+0x84f/0x10a0 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442903]  [<ffffffffc119c070>] ? intel_prepare_plane_fb+0x100/0x2b0 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.442920]  [<ffffffffc1034a82>] ? drm_atomic_helper_setup_commit+0x252/0x320 [drm_kms_helper]
Oct 11 13:47:56 localhost kernel: [ 1104.442965]  [<ffffffffc11943e2>] ? intel_atomic_commit+0x442/0x560 [i915]
Oct 11 13:47:56 localhost kernel: [ 1104.443000]  [<ffffffffc02b6032>] ? drm_atomic_set_crtc_for_connector+0x92/0xf0 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.443016]  [<ffffffffc10357c9>] ? drm_atomic_helper_set_config+0x79/0xb0 [drm_kms_helper]
Oct 11 13:47:56 localhost kernel: [ 1104.443042]  [<ffffffffc02a4591>] ? drm_mode_set_config_internal+0x61/0x110 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.443069]  [<ffffffffc02a900b>] ? drm_mode_setcrtc+0x42b/0x560 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.443089]  [<ffffffffc029b95d>] ? drm_ioctl+0x1ad/0x460 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.443114]  [<ffffffffc02a8be0>] ? drm_mode_setplane+0x1c0/0x1c0 [drm]
Oct 11 13:47:56 localhost kernel: [ 1104.443120]  [<ffffffff85428fdf>] ? do_vfs_ioctl+0x9f/0x600
Oct 11 13:47:56 localhost kernel: [ 1104.443123]  [<ffffffff852f1b52>] ? do_setitimer+0x1d2/0x260
Oct 11 13:47:56 localhost kernel: [ 1104.443126]  [<ffffffff852f1d55>] ? SyS_setitimer+0xe5/0x120
Oct 11 13:47:56 localhost kernel: [ 1104.443129]  [<ffffffff854295b4>] ? SyS_ioctl+0x74/0x80
Oct 11 13:47:56 localhost kernel: [ 1104.443134]  [<ffffffff859401f6>] ? entry_SYSCALL_64_fastpath+0x1e/0xa8
Oct 11 13:47:56 localhost kernel: [ 1104.443137] ---[ end trace 5931305c28d051d6 ]---
Oct 11 13:47:56 localhost colord: device removed: xrandr-Ancor Communications Inc-ASUS PB287Q-10863
Oct 11 13:47:56 localhost kernel: [ 1104.581859] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Oct 11 13:47:58 localhost kernel: [ 1107.117119] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Oct 11 13:48:59 localhost kernel: [ 1167.604042] init: Handling drm-device-changed event

I've had this problem for a long time and have tried various earlier kernels (4.4.0, 4.6.4) all with similar results.

I can reproduce this more or less at will (install ram, plug in monitor, error) so if there's more info that would be helpful I'd be happy to do some digging.

You may wonder why in the world I am using a VGA connector, this is because I have even less luck with hdmi or displayport connectors which don't seem to display anything at all, occasionally I'll get an image with displayport for a second or two but primarily this display is black.
Comment 70 Dennis Kieselhorst 2016-10-17 09:31:55 UTC
I'm also facing this issue after upgrading to Ubuntu 16.10 which contains Kernel 4.8.0-22.

dk@DK-TP:~$ dmesg | head -n 1
[    0.000000] Linux version 4.8.0-22-generic (buildd@lgw01-11) (gcc version 6.2.0 20161005 (Ubuntu 6.2.0-5ubuntu12) ) #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016 (Ubuntu 4.8.0-22.24-generic 4.8.0)
dk@DK-TP:~$ lspci -s 00:02.0
00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07)
Comment 71 Mads 2016-10-17 09:38:53 UTC
(In reply to Derek Scherger from comment #69)
> Adding to the reports I see this error on a gigabyte p35x v5 laptop running
> ubuntu 14.04 with a 4.8.1 kernel. 

(In reply to Dennis Kieselhorst from comment #70)
> I'm also facing this issue after upgrading to Ubuntu 16.10 which contains
> Kernel 4.8.0-22.

Why don't you both try yann's suggestion? It would actually be useful if you get the same type of errors with the newest nightly...
Comment 72 Florian Kaiser 2016-10-18 12:27:16 UTC
I was experiencing the same issues (logs attached) - display randomly going blank for a short time while working (moving the mouse around etc.) with
corresponding underruns in the kernel log.

Environment: 
Dell OptiPlex 7040/0Y7WYT with i7-6700 CPU
2x Dell U2415 displays connected via DP
OpenSUSE Leap 42.1 kernel 4.1.31-30-default and vanilla 4.8.0
2x 16GB DIMMs as well, however removing one of them didn't change anything

The nightly from 10/16/2016 seems to have fixed or at least greatly reduced it for me: Before the error happened multiple times an hour. With the nightly so far after one day, I haven't had the error while working.

However, I can still trigger the same (or a similar?) error by switching my displays off and on again:

# With nightly 4.8.0-drm-intel-20161016
Oct 18 13:36:59 fekpc org.kde.KScreen[1734]: kscreen.xrandr: Output 68 : connected = false , enabled = true
Oct 18 13:37:00 fekpc org.kde.KScreen[1734]: kscreen.xrandr: Emitting configChanged()
Oct 18 13:37:00 fekpc org.kde.KScreen[1734]: kscreen: Primary output changed from KScreen::Output(Id: 67 , Name: "DP1" ) ( "DP1" ) to KScreen::Output(Id: 67 , Name: "DP1" ) ( "DP1" )
Oct 18 13:37:00 fekpc kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun
Comment 73 Jan 2016-10-25 14:34:09 UTC
I can confirm this problem on my Dell Latitude E7270 with the 4.8.4 Kernel. This problem only happens when connected via a DisplayPort cable. I do not get the flicker with a HDMI cable.
Comment 74 lesiehnie 2016-10-27 11:24:56 UTC
I want to confirm this bug on a Dell XPS 13 9350, it probably also appears with no external screen connected.
Comment 75 Doa379 2016-10-29 14:12:26 UTC
I have noticed that this bug still exists with Kernel 4.8 compiled with Intel IOMMU DMA Remapping enabled by default.

Device is a Dell 7370 with Intel Skylake m7-6Y75.
Comment 76 Jan 2016-10-31 09:57:39 UTC
(In reply to Jan from comment #73)

I have to correct my previous comment. I do get the glitches on all possible outputs: eDP, DP, HDMI

The glitch behaves different though.

HDMI and eDP: The screen gets distorted for a second. It looks like an old CRT exposed to a magnet. But only for a second.

DP: The screen gets black for 1-2 seconds.
Comment 78 syphyr 2016-11-03 21:55:25 UTC
(In reply to Jan from comment #76)
> (In reply to Jan from comment #73)
> 
> I have to correct my previous comment. I do get the glitches on all possible
> outputs: eDP, DP, HDMI
> 
> The glitch behaves different though.
> 
> HDMI and eDP: The screen gets distorted for a second. It looks like an old
> CRT exposed to a magnet. But only for a second.
> 
> DP: The screen gets black for 1-2 seconds.

When enabling the intel iommu, I had to use this option:

intel_iommu=igfx_off

With intel iommu completely enabled, I had memory errors in dmesg while watching videos. The Intel iommu does not work correctly with some of the embedded intel graphics cards.
Comment 79 Mauro Santos 2016-11-03 22:17:32 UTC
(In reply to syphyr from comment #77)
> The pipe underruns are now fixed in latest stable 4.8.6 kernel with the
> following commits;
> 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/
> ?id=00dcbda55115994c1feb6bbfe2c4ef21de7c59fb
> 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/
> ?id=d64cdbd9291fbc569ba6a5ccef1dd697a10f8d20
> 
> https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/
> ?id=44e97ca6fb778ba7586cd5ad34afa0f789f88e17

You mean that the latest stable 4.8.6 kernel already has these fixes or that they will be included in 4.8.7?

Because if they are included in 4.8.6 they don't fix the problem, I'm still seeing "[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun" quite frequently with the accompanying screen flicker.
Comment 80 Peter Wu 2016-11-03 23:00:32 UTC
(In reply to Mauro Santos from comment #79)
> (In reply to syphyr from comment #77)
> > The pipe underruns are now fixed in latest stable 4.8.6 kernel with the
> > following commits;
[..]
> You mean that the latest stable 4.8.6 kernel already has these fixes or that
> they will be included in 4.8.7?

"Fixed in the latest stable 4.8.6 kernel".
If you follow the links, you see the message "commit xyz upstream" which hints  that it got backported. See also the changelog in https://lwn.net/Articles/705130/

> Because if they are included in 4.8.6 they don't fix the problem, I'm still
> seeing "[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A
> FIFO underrun" quite frequently with the accompanying screen flicker.

Under what circumstances? Any special cmdline, patches or userspace packages? So far it seems to work for me (i7-6700HQ, Arch Linux, 4.8.6-1).
Comment 81 Mauro Santos 2016-11-03 23:50:43 UTC
(In reply to Peter Wu from comment #80)
> Under what circumstances? Any special cmdline, patches or userspace
> packages? So far it seems to work for me (i7-6700HQ, Arch Linux, 4.8.6-1).

I'm also on Arch with kernel 4.8.6-1. I have an i7-6500U.

I guess the only "special" thing I have on the cmdline is intel_iommu=on, but I don't see any memory errors as reported by syphyr.

I have a way to avoid the problems, but that will limit the package c-state to PC2, let me try to explain.

If I do not do modifications to any of the kernel's defaults, the cpu never gets to a package c-state lower than PC2 (1)(or higher, whichever way you want to see it).

If I enable sata ALPM then the cpu will to go into PC6 and that is when I start to see the problem.

Further, if I turn on auto suspend for a usb bluetooth device, the cpu will now get into PC7 and I also see the problem in this case. I have not been able to get the cpu to go into PC8~PC10 even when doing all the changes recommended by powertop.

(1) The cpu's datasheet states clearly that "Long term reliability cannot be assured unless all the Low-Power Idle States are enabled.". I'm not sure if this applies to package c-states but even if it doesn't I would like the cpu to be as efficient as possible so that I get more time when using battery and so that there is less heat and noise.
Comment 82 Daniel Schulte 2016-11-04 09:19:46 UTC
I am also still seeing this bug. I am on Arch 4.8.6-1 on an Xeon E3-1245v5.
I don't have anything CPU/GPU performance or power saving related in my kernel cmdline.

I am also still seeing
[50809.620559] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
[50809.620599] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun

in the dmesg output although it is less than before. It also seems that at least on my system the bug triggers less often than with kernel version 4.8.4.
Comment 83 syphyr 2016-11-05 12:24:38 UTC
(In reply to Mauro Santos from comment #81)
> (In reply to Peter Wu from comment #80)
> > Under what circumstances? Any special cmdline, patches or userspace
> > packages? So far it seems to work for me (i7-6700HQ, Arch Linux, 4.8.6-1).
> 
> I'm also on Arch with kernel 4.8.6-1. I have an i7-6500U.
> 
> I guess the only "special" thing I have on the cmdline is intel_iommu=on,
> but I don't see any memory errors as reported by syphyr.
> 
> I have a way to avoid the problems, but that will limit the package c-state
> to PC2, let me try to explain.
> 
> If I do not do modifications to any of the kernel's defaults, the cpu never
> gets to a package c-state lower than PC2 (1)(or higher, whichever way you
> want to see it).
> 
> If I enable sata ALPM then the cpu will to go into PC6 and that is when I
> start to see the problem.
> 
> Further, if I turn on auto suspend for a usb bluetooth device, the cpu will
> now get into PC7 and I also see the problem in this case. I have not been
> able to get the cpu to go into PC8~PC10 even when doing all the changes
> recommended by powertop.
> 
> (1) The cpu's datasheet states clearly that "Long term reliability cannot be
> assured unless all the Low-Power Idle States are enabled.". I'm not sure if
> this applies to package c-states but even if it doesn't I would like the cpu
> to be as efficient as possible so that I get more time when using battery
> and so that there is less heat and noise.

Could you try replacing "intel_iommu=on" with:
intel_iommu=igfx_off

I also still have problems with intel_iommu=on.
Comment 84 Mauro Santos 2016-11-05 13:55:11 UTC
(In reply to syphyr from comment #83)
> Could you try replacing "intel_iommu=on" with:
> intel_iommu=igfx_off
> 
> I also still have problems with intel_iommu=on.

I have been trying something alonf those lines, so far I've given a good afternoon with intel_iommu=on,igfx_off and I didn't see any problems, now I'm testing without intel_iommu in the cmdline and I expect to see no problems.

If the problem is not solved it is much harder to trigger now, however partially disabling the iommu kind of defeats one of the main points of using it, restricting devices form having full unchecked access to the whole ram.

On another note, if I'm not wrong, using intel_iommu=igfx_off is the same as having the iommu completely off so you may want to check dmesg and make sure if the iommu is on or not.
Comment 85 syphyr 2016-11-05 14:07:11 UTC
(In reply to Mauro Santos from comment #84)
> (In reply to syphyr from comment #83)
> > Could you try replacing "intel_iommu=on" with:
> > intel_iommu=igfx_off
> > 
> > I also still have problems with intel_iommu=on.
> 
> I have been trying something alonf those lines, so far I've given a good
> afternoon with intel_iommu=on,igfx_off and I didn't see any problems, now
> I'm testing without intel_iommu in the cmdline and I expect to see no
> problems.
> 
> If the problem is not solved it is much harder to trigger now, however
> partially disabling the iommu kind of defeats one of the main points of
> using it, restricting devices form having full unchecked access to the whole
> ram.
> 
> On another note, if I'm not wrong, using intel_iommu=igfx_off is the same as
> having the iommu completely off so you may want to check dmesg and make sure
> if the iommu is on or not.

Using intel_iommu=igfx_off is not the same as completely having the intel iommu off.  I am still seeing this using "igfx_off":

[    0.076183] DMAR: Host address width 39
[    0.076184] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[    0.076188] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap 1c0000c40660462 ecap 7e3ff0505e
[    0.076189] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[    0.076193] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[    0.076194] DMAR: RMRR base: 0x0000008aca9000 end: 0x0000008acc8fff
[    0.076194] DMAR: RMRR base: 0x0000008b800000 end: 0x0000008fffffff
[    0.076196] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
[    0.076197] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[    0.076198] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out bit.
[    0.076198] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS setting.
[    0.077630] DMAR-IR: Enabled IRQ remapping in xapic mode
Comment 86 Mauro Santos 2016-11-05 14:47:07 UTC
(In reply to syphyr from comment #85)
> [    0.076183] DMAR: Host address width 39
> [    0.076184] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
> [    0.076188] DMAR: dmar0: reg_base_addr fed90000 ver 1:0 cap
> 1c0000c40660462 ecap 7e3ff0505e
> [    0.076189] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
> [    0.076193] DMAR: dmar1: reg_base_addr fed91000 ver 1:0 cap
> d2008c40660462 ecap f050da
> [    0.076194] DMAR: RMRR base: 0x0000008aca9000 end: 0x0000008acc8fff
> [    0.076194] DMAR: RMRR base: 0x0000008b800000 end: 0x0000008fffffff
> [    0.076196] DMAR-IR: IOAPIC id 2 under DRHD base  0xfed91000 IOMMU 1
> [    0.076197] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
> [    0.076198] DMAR-IR: x2apic is disabled because BIOS sets x2apic opt out
> bit.
> [    0.076198] DMAR-IR: Use 'intremap=no_x2apic_optout' to override the BIOS
> setting.
> [    0.077630] DMAR-IR: Enabled IRQ remapping in xapic mode

As far as I know, if you don't see "kernel: DMAR: IOMMU enabled" the iommu is not being used, despite you seeing the confirmation "kernel: DMAR: Disable GFX device mapping" that the igfx_off option as been accepted.

I might be wrong though, if someone else could chime in and confirm how it works it would be great.
Comment 87 Dominik Klementowski 2016-11-15 22:54:47 UTC
Created attachment 128001 [details]
dmesg

I think I'm experiencing the same issue.
Screen flickers, it can be reduced by passing i915.rc6=0 and (or) i915.psr=0, but not completely removed.

Having error messages like
[drm:drm_edid_block_valid [drm]] *ERROR* EDID checksum is invalid, remainder is 140

Weirdest thing are notices with no external screen connected, and DP-1 is my VGA connector.
[ 4410.919992] [drm] HPD interrupt storm detected on connector DP-1: switching from hotplug detection to polling

My hardware/software summary:
System:    Host: acer Kernel: 4.8.7-1-ARCH x86_64 (64 bit gcc: 6.2.1)
           Desktop: KDE Plasma 5.8.3 (Qt 5.7.0) Distro: Antergos Linux
Machine:   Device: laptop System: Acer product: Aspire E5-574 v: V1.14
           Mobo: Acer model: Zoro_SL v: V1.14 UEFI: Insyde v: V1.14 date: 03/04/2016
Battery    BAT1: charge: 17.7 Wh 71.9% condition: 24.7/25.0 Wh (99%)
           model: SANYO AL15A32 status: Discharging
CPU:       Dual core Intel Core i5-6200U (-HT-MCP-) cache: 3072 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 9602
           clock speeds: max: 2700 MHz 1: 499 MHz 2: 499 MHz 3: 499 MHz 4: 499 MHz
Graphics:  Card: Intel HD Graphics 520 bus-ID: 00:02.0
           Display Server: X.Org 1.18.4 driver: N/A Resolution: 1920x1080@60.05hz
           GLX Renderer: Mesa DRI Intel HD Graphics 520 (Skylake GT2)
           GLX Version: 3.0 Mesa 13.0.1 Direct Rendering: Yes
Audio:     Card Intel Sunrise Point-LP HD Audio driver: snd_hda_intel bus-ID: 00:1f.3
           Sound: Advanced Linux Sound Architecture v: k4.8.7-1-ARCH
Network:   Card-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           driver: r8169 v: 2.3LK-NAPI port: 3000 bus-ID: 01:00.0
           IF: enp1s0 state: down mac: <filter>
           Card-2: Qualcomm Atheros QCA9377 802.11ac Wireless Network Adapter
           driver: ath10k_pci bus-ID: 02:00.0
           IF: wlp2s0 state: up mac: <filter>
Drives:    HDD Total Size: 253.4GB (26.6% used)
           ID-1: /dev/sda model: Intenso_SSD_Sat size: 253.4GB
Partition: ID-1: / size: 19G used: 12G (61%) fs: ext4 dev: /dev/sda2
           ID-2: /home size: 213G used: 52G (25%) fs: ext4 dev: /dev/sda3
Sensors:   System Temperatures: cpu: 41.5C mobo: N/A
           Fan Speeds (in rpm): cpu: N/A
Info:      Processes: 171 Uptime: 36 min Memory: 1669.3/7854.7MB Init: systemd Gcc sys: 6.2.1
           Client: Shell (bash 4.4.01) inxi: 2.3.4
Comment 88 Dominik Klementowski 2016-11-16 22:11:53 UTC
Created attachment 128027 [details]
dmesg on current DRM Intel Nightly

I installed current linux-drm-intel-nightly. Problem is still there but dmesg messages changed a little.

In previous boot I also had this:
[    3.947456] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=501 end=502) time 508 us, min 1073, max 1079, scanline start 1064, end 1099
Comment 89 Jonas Grabber 2016-11-24 14:55:26 UTC
$  ~ lscpu | grep name 
Model name:            Intel(R) Core(TM) i5-6300U CPU @ 2.40GHz

$  ~ uname -a
Linux jgrabber-PC 4.8.10-040810-generic #201611210531 SMP Mon Nov 21 10:33:06 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$  ~ cat /etc/modprobe.d/i915.conf
options i915 enable_fbc=0

$  ~ cat /usr/share/X11/xorg.conf.d/20-intel.conf 
Section "Device"
        Identifier  "Intel Graphics"
        Driver      "intel"
        Option      "Backlight"  "intel_backlight"
        Option      "AccelMethod"  "sna"
        Option      "TearFree"  "true"
        Option      "DRI"  "3"
        BusID       "PCI:0:2:0"
EndSection

$  ~ dmesg | grep -c 'ERROR'      
0

No flickering, no input lag in Slack/chromium/chrome. So far so good.
Kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.8.10/
Comment 90 Joe Doss 2016-12-01 20:45:18 UTC
This is still happening 

Dec 01 14:35:44 sts135 kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun

$ uname -r
4.8.11-300.fc25.x86_64

$ lscpu |grep "Model name"
Model name:            Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz

Booted the kernel with i915.enable_psr=0 i915.rc6=0

External monitor connected via Displayport will randomly entire blank out for 1 to 3 seconds and kernel: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun shows up in the journal.
Comment 91 R Janssen 2016-12-04 17:24:35 UTC
Dec  4 18:04:38 NUC kernel: [24494.620426] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
Dec  4 18:13:10 NUC kernel: [25007.102488] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun

uname -a
Linux NUC 4.8.2-040802-generic #201610161339 SMP Sun Oct 16 17:41:46 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

lscpu | grep name
Model name:            Intel(R) Core(TM) i5-6260U CPU @ 1.80GHz
Comment 92 Dominik Klementowski 2016-12-04 20:29:53 UTC
Latest intel-drm-nightly solved a lot of issues for my graphics - Intel HD 520 (Skylake GT2) with Core i5-6200U (it's Acer E5 laptop). 

BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=... ro i915.enable_rc6=0 quiet splash

It works 24 hours for now (was suspended at night) and I haven't seen any flickering nor glitches so far. Previously the graphics was garbage (flickering, glitches, sometimes black screen at boot, weird things after connecting/disconnecting external screen), and now it feels stable. Also there is nothing negative about Intel DRM in dmesg! :)

I recommend installing Intel DRM Nightly kernel for anyone who's still experiencing such symptoms. I'm curious if there's more Skylake cards that are fixed with latest commits.
Comment 93 Victor Trac 2016-12-04 20:35:27 UTC
Yup. It seems like the visual problems have gone away for the last few weeks with drm-intel-nightly. I still get a ton of error messages in dmesg:

[12766.779430] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=98647 end=98648) time 162 us, min 1788, max 1799, scanline start 1785, end 1803
[12781.817306] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=99549 end=99550) time 147 us, min 1788, max 1799, scanline start 1787, end 1804
[12836.900699] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=102853 end=102854) time 151 us, min 1788, max 1799, scanline start 1785, end 1803
[12892.084106] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=106163 end=106164) time 133 us, min 1788, max 1799, scanline start 1785, end 1800
[13127.405640] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=120278 end=120279) time 151 us, min 1788, max 1799, scanline start 1786, end 1803


However, they don't seem to correlate to any tearing or other video problems.
Comment 94 Paulo Zanoni 2016-12-06 13:32:12 UTC
(In reply to Dominik Klementowski from comment #92)
> Latest intel-drm-nightly solved a lot of issues for my graphics - Intel HD
> 520 (Skylake GT2) with Core i5-6200U (it's Acer E5 laptop). 
> 
> BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=... ro i915.enable_rc6=0
> quiet splash

Why are you using i915.enable_rc6=0? Please test again without that.
Comment 95 Dominik Klementowski 2016-12-06 17:01:41 UTC
Created attachment 128357 [details]
dmesg.txt

I'll try with rc6 enabled and give you feedback in day or two.

By the way I noticed some black flashes when HDMI connected, but nothing interesting in dmesg (attachment). Didn't saw them without external screen even when watched movies whole day.
Comment 96 Dominik Klementowski 2016-12-06 21:25:49 UTC
(In reply to Paulo Zanoni from comment #94)
> (In reply to Dominik Klementowski from comment #92)
> > Latest intel-drm-nightly solved a lot of issues for my graphics - Intel HD
> > 520 (Skylake GT2) with Core i5-6200U (it's Acer E5 laptop). 
> > 
> > BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=... ro i915.enable_rc6=0
> > quiet splash
> 
> Why are you using i915.enable_rc6=0? Please test again without that.

Ok, I can confirm now - without parameter i915.enable_rc6=0 flickering is still present and pretty often (more than once per minute I guess).
I have todays' intel-drm-nightly kernel.
dmesg is clear. I only have some pcie errors

[    3.682031] pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=00e5(Receiver ID)
[    3.682036] pcieport 0000:00:1c.5:   device [8086:9d15] error status/mask=00000001/00002000
[    3.682038] pcieport 0000:00:1c.5:    [ 0] Receiver Error         (First)

I'm not sure if this could be related. I can get rid of those errors using "pci=nomsi" boot param, but this causes another issue - graphics performance gets weak after resume from suspend. (I have only integrated Intel graphics, but I think this laptop has PCI-e port, because there are variants of the same laptop with some NVIDIA GPUs).
Comment 97 Paulo Zanoni 2016-12-07 15:44:03 UTC
(In reply to Dominik Klementowski from comment #96)
> (In reply to Paulo Zanoni from comment #94)
> > (In reply to Dominik Klementowski from comment #92)
> > > Latest intel-drm-nightly solved a lot of issues for my graphics - Intel HD
> > > 520 (Skylake GT2) with Core i5-6200U (it's Acer E5 laptop). 
> > > 
> > > BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=... ro i915.enable_rc6=0
> > > quiet splash
> > 
> > Why are you using i915.enable_rc6=0? Please test again without that.
> 
> Ok, I can confirm now - without parameter i915.enable_rc6=0 flickering is
> still present and pretty often (more than once per minute I guess).
> I have todays' intel-drm-nightly kernel.
> dmesg is clear. I only have some pcie errors

Wow. Can you please boot with drm.debug=0x1e i915.enable_rc6=1 and then retest, reproduce the problem, and attach the dmesg output here? (don't forget the drm.debug)

Also, it would be good to know the exact model of your laptop. We're trying to find which machines have these flickering issues. Your dmesg says "Aspire E5-574", but it looks like there are multiple models for the E5-574 laptop, so I'd like to know the "full name" in case you have it.

Also, can you please do another test? Just "sudo mv /lib/firmware/i915 /root/backup-lib-firmware-i915", then reboot and check if the problem persists.

All tests should be *without* the i915.enable_rc6=0 option.

Thanks,
Paulo
Comment 98 Paulo Zanoni 2016-12-07 15:47:06 UTC
(In reply to Paulo Zanoni from comment #97)
> (In reply to Dominik Klementowski from comment #96)
> > (In reply to Paulo Zanoni from comment #94)
> > > (In reply to Dominik Klementowski from comment #92)
> > > > Latest intel-drm-nightly solved a lot of issues for my graphics - Intel HD
> > > > 520 (Skylake GT2) with Core i5-6200U (it's Acer E5 laptop). 
> > > > 
> > > > BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=... ro i915.enable_rc6=0
> > > > quiet splash
> > > 
> > > Why are you using i915.enable_rc6=0? Please test again without that.
> > 
> > Ok, I can confirm now - without parameter i915.enable_rc6=0 flickering is
> > still present and pretty often (more than once per minute I guess).
> > I have todays' intel-drm-nightly kernel.
> > dmesg is clear. I only have some pcie errors
> 
> Wow. Can you please boot with drm.debug=0x1e i915.enable_rc6=1 and then
> retest, reproduce the problem, and attach the dmesg output here? (don't
> forget the drm.debug)

Please also use the log_buf_len=1M option in addition to the others.

> 
> Also, it would be good to know the exact model of your laptop. We're trying
> to find which machines have these flickering issues. Your dmesg says "Aspire
> E5-574", but it looks like there are multiple models for the E5-574 laptop,
> so I'd like to know the "full name" in case you have it.
> 
> Also, can you please do another test? Just "sudo mv /lib/firmware/i915
> /root/backup-lib-firmware-i915", then reboot and check if the problem
> persists.
> 
> All tests should be *without* the i915.enable_rc6=0 option.
> 
> Thanks,
> Paulo
Comment 99 Dominik Klementowski 2016-12-07 23:45:13 UTC
Created attachment 128373 [details]
dmesg with i915 debug mode both with and without firmware files

First of all - my laptop has label on it and there is "E5-574-524L".
Also, on procuder page I can identify this model (to download Windows drivers) with SNID: 60900127176.
More about my hardware is here: http://pastebin.com/86n69E56

I've done as you said and in this package there is dmesg.txt
BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=UUID=... ro drm.debug=0x1e i915.enable_rc6=1 quiet splash vt.handoff=7

I also moved firmware files - just as you said - and flickering persists. The log is in file dmesg-no-firmware.txt

It's probably less frequent than it was on stable kernel with rc6 enabled. Sometimes I don't see anything for minutes - even now when writing this answer.
Except for black screen or black rectangles I also noticed horizontal lines on some random parts of screen.
Moreover some parts of screen being displayed somewhere else. All this happens very shortly - like for time of eye blink.

Feel free to ask me for any tests.
Comment 100 Paulo Zanoni 2016-12-08 15:54:59 UTC
(In reply to Dominik Klementowski from comment #99)
> Created attachment 128373 [details]
> dmesg with i915 debug mode both with and without firmware files
> 
> First of all - my laptop has label on it and there is "E5-574-524L".
> Also, on procuder page I can identify this model (to download Windows
> drivers) with SNID: 60900127176.
> More about my hardware is here: http://pastebin.com/86n69E56
> 
> I've done as you said and in this package there is dmesg.txt
> BOOT_IMAGE=/boot/vmlinuz-4.9.0-994-generic root=UUID=... ro drm.debug=0x1e
> i915.enable_rc6=1 quiet splash vt.handoff=7
> 
> I also moved firmware files - just as you said - and flickering persists.
> The log is in file dmesg-no-firmware.txt
> 
> It's probably less frequent than it was on stable kernel with rc6 enabled.
> Sometimes I don't see anything for minutes - even now when writing this
> answer.
> Except for black screen or black rectangles I also noticed horizontal lines
> on some random parts of screen.
> Moreover some parts of screen being displayed somewhere else. All this
> happens very shortly - like for time of eye blink.
> 
> Feel free to ask me for any tests.

Hi

Unfortunately these logs are not useful since they didn't include the moment where i915.ko got loaded: too much stuff got printed. Please try again with "drm.debug=0xe log_buf_len=1M i915.enable_rc6=1".

Thanks for the help!
Comment 101 Dominik Klementowski 2016-12-11 18:06:33 UTC
Created attachment 128413 [details]
dmesg

I'm sorry for the late answer as I was busy.
Yes I forgot to add parameter to make log buffer bigger.
Attachment contains dmesg messages, current intel-drm-nightly is used and kernel was loaded with params:
drm.debug=0xe log_buf_len=2M i915.enable_rc6=1

Uptime says system was loaded 15 minutes ago and I saw black blink only once so not that bad at all.
Comment 102 Paulo Zanoni 2016-12-12 20:45:48 UTC
Created attachment 128439 [details] [review]
Possible fix, patch 1.
Comment 103 Paulo Zanoni 2016-12-12 20:46:13 UTC
Created attachment 128440 [details] [review]
Possible fix, patch 2.
Comment 104 Paulo Zanoni 2016-12-12 20:48:29 UTC
(In reply to Dominik Klementowski from comment #101)
> Created attachment 128413 [details]
> dmesg
> 
> I'm sorry for the late answer as I was busy.
> Yes I forgot to add parameter to make log buffer bigger.
> Attachment contains dmesg messages, current intel-drm-nightly is used and
> kernel was loaded with params:
> drm.debug=0xe log_buf_len=2M i915.enable_rc6=1
> 
> Uptime says system was loaded 15 minutes ago and I saw black blink only once
> so not that bad at all.

Thanks a lot for this! While looking at the log files I spotted one thing about your machine that's different from my machine (where I can't reproduce the problem), so I decided to investigate and found one possible issue. Can you please test whether the two patches I attached fix the problem?

Just apply both patches, then test. If the problem goes away, it would be good to revert patch 2 and test to see if the problem also goes away with just patch 1 applied.

Thanks a lot for the help!
Comment 105 Mauro Santos 2016-12-12 21:37:00 UTC
Might these two patches help with screen flickering only when intel_iommu=on is used? (see https://bugs.freedesktop.org/show_bug.cgi?id=94605#c79 and following comments)

If yes, would it be ok to test against 4.8.x or does it needs to be against drm-next?
Comment 106 Paulo Zanoni 2016-12-13 10:10:21 UTC
(In reply to Mauro Santos from comment #105)
> Might these two patches help with screen flickering only when intel_iommu=on
> is used? (see https://bugs.freedesktop.org/show_bug.cgi?id=94605#c79 and
> following comments)
> 
> If yes, would it be ok to test against 4.8.x or does it needs to be against
> drm-next?

I suggest you to try it, just in case it does help.

If your issue really only happens with intel_iommu=0, please please open a new bug report called something like "Skylake screen flickering only with intel_iommu=on", and attach the relevant logs (boot with drm.debug=0xe log_buf_len=1M, reproduce the problem, grab the dmesg output).

As a note, it's really really hard to deal with these bug reports with too many people describing similar-but-still-different symptoms. I'd really rather have 100 separate bug reports than a bug with 100 different people saying "me too".

Thanks everybody for the help!
Comment 107 Dominik Klementowski 2016-12-13 10:24:20 UTC
I compiled latest stable Linux 4.9 with both patches and I didn't see any flickering for two hours running with i915.enable_rc6=1 with HDMI connected - so far so good. Unfortunately there are other bug(s) (like glitches and disappearing wallpaper) which was solved with intel-drm-nightly so I need to compile again.

Let me test this for few days and I'll try different configurations and also without second patch, then I'll let you know. Thank you very much for your work!
Comment 108 Dominik Klementowski 2016-12-14 00:15:11 UTC
Yeah! The problem is gone for me with *only* patch 1. I have rc6 enabled and I haven't seen any flicker for whole day.
Tested on Ubuntu 16.10 as well as current Arch Linux.
Thanks once again. Hope this gets into official stable kernel soon.
Comment 109 Paulo Zanoni 2016-12-14 12:58:49 UTC
(In reply to Dominik Klementowski from comment #108)
> Yeah! The problem is gone for me with *only* patch 1. I have rc6 enabled and
> I haven't seen any flicker for whole day.
> Tested on Ubuntu 16.10 as well as current Arch Linux.
> Thanks once again. Hope this gets into official stable kernel soon.

Thanks a lot for testing this! We really appreciate that.

A little comment: so patch 1 prevents engine rings and FBC from being initialized in the fist page of stolen memory, and then when the hardware writes to it, it messes the contents of the ring or whatever else is there, and on these things a single flipped bit can be a huge problem, so you see the screen flickering.

Patch 2 prevents us from using the first page of stolen as a frontbuffer inherited from the BIOS. The sort of corruption you'd see here would be distortions in the very first line of pixels in your monitor, when we're using that specific frontbuffer. I guess this is just way too hard to notice. But it would be cool if you could confirm that :). With double-buffering and other things it's probably going to be hard to spot this since not every frame will show it, and even when you're on the correct buffer you may not see the distortion or it  may be so small that you'll need really really good eyes to notice it. But, still, my guess is that it's there :).
Comment 110 Dominik Klementowski 2016-12-18 12:26:07 UTC
I didn't see anything there. I tried to look very close and I tried to zoom this top edge with camera, but anyway I didn't noticed anything. Maybe I can force this some way?

Will this patch1 be merged with kernel so I don't need to patch future releases?
Comment 111 Paulo Zanoni 2016-12-20 12:59:06 UTC
(In reply to Dominik Klementowski from comment #110)
> I didn't see anything there. I tried to look very close and I tried to zoom
> this top edge with camera, but anyway I didn't noticed anything. Maybe I can
> force this some way?
> 
> Will this patch1 be merged with kernel so I don't need to patch future
> releases?

The patch was just merged to our development tree. Patch 1 got marked for inclusion in the stable trees, so it should automatically land in your distro at some point.

Thanks a lot for the help testing this.
Comment 112 Paulo Zanoni 2016-12-20 13:12:47 UTC
So here's the problem with this bug report:

- It was created in March.
- The original bug reporter seems to be gone (no response to confirm/deny our fixes).
- There's a lot of people with "me too" in the bug report.
- 111 comments.
- Since then, we fixed dozens of patches addressing problems that could cause Skylake screen flickerings.

Usually I'd close the bug report if the original reporter confirmed his problem was solved, then I'd ask everybody else to file new bug reports in case their problems were not solved. Another process we have is that we close bugs if the reporter stops responding us after some time.

Since I believe the problems of most people here should really be fixed in the latest development tree, and since the original bug reporter stopped responding, I'll just go ahead and mark this bug as fixed.

If you think the bug still happens to you, please make sure you test the latest drm-tip tree (https://cgit.freedesktop.org/drm-tip branch drm-tip), then open a new bug report describing your specific problem. Make sure you boot with drm.debug=0xe log_buf_len=1M, reproduce the problem, then attach the dmesg output.

It's much easier for us to work on multiple bug reports about the same bug instead of working on a single bug report with multiple actual bugs, so please help us get everything organized so we can focus more on solving the bugs instead of having spend time trying to extract information from a 100+ comment bug report.

Thanks for your comprehension,
Paulo
Comment 113 Rami 2017-02-03 14:05:54 UTC
Timeout, assuming that is fixed now as many improvements pushed in kernel. If this is not the case, please re-test with latest drm-tip kernel from (https://cgit.freedesktop.org/drm-tip branch drm-tip), then open a new bug.


bug/show.html.tmpl processed on Mar 25, 2017 at 17:29:00.
(provided by the Example extension).