Summary: | [SKL] [regression] Random display flickering on Kernel 4.8 with dual-screen | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Direx <direx> | ||||||||
Component: | DRM/Intel | Assignee: | Paulo Zanoni <przanoni> | ||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||
Severity: | blocker | ||||||||||
Priority: | highest | CC: | bgamari, brettcsmith, carbonfreeze, intel-gfx-bugs, kolAflash, mike.auty, nospoonuser, przanoni, rockorequin, victor.trac | ||||||||
Version: | unspecified | Keywords: | regression | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||||
OS: | Linux (All) | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | SKL | i915 features: | display/watermark, power/Other, power/runtime PM | ||||||||
Attachments: |
|
I am on 4.8-rc4 now and it seems like the flickering could be related to these messages: [21494.389755] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [22199.026381] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=43427 end=43428) time 359 us, min 1192, max 1199, scanline start 1179, end 1206 [22585.897409] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun [26998.328003] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun [26998.999972] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun There is still a very short random flickering about once per minute and a longer flickering every 5-10 minutes. In the latter case I am getting a pipe underrun in the kernel log. Highest+Blocker due to Regression w/o workaround Not reproduced with the last setup: Setup: ====== Hardware Platform: CPU : Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (family: 6, model: 94 stepping: 3) Software Linux OS : Ubuntu 16.10 64 bits Kernel: drm-intel-nightly: 2016y-09m-19d-20h-40m-51s UTC integration manifest author: Daniel Vetter <daniel.vetter@ffwll.ch> commit: 4c518aef024daa0223692124baa2d7399f54dd97 drm: libdrm-2.4.70-14-g0659558 from http://cgit.freedesktop.org/mesa/drm/ xorg-server-1.18.99.2 from git://git.freedesktop.org/git/xorg/xserver mesa: mesa-12.0.0 78b061 from http://cgit.freedesktop.org/mesa/mesa/ cairo: tag 1.15.2 db8a7f1 from http://cgit.freedesktop.org/cairo libva: libva-1.7.0-50-g7aa2dd9 from http://cgit.freedesktop.org/libva/ vaapi-intel-driver: 1.7.0-136-g36fbd81 from http://cgit.freedesktop.org/vaapi/intel-driver Direx, please re-test and confirm if this issue is not occurring anymore now on your side. With drm-intel-nightly the flickering is even worse than with 4.8-rc7. Only one of my displays flickers, but the flickering there is horrible. After a while the display turns completely black for a few seconds (~5 seconds) and then seems to recover. But shortly after "coming back" the flickering re-appears. FWIW, I used to have this problem (full screen flickering) in Ubuntu 16.10 with the 4.8 kernel up until 4.8-rc7, but now with both drm-intel-nightly (2016-09-26) and 4.8.0-17-generic I haven't seen it in an hour or so of use. I did see the unity top bar and the top part of the unity launcher flicker with drm-intel-nightly when I moved the mouse around in Firefox, but that might have been a compiz/unity issue and it isn't happening right now with 4.8.0-17-generic. Does the problem go away if you revert the patch below? 05a76d3d6ad1ee9f9814f88949cc9305fc165460 is the first bad commit commit 05a76d3d6ad1ee9f9814f88949cc9305fc165460 Author: Lyude <cpaul@redhat.com> Date: Wed Aug 17 15:55:57 2016 -0400 drm/i915/skl: Ensure pipes with changed wms get added to the state I spoke too soon: I'm still seeing full-screen flickering with 4.8.0-17-generic, just not as frequently as before. I'll try reverting the equivalent commit to 05a76d3d6ad1ee9f9814f88949cc9305fc165460 from the mainline kernel to see if that stops it. I just saw a massive flicker on my eDP display with that commit reverted from the mainline kernel. It lasted around half a second and the screen was black with a bunch of colourful lines. There's a CPU pipe A FIFO underrun message in the syslog from about two minutes before the flicker, in case that's relevant. Hello Can you please confirm whether https://patchwork.freedesktop.org/patch/113642/ fixes the problem? Thanks, Paulo @Paulo: I have been running 4.8.0 from git patched with https://patchwork.freedesktop.org/patch/113642/ for some hours now and so far I haven't seen this issue occur at all. (Thanks for the patch.) In case it's of interest, there are some atomic update failure messages and one buffer under-run in dmesg: [ 86.182013] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=3475 end=3476) time 103 us, min 1073, max 1079, scanline start 1072, end 1079 [ 7689.788816] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=124404 end=124405) time 105 us, min 1073, max 1079, scanline start 1072, end 1080 [10148.377778] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=55108 end=55109) time 139 us, min 1073, max 1079, scanline start 1071, end 1081 [10183.928593] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=57241 end=57242) time 107 us, min 2146, max 2159, scanline start 2145, end 2160 [10450.277698] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=73222 end=73223) time 143 us, min 1073, max 1079, scanline start 1070, end 1080 [12488.118382] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=3667 end=3668) time 103 us, min 1073, max 1079, scanline start 1072, end 1080 [14088.803554] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=4992 end=4993) time 393 us, min 2146, max 2159, scanline start 2127, end 2180 [15013.089253] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=60446 end=60447) time 102 us, min 2146, max 2159, scanline start 2145, end 2159 [15755.731898] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun I can confirm this bug in 4.8.0 on Intel i7-6600U with Sky Lake Integrated Graphics. The flickering happens more often while I type in VIM. I also see this in the logs: [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun This did not happen with 4.7.6, 4.7.5, 4.7.4 nor 4.7.3 I just saw my eDP screen flicker even though I'm using my patched kernel (patched with https://patchwork.freedesktop.org/patch/113642/). Around the same time, this appeared in the syslog: Oct 6 11:26:50 xps15-9550 kernel: [ 6296.557426] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun So the problem isn't completely resolved by the patch, although its frequency is certainly reduced. I also started experiencing this issue. I've seen most of the issues mentioned, plus the computer would sometimes freeze (screen is black and caps lock light does not toggle). After applying the patch, I haven't had any problems yet (freezing, black screen, flickering). $ lspci | grep VGA 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07) $ dmesg -l err -w [ 3221.487848] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=87034 end=87035) time 301 us, min 763, max 767, scanline start 753, end 768 [ 4236.060368] tpm tpm0: A TPM error (325) occurred stopping the TPM [ 4236.322922] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [ 9858.626083] tpm tpm0: A TPM error (325) occurred stopping the TPM [ 9858.915548] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [ 9875.987934] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun [10794.506952] tpm tpm0: A TPM error (325) occurred stopping the TPM [10794.779128] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [10826.831623] tpm tpm0: A TPM error (325) occurred stopping the TPM [10827.114490] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [10862.320313] tpm tpm0: A TPM error (325) occurred stopping the TPM [10867.941723] snd_hda_intel 0000:00:1f.3: azx_get_response timeout, switching to single_cmd mode: last cmd=0x206f0900 [10867.980104] snd_hda_codec_hdmi hdaudioC0D2: Unable to sync register 0x2f0d00. -5 [10868.176508] snd_hda_codec_realtek hdaudioC0D0: out of range cmd 0:20:400:ffffffff [10868.195635] snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2b8000. -5 [10868.195769] snd_hda_codec_realtek hdaudioC0D0: Unable to sync register 0x2b8000. -5 [10869.673593] snd_hda_codec_realtek hdaudioC0D0: out of range cmd 0:20:400:ffffffff [10909.607243] tpm tpm0: A TPM error (325) occurred stopping the TPM [10909.892588] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [10943.200346] tpm tpm0: A TPM error (325) occurred stopping the TPM [10943.475999] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [10958.114510] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [10982.525656] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [11002.254512] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [11025.533533] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [11054.397515] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [11071.691503] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun [11346.956583] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun The pipe underruns happen generally when I plug/unplug a monitor and/or move a mouse across monitor boundaries. I have actually seen the flicker 4 or 5 times today now, even with the patched kernel (which is still better than every few minutes). This is after a suspend/resume cycle, in case that makes a difference - yesterday I ran the laptop continually from reboot. Please re-test with Paulo's patch to apply memory workarounds for skylake: https://patchwork.freedesktop.org/series/13548/ (In reply to Paulo Zanoni from comment #10) > Hello > > Can you please confirm whether > https://patchwork.freedesktop.org/patch/113642/ fixes the problem? > > Thanks, > Paulo I've been testing this for a few days now (4.8.1 with your patch) and I have not seen the bad flickering ever since. At least my Skylake machine is usable again. But the flickering issue has not been resolved completely. Every ~10-15 minutes I am getting one of these messages in my system log, accompanied by a very short display flicker (on one of my screens): [19523.129445] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe C (start=132656 end=132657) time 383 us, min 1043, max 1049, scanline start 1040, end 1065 [20898.089311] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun (In reply to yann from comment #17) > Please re-test with Paulo's patch to apply memory workarounds for skylake: > https://patchwork.freedesktop.org/series/13548/ The patch does not apply on top of 4.8.1 > Please re-test with Paulo's patch to apply memory workarounds for skylake:
> https://patchwork.freedesktop.org/series/13548/
Is there a proposed version of these patches that will apply to 4.8.1? drm-intel-nightly has a separate issue where the unity toolbar often flickers annoyingly when I move the mouse around in Firefox so I'd rather test against mainline if that's possible.
(In reply to rockorequin from comment #20) > > Please re-test with Paulo's patch to apply memory workarounds for skylake: > > https://patchwork.freedesktop.org/series/13548/ > > Is there a proposed version of these patches that will apply to 4.8.1? > drm-intel-nightly has a separate issue where the unity toolbar often > flickers annoyingly when I move the mouse around in Firefox so I'd rather > test against mainline if that's possible. Side note, we'd of course appreciate you reporting a separate bug on that *now* instead of waiting until 4.9 or 4.10 for it to hit you in mainline! Created attachment 127253 [details] [review] This is combined Paulo patchset that applies to 4.8.1 >This is combined Paulo patchset that applies to 4.8.1 I tried that patchset against 4.8.1 but I get this: root@xps15-9550:/usr/src/linux-4.8.1# curl https://bugs.freedesktop.org/attachment.cgi?id=127253 > patches/drm-memory-patches.patch && patch -p1 --dry-run < patches/drm-memory-patches.patch % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4533 100 4533 0 0 3225 0 0:00:01 0:00:01 --:--:-- 3226 checking file drivers/gpu/drm/i915/intel_pm.c Hunk #2 FAILED at 2994. Hunk #3 FAILED at 3011. Hunk #4 FAILED at 3561. Hunk #5 FAILED at 3608. 4 out of 5 hunks FAILED > Side note, we'd of course appreciate you reporting a separate bug > on that *now* instead of waiting until 4.9 or 4.10 for it to > hit you in mainline! Ok, I'm trying to reproduce it in drm-intel-nightly 4.8.0-994-generic #201610112342. So far no luck, though (here's hoping it's fixed). > Can you please confirm whether > https://patchwork.freedesktop.org/patch/113642/ fixes the problem? Is that patchset in drm-intel-nightly 4.8.0-994-generic #201610112342? ie am I testing the patchset at the same time as trying to reproduce the other flickering issue? Created attachment 127264 [details] [review] Paulo's patch rebased vs 4.8.1 (a7fac751ddba) (In reply to yann from comment #25) > Created attachment 127264 [details] [review] [review] > Paulo's patch rebased vs 4.8.1 (a7fac751ddba) this is to apply memory workaround for skl https://patchwork.freedesktop.org/patch/113642/ is not yet in v4.8.1. >> Created attachment 127264 [details] [review] [review] [review] >> Paulo's patch rebased vs 4.8.1 (a7fac751ddba) > > this is to apply memory workaround for skl Thanks, I'm testing 4.8.1 now with that patch applied and also drm-i915-gen9-fix-DDB-partitioning-for-multi-screen-cases.patch applied. > https://patchwork.freedesktop.org/patch/113642/ is not yet in v4.8.1. I guessed that... From the log in git in drm-intel-nightly, it looks like that commit was made on October 4th, so I guess it must already be in my October 11th drm-intel-nightly kernel. Btw, I ran that kernel all day without any seeing any flickering issues, full-screen or otherwise. (In reply to rockorequin from comment #27) > >> Created attachment 127264 [details] [review] [review] [review] [review] > >> Paulo's patch rebased vs 4.8.1 (a7fac751ddba) > > > > this is to apply memory workaround for skl > > Thanks, I'm testing 4.8.1 now with that patch applied and also > drm-i915-gen9-fix-DDB-partitioning-for-multi-screen-cases.patch applied. > > > https://patchwork.freedesktop.org/patch/113642/ is not yet in v4.8.1. > > I guessed that... From the log in git in drm-intel-nightly, it looks like > that commit was made on October 4th, so I guess it must already be in my > October 11th drm-intel-nightly kernel. Btw, I ran that kernel all day > without any seeing any flickering issues, full-screen or otherwise. Thanks a lot for testing the patches! Based on your comments, I can see that: - Patch "fix DDB partitioning" improves the situation but doesn't completely solve the problem - Patch "unconditionally apply memory WAs" helps solving the remaining issues. Is that correct? If yes, then I suppose we'll be able to close the bug once the second patch lands on the tree. If not, I do have to point that we have even more fixes on the mailing list, but then we should probably setup a separate branch with all the fixes applied so you'll only need to clone that branch instead of having to apply patches manually and solve conflicts. > Based on your comments, I can see that: > - Patch "fix DDB partitioning" improves the situation but > doesn't completely solve the problem > - Patch "unconditionally apply memory WAs" helps solving the > remaining issues. > > Is that correct? Yes, I think that should be correct, although I should probably test for longer because "fix DDB partitioning" reduces the occurrence of the flickering considerably so I should give it a chance to re-occur. But so far, so good. (FWIW, "fix DDB partitioning" also fixes https://bugs.freedesktop.org/show_bug.cgi?id=97596, so it's an important patch.) I am still seeing some atomic update failure messages in the log with my patched 4.8.1. For example: [ 955.923097] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe B (start=55693 end=55694) time 102 us, min 1073, max 1079, scanline start 1072, end 1079 Are they anything to be concerned about? I just saw the eDP display flicker again, so I think these patch don't completely resolve the problem. The nearest relevant message in the syslog is: Oct 14 16:04:07 xps15-9550 kernel: [46013.541604] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun Are there other patches I could try against 4.8.1? The drm-intel-nightly kernel is behaving quite well. (In reply to yann from comment #17) > Please re-test with Paulo's patch to apply memory workarounds for skylake: > https://patchwork.freedesktop.org/series/13548/ I've been testing the rebased patch on 4.8.1 now for a while. I cannot confirm significant positive effects, however I haven't experienced any negative effects at all. The flickering might have become a little better, it's hard to tell. There are only very few screen flickers (once every ~30 minutes), so the machine is definitely usable. There still are the usual underruns: [15792.820143] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun [16070.293825] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun [18877.276327] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=224491 end=224492) time 471 us, min 1192, max 1199, scanline start 1189, end 1227 [22438.333764] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun [22692.923450] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe C FIFO underrun Also, suspend/resume cycles work fine. I can also confirm that drm-intel-nightly feels quite good. Maybe there finally is hope for Skylake users (more than one year after the platform's release). I have seen this issue with the following kernels: * 4.8.0 * 4.8.2 built with the patches linked above * 4.9-rc1 built from the drm-intel-nightly branch (commit 5b633f423e27af3a7f30d303e243f5a2e82917ae) Subjectively, it feels like the flickering became more common with the memory workaround patches, but I've only been running them today, so that's a pretty small sample. I meant to add that sometimes my display flickers without any corresponding messages from dmesg. For example, my display has flickered several times since the last boot, and the only messages from i915 are the messages from when it loaded. No FIFO underruns reported. I have seen those messages in dmesg in the past too, but they don't seem directly correlated to external display flicker at all. I have been trying out the ubuntu 4.9-rc1 mainline kernel for the last day and a half, and so far it has been pretty solid - I haven't seen any flickering so far. Bug exists on all Linux kernel versions from 4.8.0 to 4.8.3 (including) on openSUSE 42.1, installed from here: http://download.opensuse.org/repositories/Kernel:/stable/standard/ Some suspicious lines from dmesg, running 4.8.3 -- [ 2.213187] BERT: Can't request iomem region <00000000dbfa6f98-00000000dbfa6fab>. ... [ 369.631560] Corrupted low memory at ffff9b6ac0001000 (1000 phys) = 6000010000100000 [ 369.631568] Corrupted low memory at ffff9b6ac0001008 (1008 phys) = 100001000001c [ 369.631571] Corrupted low memory at ffff9b6ac0001010 (1010 phys) = 100001000001c70 [ 369.631574] Corrupted low memory at ffff9b6ac0001018 (1018 phys) = 1000001c8000 ... [ 369.637365] Corrupted low memory at ffff9b6ac00050f8 (50f8 phys) = e000000000100000 [ 369.637366] Corrupted low memory at ffff9b6ac0005100 (5100 phys) = 1000008f [ 369.637367] Corrupted low memory at ffff9b6ac0005108 (5108 phys) = 1000008ff0 [ 369.637368] Corrupted low memory at ffff9b6ac0005110 (5110 phys) = 560554504224 [ 369.637373] ------------[ cut here ]------------ [ 369.637376] WARNING: CPU: 0 PID: 2947 at ../arch/x86/kernel/check.c:141 check_corruption+0xa/0x40 [ 369.637376] Memory corruption detected in low memory [ 369.637458] CPU: 0 PID: 2947 Comm: kworker/0:0 Not tainted 4.8.3-1.g94eb9fb-default #1 [ 369.637458] Hardware name: Dell Inc. Precision Tower 3420/08K0X7, BIOS 1.3.6 05/26/2016 [ 369.637460] Workqueue: events check_corruption [ 369.637461] 0000000000000000 ffffffff883a3e62 ffff9b726a6f3dd0 0000000000000000 [ 369.637463] ffffffff8807ddde ffffffff88e2ab00 ffff9b726a6f3e20 ffff9b72ddc18e80 [ 369.637465] ffffd6a43fc02d00 0000000000000000 0ffffd6a43fc02d0 ffffffff8807de4f [ 369.637467] Call Trace: [ 369.637474] [<ffffffff8802eefe>] dump_trace+0x5e/0x310 [ 369.637477] [<ffffffff8802f2cb>] show_stack_log_lvl+0x11b/0x1a0 [ 369.637480] [<ffffffff88030001>] show_stack+0x21/0x40 [ 369.637482] [<ffffffff883a3e62>] dump_stack+0x5c/0x7a [ 369.637488] [<ffffffff8807ddde>] __warn+0xbe/0xe0 [ 369.637492] [<ffffffff8807de4f>] warn_slowpath_fmt+0x4f/0x60 [ 369.637494] [<ffffffff880600ea>] check_corruption+0xa/0x40 [ 369.637497] [<ffffffff880966bd>] process_one_work+0x1ed/0x4d0 [ 369.637500] [<ffffffff880969e7>] worker_thread+0x47/0x4c0 [ 369.637502] [<ffffffff8809c59d>] kthread+0xbd/0xe0 [ 369.637505] [<ffffffff886d461f>] ret_from_fork+0x1f/0x40 [ 369.638807] DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 [ 369.638808] Leftover inexact backtrace: [ 369.638810] [<ffffffff8809c4e0>] ? kthread_worker_fn+0x170/0x170 [ 369.638811] ---[ end trace 79d3bdfae01f66f9 ]--- [ 474.385619] [drm:gen8_irq_handler [i915]] *ERROR* CPU pipe B FIFO underrun Maybe related: https://bugzilla.kernel.org/show_bug.cgi?id=177791 (In reply to Ted from comment #15) > I also started experiencing this issue. I've seen most of the issues > mentioned, plus the computer would sometimes freeze (screen is black and > caps lock light does not toggle). > After applying the patch, I haven't had any problems yet (freezing, black > screen, flickering). > > $ lspci | grep VGA > 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 520 (rev 07) > $ dmesg -l err -w > [ 3221.487848] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update > failure on pipe A (start=87034 end=87035) time 301 us, min 763, max 767, > scanline start 753, end 768 > [ 4236.060368] tpm tpm0: A TPM error (325) occurred stopping the TPM > [ 4236.322922] [drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU > pipe B FIFO underrun > . . . > The pipe underruns happen generally when I plug/unplug a monitor and/or move > a mouse across monitor boundaries. I'm experiencing the hard freezes too, "hard" because the system does not respond even to REISUB, only to long pressing the Power button. I use single display (laptop LCD), however the error messages in dmesg are similar. Also there are screen glitches/flickering, but only if I enable the IOMMU (check the videos there https://bugzilla.kernel.org/show_bug.cgi?id=177791 ) and they appear only when I place the mouse cursor on two exact lines of the whole screen. I've thought that freezes are related to IOMMU but I got them a few times with IOMMU disabled. Nevertheless it seems that freezes appear much more often with IOMMU enabled than disabled. Most of the times freezes appear when the system goes to powersave mode after some time of inactivity when I go away from the laptop. However I got a freeze one or two times when I was working on the laptop. My powersave settings in KDE are: dim screen after 5 mins, switch off screen after 10 mins, never suspend, never hibernate. Most of the times freezes appear from 1 to (60?) minutes after the screen turns off - so from 11 minutes of inactivity. Freezes almost never occur (just one or two times total) before the screen turns off (in less than 10 minutes of inactivity). Sometimes the system does not freeze even after 30 minutes of inactivity. Despite system does not respond to REISUB, sometimes (although very rare) it is possible to switch from the X session to the terminal session by pressing Ctrl+Alt+F1 many times after the system freeze. Sometimes the terminal works fast without any problems, then I am able to log in and see that load average stays below 1 and no processes load the CPU too much in the 'top' or I/O in the 'iotop', but if I try to switch back to X session with Ctrl+Alt+F7 the system completely freezes and never responds to Ctrl+Alt+F* or REISUB. Once after I switched to the terminal session I got an error log to console stating about 'i915' so I think the problem is related to the powersave of Intel integrated GPU. Please see the photo here: http://robolab.it/iommu/call_trace.jpg This time I was not able to login as the system got completely frozen after I typed 'root'. I use Clevo P640RE with Core i7-6700HQ CPU, openSuSE 13.2 with kernel 4.8.1 installed from this repo: http://download.opensuse.org/repositories/Kernel:/stable/standard/ Retyped the error message for search engines and terminal users. Judging by lots of 'fb' (framebuffer?) lines it seems that this error appeared because I have switched to terminal and is not related to the actual system freeze. WARNING: CPU: 0 PID: 1993 at ../drivers/gpu/drm/i915/intel_display.c:13647 intel_atomic_commit_tail+0x1054/0x1060 [i915] pipe A vblank wait timed out Modules linked in: (...many modules here...) CPU: 0 PID: 1993 Comm: Xorg Tainted: G W O 4.8.1-3.gf7183f5-default #1 Hardware name: CLEVO P64xRE/P64xRE powered by premamod.com, BIOS 1.05.07PM v1 07/29/2016 0000000000000000 ffffffffa03a3df2 ffff8aa9ed7f7918 0000000000000000 ffffffffa007ddde ffff8aa9f0d30000 ffff8aa9ed7f7968 0000000000000000 0000000000000000 0000000000000000 ffff8aa9f08e3000 ffffffffa007de4f Call Trace: [<ffffffffa002eefe>] dump_trace+0x5e/0x310 [<ffffffffa002f2cb>] show_stack_log_lvl+0x11b/0x1a0 [<ffffffffa0030001>] show_stack+0x21/0x40 [<ffffffffa03a3df2>] dump_stack+0x5c/0x7a [<ffffffffa007ddde>] __warn+0xbe/0xe0 [<ffffffffa007de4f>] warn_slowpath_fmt+0x4f/0x60 [<ffffffffc04b2c64>] intel_atomic_commit_tail+0x1054/0x1060 [i915] [<ffffffffc04b307c>] intel_atomic_commit+0x40c/0x510 [i915] [<ffffffffc03ff4dc>] restore_fbdev_mode+0x14c/0x270 [drm_kms_helper] [<ffffffffc0400f7e>] drm_fb_helper_restore_fbdev_mode_unlocked+0x2e/0x70 [drm_kms_helper] [<ffffffffc0400fe9>] drm_fb_helper_set_par+0x29/0x50 [drm_kms_helper] [<ffffffffc04cc0b6>] intel_fbdev_set_par+0x16/0x60 [i915] [<ffffffffa0419e40>] fb_set_var+0x200/0x3e0 [<ffffffffa0410b28>] fbcon_blank+0x2b8/0x2f0 [<ffffffffa04a2517>] do_unblank_screen+0xc7/0x190 [<ffffffffa0498374>] complete_change_console+0x54/0xd0 [<ffffffffa0498aa1>] vt_ioctl+0x6b1/0x1230 [<ffffffffa048d44e>] tty_ioctl+0x33e/0xc20 [<ffffffffa022c2af>] do_vfs_ioctl+0x8f/0x5d0 [<ffffffffa022c864>] SyS_ioctl+0x74/0x80 [<ffffffffa06d43f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 FWIW, I haven't seen any flickering in the last week and a bit using the 4.9-rc1 and -rc2 kernels. I've been on 4.9-rc2 for 3 days now and the flickering is gone (even with multiple suspend/resume cycles in between). No more underruns either, just one single "Atomic update failure on pipe A" in the kernel log. So 4.9 will probably be the first kernel with proper Skylake graphics support. Thanks for the follow-up, and patience, closing. The 4.8 backport patches have been sent to stable maintainers, hopefully we can still make 4.8 work too. For reference, the backport http://lkml.kernel.org/r/1477510599-14843-1-git-send-email-lyude@redhat.com Closing resolved+fixed. Verified with 4.9-rc2 by reporter. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 125975 [details] dmesg from drm-intel-fixes (177d91aa) with drm.debug=0xe On Kernel 4.8-rc3 with the latest drm-intel-fixes patches I am getting random display flickering. I also applied by the patch series "Finally fix watermarks". On Kernel 4.7 and earlier I do not have any flickering issues. The flickering only happens on my HDMI display and only if the mouse cursor is present on the display. However it is unrelated to cursor movement. I also could not find any suspicious messages in dmesg which appear at the time of the flickering. There are the usual PIPE underruns and an atomic update failures in the kernel log. The flickering itself happens once every other minute. Hardware: Lenovo Thinkpad L460 with latest BIOS, Ultra Dock, two external screens (HDMI+DVI) CPU: i5-6200U Kernel: 4.8-rc3 with patches from drm-intel-fixes OS: Arch Linux