I'm running latest Debian sid/buster on a Dell XPS 9370 laptop (Kabylake-R graphics).
Sometimes, when switching to a tty console after been running a Xorg session, my laptop screen gets corrupted in the top left corner. This corruption disappears when switching back to the Xorg session or when rebooting the machine.
Created attachment 141586 [details]
picture of the corruption after switching to console
Created attachment 141587 [details]
another picture of the corruption after switching to console
Created attachment 141588 [details]
full screen image with corruption
Martin, can you attach the full dmesg from boot with kernel parameters drm.debug=0x1e log_buf_len=4M.
This will help us in investigating the issue.
Can you also verify if same issue is on latest drm-tip:
Hi, I can confirm the same behavior on my machine (KBL based).
And the same with Martin, issue was seen on Debian buster (with latest kernel from repo).
Will try suggested kernel also.
Created attachment 141601 [details]
Attaching requested log. also will try to roll back drm-tip somewhere to 4.13 kernel version, just in case of possible regression
debian is extremely unfriendly for compiling kernels :( at least for me.
So I decided to take kernel from stable branch
Linux debian 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux
Can't reproduce the issue on it. So would be great to bisect it. I will try to do this on intel-drm... on Ubuntu it was quite easy.
providing my results here, maybe they will help somebody.
I couldn't build and boot to any kernel from drm-intel.
Successfully were built 2 kernels:
On both I stucked on startup screen "booting kernel" or similar.
All other kernels between 4.10 and 4.16 returned this error:
Unsupported relocation type: R_X86_64_PLT32 (4)
make: *** [arch/x86/boot/compressed/Makefile:122: arch/x86/boot/compressed/vmlinux.relocs] Error 1
make: *** Waiting for unfinished jobs....
make: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] Error 2
From what I found, this issue can be solved by downgrading binutils, but this require to downgrade and gcc/g++ also, and appropriate versions don't exist in repo (the lowest gcc-6 still requires highest binutils).
Finally after trying to install all needed dependencies manually I stucked with:
>sudo apt-get -f install
>sudo: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.26' not found (required >by sudo)
>sudo: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.26' not found (required >by /usr/lib/sudo/libsudo_util.so.0)
Maybe somebody will have better luck((
Can someone who is seeing this try to revert 011f22eb545a35f972036bb6a245c95c2e7e15a0 (drm/i915: Do NOT skip the first 4k of stolen memory for pre-allocated buffers v2) ?
That will likely fix this. If that indeed fixes it then we should really only use the Video BIOS / GOP driver framebuffer when taking over the initial mode and, *if it starts within the first 4k*, use a new framebuffer for fbdev emulation instead of inheriting the BIOS / GOP driver framebuffer there.
This will allow us to keep the initial framebuffer for flickerfree boot, while selecting another framebuffer which honors the WaSkipStolenMemoryFirstPage:bdw+
workaround for fbcon, which should fix the fbcon corruption.
(In reply to Hans de Goede from comment #10)
> Can someone who is seeing this try to revert
> 011f22eb545a35f972036bb6a245c95c2e7e15a0 (drm/i915: Do NOT skip the first 4k
> of stolen memory for pre-allocated buffers v2) ?
> That will likely fix this. If that indeed fixes it then we should really
> only use the Video BIOS / GOP driver framebuffer when taking over the
> initial mode and, *if it starts within the first 4k*, use a new framebuffer
> for fbdev emulation instead of inheriting the BIOS / GOP driver framebuffer
> This will allow us to keep the initial framebuffer for flickerfree boot,
> while selecting another framebuffer which honors the
> workaround for fbcon, which should fix the fbcon corruption.
I'm able to reproduce this issue. I've just rebuild kernel with 011f22eb545a35f972036bb6a245c95c2e7e15a0 patch reverted but it didn't help. Issue is still reproducible.
(In reply to vadym from comment #11)
> > This will allow us to keep the initial framebuffer for flickerfree boot,
> > while selecting another framebuffer which honors the
> > WaSkipStolenMemoryFirstPage:bdw+
> > workaround for fbcon, which should fix the fbcon corruption.
> I'm able to reproduce this issue. I've just rebuild kernel with
> 011f22eb545a35f972036bb6a245c95c2e7e15a0 patch reverted but it didn't help.
> Issue is still reproducible.
Thanks. I'm a bit surprised that reverting that commit does not fix things. Did you rebuild your initrd? I'm happy that I (I wrote that commit) did not cause this breakage, but I'm a bit surprised.
(In reply to Hans de Goede from comment #12)
> Thanks. I'm a bit surprised that reverting that commit does not fix things.
> Did you rebuild your initrd? I'm happy that I (I wrote that commit) did not
> cause this breakage, but I'm a bit surprised.
Yes, I've got new initrd for my build. If I'm not mistaking you patch was landed in 4.18 kernel. But I can reproduce this issue with my default Debian 4.17.0-3-amd64 kernel. Also I'm not able to reproduce this on 4.9 kernel. So I think this issue can be bisected between 4.9 and 4.17
Following patch fixes the issue for me: https://chromium-review.googlesource.com/c/chromiumos/third_party/kernel/+/877396/
That patch fixes similar issue on the ChromeOS.
I think this can be marked as duplicate of Bug 106478
If it can help, it seems I have the same problem : top left screen corruption looking the same.
Screen is looking fine at boot (console), then Kodi starts and screen corruption appears after a few seconds. Then it is still visible even if I go back to console.
The problems started when I upgrade the box from Ubuntu 18.04.LTS to 18.1 (kernel version is Linux server 4.18.0-10-generic #11-Ubuntu).
Hardware is an Intel(R) Core(TM) i3-7100 using the internal GPU, connected to TV with HDMI cable.
Please tell me if you need traces or more details.
hey. As I found out, https://bugs.freedesktop.org/show_bug.cgi?id=108257 this ticket was closed as fixed. I think that current one is the same and also should be closed. Reporter, did you try to check drm-tip, how it works for you?
(In reply to Denis from comment #16)
> hey. As I found out, https://bugs.freedesktop.org/show_bug.cgi?id=108257
> this ticket was closed as fixed. I think that current one is the same and
> also should be closed. Reporter, did you try to check drm-tip, how it works
> for you?
I'm now running latest Debian testing kernel (4.18.0-3-amd64), and I still see the same corruption issue. It seems to be triggered on my machine after running a game, like a fullscreen OpenGL game.
I have not tried the drm-tip repos yet. I'll try look into how to build it and run it.
Well, something broken. I get an grub "error: Out of memory" when loading 4.20-rc5 kernel built from the drm-tip repos ...
hm, this definitely doesn't relate to current issue, but still - blocks from checking :(
btw - I recollecting problems with building on debian using defconfig command.
Could you try build using command "make oldconfig" instead "make defconfig"?
It should take your current "workable" kernel config for a new kernel.
upd - keep in mind that using your config building process will take longer time then with "defconfig" (about 2-4 hours depending on your PC).
oh and the last thing - user in the related ticket mentioned exactly this commit
as workable for him.
So if you didn't take exactly it and just built "latest" - maybe it worse it to build exactly this one.
(In reply to Denis from comment #19)
> hm, this definitely doesn't relate to current issue, but still - blocks from
> checking :(
> btw - I recollecting problems with building on debian using defconfig
> Could you try build using command "make oldconfig" instead "make defconfig"?
> It should take your current "workable" kernel config for a new kernel.
Getting latest 4.20-rc6 sources, trying "make oldconfig" and answering "n" to all questions, makes an installable kernel that produce the same "out of memory" error as before on boot.
I'll try clone the drm-tip and checkout the commit "2f99c4889e4124f9cf50b745d037f432318c4bb4" and build that instead.
(In reply to Denis from comment #21)
> oh and the last thing - user in the related ticket mentioned exactly this
> as workable for him.
> So if you didn't take exactly it and just built "latest" - maybe it worse it
> to build exactly this one.
hmm, does it exist?
~/dev $ git clone git://anongit.freedesktop.org/drm-tip
Cloning into 'drm-tip'...
remote: Counting objects: 6396733, done.
remote: Compressing objects: 100% (957256/957256), done.
remote: Total 6396733 (delta 5394085), reused 6396675 (delta 5394047)
Receiving objects: 100% (6396733/6396733), 1.16 GiB | 3.95 MiB/s, done.
Resolving deltas: 100% (5394085/5394085), done.
Checking out files: 100% (62551/62551), done.
~/dev $ cd drm-tip/
~/dev/drm-tip $ git checkout 2f99c4889e4124f9cf50b745d037f432318c4bb4
fatal: reference is not a tree: 2f99c4889e4124f9cf50b745d037f432318c4bb4
hm, I am not familiar well with kernel fix process, but I think that if that patch provided changes to "UTC integration manifest" - it is something general, means that from this date "2018y-11m-30d-21h-47m-58s" this file still includes fixes, so "git checkout master" should be ok.
The last thing I forgot to mention when wrote about "defconfig" - you should take your current config, and apply it during the building (according to your steps, you manually selected all options, and for sure, disabling everything - is not a right way).
1. If I am not mistaking, your current config should be here:
/boot/config-X.X.X (where X.X.X - your current stable kernel version).
2. During typing "make difconfig" - in the opened GUI find "Load configuration file" or similar to this, and select your old config.
3. Save these changes (note that config file should be renamed from config-X.X.X to .config)
4. Continue setup (it shouldn't ask you about enabling/disabling anything. It should take everything from your old .config file).
If this will not help for you, I will try to install debian later and compile kernel as well, cos I also reproduced this issue... but it won't bee fast :(
upd - "Save these changes (note that config file should be renamed from config-X.X.X to .config)"
to be safe doing these operations, don't do this on original config file. Copy it somewhere (it may be obvious, I know... but still should be mentioned :) ).
I searched through the .config and found that it was set to build with debug symbols, which made the initrd image unreasonably big, causing the "out of memory" error
soo? :) Did your try was successful?)
Yes, managed to build a 4.20-rc6 from drm-tip
I've tried reproduce the screen corruption with 4.20-rc6 but haven't been able to so fa.
thanks a lot! That's confirming that fix was landed within those patches. Closing ticket. (https://patchwork.freedesktop.org/series/51878/ series with fix)
Please reopen if you get new information or reproduce it again.
The bug returned to kernel 4.19.28:
$ uname -a
Linux waterhole 4.19.0-4-amd64 #1 SMP Debian 4.19.28-2 (2019-03-15) x86_64 GNU/Linux
See atttached image.
Created attachment 143903 [details]
Console corruption (regression) with kernel 4.19.28
I'm sorry guys, I just realized I booted the wrong kernel. Sorry for the inconveniences.