Summary: | Intel GeminiLake corruption at top of screen caused by fbc | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Daniel Drake <dan> | ||||||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||
Status: | RESOLVED MOVED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||
Severity: | normal | ||||||||||||
Priority: | medium | CC: | arkadiusz.hiler, intel-gfx-bugs, lui, przanoni, punx665 | ||||||||||
Version: | unspecified | ||||||||||||
Hardware: | Other | ||||||||||||
OS: | All | ||||||||||||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=104890 | ||||||||||||
Whiteboard: | Triaged, ReadyForDev | ||||||||||||
i915 platform: | GLK | i915 features: | display/FBC | ||||||||||
Bug Depends on: | |||||||||||||
Bug Blocks: | 111484 | ||||||||||||
Attachments: |
|
Description
Daniel Drake
2018-09-27 09:07:02 UTC
Created attachment 141758 [details]
lspci output on Asus E406MA
Created attachment 141759 [details]
glxinfo output on Asus E406MA
(In reply to Daniel Drake from comment #0) > On Intel GeminiLake platforms, a horizontal line of display corruption > frequently appears at the top of the screen. > > When the corruption persists on screen, if I take a screenshot with > "DISPLAY=:0 import -window root out.png", the captured output is fine (it > does not show the corruption) Suggests that it may be a display engine issue (or a really weird resolve, but I'd bet on a conflict in stolen memory). A drm.debug=0xe may be helpful. reminds me of the Zalgo bug. Created attachment 141770 [details]
drm.debug=0xe log
When the corruption happens, the only lines that are logged are the drm_mode_addfb2 messages which occur every time the screen content changes (even in the no-corruption case).
Have exactly the same issue on my Lenovo Ideapad 330-15IGM with Intel N5000 inside under Xubuntu 18.04. A shot in the dark, please try i915.enable_dc=0 and i915.disable_power_well=0, separately, one at a time. I tried both options separately, the corruption still easily reproduced in both cases. BTW we would be prepared to ship a sample product to Intel if that helps. First 4kb of stolen memory being used? Aka possible regression from commit 011f22eb545a35f972036bb6a245c95c2e7e15a0. At drivers/gpu/drm/i915/i915_gem_stolen.c, function i915_gem_init_stolen(), at the very end, on the drm_mm_init() call, can you please change the second argument from 0 to 4096? Also please in the stolen_usable_size argument right above this, subtract 4096 to compensate that. Then please test this and report if the problem still happens. Created attachment 142067 [details] [review] suggested change Thanks Paulo. I made this code change according to your suggestion however it does not seem to affect the issue, the corruption is still present. Daniel, could you please try to disable FBC? Also, did you try an older kernel to see if this is a regression and possible bisectable? i915.enable_fbc=0 makes the issue go away Re-enabling fbc, I also tried INTEL_DEBUG=norbc for mesa but the issue is still there. So this seems to be a bug with the i915 kernel driver framebuffer compression, I'm adjusting the bug accordingly. Also tested Linux 4.13 which was the first kernel to support GeminiLake without requiring alpha_support. The bug is immediately reproducible there so it does not seem to be a regression. Again, booting that kernel with i915.enable_fbc=0 the issue goes away. Daniel, Thanks for the detailed bug report/videos. (In reply to Daniel Drake from comment #0) > On Intel GeminiLake platforms, a horizontal line of display corruption > frequently appears at the top of the screen. Does this happen throughout the usage of the PC or only at gdm? > Alternative reproducer, a bit harder: open terminal, run dmesg, maximize the > terminal, then use two-finger scroll to quickly scroll up and down. In this case, even though it is hard to reproduce the issue, corruption will stay forever? or disappears? (In reply to Lakshmi from comment #14) > Does this happen throughout the usage of the PC or only at gdm? It happens throughout usage, but gdm the easiest way to reproduce it on-demand that I've found. > > Alternative reproducer, a bit harder: open terminal, run dmesg, maximize the > > terminal, then use two-finger scroll to quickly scroll up and down. > > In this case, even though it is hard to reproduce the issue, corruption will > stay forever? or disappears? If you manage to stop scrolling right at the moment when the corruption is visible, then the corruption will persist until the next screen update. Otherwise, it will disappear after a moment. Even though it's for a different SoC I checked on a few details found in https://www.x.org/docs/intel/SKL/rev01/intel-gfx-prm-osrc-skl-vol16-workarounds.pdf 0529: enabled the existing i915 codepath on GLK, no change 0562: FBC Watermark Disable is already set 0622: don't know how to check this 0851: don't know how to check this 0859: DISP_FBC_MEMORY_WAKE is already set 0873: added ILK_DPFC_NUKE_ON_ANY_MODIFICATION, no change 0883: ILK_DPFC_DISABLE_DUMMY0 is already set, DISP_FBC_MEMORY_WAKE is already set 0884: i915 implementation only affects PSR codepath but this platform doesn't support PSR So no progress there, further suggestions very welcome... Can confirm i915.enable_fbc=0 solved the issue in my system. Random idea, does intel_iommu=igfx_off make a difference with fbc enabled? (In reply to Jani Nikula from comment #18) > Random idea, does intel_iommu=igfx_off make a difference with fbc enabled? The visual corruption is still easily reproducible with that parameter. Setting the priority to Medium based on WA and impact. I have a similar problem with three motherboards. MSI H310M PRO-VD LGA1151 MSI B360M Gaming Plus DDR4 Turbo M2 LGA1151 Gigabyte Z370M-D3H LGA1151-CL The onboard motherboard video is what I am using. That thread is located at: https://forums.fedoraforum.org/showthread.php?320591-Fedora-29-onboard-video-memory-problem&goto=newpost I have tried the above ideas separately i915.enable_fbc=0, i915.enable_dc=0 and i915.disable_power_well=0 but problem still there. I am happy to try any other suggestions or debugs. I did not read all the pointers that the above points to just the main thread so if I missed something let me know and I'll try that too. (In reply to Luigi Cantoni from comment #21) > I have a similar problem with three motherboards. > MSI H310M PRO-VD LGA1151 > MSI B360M Gaming Plus DDR4 Turbo M2 LGA1151 > Gigabyte Z370M-D3H LGA1151-CL > The onboard motherboard video is what I am using. > > That thread is located at: > https://forums.fedoraforum.org/showthread.php?320591-Fedora-29-onboard-video- > memory-problem&goto=newpost > > I have tried the above ideas separately > i915.enable_fbc=0, i915.enable_dc=0 and i915.disable_power_well=0 > but problem still there. > > I am happy to try any other suggestions or debugs. I did not read all the > pointers that the above points to just the main thread so if I missed > something let me know and I'll try that too. Is it GLK? Have you tried comment 18? Hello everyone, just to share that I have the same issue using libreelec with a NUC7PJYH. It seems everybody using a GLK have the issue: https://forum.libreelec.tv/thread/12380-le-8-2-5-with-uhd-630-coffee-lake-gemini-lake-support-and-luks/?pageNo=10 (In reply to Geraud from comment #23) > Hello everyone, just to share that I have the same issue using libreelec > with a NUC7PJYH. > > It seems everybody using a GLK have the issue: > > https://forum.libreelec.tv/thread/12380-le-8-2-5-with-uhd-630-coffee-lake- > gemini-lake-support-and-luks/?pageNo=10 Can you confirm if the issue goes away after disabling fbc i915.enable_fbc=0 ? Yes I did the test yesterday adding kernel parameter or driver config file...no luck the issue is still there. (In reply to Geraud from comment #25) > Yes I did the test yesterday adding kernel parameter or driver config > file...no luck the issue is still there. Can you double check that the parameter change took effect with the following command: > sudo cat /sys/module/i915/parameters/enable_fbc If it says "1" then you should open a new bug report as any screen corruption is not caused by fbc in that case. If it says "0" then something is wrong with the way you are setting the parameter. It appears I am not using glk as that is not in my dmesg/journal files. Thus I think I am not loading it. I just checked out comment#18 and that made no difference, same problem. I just tested again for i915.enable_fbc=0 Still there but using the check that Daniel suggested looks like it is going wrong. [root@test tmp]# dmesg | grep -i "kernel command" [ 0.116842] Kernel command line: BOOT_IMAGE=/vmlinuz-4.20.5-200.fc29.x86_64 root=/dev/sda3 ro resume=/dev/sda2 i915.enable_fbc=0 rhgb quiet LANG=en_AU.UTF-8 [root@test tmp]# cat /sys/module/i915/parameters/enable_fbc 0 I thought I was setting the parameter correctly but it appears I am not. Suggestion? I certainly hope that once set correctly it fixes it because then we will all be happier. Sorry, I got it the wrong way round in my last comment. What I meant to write:
> sudo cat /sys/module/i915/parameters/enable_fbc
If it says "0" then you have set the parameter correctly and you should open a new bug report as any screen corruption is not caused by fbc in that case.
If it says "1" then something is wrong with the way you are setting the parameter.
So, Luigi, you are setting the parameter correctly and it is not solving the issue. That means the issue you are facing is not the one being tracked here.
Guess what. Looks like my problems is X11 related. When I tell X11 that it has an intel graphics device then my problem appears to go away. Two machines tested and MSI and a Gigabyte. I will be testing my third machine later (I have to wait until the users go to lunch). If I do not post anything more then it worked. If not I will provide more info. My fix is just to create up /etc/X11/xorg.conf.d/20-intel.conf and put Section "Device" Identifier "Intel Graphics" Driver "intel" EndSection into it. Very simple and probably obvious in hindsight. As you suggest Daniel not exactly this problem. Now it gets confusing. It appears I was wrong my fix is not a fix. It depends on what order you do things and also a bit on what is on the graphical screen as to if you see it or not. The good bit is I appear now to have a method to make it fail etc (and sort of work). If I have ctrl-alt-F1 with the graphical login screen ctrl-alt-F2 is a logged in session ctrl-alt-F3 is the character based login. This is the screen display with the problem visible on it. Swap from F1<->F3 100% looks OK Swap F1<->F2 100% OK (always was and never had an issue) Swap F2<->F3 and my problem is there. Do F2->F3 (problem there) do F3->F1->F3 problem not there Or do F2->F1->F3 problem not there. It fixes up the display if I go through the F1 (graphical login) screen. I just re-tested my machines with my X11 conf file there and not there and the behaviour is as described above. The file makes no difference. I had not noticed that behaviour before as I normally go from F2-F3 and back again without going through F1. Also something I have noticed is that when I swap out of F2 for a moment the screen goes silly in the top left with the graphical data and F3 does not clear it but F1 seams to clear it out. My guess is the swap screen code is writing the data out (maybe only as temporary buffer) and F1 (as an initialisation code) clear that temporary area out) and that is what is happening and why F1 fixes it. Might have been helpful to know that. Hi All, I was not actually testing the Xorg version I now see. Using the login "gnome on Xorg" option I can see by the different display that I am using it and it makes no difference to the problem. The good thing I am fairly sure this is just a display issue and other memory is not getting damaged so should not cause anything really bad or strange to happen. Some more testing: with /etc/gdm/custom.conf having these lines in the daamon section [daemon] # Uncomment the line below to force the login screen to use Xorg #WaylandEnable=false WaylandEnable=true DefaultSession=gnome-xorg.desktop and selecting from the login cog either "gnome classic" or "gnome on xorg" it continues to fail. If you choose "gnome" (which I had not done before) it appears to be OK. Certainly the ways I was doing before to make it fail does not appear to fail. Luigi, please file a separate bug report for the problems you are facing. Since the issue you see is not related to fbc, it should be separated from this issue. Thanks. Separate bug now logged, thanks for all the help so far. 109610 xorg Driver/i i915 xorg display corruption on ctrl-alt-F3 (In reply to Daniel Drake from comment #26) > (In reply to Geraud from comment #25) > > Yes I did the test yesterday adding kernel parameter or driver config > > file...no luck the issue is still there. > > Can you double check that the parameter change took effect with the > following command: > > > sudo cat /sys/module/i915/parameters/enable_fbc > > If it says "1" then you should open a new bug report as any screen > corruption is not caused by fbc in that case. > > If it says "0" then something is wrong with the way you are setting the > parameter. OK I did the check tonight: I put the drivers option using echo "options i915 enable_fbc=0" >> /storage/.config/modprobe.d/i915.conf. I restarted and I did cat /sys/module/i915/parameters/enable_fbc with root account...the result is "1"... But still I'm a bit confused...if it is "1" shouldn't that mean that fbc is activated? Meaning the issue comes from fbc? Daniel/Geraud, can you give your feedback with latest drmtip? I am not aware of any changes committed related to this specific bug. Still good to check with latest drmtip. (https://cgit.freedesktop.org/drm-tip). Jai Hind (In reply to Lakshmi from comment #36) > Daniel/Geraud, can you give your feedback with latest drmtip? I am not aware > of any changes committed related to this specific bug. Still good to check > with latest drmtip. > (https://cgit.freedesktop.org/drm-tip). Problem is on my side I'm both kind of a noob...and I'm using latest version of libreelec (https://libreelec.tv/) that is not a "regular" distribution with proper package management (like debian...)...so I really don't have any idea how to update this package... :-s Hello, Just one question...as explained I don't know if I can help myself much more...but I guess I can share the issue on this forum to investigate deeper with the help of people who know how libreelec works and what can be tried: https://forum.libreelec.tv/thread/12380-le-8-2-5-with-uhd-630-coffee-lake-gemini-lake-support-and-luks/?pageNo=12 Hence, if somebody can first confirm me that when I run the command line: at /sys/module/i915/parameters/enable_fbc and get the result "1" it means that fbc is activated or not. Using this information I can then confirm that it is linked to fbc or not. If not it means that libreele is not taking into account the drivers options I created...so I can have a look on how I can force it to use it. If not then I can log a new bug report I guess...but at least it will be clear if my issue can be related to this bug or not. Hi Geraud- If $ cat /sys/module/i915/parameters/enable_fbc gives you a 1, that means that the module parameter is set to allow FBC. However, this doesn't necessarily mean that FBC is active/enabled. To check this, you'll need to either look at a file in debugfs, or boot with the "drm.debug=0xe" parameter and check your kernel logs. To check the file in debugfs, do: $ mount | grep debugfs if this doesn't return results, then debugfs isn't mounted (or may not even be enabled in your kernel config) - you can choose to enable/mount debugfs if you choose - see https://www.kernel.org/doc/Documentation/filesystems/debugfs.txt for some examples. If the mount command *did* return results, it probably looked something like: $ mount | grep debugfs debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime) Then you can do: $ cat /sys/kernel/debug/dri/0/i915_fbc_status which will give you the "this moment right now" status of FBC. Alternatively, if you wan't to check via the kernel log, boot with the "drm.debug=0xe" kernel parameter, and then do: $ dmesg | grep -i fbc Which should give at least a few lines about FBC enabling and FBC status. Hello James, Thank you very much. I did the mount | grep debugfs and then cat /sys/kernel/debug/dri/0/i915_fbc_status And the result is: FBC enabled Compressing: yes This means despites me creating the i985.conf file to set the driver parameters it didn't take it into account. I will try to ask on libreelec forum see if somebody has way to force the paramter. I don't want to add option to kernel boot as last time it kill the distribution for some reason. Good news is the bug is not Sorry for my double reply...keyboard mistake... I was saying the good news is it doesn't seem to be new bug. It's the same fbc issue identified here. Thanks for the help guys. :-) OK so after checking on libreelec forum the only way with libreelec to solve the issue is to add the option to one of those three file (depending on your installation: /flash/EFI/BOOT/syslinux.cfg, /flash/syslinux.cfg or /flash/extlinux.cfg Just before "quiet". This fix works fine but creating the i915.conf doesn't work with this specific distribution. So I can confirm that the bug we experienced with libreelec is linked to fbc...and the workaround to disable the function untill the bug is fixed. Thanks a lot for your help guys. :-) *** Bug 109950 has been marked as a duplicate of this bug. *** Seems to be relevant: https://patchwork.freedesktop.org/series/58843/ I tested Linux 5.0, 5.1, next-20190411, and drm-tip-2019y-04m-12d-06h-24m-37s. All those versions continue to be affected by the bug, trivially reproduced in the gdm login manager. I also tested applying the above mentioned patch "drm/i915: FBC needs vblank before enable / disable" to all those kernel versions. They all reproduced the graphical corruption at the top of the screen exactly as before. It's now been over 6 months since this bug was reported. Please let me know how we can help further. still exist on new kernels, currently running kernel version 5.0.7, my unit is asus e203mah, AFAIK current workaround (disabling fbc,ie., enable_fbc=0) makes it disappear Could frame buffer compression at least be disabled by default on GLK until this issue can be looked at? (In reply to Daniel Drake from comment #46) > I tested Linux 5.0, 5.1, next-20190411, and > drm-tip-2019y-04m-12d-06h-24m-37s. > > All those versions continue to be affected by the bug, trivially reproduced > in the gdm login manager. > > I also tested applying the above mentioned patch "drm/i915: FBC needs vblank > before enable / disable" to all those kernel versions. They all reproduced > the graphical corruption at the top of the screen exactly as before. > > It's now been over 6 months since this bug was reported. Please let me know > how we can help further. @Daniel, Have you tried the workaround mentioned in Comment 47? @James, any advice here? (In reply to Lakshmi from comment #49) > @Daniel, Have you tried the workaround mentioned in Comment 47? Yes! That workaround was found over 6 months ago (Comment 11) and has been verified to be effective by myself and several other people in the comments above. Looks like there was a patch posted to disable fbc on GLK[1] but it appears that it is not going to be accepted upstream. [1] https://lists.freedesktop.org/archives/intel-gfx/2019-April/196619.html We need to get FBC working on GLK too, but in the mean time pushed commit 1d25724b41fad7eeb2c3058a5c8190d6ece73e08 Author: Daniel Drake <drake@endlessm.com> Date: Tue Apr 23 17:28:10 2019 +0800 drm/i915/fbc: disable framebuffer compression on GeminiLake Is this bug should be open till FBC working on GLK? or We can close this bug? It seems this also happens to some IceLake platforms, a horizontal line of display corruption appears every time logging in via gdm. With s/IS_GEMINILAKE/IS_ICELAKE/ to the landed commit https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=396dd8143bdd94bd1c358a228a631c8c895a1126 , this problem is gone. (In reply to You-Sheng Yang [:vicamo] from comment #54) > It seems this also happens to some IceLake platforms, a horizontal line of > display corruption appears every time logging in via gdm. With > s/IS_GEMINILAKE/IS_ICELAKE/ to the landed commit > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=396dd8143bdd94bd1c358a228a631c8c895a1126 , this problem is gone. Can I close this bug? I'd suggest leaving this bug open til fbc is fixed on Gemini Lake. And given that it is reported to affect Icelake too that sounds like an even stronger reason to leave this open. (In reply to You-Sheng Yang [:vicamo] from comment #54) > It seems this also happens to some IceLake platforms, a horizontal line of > display corruption appears every time logging in via gdm. With > s/IS_GEMINILAKE/IS_ICELAKE/ to the landed commit > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=396dd8143bdd94bd1c358a228a631c8c895a1126 , this problem is gone. (In reply to Daniel Drake from comment #56) > I'd suggest leaving this bug open til fbc is fixed on Gemini Lake. > And given that it is reported to affect Icelake too that sounds like an even > stronger reason to leave this open. Currently this issue affects only GLK. If there is any issue on Icelake, I would recommend to create a new issue to investigate the issue separately. Created a new issue for IceLake in https://bugs.freedesktop.org/show_bug.cgi?id=111484 -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/162. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.