Bug 105902 - Closing and repoening laptop lid causes scanout corruption (regression since 4.15.12)
Summary: Closing and repoening laptop lid causes scanout corruption (regression since ...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Lakshmi
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-04-05 08:39 UTC by Wolfgang Draxinger
Modified: 2018-10-04 17:48 UTC (History)
2 users (show)

See Also:
i915 platform: GM45
i915 features: display/atomic


Attachments
Complete dmesg from boot, including a lid close→open and VT graphics→text→graphics sequence (157.13 KB, text/x-log)
2018-04-05 08:39 UTC, Wolfgang Draxinger
no flags Details

Description Wolfgang Draxinger 2018-04-05 08:39:01 UTC
Created attachment 138616 [details]
Complete dmesg from boot, including a lid close→open and VT graphics→text→graphics sequence

Hi,

ever since I upgraded my system to kernel version 4.15.12 I'm experiencing a kind of interesting problem with my laptop's graphics output.

The system is a Dell Latitude E6400 with Intel GM45 graphics. Kernel version is 4.15.12, X.org driver is xf86-video-intel-2.99.917.812, X.org server is xorg-server-1.19.6

Output of `lspci -v`:

00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller (rev 07) (prog-if 00 [VGA controller])
    Subsystem: Dell Device 0233
    Flags: bus master, fast devsel, latency 0, IRQ 16
    Memory at f6c00000 (64-bit, non-prefetchable) [size=4M]
    Memory at e0000000 (64-bit, prefetchable) [size=256M]
    I/O ports at ef98 [size=8]
    [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
    Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
    Capabilities: [d0] Power Management version 3
    Kernel driver in use: i915
    Kernel modules: i915

Problem description:

Whenever the laptop's lid is closed and opened the lower part of the graphics output is rapidly flickering between (corrupted) scanout buffers, with the expected output appearing intermittently inbetween. If running X.org there's distinct horizontal tear between where the graphics output is in order and below which it's corrupted. The vertical position coincides and follows the top boundary of the cursor. Switching the VT between graphics mode to text and back to graphics restores the graphics output into a sane state.

However the graphics returns to a sane state only after switching back to graphics. While in fbcon "text" mode it remains corrupted.

This behavior happens consistently and reproducibly regardless of power management or ACPI daemons in user space. Putting the system into single-user mode, terminating all daemons (including udev and other hardware management stuff) still allows to reproduce the behavior.

Interestingly enough when closing the lid a completely unrelated subsystem produced kernel messages: The PC-Card / PCMCIA / yenta_cardbus subsystem, emits these messages when the laptop lid is closed and opened

yenta_cardbus 0000:03:01.0: CardBus bridge to [bus 04]
yenta_cardbus 0000:03:01.0:   bridge window [io  0x5000-0x50ff]
yenta_cardbus 0000:03:01.0:   bridge window [io  0x5400-0x54ff]
yenta_cardbus 0000:03:01.0:   bridge window [mem 0xf0c00000-0xf0ffffff]
yenta_cardbus 0000:03:01.0:   bridge window [mem 0xf1000000-0xf13fffff]
yenta_cardbus 0000:03:01.0: CardBus bridge to [bus 04]
yenta_cardbus 0000:03:01.0:   bridge window [io  0x5000-0x50ff]
yenta_cardbus 0000:03:01.0:   bridge window [io  0x5400-0x54ff]
yenta_cardbus 0000:03:01.0:   bridge window [mem 0xf0c00000-0xf0ffffff]
yenta_cardbus 0000:03:01.0:   bridge window [mem 0xf1000000-0xf13fffff]

I'm really puzzled by these messages, I don't see how the lid is in any way related to this subsystem.

After that the graphics output is corrupted, but no further kernel messages produced. However after switching from graphics to text mode and back (and only after having switched back) the following messages are logged; the backtrace is logged exactly twice!

------------[ cut here ]------------
cursor A assertion failure (expected off, current on)
WARNING: CPU: 1 PID: 423 at drivers/gpu/drm/i915/intel_display.c:1247 assert_plane+0x90/0xa0 [i915]
Modules linked in: 8021q garp mrp stp llc ext2 joydev i915 dell_smbios_wmi dell_smm_hwmon dell_wmi i2c_algo_bit iTCO_wdt gpio_ich iTCO_vendor_support sparse_keymap wmi_bmof dell_rbtn dell_wmi_descriptor uvcvideo videobuf2_vmalloc videobuf2_memops drm_kms_helper dell_laptop dell_smbios_smm dell_smbios dcdbas videobuf2_v4l2 coretemp hwmon videobuf2_core input_leds psmouse evdev pcspkr mac_hid videodev yenta_socket usbkbd media usbmouse pcmcia_rsrc i2c_i801 drm pcmcia_core lpc_ich snd_hda_codec_hdmi intel_agp intel_gtt shpchp agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops snd_hda_intel wmi thermal button battery video ac acpi_cpufreq kvm_intel kvm irqbypass snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec snd_hda_core snd_hwdep snd_pcm ctr ccm arc4 iwldvm iwlwifi mac80211 cfg80211
 snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock vmw_vsock_virtio_transport_common vsock vhost_net vhost tap uhid hci_vhci bluetooth ecdh_generic rfkill vfio_iommu_type1 vfio dm_mod uinput userio ppp_generic slhc tun loop crc32c_generic btrfs xor zstd_compress raid6_pq zstd_decompress xxhash cuse fuse ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom hid_generic usbhid hid uhci_hcd ahci serio_raw libahci sdhci_pci libata firewire_ohci ehci_pci sdhci ehci_hcd scsi_mod firewire_core mmc_core crc_itu_t usbcore
CPU: 1 PID: 423 Comm: Xorg Not tainted 4.15.12_1 #1
Hardware name: Dell Inc. Latitude E6400                  /0W620R, BIOS A25 06/04/2010
RIP: 0010:assert_plane+0x90/0xa0 [i915]
RSP: 0018:ffffb092c124f7a8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff9add52e24400 RCX: 0000000000000035
RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000292
RBP: 0000000000000000 R08: 000000535d8c721a R09: 0000000000000035
R10: ffff9add52e27000 R11: 0000000000000000 R12: ffff9add54c3bcb0
R13: ffff9add47bcdc00 R14: ffff9add47bcdc28 R15: ffff9add47bcdc30
FS:  00007f47b9c5d8c0(0000) GS:ffff9add5fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fff90269fc0 CR3: 0000000215352000 CR4: 00000000000406e0
Call Trace:
 assert_planes_disabled.isra.59+0x48/0x60 [i915]
 intel_disable_pipe+0x52/0x180 [i915]
 i9xx_crtc_disable+0x77/0x430 [i915]
 ? intel_pre_plane_update+0xdf/0x150 [i915]
 intel_atomic_commit_tail+0x7b1/0xd10 [i915]
 intel_atomic_commit+0x266/0x2a0 [i915]
 restore_fbdev_mode_atomic+0x189/0x1f0 [drm_kms_helper]
 drm_fb_helper_restore_fbdev_mode_unlocked.part.25+0x23/0x70 [drm_kms_helper]
 drm_fb_helper_set_par+0x3e/0x70 [drm_kms_helper]
 intel_fbdev_set_par+0x16/0x60 [i915]
 fb_set_var+0x1d0/0x430
 fbcon_blank+0x200/0x340
 ? check_preempt_wakeup+0xe8/0x1b0
 do_unblank_screen+0xa3/0x180
 complete_change_console+0x54/0xd0
 vt_ioctl+0x719/0x11e0
 tty_ioctl+0xf3/0x8a0
 do_vfs_ioctl+0xa4/0x670
 SyS_ioctl+0x74/0x80
 do_syscall_64+0x67/0x100
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f47b7b266a7
RSP: 002b:00007ffeca93af78 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f47b7b266a7
RDX: 0000000000000001 RSI: 0000000000005605 RDI: 000000000000000b
RBP: 0000562d15892254 R08: 0000000000000000 R09: 0000000000000002
R10: 00007ffeca93af2c R11: 0000000000003246 R12: 0000000000000008
R13: 0000562d15892300 R14: 0000562d15892310 R15: 0000562d15892250
Code: ae 9d fd c0 84 c0 48 c7 c2 b1 9d fd c0 48 89 f1 48 c7 c7 a8 b2 fe c0 48 0f 44 ca 40 84 ed 48 0f 45 d6 48 8b 73 18 e8 30 52 12 cf <0f> 0b eb 89 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
---[ end trace 90a6895611fa5a9b ]---
------------[ cut here ]------------
cursor A assertion failure (expected off, current on)
WARNING: CPU: 0 PID: 423 at drivers/gpu/drm/i915/intel_display.c:1247 assert_plane+0x90/0xa0 [i915]
Modules linked in: 8021q garp mrp stp llc ext2 joydev i915 dell_smbios_wmi dell_smm_hwmon dell_wmi i2c_algo_bit iTCO_wdt gpio_ich iTCO_vendor_support sparse_keymap wmi_bmof dell_rbtn dell_wmi_descriptor uvcvideo videobuf2_vmalloc videobuf2_memops drm_kms_helper dell_laptop dell_smbios_smm dell_smbios dcdbas videobuf2_v4l2 coretemp hwmon videobuf2_core input_leds psmouse evdev pcspkr mac_hid videodev yenta_socket usbkbd media usbmouse pcmcia_rsrc i2c_i801 drm pcmcia_core lpc_ich snd_hda_codec_hdmi intel_agp intel_gtt shpchp agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops snd_hda_intel wmi thermal button battery video ac acpi_cpufreq kvm_intel kvm irqbypass snd_hda_codec_idt snd_hda_codec_generic snd_hda_codec snd_hda_core snd_hwdep snd_pcm ctr ccm arc4 iwldvm iwlwifi mac80211 cfg80211
 snd_seq snd_seq_device snd_timer snd soundcore vhost_vsock vmw_vsock_virtio_transport_common vsock vhost_net vhost tap uhid hci_vhci bluetooth ecdh_generic rfkill vfio_iommu_type1 vfio dm_mod uinput userio ppp_generic slhc tun loop crc32c_generic btrfs xor zstd_compress raid6_pq zstd_decompress xxhash cuse fuse ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom hid_generic usbhid hid uhci_hcd ahci serio_raw libahci sdhci_pci libata firewire_ohci ehci_pci sdhci ehci_hcd scsi_mod firewire_core mmc_core crc_itu_t usbcore
CPU: 0 PID: 423 Comm: Xorg Tainted: G        W        4.15.12_1 #1
Hardware name: Dell Inc. Latitude E6400                  /0W620R, BIOS A25 06/04/2010
RIP: 0010:assert_plane+0x90/0xa0 [i915]
RSP: 0018:ffffb092c124f738 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff9add52e24400 RCX: 0000000000000035
RDX: 0000000000000001 RSI: 0000000000000002 RDI: 0000000000000292
RBP: 0000000000000000 R08: 000000535e0f6067 R09: 0000000000000035
R10: ffffb092c124f638 R11: 0000000000000000 R12: ffff9add54c3bcb0
R13: 0000000000000000 R14: ffff9add556e2000 R15: ffff9add52e28000
FS:  00007f47b9c5d8c0(0000) GS:ffff9add5fc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005638821db760 CR3: 0000000215352000 CR4: 00000000000406f0
Call Trace:
 assert_planes_disabled.isra.59+0x48/0x60 [i915]
 intel_enable_pipe+0x52/0x200 [i915]
 i9xx_crtc_enable+0x352/0x480 [i915]
 intel_update_crtc+0x39/0x90 [i915]
 intel_update_crtcs+0x47/0x60 [i915]
 intel_atomic_commit_tail+0x212/0xd10 [i915]
 intel_atomic_commit+0x266/0x2a0 [i915]
 restore_fbdev_mode_atomic+0x189/0x1f0 [drm_kms_helper]
 drm_fb_helper_restore_fbdev_mode_unlocked.part.25+0x23/0x70 [drm_kms_helper]
 drm_fb_helper_set_par+0x3e/0x70 [drm_kms_helper]
 intel_fbdev_set_par+0x16/0x60 [i915]
 fb_set_var+0x1d0/0x430
 fbcon_blank+0x200/0x340
 ? check_preempt_wakeup+0xe8/0x1b0
 do_unblank_screen+0xa3/0x180
 complete_change_console+0x54/0xd0
 vt_ioctl+0x719/0x11e0
 tty_ioctl+0xf3/0x8a0
 do_vfs_ioctl+0xa4/0x670
 SyS_ioctl+0x74/0x80
 do_syscall_64+0x67/0x100
 entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x7f47b7b266a7
RSP: 002b:00007ffeca93af78 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f47b7b266a7
RDX: 0000000000000001 RSI: 0000000000005605 RDI: 000000000000000b
RBP: 0000562d15892254 R08: 0000000000000000 R09: 0000000000000002
R10: 00007ffeca93af2c R11: 0000000000003246 R12: 0000000000000008
R13: 0000562d15892300 R14: 0000562d15892310 R15: 0000562d15892250
Code: ae 9d fd c0 84 c0 48 c7 c2 b1 9d fd c0 48 89 f1 48 c7 c7 a8 b2 fe c0 48 0f 44 ca 40 84 ed 48 0f 45 d6 48 8b 73 18 e8 30 52 12 cf <0f> 0b eb 89 66 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90
---[ end trace 90a6895611fa5a9c ]---
[drm:intel_cpu_fifo_underrun_irq_handler [i915]] *ERROR* CPU pipe A FIFO underrun
dell_wmi: Unknown key with type 0x0011 and code 0xffd1 pressed


So far I haven't tested if this streches over more than one CRTCs, but I can check.


Cheers,
Wolfgang
Comment 1 Jani Saarinen 2018-04-05 08:52:37 UTC
HI, Are you able to see problem with our latest tip: https://cgit.freedesktop.org/drm-tip? Add also dmesg output with drm.debug=0xe
Comment 2 Jani Saarinen 2018-04-05 08:56:32 UTC
On email to intel-gfx was advised to add drm.debug=14 module parameter, attach dmesg from boot.
Comment 3 Chris Wilson 2018-04-05 11:15:21 UTC
Fwiw, this was one of my favourite bugs from yesteryear (on ctg, thinkpad x200s) with the flickering starting exactly on the same line as the cursor; we had screwed up the WM calculations in conjunction with FBC. I recall we disabled FBC...

We are indeed using FBC again. So one thing you can do to confirm this WM / FBC interaction is to boot with i915.enable_fbc=0
Comment 4 Wolfgang Draxinger 2018-04-05 11:32:19 UTC
On 2018-04-05 10:56, bugzilla-daemon@freedesktop.org wrote:
> *Comment # 2 <https://bugs.freedesktop.org/show_bug.cgi?id=105902#c2> on 
> bug 105902 <https://bugs.freedesktop.org/show_bug.cgi?id=105902> from 
> Jani Saarinen <mailto:jani.saarinen@intel.com> *
> 
> On email to intel-gfx was advised to add drm.debug=14 module parameter, attach
> dmesg from boot.
> 
> ------------------------------------------------------------------------
> You are receiving this mail because:
> 
>   * You reported the bug.
> 

As advised I did attach a full dmesg from boot, with drm.debug=14 in the 
Bugzilla bugreport. I'll try what happens with the tip version as soon 
as I'm back at home and can test on my laptop.

Cheers,
Wolfgang
Comment 5 Ville Syrjala 2018-04-05 14:29:18 UTC
I think we've broken the LVDS lid notifier. We never make a copy of the atomic state and hence we never restore it either.

I would be curious to know if we could actually just nuke the lid notifier entirerly:
git://github.com/vsyrjala/linux.git lvds_lid_notifier_nuke

The reason for the notifier was that the VBIOS supposedly craps on top of the hardware when the lid is closed, which then means that we have to restore the expected hardware state. If anyone has one of the affected machines (not sure what they were) I'd like to hear what happens without the lid notifier.

If the VBIOS is still up to its old tricks, I have also made a branch where I try to set various SWF bits. In theory those could make the VBIOS stop messing up our hardware:
git://github.com/vsyrjala/linux.git vbios_swf
Comment 6 Jani Saarinen 2018-04-20 19:17:35 UTC
Wolfgang, have you tried patches/branches proposed by Ville?
Comment 7 Ketsui 2018-04-27 01:27:51 UTC
This sounds awfully similar to what I reported [1], should I go ahead and test out Ville's branches?

[1] https://bugs.freedesktop.org/show_bug.cgi?id=105435
Comment 8 kitsunyan 2018-04-30 19:18:59 UTC
I tested Ville's patch from lvds_lid_notifier_nuke branch and can confirm that flickering was gone. Should I try vbios_swf?

Lenovo ThinkPad X200 with Intel GM45.
Comment 9 Jani Saarinen 2018-05-04 12:33:57 UTC
Ville, what are next steps here?
Comment 10 Ville Syrjala 2018-05-17 12:01:24 UTC
Would be nice if we could find one of the original machines that were the reason for adding the lid notifier, and chekc how they fare these days without the notifier. I see Thinkpad X41 Tablet, T43, and X61 mentioned in the original bug #21230.
Comment 11 Joonas Saarinen 2018-06-18 15:06:35 UTC
Fujitsu Siemens U9210 with Intel GM45 is currently broken as well.

Results of trying the suggested things:

- Booting with i915.enable_fbc=0 does not help.

- Using the lvds_lid_notifier_nuke branch makes the problem go away (there is no flickering rubbish after opening the lid).

- Using the vbios_swf branch was not helpful.
Comment 12 Ville Syrjala 2018-08-21 11:41:42 UTC
commit 05c72e77ccda89ff624108b1b59a0fc43843f343
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Jul 17 20:42:14 2018 +0300

    drm/i915: Nuke the LVDS lid notifier
Comment 13 James Ausmus 2018-08-21 15:22:45 UTC
Wolfgang - can you verify that the issue is resolved for you on latest drm-tip?
Comment 14 Lakshmi 2018-09-07 14:42:49 UTC
Wolfgang, ping?
Comment 15 Lakshmi 2018-10-04 17:48:58 UTC
No feedback from more than a month.
I assume this issue has been fixed. Closing this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.