Bug 107573 - [GM45] black screen when laptop lid reopened: primary A assertion failure (expected on, current off)
Summary: [GM45] black screen when laptop lid reopened: primary A assertion failure (ex...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: Triaged
Keywords:
Depends on:
Blocks:
 
Reported: 2018-08-15 02:26 UTC by Adric Blake
Modified: 2018-11-30 00:19 UTC (History)
2 users (show)

See Also:
i915 platform: GM45
i915 features:


Attachments
drm.debug=0x1f output, set temporarily at runtime, around the time of the bug. (48.07 KB, application/gzip)
2018-08-15 02:26 UTC, Adric Blake
no flags Details
drm.debug=0x1e dmesg; actions taken: boot, close lid, open lid, dpms force on, close lid, open lid, result=blank screen with cursor visible (377.07 KB, text/plain)
2018-11-29 22:33 UTC, plantroon
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adric Blake 2018-08-15 02:26:52 UTC
Created attachment 141097 [details]
drm.debug=0x1f output, set temporarily at runtime, around the time of the bug.

With Linux kernel 4.18 (not present in 4.17.10-1-ARCH), reopening the laptop lid after closing it will result in no output being displayed, aside from the mouse cursor. Switching to the Linux console and back fixes it.

This has occurred consistently every time I opened the lid, totaling four times so far.

Possibly related in root cause to bug 105902, "Closing and repoening laptop lid causes scanout corruption," which suggests a broken/incomplete "lid notifier" in the comments--but the symptoms are slightly different. I should note that I did occasionally have that bug when I was running kernel 4.16/4.17.

Arch Linux x86_64, package list is:
linux 4.18.arch1-1
xorg-server 1.20.1-1
mesa 18.1.6-1
libdrm 2.4.93-1
xf86-video-intel 1:2.99.917+842+g3d395062-1

A single warning gets printed each time. An example of one of the warnings:

[30153.005247] arch_pc kernel: ------------[ cut here ]------------
[30153.005250] arch_pc kernel: primary A assertion failure (expected on, current off)
[30153.005346] arch_pc kernel: WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/i915/intel_display.c:1274 assert_plane+0x87/0x90 [i915]
[30153.005347] arch_pc kernel: Modules linked in: fuse xt_tcpudp ipt_REJECT nf_reject_ipv4 xt_set iptable_filter bpfilter ip_set_hash_net ip_set nfnetlink arc4 b43 bcma mac80211 joydev uvcvideo mousedev snd_hda_codec_idt snd_hda_codec_generic videobuf2_vmalloc cfg80211 videobuf2_memops videobuf2_v4l2 snd_hda_intel i915 rng_core snd_hda_codec iTCO_wdt dell_wmi sparse_keymap ums_realtek uas videobuf2_common gpio_ich iTCO_vendor_support wmi_bmof videodev dell_laptop rfkill i2c_algo_bit drm_kms_helper media dell_smbios dcdbas dell_wmi_descriptor drm dell_smm_hwmon coretemp syscopyarea sysfillrect psmouse snd_hda_core snd_hwdep input_leds led_class sysimgblt pcspkr evdev intel_agp ssb mmc_core snd_pcm snd_timer fb_sys_fops mac_hid i2c_i801 intel_gtt pcmcia snd soundcore pcmcia_core agpgart lpc_ich wmi battery rtc_cmos ac
[30153.005411] arch_pc kernel:  pcc_cpufreq acpi_cpufreq sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto sr_mod cdrom sd_mod usb_storage serio_raw atkbd libps2 uhci_hcd ahci libahci libata ehci_pci scsi_mod ehci_hcd usbcore usb_common i8042 serio sky2
[30153.005441] arch_pc kernel: CPU: 1 PID: 650 Comm: Xorg Tainted: G        W         4.18.0-arch1-1-ARCH #1
[30153.005443] arch_pc kernel: Hardware name: Dell Inc. Inspiron 1545                   /0G848F, BIOS A14 12/07/2009
[30153.005479] arch_pc kernel: RIP: 0010:assert_plane+0x87/0x90 [i915]
[30153.005480] arch_pc kernel: Code: 8a 74 c0 84 c0 48 c7 c2 19 8a 74 c0 48 c7 c7 80 c3 75 c0 48 89 f1 48 0f 44 ca 40 84 ed 48 0f 45 d6 48 8b 73 18 e8 33 72 fe e7 <0f> 0b eb 92 0f 1f 44 00 00 66 66 66 66 90 c1 e6 0c 53 89 d3 48 8b 
[30153.005524] arch_pc kernel: RSP: 0018:ffffb73d01bd7b18 EFLAGS: 00010286
[30153.005526] arch_pc kernel: RAX: 0000000000000000 RBX: ffff9748964c2c00 RCX: 0000000000000001
[30153.005528] arch_pc kernel: RDX: 0000000080000001 RSI: ffffffffa94810ee RDI: 00000000ffffffff
[30153.005530] arch_pc kernel: RBP: 0000000000000001 R08: 0000000000000001 R09: 000000000000043c
[30153.005531] arch_pc kernel: R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000000
[30153.005533] arch_pc kernel: R13: ffff974808731000 R14: 0000000000000002 R15: ffff974896580328
[30153.005535] arch_pc kernel: FS:  00007f14da519e00(0000) GS:ffff97489fd00000(0000) knlGS:0000000000000000
[30153.005536] arch_pc kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[30153.005538] arch_pc kernel: CR2: 00007fca74619000 CR3: 0000000187d9e000 CR4: 00000000000406e0
[30153.005539] arch_pc kernel: Call Trace:
[30153.005582] arch_pc kernel:  intel_atomic_commit_tail+0x901/0xcf0 [i915]
[30153.005621] arch_pc kernel:  intel_atomic_commit+0x2a9/0x2e0 [i915]
[30153.005637] arch_pc kernel:  drm_atomic_helper_set_config+0x80/0x90 [drm_kms_helper]
[30153.005663] arch_pc kernel:  __drm_mode_set_config_internal+0x67/0x120 [drm]
[30153.005679] arch_pc kernel:  drm_mode_setcrtc+0x434/0x620 [drm]
[30153.005686] arch_pc kernel:  ? unix_stream_recvmsg+0x53/0x70
[30153.005688] arch_pc kernel:  ? unix_set_peek_off+0x50/0x50
[30153.005704] arch_pc kernel:  ? drm_mode_getcrtc+0x180/0x180 [drm]
[30153.005717] arch_pc kernel:  drm_ioctl_kernel+0xa7/0xf0 [drm]
[30153.005732] arch_pc kernel:  drm_ioctl+0x30e/0x3c0 [drm]
[30153.005748] arch_pc kernel:  ? drm_mode_getcrtc+0x180/0x180 [drm]
[30153.005753] arch_pc kernel:  do_vfs_ioctl+0xa4/0x620
[30153.005757] arch_pc kernel:  ? syscall_slow_exit_work+0x19b/0x1b0
[30153.005760] arch_pc kernel:  ksys_ioctl+0x60/0x90
[30153.005763] arch_pc kernel:  __x64_sys_ioctl+0x16/0x20
[30153.005765] arch_pc kernel:  do_syscall_64+0x5b/0x170
[30153.005769] arch_pc kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[30153.005772] arch_pc kernel: RIP: 0033:0x7f14dece779b
[30153.005773] arch_pc kernel: Code: 0f 1e fa 48 8b 05 c5 b6 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 95 b6 0c 00 f7 d8 64 89 01 48 
[30153.005816] arch_pc kernel: RSP: 002b:00007ffd29114e48 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
[30153.005818] arch_pc kernel: RAX: ffffffffffffffda RBX: 000055cc3553e1c0 RCX: 00007f14dece779b
[30153.005820] arch_pc kernel: RDX: 00007ffd29114f30 RSI: 00000000c06864a2 RDI: 000000000000000d
[30153.005821] arch_pc kernel: RBP: 00007ffd29114f30 R08: 000000000000003b R09: 0000000000000000
[30153.005823] arch_pc kernel: R10: 0000000000000001 R11: 0000000000003246 R12: 00000000c06864a2
[30153.005824] arch_pc kernel: R13: 000000000000000d R14: 000055cc3553ffe0 R15: 00007ffd29114f30
[30153.005828] arch_pc kernel: ---[ end trace 0a3a0d20ab9047fb ]---

I added a drm.debug=0x1f trace as an attachment.
Comment 1 Jani Saarinen 2018-08-15 06:19:16 UTC
Can you also verify if same issue is on latest drm-tip:
https://cgit.freedesktop.org/drm-tip and send dmesg with drm.debug=0x1e
log_buf_len=4M?

Please add all logs plain text not packed up.
Comment 2 Adric Blake 2018-08-25 20:59:57 UTC
I tested the drm-tip kernel on Aug 15, but I could not replicate the bug at hand.I bisected to what appeareds to be a merge into the 4.18(rc) drm-tip tree. Any commit during bisection that was based on a 4.17 kernel did not show the bug, though I can't determine whether the bug originated before (or because of) the 4.18 merge. Or perhaps Git is confusing me, which is also a possibility. But as an example, commit 05c72e77 ("drm/i915: Nuke the LVDS lid notifier") did not show the bug, even though it seems to be some ancestor of the buggy commit.

From what I can make out...
old buggy commit 294f96ae8aa5: Merge tag 'drm-misc-next-2018-07-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
bisected-to working commit ef8e0ff97ae8: Merge tag 'drm-intel-next-2018-07-19' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

That's the smallest change that I can confirm contains a fix and can determine at this point in time. I don't know when the bug got introduced.

No logs since the bug didn't appear, though I made them just in case.
Comment 3 Adric Blake 2018-08-25 21:00:49 UTC
The bug still exists in the 4.18.5 kernel, I forgot to add...
Comment 4 Radosław Szwichtenberg 2018-08-28 07:43:59 UTC
Thanks Adric!

Did you manage to reproduce the problem with latest drm-tip? If yes, please attach the logfiles that Jani asked for.
Comment 5 Ville Syrjala 2018-08-28 19:12:23 UTC
commit 05c72e77ccda89ff624108b1b59a0fc43843f343
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Jul 17 20:42:14 2018 +0300

    drm/i915: Nuke the LVDS lid notifier

Going to assume that one fixed this. Please reopen if that isn't the case.
Comment 6 Lakshmi 2018-09-07 14:47:04 UTC
Adric, can you confirm if this issue is resolved for you based on the above commit changes?
Comment 7 Adric Blake 2018-09-08 13:09:09 UTC
Linux stable (4.18.6) doesn't work (bug is still there).
Linux from torvalds git (as of yesterday) works, so Linux 4.19 should be fixed.
Linux from drm-tip git (as of yesterday) also works.

commit 05c72e77ccda89ff624108b1b59a0fc43843f343
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Tue Jul 17 20:42:14 2018 +0300

    drm/i915: Nuke the LVDS lid notifier

appears to be applied to linux stable, so I don't think that commit was the one to fix the bug.

I'll reopen until 4.18.y stable is fixed (or until 4.19 comes out).
Comment 8 Ville Syrjala 2018-09-10 12:05:26 UTC
(In reply to Adric Blake from comment #7)
> Linux stable (4.18.6) doesn't work (bug is still there).

Because the fix isn't included there. Resolving back to fixed.
Comment 9 Lakshmi 2018-10-04 17:57:41 UTC
Adric, can you please verify this with latest kernel?
Comment 10 Lakshmi 2018-10-17 14:23:27 UTC
Adric, have you got any results from latest kernel?
Comment 11 plantroon 2018-11-29 22:33:10 UTC
Created attachment 142661 [details]
drm.debug=0x1e dmesg; actions taken: boot, close lid, open lid, dpms force on, close lid, open lid, result=blank screen with cursor visible

I am getting the same behavior as described in the original report.

system info: Debian Stretch 9.6 on Thinkpad X200, GPU: GMA4500
xserver-xorg-video-intel version: 2:2.99.917+git20161206-1
affected kernel: 4.9.0-8-amd64
last working kernel: 4.9.0-7-amd64
kernel 4.18.0-0.bpo.1 from unstable is also affected
Comment 12 Adric Blake 2018-11-29 22:48:46 UTC
The fix is not in the 4.18 kernel as I have been made aware. It is in the 4.19 kernel. If you can get your hands on the 4.19 kernel and test whether that works, then it would merit creating a new bug report if the bug still exists.

For the record, the problem has not happened again on my GM45 system with kernels 4.19 onward.
Comment 13 plantroon 2018-11-29 23:20:32 UTC
I installed 4.19-0-trunk from unstable repository and the bug is gone. What are my chances of getting a fix backported to 4.9 stable releases? For now, I am sticking with the last working kernel from stable.
Comment 14 Adric Blake 2018-11-30 00:19:23 UTC
The chances are very slim, since this is not a minor change in terms of code changes. If it could be backported so easily, the change would have at least made it into 4.18. As the drm/i915 codebase changes rather rapidly, a large version jump like that makes it very difficult for fixes for the latest kernel to cleanly apply to an old kernel like 4.9.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.