Bug 107827

Summary: [Regression] [bisected] drm/atomic: "plane A assertion failure" after lid close/lid open on older Intel Mobile GME965/GLE960
Product: DRI Reporter: Karsten Hohmeier <dri>
Component: DRM/IntelAssignee: Maarten Lankhorst <bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: medium CC: bugs, intel-gfx-bugs
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard: Triaged
i915 platform: i915 features: display/atomic
Attachments:
Description Flags
crtc_state->plane_mask paranoia none

Description Karsten Hohmeier 2018-09-04 21:32:29 UTC
Hello maintainers,

Recently Kernel 4.17.17 trickeled down to me via Debian backports. After closing and opening the lid on my older HP Compaq 6720s laptop with Mobile GME965/GLE960 graphics I get garbled screen output and the following error:

systemd-logind[486]: Lid closed.
kernel: pci 0000:00:1e.0: PCI bridge to [bus 02]
systemd-logind[486]: Lid opened.
kernel: pci 0000:00:1e.0: PCI bridge to [bus 02]
kernel: ------------[ cut here ]------------
kernel: plane A assertion failure (expected on, current off)
kernel: WARNING: CPU: 0 PID: 726 at drivers/gpu/drm/i915/intel_display.c:1274 assert_plane+0x87/0x90 [i
kernel: Modules linked in: ctr ccm fuse cpufreq_userspace cpufreq_conservative cpufreq_powersave arc4 i
kernel:  e1000e usbcore fan thermal
kernel: CPU: 0 PID: 726 Comm: Xorg Not tainted 4.18.5 #1
kernel: Hardware name: Hewlett-Packard HP Compaq 6720s/30D8, BIOS 68MDU Ver. F.0D 11/04/2008
kernel: RIP: 0010:assert_plane+0x87/0x90 [i915]
kernel: Code: b5 af c0 84 c0 48 c7 c2 a1 b5 af c0 48 89 f1 48 c7 c7 20 e7 b0 c0 48 0f 44 ca 40 84 ed 48
kernel: RSP: 0018:ffffb38000b5bb08 EFLAGS: 00010282
kernel: RAX: 0000000000000000 RBX: ffff96e4f5ebf800 RCX: 0000000000000006
kernel: RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff96e4ffc16730
kernel: RBP: 0000000000000001 R08: 00000000000787af R09: 000000000000030c
kernel: R10: ffffed2c84dfbf40 R11: ffffffffb89a2f4d R12: 0000000000000000
kernel: R13: ffff96e3f672b000 R14: 0000000000000002 R15: ffff96e4f8758328
kernel: FS:  00007f332ae69a40(0000) GS:ffff96e4ffc00000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00007f2a141a4db0 CR3: 0000000137c74000 CR4: 00000000000006f0
kernel: Call Trace:
kernel:  intel_atomic_commit_tail+0xa93/0xd70 [i915]
kernel:  intel_atomic_commit+0x2ad/0x2e0 [i915]
kernel:  drm_atomic_helper_set_config+0x66/0x90 [drm_kms_helper]
kernel:  __drm_mode_set_config_internal+0x67/0x120 [drm]
kernel:  drm_mode_setcrtc+0x476/0x600 [drm]
kernel:  ? drm_mode_getcrtc+0x180/0x180 [drm]
kernel:  drm_ioctl_kernel+0xaa/0xf0 [drm]
kernel:  ? ___sys_recvmsg+0x18b/0x230
kernel:  drm_ioctl+0x2e4/0x380 [drm]
kernel:  ? drm_mode_getcrtc+0x180/0x180 [drm]
kernel:  do_vfs_ioctl+0xa2/0x630
kernel:  ? __sys_recvmsg+0x60/0xa0
kernel:  ? __sys_recvmsg+0x8f/0xa0
kernel:  ksys_ioctl+0x70/0x80
kernel:  __x64_sys_ioctl+0x16/0x20
kernel:  do_syscall_64+0x55/0xf0
kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
kernel: RIP: 0033:0x7f332887edd7
kernel: Code: 00 00 00 48 8b 05 c1 80 2b 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84
kernel: RSP: 002b:00007fff129c0928 EFLAGS: 00003246 ORIG_RAX: 0000000000000010
kernel: RAX: ffffffffffffffda RBX: 000000000000000e RCX: 00007f332887edd7
kernel: RDX: 00007fff129c0960 RSI: 00000000c06864a2 RDI: 000000000000000e
kernel: RBP: 00007fff129c0960 R08: 0000000000000000 R09: 000055f90fb2e570
kernel: R10: 00007fff129c0b00 R11: 0000000000003246 R12: 00000000c06864a2
kernel: R13: 000000000000000e R14: 000055f90ea93770 R15: 000055f90e8a3af0
kernel: ---[ end trace 1f4e17b459c3e35d ]---


I bisected this to:


c81350c31d0d20661a0aa839b79182bcb0e7a45d is the first bad commit
commit c81350c31d0d20661a0aa839b79182bcb0e7a45d
Author: Satendra Singh Thakur <satendra.t@samsung.com>
Date:   Thu May 3 11:19:32 2018 +0530

    drm/atomic: Handling the case when setting old crtc for plane
    
    [ Upstream commit fc2a69f3903dfd97cd47f593e642b47918c949df ]
    
    In the func drm_atomic_set_crtc_for_plane, with the current code,
    if crtc of the plane_state and crtc passed as argument to the func
    are same, entire func will executed in vein.
    It will get state of crtc and clear and set the bits in plane_mask.
    All these steps are not required for same old crtc.
    Ideally, we should do nothing in this case, this patch handles the same,
    and causes the program to return without doing anything in such scenario.
    
    Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
    Cc: Madhur Verma <madhur.verma@samsung.com>
    Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com
    Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

:040000 040000 3417076ff5f24a0c378d744e5cade5dd4c6fb184 995e08ea110cdaa338636a08b1d3d50ea50ab398 M	drivers
---
 drivers/gpu/drm/drm_atomic.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -1091,7 +1091,9 @@ drm_atomic_set_crtc_for_plane(struct drm
 {
 	struct drm_plane *plane = plane_state->plane;
 	struct drm_crtc_state *crtc_state;
-
+	/* Nothing to do for same crtc*/
+	if (plane_state->crtc == crtc)
+		return 0;
 	if (plane_state->crtc) {
 		crtc_state = drm_atomic_get_crtc_state(plane_state->state,
 						       plane_state->crtc);



Above patch adds an early exit to drm_atomic_set_crtc_for_plane() (for no good reason?) and breaks my system.

Reverting this patch makes all newer kernels work again with my system (tested up to 4.18.5).

I would not mind if this patch got reverted upstream as well.

Kind Regards

Karsten Hohmeier
Comment 1 Chris Wilson 2018-09-04 21:41:37 UTC
Please do send the revert to dri-devel@lists.freedesktop.org, cc intel-gfx@lists.freedesktop.org citing this bugzilla for the regression. As the original patch purports to be a simple optimisation, we should revert until the root cause is found and the patch reinstated.

You will want to append:
Fixes: fc2a69f3903d ("drm/atomic: Handling the case when setting old crtc for plane")
Cc: Satendra Singh Thakur <satendra.t@samsung.com>
Cc: Madhur Verma <madhur.verma@samsung.com>
Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org>
Comment 2 Maarten Lankhorst 2018-09-05 06:41:43 UTC
Created attachment 141456 [details] [review]
crtc_state->plane_mask paranoia

Can you check where this patch gives a WARN?
Comment 3 Ville Syrjala 2018-09-05 10:09:45 UTC
Does commit 05c72e77ccda ("drm/i915: Nuke the LVDS lid notifier")
fix this?
Comment 4 Karsten Hohmeier 2018-09-05 11:45:02 UTC
(In reply to Ville Syrjala from comment #3)
> Does commit 05c72e77ccda ("drm/i915: Nuke the LVDS lid notifier")
> fix this?

I have only tested against the stable tree at git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git . Your patch is not in there yet.

Should I clone Linus'tree 4.19-rc2 which has it or should I go for your dev-tree at git://anongit.freedesktop.org/drm-intel ?
Comment 5 Ville Syrjala 2018-09-05 12:04:19 UTC
(In reply to Karsten Hohmeier from comment #4)
> (In reply to Ville Syrjala from comment #3)
> > Does commit 05c72e77ccda ("drm/i915: Nuke the LVDS lid notifier")
> > fix this?
> 
> I have only tested against the stable tree at
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git . Your patch
> is not in there yet.
> 
> Should I clone Linus'tree 4.19-rc2 which has it or should I go for your
> dev-tree at git://anongit.freedesktop.org/drm-intel ?

Doesn't really matter. Alternatively you could just cherry pick the commit from one of those trees to your stable branch (git add remote foo <url>; git fetch foo; git cherry-pick <sha>).
Comment 6 Karsten Hohmeier 2018-09-05 17:01:32 UTC
(In reply to Maarten Lankhorst from comment #2)
> Created attachment 141456 [details] [review] [review]
> crtc_state->plane_mask paranoia
> 
> Can you check where this patch gives a WARN?

I cherry-picked 62f77ad0969594ee428043523bf28329df191b39 "drm: Add drm_plane_mask()" on top of 4.18.5 to make your patch compile, but the result is not bootable and hangs on "fb: switching to inteldrmfb from VESA VGA".
The compiler also complained about crtc_state being used uninitialized, but that may have been intentional?
Comment 7 Karsten Hohmeier 2018-09-05 17:11:14 UTC
(In reply to Ville Syrjala from comment #5)
> (In reply to Karsten Hohmeier from comment #4)
> > (In reply to Ville Syrjala from comment #3)
> > > Does commit 05c72e77ccda ("drm/i915: Nuke the LVDS lid notifier")
> > > fix this?
> > 
> > I have only tested against the stable tree at
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git . Your patch
> > is not in there yet.
> > 
> > Should I clone Linus'tree 4.19-rc2 which has it or should I go for your
> > dev-tree at git://anongit.freedesktop.org/drm-intel ?
> 
> Doesn't really matter. Alternatively you could just cherry pick the commit
> from one of those trees to your stable branch (git add remote foo <url>; git
> fetch foo; git cherry-pick <sha>).

I cherry-picked commit 05c72e77ccda ("drm/i915: Nuke the LVDS lid notifier") on top of 4.18.5 as you suggested.

Yes, it works.

The screen corruption is gone and the logs are clean. The 82bcb0e7a45d ("drm/atomic: Handling the case when setting old crtc for plane") change was still in place.
Comment 8 Lakshmi 2018-09-07 12:51:09 UTC
Closing this issue as fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.