Bug 107792

Summary: [CI][DRMTIP] igt@kms_frontbuffer_tracking@ - incomplete - BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 in intel_psr_set_debugfs_mode
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: James Ausmus <james.ausmus>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: highest CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: ICL i915 features: display/PSR

Description Martin Peres 2018-09-03 08:22:18 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_98/fi-icl-u/igt@kms_frontbuffer_tracking@psr-1p-offscren-pri-shrfb-draw-mmap-cpu.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_99/fi-icl-u/igt@kms_frontbuffer_tracking@fbc-stridechange.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_99/fi-icl-u/igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-mmap-cpu.html

The best pstore was from the run 99:

<1>[   42.493574] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
<6>[   42.493578] PGD 0 P4D 0 
<4>[   42.493583] Oops: 0000 [#1] PREEMPT SMP PTI
<4>[   42.493587] CPU: 5 PID: 1239 Comm: kms_frontbuffer Tainted: G     U  W         4.19.0-rc1-gb4059d5fdbfd-drmtip_99+ #1
<4>[   42.493592] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2313.A01.1808012121 08/01/2018
<4>[   42.493599] RIP: 0010:__lock_acquire+0x47b/0x1b50
<4>[   42.493602] Code: 65 48 33 34 25 28 00 00 00 44 89 f0 0f 85 f8 0d 00 00 48 81 c4 b0 00 00 00 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <49> 81 3b 80 40 82 98 41 be 00 00 00 00 44 0f 45 f0 83 fe 01 0f 87
<4>[   42.493606] RSP: 0018:ffffb8184138fb00 EFLAGS: 00010002
<4>[   42.493609] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   42.493613] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   42.493617] RBP: ffffb8184138fbe0 R08: ffffffff97951638 R09: 0000000000000001
<4>[   42.493621] R10: 0000000000000000 R11: 0000000000000080 R12: ffffa3edda560040
<4>[   42.493623] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000046
<4>[   42.493625] FS:  00007fad02675980(0000) GS:ffffa3edf0740000(0000) knlGS:0000000000000000
<4>[   42.493627] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   42.493636] CR2: 0000000000000080 CR3: 000000049386c004 CR4: 0000000000760ee0
<4>[   42.493639] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   42.493642] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[   42.493644] PKRU: 55555554
<4>[   42.493647] Call Trace:
<4>[   42.493656]  ? lock_acquire+0xa6/0x1c0
<4>[   42.493661]  ? drm_modeset_lock+0x3b/0x110
<4>[   42.493666]  ? lock_acquire+0xa6/0x1c0
<4>[   42.493669]  lock_acquire+0xa6/0x1c0
<4>[   42.493674]  ? wait_for_common+0x48/0x1f0
<4>[   42.493681]  _raw_spin_lock_irq+0x30/0x40
<4>[   42.493685]  ? wait_for_common+0x48/0x1f0
<4>[   42.493688]  wait_for_common+0x48/0x1f0
<4>[   42.493692]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4>[   42.493697]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4>[   42.493702]  wait_for_completion_interruptible+0x14/0x30
<4>[   42.493746]  intel_psr_set_debugfs_mode+0x99/0x240 [i915]
<4>[   42.493782]  i915_edp_psr_debug_set+0x87/0xe0 [i915]
<4>[   42.493789]  simple_attr_write+0xb0/0xd0
<4>[   42.493792]  full_proxy_write+0x51/0x80
<4>[   42.493796]  __vfs_write+0x31/0x180
<4>[   42.493799]  ? rcu_lockdep_current_cpu_online+0x8f/0xd0
<4>[   42.493802]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[   42.493804]  ? rcu_sync_lockdep_assert+0x29/0x50
<4>[   42.493807]  ? __sb_start_write+0x152/0x1f0
<4>[   42.493809]  ? __sb_start_write+0x168/0x1f0
<4>[   42.493811]  vfs_write+0xbd/0x1b0
<4>[   42.493813]  ksys_write+0x50/0xc0
<4>[   42.493816]  do_syscall_64+0x55/0x190
<4>[   42.493819]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   42.493821] RIP: 0033:0x7fad01bd4281
<4>[   42.493824] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4>[   42.493825] RSP: 002b:00007ffc57e45158 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[   42.493827] RAX: ffffffffffffffda RBX: 000055a3bd1bdf9d RCX: 00007fad01bd4281
<4>[   42.493829] RDX: 0000000000000003 RSI: 00007fad02279fc4 RDI: 0000000000000006
<4>[   42.493830] RBP: 00007ffc57e45180 R08: 0000000000000000 R09: 0000000000000000
<4>[   42.493833] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fad022785ac
<4>[   42.493834] R13: 000055a3bd1bdf8b R14: 000055a3bd1bdf79 R15: 000055a3bd1bdf79
<4>[   42.493837] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 ax88179_178a x86_pkg_temp_thermal usbnet coretemp mii crct10dif_pclmul snd_hda_intel crc32_pclmul ghash_clmulni_intel snd_hda_codec e1000e snd_hwdep snd_hda_core snd_pcm prime_numbers
<0>[   42.493865] Dumping ftrace buffer:
<0>[   42.493868] ---------------------------------
[...]

and

<14>[   39.678421] [IGT] kms_frontbuffer_tracking: starting subtest psr-1p-primscrn-spr-indfb-draw-mmap-cpu
<5>[   39.678585] Setting dangerous option enable_fbc - tainting kernel
<7>[   39.678755] [drm:i915_edp_psr_debug_set [i915]] Setting PSR debug to f
<7>[   39.678813] [drm:intel_psr_set_debugfs_mode [i915]] Invalid debug mask f
<7>[   39.678963] [drm:i915_edp_psr_debug_set [i915]] Setting PSR debug to 1
<1>[   39.678988] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
<6>[   39.678992] PGD 0 P4D 0 
<4>[   39.678998] Oops: 0000 [#1] PREEMPT SMP PTI
<4>[   39.679005] CPU: 0 PID: 1245 Comm: kms_frontbuffer Tainted: G     U  W         4.19.0-rc1-gb4059d5fdbfd-drmtip_99+ #1
<4>[   39.679008] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.2313.A01.1808012121 08/01/2018
<4>[   39.679015] RIP: 0010:__lock_acquire+0x47b/0x1b50
<4>[   39.679021] Code: 65 48 33 34 25 28 00 00 00 44 89 f0 0f 85 f8 0d 00 00 48 81 c4 b0 00 00 00 5b 41 5a 41 5c 41 5d 41 5e 41 5f 5d 49 8d 62 f8 c3 <49> 81 3b 80 40 82 bb 41 be 00 00 00 00 44 0f 45 f0 83 fe 01 0f 87
<4>[   39.679024] RSP: 0018:ffffb3c241343b00 EFLAGS: 00010002
<4>[   39.679030] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000000
<4>[   39.679033] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
<4>[   39.679035] RBP: ffffb3c241343be0 R08: ffffffffba951638 R09: 0000000000000001
<4>[   39.679041] R10: 0000000000000000 R11: 0000000000000080 R12: ffffa084e0fec040
<4>[   39.679045] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000046
<4>[   39.679048] FS:  00007ff8f5598980(0000) GS:ffffa084f0600000(0000) knlGS:0000000000000000
<4>[   39.679050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   39.679057] CR2: 0000000000000080 CR3: 0000000491364004 CR4: 0000000000760ef0
<4>[   39.679059] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   39.679062] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[   39.679067] PKRU: 55555554
<4>[   39.679069] Call Trace:
<4>[   39.679074]  ? lock_acquire+0xa6/0x1c0
<4>[   39.679079]  ? drm_modeset_lock+0x3b/0x110
<4>[   39.679084]  ? lock_acquire+0xa6/0x1c0
<4>[   39.679087]  lock_acquire+0xa6/0x1c0
<4>[   39.679096]  ? wait_for_common+0x48/0x1f0
<4>[   39.679103]  _raw_spin_lock_irq+0x30/0x40
<4>[   39.679106]  ? wait_for_common+0x48/0x1f0
<4>[   39.679111]  wait_for_common+0x48/0x1f0
<4>[   39.679117]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4>[   39.679121]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4>[   39.679128]  wait_for_completion_interruptible+0x14/0x30
<4>[   39.679180]  intel_psr_set_debugfs_mode+0x99/0x240 [i915]
<4>[   39.679223]  i915_edp_psr_debug_set+0x87/0xe0 [i915]
<4>[   39.679244]  simple_attr_write+0xb0/0xd0
<4>[   39.679250]  full_proxy_write+0x51/0x80
<4>[   39.679254]  __vfs_write+0x31/0x180
<4>[   39.679257]  ? rcu_lockdep_current_cpu_online+0x8f/0xd0
<4>[   39.679260]  ? rcu_read_lock_sched_held+0x6f/0x80
<4>[   39.679262]  ? rcu_sync_lockdep_assert+0x29/0x50
<4>[   39.679264]  ? __sb_start_write+0x152/0x1f0
<4>[   39.679266]  ? __sb_start_write+0x168/0x1f0
<4>[   39.679268]  vfs_write+0xbd/0x1b0
<4>[   39.679271]  ksys_write+0x50/0xc0
<4>[   39.679274]  do_syscall_64+0x55/0x190
<4>[   39.679277]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   39.679279] RIP: 0033:0x7ff8f4af7281
<4>[   39.679282] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4>[   39.679283] RSP: 002b:00007ffe83366688 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4>[   39.679286] RAX: ffffffffffffffda RBX: 000055a880375f93 RCX: 00007ff8f4af7281
<4>[   39.679287] RDX: 0000000000000003 RSI: 00007ff8f519cfc4 RDI: 0000000000000006
<4>[   39.679288] RBP: 00007ffe833666b0 R08: 0000000000000000 R09: 0000000000000000
<4>[   39.679290] R10: 0000000000000000 R11: 0000000000000246 R12: 000055a880375f97
<4>[   39.679292] R13: 00007ff8f519b587 R14: 000055a880375f6a R15: 000055a880375f70
<4>[   39.679294] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 ax88179_178a usbnet mii x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core e1000e snd_pcm prime_numbers
<0>[   39.679311] Dumping ftrace buffer:
<0>[   39.679312] ---------------------------------
Comment 1 Martin Peres 2018-09-03 08:23:04 UTC
Bumping the priority because it stops the testing and reduces coverage on ICL.
Comment 2 Dhinakaran Pandiyan 2018-09-04 20:56:40 UTC
There is a patch from Chris to fix this - 

drm/i915: Be defensive and don't assume PSR has any commit to sync against

If the previous modeset commit has completed and is no longer part of
the crtc state, skip waiting for it.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107792
Fixes: c44301fce614 ("drm/i915: Allow control of PSR at runtime through debugfs, v6")
Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst at linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi at intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan at intel.com>
Comment 3 Chris Wilson 2018-09-04 21:04:16 UTC
Actually pushed,

commit 9d3f8d2ff777b94993581bdfe5c595c619429624 (drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Sep 4 17:29:02 2018 +0100

    drm/i915: Be defensive and don't assume PSR has any commit to sync against
    
    If the previous modeset commit has completed and is no longer part of
    the crtc state, skip waiting for it.
    
    Ville pointed out that, in fact, the commit is never removed after a
    modeset so the only way we could see a NULL here should be if there was
    never a commit attached. Nevertheless, we have the evidence it can be
    NULL and it has been defended against elsewhere, for example commit
    93313538c153 ("drm/i915: Pass idle crtc_state to intel_dp_sink_crc").

That we also check for commit being NULL elsewhere was the clincher.
Comment 4 Martin Peres 2018-09-07 15:29:16 UTC
(In reply to Chris Wilson from comment #3)
> Actually pushed,
> 
> commit 9d3f8d2ff777b94993581bdfe5c595c619429624
> (drm-intel/drm-intel-next-queued)
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Sep 4 17:29:02 2018 +0100
> 
>     drm/i915: Be defensive and don't assume PSR has any commit to sync
> against
>     
>     If the previous modeset commit has completed and is no longer part of
>     the crtc state, skip waiting for it.
>     
>     Ville pointed out that, in fact, the commit is never removed after a
>     modeset so the only way we could see a NULL here should be if there was
>     never a commit attached. Nevertheless, we have the evidence it can be
>     NULL and it has been defended against elsewhere, for example commit
>     93313538c153 ("drm/i915: Pass idle crtc_state to intel_dp_sink_crc").
> 
> That we also check for commit being NULL elsewhere was the clincher.

Thanks!

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.