Bug 105959 - [CI] igt@kms_* - incomplete - softdog - backtrace in dmesg indicates: NULL deref in intel_psr_flush
Summary: [CI] igt@kms_* - incomplete - softdog - backtrace in dmesg indicates: NULL de...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Jose Roberto de Souza
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 106068 106071 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-04-09 13:08 UTC by Marta Löfstedt
Modified: 2019-01-04 15:52 UTC (History)
1 user (show)

See Also:
i915 platform: CFL, KBL, SKL
i915 features: display/PSR


Attachments

Description Marta Löfstedt 2018-04-09 13:08:27 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_16/fi-cfl-s3/igt@kms_frontbuffer_tracking@psr-1p-pri-indfb-multidraw.html

Fron run.log:
running: igt/kms_frontbuffer_tracking/psr-1p-pri-indfb-multidraw

[68/75] skip: 8, pass: 59, fail: 1 |                            
owatch: TIMEOUT!

from dmesg:
<1>[  539.599239] BUG: unable to handle kernel NULL pointer dereference at 00000000000005e0
<1>[  539.599262] IP: intel_psr_flush+0x64/0x140 [i915]
<6>[  539.599263] PGD 0 P4D 0 
<4>[  539.599266] Oops: 0000 [#1] PREEMPT SMP PTI
<0>[  539.599268] Dumping ftrace buffer:
...
<4>[  539.630887] Modules linked in: vgem cdc_ether usbnet i915 r8152 x86_pkg_temp_thermal mii intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e mei_me mei prime_numbers
<4>[  539.630898] CPU: 5 PID: 2267 Comm: kms_frontbuffer Tainted: G     U           4.16.0-rc7-g1be073153147-drmtip_16+ #1
<4>[  539.630899] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B19.1802080131 02/08/2018
<4>[  539.630919] RIP: 0010:intel_psr_flush+0x64/0x140 [i915]
<4>[  539.630920] RSP: 0018:ffffa479402c7b58 EFLAGS: 00010286
<4>[  539.630922] RAX: 0000000000000000 RBX: ffff97f4113e0000 RCX: 0000000000000001
<4>[  539.630923] RDX: 0000000080000001 RSI: ffffffffffffffff RDI: 0000000000000000
<4>[  539.630924] RBP: 0000000000000100 R08: 000000004e19dbcc R09: 0000000000000000
<4>[  539.630925] R10: ffffa479402c7b58 R11: ffff97f4113ea638 R12: ffff97f4113ea5d8
<4>[  539.630927] R13: 0000000000000100 R14: 0000000000000004 R15: ffffffff93e89250
<4>[  539.630928] FS:  00007f6b60eb8980(0000) GS:ffff97f41d340000(0000) knlGS:0000000000000000
<4>[  539.630929] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  539.630930] CR2: 00000000000005e0 CR3: 00000004494f6003 CR4: 00000000003606e0
<4>[  539.630932] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  539.630933] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  539.630934] Call Trace:
<4>[  539.630954]  intel_frontbuffer_flush+0x6a/0x80 [i915]
<4>[  539.630975]  intel_prepare_plane_fb+0x2c0/0x5a0 [i915]
<4>[  539.630979]  drm_atomic_helper_prepare_planes+0x4a/0xd0
<4>[  539.630999]  intel_atomic_commit+0xa9/0x320 [i915]
<4>[  539.631002]  set_property_atomic+0x100/0x130
<4>[  539.631005]  drm_mode_obj_set_property_ioctl+0xf1/0x1a0
<4>[  539.631008]  ? drm_mode_obj_find_prop_id+0x40/0x40
<4>[  539.631010]  drm_ioctl_kernel+0x7c/0xf0
<4>[  539.631012]  drm_ioctl+0x2e6/0x3a0
<4>[  539.631014]  ? drm_mode_obj_find_prop_id+0x40/0x40
<4>[  539.631019]  do_vfs_ioctl+0xa0/0x6c0
<4>[  539.631021]  ? __task_pid_nr_ns+0xbd/0x1d0
<4>[  539.631023]  SyS_ioctl+0x36/0x70
<4>[  539.631026]  ? do_syscall_64+0x19/0x1b0
<4>[  539.631048]  do_syscall_64+0x6b/0x1b0
<4>[  539.631051]  entry_SYSCALL_64_after_hwframe+0x42/0xb7
<4>[  539.631052] RIP: 0033:0x7f6b6012a5d7
<4>[  539.631053] RSP: 002b:00007ffe90456698 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  539.631055] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f6b6012a5d7
<4>[  539.631056] RDX: 00007ffe904566d0 RSI: 00000000c01864ba RDI: 0000000000000003
<4>[  539.631058] RBP: 00007ffe904566d0 R08: 0000000000000001 R09: 0000000000000000
<4>[  539.631059] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c01864ba
<4>[  539.631060] R13: 0000000000000003 R14: 0000558e2337a401 R15: 0000558e2337a401
<4>[  539.631076] Code: 48 89 fb 31 f6 4c 89 e7 e8 aa 1b 35 d3 48 8b 83 60 a6 00 00 48 85 c0 0f 84 9f 00 00 00 48 8b 80 48 ff ff ff 48 c7 c6 ff ff ff ff <48> 63 90 e0 05 00 00 b8 01 00 00 00 8d 0c d5 00 00 00 00 48 d3 
<1>[  539.631123] RIP: intel_psr_flush+0x64/0x140 [i915] RSP: ffffa479402c7b58
<4>[  539.631124] CR2: 00000000000005e0
<4>[  539.631126] ---[ end trace 520ee23b7ec0823c ]---
<3>[  539.953232] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:34
<3>[  539.953237] in_atomic(): 0, irqs_disabled(): 1, pid: 2267, name: kms_frontbuffer
<4>[  539.953238] INFO: lockdep is turned off.
<4>[  539.953239] irq event stamp: 371690
<4>[  539.953244] hardirqs last  enabled at (371689): [<000000005bdd2146>] _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[  539.953246] hardirqs last disabled at (371690): [<000000001e15f6aa>] error_entry+0x78/0xf0
<4>[  539.953248] softirqs last  enabled at (371304): [<00000000711d937a>] __do_softirq+0x32b/0x4e1
<4>[  539.953251] softirqs last disabled at (371285): [<000000009aaa05f8>] irq_exit+0xa4/0xb0
<4>[  539.953253] CPU: 5 PID: 2267 Comm: kms_frontbuffer Tainted: G     UD          4.16.0-rc7-g1be073153147-drmtip_16+ #1
<4>[  539.953255] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B19.1802080131 02/08/2018
<4>[  539.953256] Call Trace:
<4>[  539.953259]  dump_stack+0x67/0x95
<4>[  539.953262]  ___might_sleep+0x167/0x250
<4>[  539.953265]  exit_signals+0x2b/0x2d0
<4>[  539.953267]  do_exit+0xa3/0xd40
<4>[  539.953270]  ? SyS_ioctl+0x36/0x70
<4>[  539.953272]  ? do_syscall_64+0x19/0x1b0
<4>[  539.953275]  rewind_stack_do_exit+0x17/0x20
Comment 1 Marta Löfstedt 2018-04-16 07:57:22 UTC
*** Bug 106068 has been marked as a duplicate of this bug. ***
Comment 2 Marta Löfstedt 2018-04-16 07:57:55 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_21/fi-kbl-r/igt@kms_frontbuffer_tracking@fbc-rgb101010-draw-blt.html

softdog, but pstore has:
<4>[  340.053536] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 asix btusb usbnet btrtl btbcm mii btintel bluetooth snd_hda_intel x86_pkg_temp_thermal intel_powerclamp snd_hda_codec coretemp crct10dif_pclmul crc32_pclmul snd_hwdep ghash_clmulni_intel snd_hda_core e1000e snd_pcm ecdh_generic mei_me mei prime_numbers pinctrl_sunrisepoint pinctrl_intel
<4>[  340.053560] CPU: 4 PID: 142 Comm: kworker/4:3 Tainted: G     U           4.16.0-rc7-ga0e39233b887-drmtip_21+ #1
<4>[  340.053562] Hardware name: Intel Corporation Kabylake Client platform/Kabylake R DDR4 RVP, BIOS KBLSE2R1.R00.X078.P02.1703030515 03/03/2017
<4>[  340.053584] Workqueue: events i915_clflush_work [i915]
<4>[  340.053612] RIP: 0010:intel_psr_flush+0x61/0x150 [i915]
<4>[  340.053614] RSP: 0018:ffffab8f00537dd0 EFLAGS: 00010286
<4>[  340.053616] RAX: 0000000000000000 RBX: ffff8d41a385a5b8 RCX: 0000000000000000
<4>[  340.053618] RDX: 0000000080000001 RSI: ffffffffffffffff RDI: 00000000ffffffff
<4>[  340.053619] RBP: 0000000000000001 R08: ffff8d41b1a7d948 R09: 00000000ad9315fd
<4>[  340.053621] R10: ffffab8f00537dd0 R11: ffff8d41b1a7d040 R12: ffff8d41a3850000
<4>[  340.053622] R13: 0000000000000001 R14: 0000000000000001 R15: ffff8d41a89e25c8
<4>[  340.053624] FS:  0000000000000000(0000) GS:ffff8d41bed00000(0000) knlGS:0000000000000000
<4>[  340.053625] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  340.053627] CR2: 00000000000005e0 CR3: 0000000251210006 CR4: 00000000003606e0
<4>[  340.053628] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  340.053630] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  340.053631] Call Trace:
<4>[  340.053661]  intel_frontbuffer_flush+0x6a/0x80 [i915]
<4>[  340.053684]  i915_clflush_work+0x73/0x1c0 [i915]
<4>[  340.053688]  process_one_work+0x21a/0x640
<4>[  340.053691]  worker_thread+0x48/0x3a0
<4>[  340.053695]  kthread+0xfb/0x130
<4>[  340.053697]  ? process_one_work+0x640/0x640
<4>[  340.053699]  ? _kthread_create_on_node+0x60/0x60
<4>[  340.053702]  ret_from_fork+0x3a/0x50
<4>[  340.053706] Code: b8 a5 00 00 89 f5 31 f6 48 89 df e8 aa 75 26 d0 49 8b 84 24 40 a6 00 00 48 85 c0 74 7d 48 8b 80 48 ff ff ff 48 c7 c6 ff ff ff ff <48> 63 90 e0 05 00 00 b8 01 00 00 00 8d 0c d5 00 00 00 00 48 d3 
<1>[  340.053771] RIP: intel_psr_flush+0x61/0x150 [i915] RSP: ffffab8f00537dd0
<4>[  340.053773] CR2: 00000000000005e0
<4>[  340.053775] ---[ end trace cc8235f3734af51b ]---
Comment 3 Marta Löfstedt 2018-04-18 06:24:10 UTC
*** Bug 106071 has been marked as a duplicate of this bug. ***
Comment 4 Marta Löfstedt 2018-04-18 06:55:10 UTC
Highest due to CFL-u
Comment 13 Martin Peres 2018-06-11 13:47:49 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_59/fi-cfl-s3/igt@kms_rotation_crc@sprite-rotation-180.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_60/fi-cfl-s3/igt@kms_rotation_crc@sprite-rotation-180.html

<0>[  366.252050] ---------------------------------
<4>[  366.252050] Modules linked in: i2c_dev vgem cdc_ether usbnet r8152 mii x86_pkg_temp_thermal intel_powerclamp i915 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel e1000e mei_me mei prime_numbers
<4>[  366.252050] CPU: 2 PID: 86 Comm: kworker/2:1 Tainted: G     U  W         4.17.0-rc7-g104ae0f882a6-drmtip_60+ #1
<4>[  366.252050] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X118.B19.1802080131 02/08/2018
<4>[  366.252050] Workqueue: events i915_clflush_work [i915]
<4>[  366.252050] RIP: 0010:intel_psr_flush+0x4f/0x130 [i915]
<4>[  366.252050] RSP: 0018:ffff9dc700397df0 EFLAGS: 00010282
<4>[  366.252050] RAX: 0000000000000000 RBX: ffff8dc4bc780000 RCX: 0000000000000001
<4>[  366.252050] RDX: 0000000080000001 RSI: 0000000000000001 RDI: 0000000000000000
<4>[  366.252050] RBP: 0000000000000002 R08: 00000000068cf6d4 R09: 0000000000000000
<4>[  366.252050] R10: ffff9dc700397df0 R11: ffff8dc4bc78a310 R12: ffff8dc4bc78a2b0
<4>[  366.252050] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000000
<4>[  366.252050] FS:  0000000000000000(0000) GS:ffff8dc4cd280000(0000) knlGS:0000000000000000
<4>[  366.252050] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  366.252050] CR2: 0000000000000600 CR3: 0000000372210004 CR4: 00000000003606e0
<4>[  366.252050] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[  366.252050] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[  366.252050] Call Trace:
<4>[  366.252050]  intel_frontbuffer_flush+0x6a/0x80 [i915]
<4>[  366.252050]  i915_clflush_work+0x73/0x1b0 [i915]
<4>[  366.252050]  process_one_work+0x229/0x6a0
<4>[  366.252050]  worker_thread+0x35/0x380
<4>[  366.252050]  ? process_one_work+0x6a0/0x6a0
<4>[  366.252050]  kthread+0x119/0x130
<4>[  366.252050]  ? kthread_flush_work_fn+0x10/0x10
<4>[  366.252050]  ret_from_fork+0x3a/0x50
<4>[  366.252050] Code: 89 f5 48 89 fb 31 f6 4c 89 e7 e8 8d 36 53 e1 48 8b 83 38 a3 00 00 48 85 c0 0f 84 8b 00 00 00 48 8b 80 48 ff ff ff be 01 00 00 00 <48> 63 90 00 06 00 00 48 c7 c0 ff ff ff ff 8d 0c d5 00 00 00 00 
<1>[  366.252050] RIP: intel_psr_flush+0x4f/0x130 [i915] RSP: ffff9dc700397df0
<4>[  366.252050] CR2: 0000000000000600
<4>[  366.252050] ---[ end trace 51ca355488b86fa0 ]---
Comment 14 Martin Peres 2018-07-24 13:35:29 UTC
Also seen on SKL:

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_78/fi-skl-6700hq/igt@kms_busy@basic-modeset-b.html

<1>[   83.275386] BUG: unable to handle kernel NULL pointer dereference at 0000000000000600
<6>[   83.275394] PGD 0 P4D 0 
<4>[   83.275406] Oops: 0000 [#1] PREEMPT SMP PTI
<4>[   83.275414] CPU: 3 PID: 129 Comm: kworker/3:1 Tainted: G     U  W         4.18.0-rc3-g7bc1be8128e3-drmtip_78+ #1
<4>[   83.275419] Hardware name: TOSHIBA SATELLITE P50-C/06F4                            , BIOS 1.40 03/29/2016
<4>[   83.275487] Workqueue: events i915_clflush_work [i915]
<4>[   83.275580] RIP: 0010:intel_psr_flush+0x4f/0x120 [i915]
<4>[   83.275585] Code: f5 48 89 fb 31 f6 4c 89 e7 e8 7d 6a 62 f3 48 8b 83 68 a4 00 00 48 85 c0 0f 84 8b 00 00 00 48 8b 80 48 ff ff ff be 01 00 00 00 <48> 63 90 00 06 00 00 48 c7 c0 ff ff ff ff 8d 0c d5 00 00 00 00 48 
<4>[   83.275709] RSP: 0018:ffffa5f2003cfdf0 EFLAGS: 00010282
<4>[   83.275716] RAX: 0000000000000000 RBX: ffff959768e80000 RCX: 0000000000000001
<4>[   83.275721] RDX: 0000000080000001 RSI: 0000000000000001 RDI: 0000000000000000
<4>[   83.275727] RBP: 0000000000000100 R08: 0000000021bcca6d R09: 0000000000000000
<4>[   83.275732] R10: ffffa5f2003cfdf0 R11: ffff959768e8a440 R12: ffff959768e8a3e0
<4>[   83.275737] R13: 0000000000000100 R14: 0000000000000001 R15: 0000000000000000
<4>[   83.275743] FS:  0000000000000000(0000) GS:ffff959781cc0000(0000) knlGS:0000000000000000
<4>[   83.275748] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   83.275754] CR2: 0000000000000600 CR3: 0000000147210004 CR4: 00000000003606e0
<4>[   83.275759] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   83.275764] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<4>[   83.275768] Call Trace:
<4>[   83.275862]  intel_frontbuffer_flush+0x6a/0x80 [i915]
<4>[   83.275934]  i915_clflush_work+0x73/0x1b0 [i915]
<4>[   83.275945]  process_one_work+0x248/0x6c0
<4>[   83.275957]  worker_thread+0x37/0x380
<4>[   83.275965]  ? process_one_work+0x6c0/0x6c0
<4>[   83.275973]  kthread+0x119/0x130
<4>[   83.275981]  ? kthread_flush_work_fn+0x10/0x10
<4>[   83.275990]  ret_from_fork+0x3a/0x50
<4>[   83.276002] Modules linked in: snd_hda_codec_hdmi snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me r8169 mei mii prime_numbers
<0>[   83.315412] ---------------------------------
<4>[   83.315414] CR2: 0000000000000600
<4>[   83.315417] ---[ end trace d699399803d6cb6f ]---
<4>[   83.657100] RIP: 0010:intel_psr_flush+0x4f/0x120 [i915]
<4>[   83.657105] Code: f5 48 89 fb 31 f6 4c 89 e7 e8 7d 6a 62 f3 48 8b 83 68 a4 00 00 48 85 c0 0f 84 8b 00 00 00 48 8b 80 48 ff ff ff be 01 00 00 00 <48> 63 90 00 06 00 00 48 c7 c0 ff ff ff ff 8d 0c d5 00 00 00 00 48 
<4>[   83.657154] RSP: 0018:ffffa5f2003cfdf0 EFLAGS: 00010282
<4>[   83.657157] RAX: 0000000000000000 RBX: ffff959768e80000 RCX: 0000000000000001
<4>[   83.657159] RDX: 0000000080000001 RSI: 0000000000000001 RDI: 0000000000000000
<4>[   83.657161] RBP: 0000000000000100 R08: 0000000021bcca6d R09: 0000000000000000
<4>[   83.657163] R10: ffffa5f2003cfdf0 R11: ffff959768e8a440 R12: ffff959768e8a3e0
<4>[   83.657165] R13: 0000000000000100 R14: 0000000000000001 R15: 0000000000000000
<4>[   83.657167] FS:  0000000000000000(0000) GS:ffff959781cc0000(0000) knlGS:0000000000000000
<4>[   83.657169] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   83.657171] CR2: 0000000000000600 CR3: 0000000268d52006 CR4: 00000000003606e0
<4>[   83.657173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>[   83.657175] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<3>[   83.657178] BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:34
<3>[   83.657180] in_atomic(): 0, irqs_disabled(): 1, pid: 129, name: kworker/3:1
Comment 15 James Ausmus 2018-10-26 00:16:34 UTC
Does not appear to be occurring any longer.
Comment 17 Jose Roberto de Souza 2018-12-03 18:38:40 UTC
Should be fixed with the commit that was just merged:
f0ad62a631e040ae4413286a4b46a90c5ce42d07 (drm/i915/psr: Get pipe id following atomic guidelines)
Comment 18 Francesco Balestrieri 2018-12-28 09:36:11 UTC
Some failures are still matched with this bug, but I'm not sure if they are really the same. E.g.:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5345/shard-skl7/igt@kms_frontbuffer_tracking@fbc-suspend.html

Can someone verify?
Comment 19 Martin Peres 2019-01-04 15:52:54 UTC
(In reply to Francesco Balestrieri from comment #18)
> Some failures are still matched with this bug, but I'm not sure if they are
> really the same. E.g.:
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5345/shard-skl7/
> igt@kms_frontbuffer_tracking@fbc-suspend.html
> 
> Can someone verify?

Yeah, this is the same as https://bugs.freedesktop.org/show_bug.cgi?id=107773. Closing!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.