Bug 110382 - [CI][SHARDS] igt@runner@aborted - fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI, Workqueue: events azx_probe_work
Summary: [CI][SHARDS] igt@runner@aborted - fail - general protection fault: 0000 [#1] ...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-10 14:18 UTC by Martin Peres
Modified: 2019-05-21 07:48 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: display/audio


Attachments
boot2.log (184.98 KB, text/x-log)
2019-05-07 09:46 UTC, Amadeusz Sławiński
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Peres 2019-04-10 14:18:42 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5896/shard-iclb2/igt@runner@aborted.html

Aborting.
Previous test: nothing
Next test: gem_exec_basic (basic-render)

Kernel badly tainted (0x2a0) (check dmesg for details):
	(0x20) TAINT_BAD_PAGE: Bad page reference or an unexpected page flags.
	(0x80) TAINT_DIE: Kernel has died - BUG/OOPS.
	(0x200) TAINT_WARN: WARN_ON has happened.

<4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted 5.1.0-rc4-CI-CI_DRM_5896+ #1
<4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
<4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
<4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
<4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88 ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00 <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
<4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
<4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000000
<4>[   11.838008] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
<4>[   11.838036] RBP: ffff88848250b3f8 R08: 0000000085e480ea R09: 0000000000000001
<4>[   11.838063] R10: 0000000000000000 R11: ffff888499ee48b8 R12: ffffffff820df667
<4>[   11.839071] R13: 0000000000000000 R14: 6b6b6b6b6b6b6b6b R15: ffff88849507df00
<4>[   11.840027] FS:  0000000000000000(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000
<4>[   11.840992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   11.841951] CR2: 0000556b60348c78 CR3: 0000000005214004 CR4: 0000000000760ee0
<4>[   11.842915] PKRU: 55555554
<4>[   11.843882] Call Trace:
<4>[   11.844846]  ? add_widget_node+0x68/0xa0 [snd_hda_core]
<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.845813]  widget_tree_create+0xf6/0x110 [snd_hda_core]
<7>[   11.846800] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.847717]  hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<4>[   11.847720]  snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<7>[   11.848782] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.849720]  snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<4>[   11.849724]  azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<7>[   11.850981] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.851936]  azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<7>[   11.852973] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.853928]  process_one_work+0x245/0x610
<7>[   11.855224] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.856202]  worker_thread+0x37/0x380
<4>[   11.856204]  ? process_one_work+0x610/0x610
<7>[   11.857129] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.858091]  kthread+0x119/0x130
<4>[   11.858092]  ? kthread_park+0x80/0x80
<4>[   11.858095]  ret_from_fork+0x24/0x50
<7>[   11.858999] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.859975] Modules linked in: snd_hda_codec_hdmi(+) snd_hda_codec_realtek(+) snd_hda_codec_generic x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core cdc_ether usbnet e1000e snd_pcm mii ptp pps_core mei_me mei i915 prime_numbers
<7>[   11.860861] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.862139] ---[ end trace bf38e6abac39589f ]---
Comment 1 CI Bug Log 2019-04-10 14:19:09 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@runner@aborted - fail - Previous test: nothing, Kernel badly tainted (0x2a0)
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5896/shard-iclb2/igt@runner@aborted.html
Comment 2 Jani Saarinen 2019-04-22 11:53:00 UTC
On shards there was BIOS update to BIOS ICLSFWR1.R00.3121.A00.1903190527 03/19/2019.
Comment 3 James Ausmus 2019-04-25 20:33:29 UTC
If we get rid of the noise from the dmesg, we get:

<4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted 5.1.0-rc4-CI-CI_DRM_5896+ #1
<4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
<4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
<4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
<4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88 ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00 <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
<4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
<4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000000
<4>[   11.838008] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
<4>[   11.838036] RBP: ffff88848250b3f8 R08: 0000000085e480ea R09: 0000000000000001
<4>[   11.838063] R10: 0000000000000000 R11: ffff888499ee48b8 R12: ffffffff820df667
<4>[   11.839071] R13: 0000000000000000 R14: 6b6b6b6b6b6b6b6b R15: ffff88849507df00
<4>[   11.840027] FS:  0000000000000000(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000
<4>[   11.840992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   11.841951] CR2: 0000556b60348c78 CR3: 0000000005214004 CR4: 0000000000760ee0
<4>[   11.842915] PKRU: 55555554
<4>[   11.843882] Call Trace:
<4>[   11.844846]  ? add_widget_node+0x68/0xa0 [snd_hda_core]
<4>[   11.845813]  widget_tree_create+0xf6/0x110 [snd_hda_core]
<4>[   11.847717]  hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<4>[   11.847720]  snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<4>[   11.849720]  snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<4>[   11.849724]  azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<4>[   11.851936]  azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<4>[   11.853928]  process_one_work+0x245/0x610
<4>[   11.856202]  worker_thread+0x37/0x380
<4>[   11.856204]  ? process_one_work+0x610/0x610
<4>[   11.858091]  kthread+0x119/0x130
<4>[   11.858092]  ? kthread_park+0x80/0x80
<4>[   11.858095]  ret_from_fork+0x24/0x50
<4>[   11.859975] Modules linked in: snd_hda_codec_hdmi(+) snd_hda_codec_realtek(+) snd_hda_codec_generic x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core cdc_ether usbnet e1000e snd_pcm mii ptp pps_core mei_me mei i915 prime_numbers
<4>[   11.862139] ---[ end trace bf38e6abac39589f ]---

moved lines:

<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.846800] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.848782] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.850981] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.852973] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.855224] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.857129] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7>[   11.860861] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7>[   11.858999] [drm:i915_audio_component_get_eld [i915]] Not valid for port D


Additional info:

<7>[   11.112111] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.529366] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.583388] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.601007] [drm:drm_add_edid_modes] ELD: no CEA Extension found

-------------------------------------------------------------------


So, what appears to be happening is that there are no devices connected that advertise an ELD that is audio capable, so none of the DDI ports ever gets has_audio set, which means that:

 /* referred in audio callbacks */
        dev_priv->av_enc_map[pipe] = encoder;

from intel_audio_codec_enable (i915/intel_audio.c) never gets called, which means that the call to get_saved_enc always returns NULL, causing the


<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B


lines to show up, which is i915_audio_component_get_eld returning -EINVAL, which appears to be making the audio driver blow up, as it's trying to grab ELDs for ports that don't have an audio-capable device connected.

i915 doesn't appear to be doing anything wrong here, I believe this is a bug for the audio driver
Comment 4 Chris Wilson 2019-04-25 20:37:40 UTC
(In reply to James Ausmus from comment #3)
> If we get rid of the noise from the dmesg, we get:
> 
> <4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
> <4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted
> 5.1.0-rc4-CI-CI_DRM_5896+ #1
> <4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client
> Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334
> 02/25/2019
> <4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
> <4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
> <4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88
> ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00
> <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
> <4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
> <4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX:

0x6b6b6b6b6b6b6b6b == POISON_FREE == use-after-free in snd_hda.
Comment 5 CI Bug Log 2019-04-27 18:21:04 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL - igt@runner@aborted - fail - Previous test: nothing\n.*\n.*\nKernel badly tainted \(0x280\)
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_264/fi-icl-y/igt@runner@aborted.html
Comment 6 CI Bug Log 2019-04-27 18:21:36 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ICL - igt@runner@aborted - fail - Previous test: nothing\n.*\n.*\nKernel badly tainted \(0x280\) -}
{+ ICL - igt@runner@aborted - fail - Previous test: nothing, Kernel badly tainted (0x280) +}


  No new failures caught with the new filter
Comment 7 Amadeusz Sławiński 2019-05-07 09:46:52 UTC
Created attachment 144186 [details]
boot2.log

From boot2.log, seems like problem with sysfs

<3>[   11.881334] BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
<3>[   11.882350] =============================================================================
<3>[   11.882566] in_atomic(): 0, irqs_disabled(): 0, pid: 190, name: kworker/4:3
<3>[   11.883476] BUG kmalloc-32 (Tainted: G      D          ): Poison overwritten
<3>[   11.883477] -----------------------------------------------------------------------------

<3>[   11.883478] INFO: 0x00000000c581b909-0x00000000cf17bda8. First byte 0xe8 instead of 0x6b
<3>[   11.883484] INFO: Allocated in widget_tree_create+0x21/0x110 [snd_hda_core] age=55 cpu=4 pid=190
<3>[   11.883486] 	kmem_cache_alloc_trace+0x225/0x290
<3>[   11.883489] 	widget_tree_create+0x21/0x110 [snd_hda_core]
<3>[   11.883491] 	hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<3>[   11.883493] 	snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<3>[   11.883496] 	snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<3>[   11.883499] 	azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<3>[   11.883501] 	azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<3>[   11.883503] 	process_one_work+0x245/0x610
<3>[   11.883504] 	worker_thread+0x37/0x380
<3>[   11.883505] 	kthread+0x119/0x130
<3>[   11.883508] 	ret_from_fork+0x24/0x50
<3>[   11.883510] INFO: Freed in hda_widget_sysfs_reinit+0x18c/0x210 [snd_hda_core] age=48 cpu=2 pid=348
<3>[   11.883513] 	snd_hdac_refresh_widgets+0x85/0xc0 [snd_hda_core]
<3>[   11.883516] 	snd_hda_codec_update_widgets+0xe/0x50 [snd_hda_codec]
<3>[   11.883518] 	intel_hsw_common_init+0x59/0xc0 [snd_hda_codec_hdmi]
<3>[   11.883520] 	hda_codec_driver_probe+0x88/0x150 [snd_hda_codec]
<3>[   11.883522] 	really_probe+0xf3/0x3e0
<3>[   11.883523] 	driver_probe_device+0x10a/0x120
<3>[   11.883524] 	device_driver_attach+0x4b/0x50
<3>[   11.883524] 	__driver_attach+0x97/0x130
<3>[   11.883527] 	bus_for_each_dev+0x74/0xc0
<3>[   11.883528] 	bus_add_driver+0x13f/0x210
<3>[   11.883529] 	driver_register+0x56/0xe0
<3>[   11.883530] 	do_one_initcall+0x58/0x2e0
<3>[   11.883532] 	do_init_module+0x56/0x1ea
<3>[   11.883533] 	load_module+0x2701/0x29e0
<3>[   11.883534] 	__se_sys_finit_module+0xd3/0xf0
<3>[   11.883534] 	do_syscall_64+0x55/0x190
<3>[   11.883536] INFO: Slab 0x00000000043b2366 objects=22 used=22 fp=0x          (null) flags=0x8000000000010200
<3>[   11.883536] INFO: Object 0x000000000c2cf882 @offset=1112 fp=0x00000000b006c361

<3>[   11.883538] Redzone 00000000f96d092c: bb bb bb bb bb bb bb bb                          ........
<3>[   11.883539] Object 000000000c2cf882: 6b 6b 6b 6b 6b 6b 6b 6b e8 d5 db 99 84 88 ff ff  kkkkkkkk........
<3>[   11.883539] Object 00000000d4d94bbe: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
<3>[   11.883540] Redzone 00000000be7a46be: bb bb bb bb bb bb bb bb                          ........
<3>[   11.883541] Padding 00000000a84a9f5a: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ

This looks like race between widget_tree_create() and hda_widget_sysfs_reinit():
looks like hda_widget_sysfs_reinit() freed codec->widgets structure while widget_tree_create() still operated on it.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.