Bug 110382 - [CI][SHARDS] igt@runner@aborted - fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI, Workqueue: events azx_probe_work
Summary: [CI][SHARDS] igt@runner@aborted - fail - general protection fault: 0000 [#1] ...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-10 14:18 UTC by Martin Peres
Modified: 2019-09-30 12:57 UTC (History)
1 user (show)

See Also:
i915 platform: ICL
i915 features: display/audio


Attachments
boot2.log (184.98 KB, text/x-log)
2019-05-07 09:46 UTC, Amadeusz Sławiński
no flags Details

Description Martin Peres 2019-04-10 14:18:42 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5896/shard-iclb2/igt@runner@aborted.html

Aborting.
Previous test: nothing
Next test: gem_exec_basic (basic-render)

Kernel badly tainted (0x2a0) (check dmesg for details):
	(0x20) TAINT_BAD_PAGE: Bad page reference or an unexpected page flags.
	(0x80) TAINT_DIE: Kernel has died - BUG/OOPS.
	(0x200) TAINT_WARN: WARN_ON has happened.

<4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted 5.1.0-rc4-CI-CI_DRM_5896+ #1
<4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
<4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
<4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
<4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88 ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00 <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
<4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
<4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000000
<4>[   11.838008] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
<4>[   11.838036] RBP: ffff88848250b3f8 R08: 0000000085e480ea R09: 0000000000000001
<4>[   11.838063] R10: 0000000000000000 R11: ffff888499ee48b8 R12: ffffffff820df667
<4>[   11.839071] R13: 0000000000000000 R14: 6b6b6b6b6b6b6b6b R15: ffff88849507df00
<4>[   11.840027] FS:  0000000000000000(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000
<4>[   11.840992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   11.841951] CR2: 0000556b60348c78 CR3: 0000000005214004 CR4: 0000000000760ee0
<4>[   11.842915] PKRU: 55555554
<4>[   11.843882] Call Trace:
<4>[   11.844846]  ? add_widget_node+0x68/0xa0 [snd_hda_core]
<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.845813]  widget_tree_create+0xf6/0x110 [snd_hda_core]
<7>[   11.846800] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.847717]  hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<4>[   11.847720]  snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<7>[   11.848782] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<4>[   11.849720]  snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<4>[   11.849724]  azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<7>[   11.850981] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.851936]  azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<7>[   11.852973] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.853928]  process_one_work+0x245/0x610
<7>[   11.855224] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<4>[   11.856202]  worker_thread+0x37/0x380
<4>[   11.856204]  ? process_one_work+0x610/0x610
<7>[   11.857129] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.858091]  kthread+0x119/0x130
<4>[   11.858092]  ? kthread_park+0x80/0x80
<4>[   11.858095]  ret_from_fork+0x24/0x50
<7>[   11.858999] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.859975] Modules linked in: snd_hda_codec_hdmi(+) snd_hda_codec_realtek(+) snd_hda_codec_generic x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core cdc_ether usbnet e1000e snd_pcm mii ptp pps_core mei_me mei i915 prime_numbers
<7>[   11.860861] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<4>[   11.862139] ---[ end trace bf38e6abac39589f ]---
Comment 1 CI Bug Log 2019-04-10 14:19:09 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL: igt@runner@aborted - fail - Previous test: nothing, Kernel badly tainted (0x2a0)
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5896/shard-iclb2/igt@runner@aborted.html
Comment 2 Jani Saarinen 2019-04-22 11:53:00 UTC
On shards there was BIOS update to BIOS ICLSFWR1.R00.3121.A00.1903190527 03/19/2019.
Comment 3 James Ausmus 2019-04-25 20:33:29 UTC
If we get rid of the noise from the dmesg, we get:

<4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted 5.1.0-rc4-CI-CI_DRM_5896+ #1
<4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334 02/25/2019
<4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
<4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
<4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88 ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00 <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
<4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
<4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX: 0000000000000000
<4>[   11.838008] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b
<4>[   11.838036] RBP: ffff88848250b3f8 R08: 0000000085e480ea R09: 0000000000000001
<4>[   11.838063] R10: 0000000000000000 R11: ffff888499ee48b8 R12: ffffffff820df667
<4>[   11.839071] R13: 0000000000000000 R14: 6b6b6b6b6b6b6b6b R15: ffff88849507df00
<4>[   11.840027] FS:  0000000000000000(0000) GS:ffff88849ff00000(0000) knlGS:0000000000000000
<4>[   11.840992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[   11.841951] CR2: 0000556b60348c78 CR3: 0000000005214004 CR4: 0000000000760ee0
<4>[   11.842915] PKRU: 55555554
<4>[   11.843882] Call Trace:
<4>[   11.844846]  ? add_widget_node+0x68/0xa0 [snd_hda_core]
<4>[   11.845813]  widget_tree_create+0xf6/0x110 [snd_hda_core]
<4>[   11.847717]  hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<4>[   11.847720]  snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<4>[   11.849720]  snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<4>[   11.849724]  azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<4>[   11.851936]  azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<4>[   11.853928]  process_one_work+0x245/0x610
<4>[   11.856202]  worker_thread+0x37/0x380
<4>[   11.856204]  ? process_one_work+0x610/0x610
<4>[   11.858091]  kthread+0x119/0x130
<4>[   11.858092]  ? kthread_park+0x80/0x80
<4>[   11.858095]  ret_from_fork+0x24/0x50
<4>[   11.859975] Modules linked in: snd_hda_codec_hdmi(+) snd_hda_codec_realtek(+) snd_hda_codec_generic x86_pkg_temp_thermal coretemp mei_hdcp crct10dif_pclmul crc32_pclmul snd_hda_intel snd_hda_codec snd_hwdep ghash_clmulni_intel snd_hda_core cdc_ether usbnet e1000e snd_pcm mii ptp pps_core mei_me mei i915 prime_numbers
<4>[   11.862139] ---[ end trace bf38e6abac39589f ]---

moved lines:

<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.846800] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.848782] [drm:i915_audio_component_get_eld [i915]] Not valid for port B
<7>[   11.850981] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.852973] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.855224] [drm:i915_audio_component_get_eld [i915]] Not valid for port C
<7>[   11.857129] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7>[   11.860861] [drm:i915_audio_component_get_eld [i915]] Not valid for port D
<7>[   11.858999] [drm:i915_audio_component_get_eld [i915]] Not valid for port D


Additional info:

<7>[   11.112111] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.529366] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.583388] [drm:drm_add_edid_modes] ELD: no CEA Extension found
<7>[   11.601007] [drm:drm_add_edid_modes] ELD: no CEA Extension found

-------------------------------------------------------------------


So, what appears to be happening is that there are no devices connected that advertise an ELD that is audio capable, so none of the DDI ports ever gets has_audio set, which means that:

 /* referred in audio callbacks */
        dev_priv->av_enc_map[pipe] = encoder;

from intel_audio_codec_enable (i915/intel_audio.c) never gets called, which means that the call to get_saved_enc always returns NULL, causing the


<7>[   11.845330] [drm:i915_audio_component_get_eld [i915]] Not valid for port B


lines to show up, which is i915_audio_component_get_eld returning -EINVAL, which appears to be making the audio driver blow up, as it's trying to grab ELDs for ports that don't have an audio-capable device connected.

i915 doesn't appear to be doing anything wrong here, I believe this is a bug for the audio driver
Comment 4 Chris Wilson 2019-04-25 20:37:40 UTC
(In reply to James Ausmus from comment #3)
> If we get rid of the noise from the dmesg, we get:
> 
> <4>[   11.837706] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
> <4>[   11.837755] CPU: 4 PID: 190 Comm: kworker/4:3 Not tainted
> 5.1.0-rc4-CI-CI_DRM_5896+ #1
> <4>[   11.837788] Hardware name: Intel Corporation Ice Lake Client
> Platform/IceLake U DDR4 SODIMM PD RVP, BIOS ICLSFWR1.R00.3087.A00.1902250334
> 02/25/2019
> <4>[   11.837843] Workqueue: events azx_probe_work [snd_hda_intel]
> <4>[   11.837869] RIP: 0010:kobject_uevent_env+0x38/0x810
> <4>[   11.837890] Code: 53 49 89 d5 48 89 c3 48 83 ec 38 4c 8b 24 c5 c0 88
> ef 81 0f 1f 44 00 00 4c 89 f0 eb 0d 48 8b 40 18 48 85 c0 0f 84 27 01 00 00
> <4c> 8b 78 20 4d 85 ff 74 ea 41 f6 46 3c 10 0f 85 f9 00 00 00 49 8b
> <4>[   11.837957] RSP: 0018:ffffc9000144bd18 EFLAGS: 00010296
> <4>[   11.837980] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000002 RCX:

0x6b6b6b6b6b6b6b6b == POISON_FREE == use-after-free in snd_hda.
Comment 5 CI Bug Log 2019-04-27 18:21:04 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ICL - igt@runner@aborted - fail - Previous test: nothing\n.*\n.*\nKernel badly tainted \(0x280\)
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_264/fi-icl-y/igt@runner@aborted.html
Comment 6 CI Bug Log 2019-04-27 18:21:36 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ICL - igt@runner@aborted - fail - Previous test: nothing\n.*\n.*\nKernel badly tainted \(0x280\) -}
{+ ICL - igt@runner@aborted - fail - Previous test: nothing, Kernel badly tainted (0x280) +}


  No new failures caught with the new filter
Comment 7 Amadeusz Sławiński 2019-05-07 09:46:52 UTC
Created attachment 144186 [details]
boot2.log

From boot2.log, seems like problem with sysfs

<3>[   11.881334] BUG: sleeping function called from invalid context at kernel/sched/completion.c:99
<3>[   11.882350] =============================================================================
<3>[   11.882566] in_atomic(): 0, irqs_disabled(): 0, pid: 190, name: kworker/4:3
<3>[   11.883476] BUG kmalloc-32 (Tainted: G      D          ): Poison overwritten
<3>[   11.883477] -----------------------------------------------------------------------------

<3>[   11.883478] INFO: 0x00000000c581b909-0x00000000cf17bda8. First byte 0xe8 instead of 0x6b
<3>[   11.883484] INFO: Allocated in widget_tree_create+0x21/0x110 [snd_hda_core] age=55 cpu=4 pid=190
<3>[   11.883486] 	kmem_cache_alloc_trace+0x225/0x290
<3>[   11.883489] 	widget_tree_create+0x21/0x110 [snd_hda_core]
<3>[   11.883491] 	hda_widget_sysfs_init+0x1a/0x40 [snd_hda_core]
<3>[   11.883493] 	snd_hdac_device_register+0x19/0x40 [snd_hda_core]
<3>[   11.883496] 	snd_hda_codec_configure+0x39/0x160 [snd_hda_codec]
<3>[   11.883499] 	azx_codec_configure+0x2a/0x60 [snd_hda_codec]
<3>[   11.883501] 	azx_probe_work+0x42e/0x7e0 [snd_hda_intel]
<3>[   11.883503] 	process_one_work+0x245/0x610
<3>[   11.883504] 	worker_thread+0x37/0x380
<3>[   11.883505] 	kthread+0x119/0x130
<3>[   11.883508] 	ret_from_fork+0x24/0x50
<3>[   11.883510] INFO: Freed in hda_widget_sysfs_reinit+0x18c/0x210 [snd_hda_core] age=48 cpu=2 pid=348
<3>[   11.883513] 	snd_hdac_refresh_widgets+0x85/0xc0 [snd_hda_core]
<3>[   11.883516] 	snd_hda_codec_update_widgets+0xe/0x50 [snd_hda_codec]
<3>[   11.883518] 	intel_hsw_common_init+0x59/0xc0 [snd_hda_codec_hdmi]
<3>[   11.883520] 	hda_codec_driver_probe+0x88/0x150 [snd_hda_codec]
<3>[   11.883522] 	really_probe+0xf3/0x3e0
<3>[   11.883523] 	driver_probe_device+0x10a/0x120
<3>[   11.883524] 	device_driver_attach+0x4b/0x50
<3>[   11.883524] 	__driver_attach+0x97/0x130
<3>[   11.883527] 	bus_for_each_dev+0x74/0xc0
<3>[   11.883528] 	bus_add_driver+0x13f/0x210
<3>[   11.883529] 	driver_register+0x56/0xe0
<3>[   11.883530] 	do_one_initcall+0x58/0x2e0
<3>[   11.883532] 	do_init_module+0x56/0x1ea
<3>[   11.883533] 	load_module+0x2701/0x29e0
<3>[   11.883534] 	__se_sys_finit_module+0xd3/0xf0
<3>[   11.883534] 	do_syscall_64+0x55/0x190
<3>[   11.883536] INFO: Slab 0x00000000043b2366 objects=22 used=22 fp=0x          (null) flags=0x8000000000010200
<3>[   11.883536] INFO: Object 0x000000000c2cf882 @offset=1112 fp=0x00000000b006c361

<3>[   11.883538] Redzone 00000000f96d092c: bb bb bb bb bb bb bb bb                          ........
<3>[   11.883539] Object 000000000c2cf882: 6b 6b 6b 6b 6b 6b 6b 6b e8 d5 db 99 84 88 ff ff  kkkkkkkk........
<3>[   11.883539] Object 00000000d4d94bbe: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b a5  kkkkkkkkkkkkkkk.
<3>[   11.883540] Redzone 00000000be7a46be: bb bb bb bb bb bb bb bb                          ........
<3>[   11.883541] Padding 00000000a84a9f5a: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ

This looks like race between widget_tree_create() and hda_widget_sysfs_reinit():
looks like hda_widget_sysfs_reinit() freed codec->widgets structure while widget_tree_create() still operated on it.
Comment 9 Lakshmi 2019-09-30 12:56:57 UTC
This issue happened twice on drmtuip runs 262 and 264 but not seen later. Current drmtip run is 380 Closing and archiving this issue.
Comment 10 CI Bug Log 2019-09-30 12:57:04 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.