Bug 94079

Summary: [BAT regression] snd-hda blows up in module reload since CI build 1040
Product: DRI Reporter: Daniel Vetter <daniel>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: blocker    
Priority: highest CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: HSW i915 features: display/audio
Attachments:
Description Flags
Patch for latest takashi's branch none

Description Daniel Vetter 2016-02-10 16:27:26 UTC
Looking at the testcase history /archive/results/CI_IGT_test/igt@drv_module_reload_basic.html most machine seem affected, but some only sporadically.

[  398.616887] general protection fault: 0000 [#1] PREEMPT SMP
[  398.616893] Modules linked in: snd_hda_codec_hdmi i915 snd_hda_intel(-) snd_hda_codec x86_pkg_temp_thermal intel_powerclamp coretemp snd_hwdep snd_hda_core crct10dif_pclmul lpc_ich snd_pcm mei_me mei crc32_pclmul ghash_clmulni_intel r8169 mii
[  398.616907] CPU: 0 PID: 6795 Comm: rmmod Tainted: G     U          4.5.0-rc3-gfxbench+ #1
[  398.616911] Hardware name: Gigabyte Technology Co., Ltd. H87M-D3H/H87M-D3H, BIOS F11 08/18/2015
[  398.616914] task: ffff8800d8a10000 ti: ffff8800c0ed4000 task.ti: ffff8800c0ed4000
[  398.616916] RIP: 0010:[<ffffffff816851a9>]  [<ffffffff816851a9>] snd_jack_report+0x29/0x110
[  398.616933] RSP: 0018:ffff8800c0ed79e0  EFLAGS: 00010246
[  398.616935] RAX: 6b6b6b6b6b6b6b6b RBX: 6b6b6b6b6b6b6b63 RCX: 0000000000000000
[  398.616937] RDX: 0000000000000000 RSI: 0000000000000014 RDI: ffff8800d8980410
[  398.616939] RBP: ffff8800c0ed7a00 R08: 0000000000000001 R09: 0000000000000000
[  398.616942] R10: 0000000000ffff0a R11: 0000000000000005 R12: ffff8800d8980410
[  398.616944] R13: 0000000000000014 R14: ffff88020fb82290 R15: ffff88020f8d2478
[  398.616947] FS:  00007f095f01f700(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
[  398.616950] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  398.616953] CR2: 0000559ee9a31338 CR3: 00000000d6659000 CR4: 00000000001406f0
[  398.616956] Stack:
[  398.616958]  ffff88020f8d2148 ffff88020d7c5e18 ffff88020d7c5cd0 ffff88020fb82290
[  398.616963]  ffff8800c0ed7a60 ffffffffa0024706 ffff8800c0ed7a20 ffffffff810df8b1
[  398.616968]  ffff880000000020 ffff8800d8980410 ffff88020d7c5eb8 0000000000000001
[  398.616973] Call Trace:
[  398.616978]  [<ffffffffa0024706>] hdmi_present_sense+0x136/0x390 [snd_hda_codec_hdmi]
[  398.616994]  [<ffffffff810df8b1>] ? rcu_read_lock_sched_held+0x81/0x90
[  398.616999]  [<ffffffffa019f1e0>] ? hda_call_codec_resume+0x110/0x110 [snd_hda_codec]
[  398.617004]  [<ffffffffa00249a8>] generic_hdmi_resume+0x48/0x60 [snd_hda_codec_hdmi]
[  398.617010]  [<ffffffffa019f185>] hda_call_codec_resume+0xb5/0x110 [snd_hda_codec]
[  398.617015]  [<ffffffffa019f210>] hda_codec_runtime_resume+0x30/0x50 [snd_hda_codec]
[  398.617021]  [<ffffffff8153daad>] __rpm_callback+0x2d/0x70
[  398.617024]  [<ffffffffa019f1e0>] ? hda_call_codec_resume+0x110/0x110 [snd_hda_codec]
[  398.617029]  [<ffffffff8153db0f>] rpm_callback+0x1f/0x80
[  398.617033]  [<ffffffffa019f1e0>] ? hda_call_codec_resume+0x110/0x110 [snd_hda_codec]
[  398.617047]  [<ffffffff8153f18f>] rpm_resume+0x4cf/0x7c0
[  398.617051]  [<ffffffff8153f4ca>] __pm_runtime_resume+0x4a/0x80
[  398.617055]  [<ffffffff81534557>] __device_release_driver+0x37/0x140
[  398.617059]  [<ffffffff81534680>] device_release_driver+0x20/0x30
[  398.617062]  [<ffffffff81533323>] bus_remove_device+0x113/0x190
[  398.617066]  [<ffffffff8152fb14>] device_del+0x134/0x250
[  398.617070]  [<ffffffffa00d81bc>] snd_hdac_device_unregister+0x1c/0x20 [snd_hda_core]
[  398.617076]  [<ffffffffa019dc78>] snd_hda_codec_dev_free+0x18/0x30 [snd_hda_codec]
[  398.617080]  [<ffffffff81682d14>] __snd_device_free+0x24/0x70
[  398.617084]  [<ffffffff81682f2b>] snd_device_free_all+0x2b/0x40
[  398.617087]  [<ffffffff8167da79>] release_card_device+0x19/0x70
[  398.617091]  [<ffffffff8152f18d>] device_release+0x2d/0x90
[  398.617095]  [<ffffffff813f8c1a>] kobject_release+0x7a/0x1a0
[  398.617098]  [<ffffffff813f8d67>] kobject_put+0x27/0x50
[  398.617101]  [<ffffffff8152f5a2>] put_device+0x12/0x20
[  398.617104]  [<ffffffff8167e074>] snd_card_free_when_closed+0x24/0x30
[  398.617107]  [<ffffffff8167e1a0>] snd_card_free+0x40/0x60
[  398.617111]  [<ffffffffa01c3b4c>] azx_remove+0x2c/0x30 [snd_hda_intel]
[  398.617115]  [<ffffffff8143daf4>] pci_device_remove+0x34/0xb0
[  398.617118]  [<ffffffff815345b5>] __device_release_driver+0x95/0x140
[  398.617122]  [<ffffffff81534756>] driver_detach+0xb6/0xc0
[  398.617125]  [<ffffffff81533683>] bus_remove_driver+0x53/0xd0
[  398.617128]  [<ffffffff81535177>] driver_unregister+0x27/0x50
[  398.617132]  [<ffffffff8143cb55>] pci_unregister_driver+0x25/0x70
[  398.617136]  [<ffffffffa01c5b8b>] azx_driver_exit+0x10/0x12 [snd_hda_intel]
[  398.617140]  [<ffffffff8110414f>] SyS_delete_module+0x18f/0x1f0
[  398.617144]  [<ffffffff817b97db>] entry_SYSCALL_64_fastpath+0x16/0x73
[  398.617147] Code: 00 00 48 85 ff 0f 84 fd 00 00 00 55 48 89 e5 41 56 41 55 41 54 53 41 89 f5 48 8b 47 08 49 89 fc 48 8d 58 f8 48 39 df 74 23 31 d2 <44> 85 6b 18 48 8b 33 49 8b 7c 24 18 0f 95 c2 e8 b3 fa ff ff 48
[  398.617176] RIP  [<ffffffff816851a9>] snd_jack_report+0x29/0x110
[  398.617180]  RSP <ffff8800c0ed79e0>
[  398.617185] ---[ end trace 96cae4ed51e405a5 ]---
Comment 1 Daniel Vetter 2016-02-10 16:28:03 UTC
Around the same time also a bunch of machines started dying, but that might be an unrelated regression.
Comment 2 Gabriel Feceoru 2016-02-12 16:46:45 UTC
The bisect done on HSW lead to:

commit 25e4abb33df3aafa7d1efba8f82f9178268efab1
Author: Libin Yang <libin.yang@linux.intel.com>
Date:   Tue Jan 12 11:13:27 2016 +0800

    ALSA: hda - hdmi jack created based on pcm
    
    Jack is created based on pcm.
    
    Apply the acomp jack rule to dyn_pcm_assign.
    For dyn_pcm_assign:
     Driver does not use hda_jack. It operates snd_jack directly.
     snd_jack pointer will be stored in spec->pcm.jack instead of
     the current spec->acomp_jack. When pcm is assigned to pin,
     jack will be assigned to pin automatically.
    For !dyn_pcm_assign:
     Driver continues using hda_jack for less impact on the old cases.
     Pcm is statically assigned to pin. So is jack. spec->pcm.jack
     saves the snd_jack pointer created in hda_jack.
    
    Signed-off-by: Libin Yang <libin.yang@linux.intel.com>
    Signed-off-by: Takashi Iwai <tiwai@suse.de>
Comment 3 Libin Yang 2016-02-17 08:44:43 UTC
I have made a patch for it. The patch can fix the issue. I'm refining the patch now and QA is helping test.
Comment 4 Libin Yang 2016-02-18 07:04:21 UTC
Could you please send us your test case? So we can do the test on our side.
Comment 5 Libin Yang 2016-02-19 08:27:31 UTC
Patch is merged into Takashi's branch.

tests/drv_module_reload_basic passed with this patch.

Do gfx QA need to do a test before merging Takashi's branch or let's merge Takashi's branch first and gfx QA do the test?
Comment 6 Libin Yang 2016-02-19 08:29:16 UTC
Created attachment 121838 [details]
Patch for latest takashi's branch
Comment 7 Focus.Luo 2016-02-23 06:02:20 UTC
The patch had been merged into Takashi's branch(ALSA tree), but it is still not merged into the drm-intel-nightly tree. 
Libin also have pinged Daniel Vetter to merge this patch from ALSA tree.
Now, just to wait Daniel to merge up this patch to drm-intel-nightly tree.
Comment 8 Libin Yang 2016-02-23 06:12:40 UTC
Could you please help merge Takashi's branch and to see if it is OK now?
Comment 9 cprigent 2016-03-09 11:06:48 UTC
Test is Pass. I don't see any RIP or audio problem in kern.log.

Hardware: 
Motherboard: SawTooth Peak 
cpu model name : Intel(R) Core(TM) i7-4550U CPU @ 1.50GHz 
cpu model : 69 
cpu family : 6 
Graphic card: Haswell-ULT Integrated Graphics Controller (rev 09) 
Software:
Ubuntu 14.04.4 LTS 
Bios: HSWLPTU1.86C.0135.R01.1311020052
Libdrm: 2.4.64 
Kernel 4.5.0-rc6 drm-intel-nightly from git://anongit.freedesktop.org/drm-intel
  commit f9cadb616ff17d482312fba07db772b6604ce799
  Author: Imre Deak <imre.deak@intel.com>
  Date:   Tue Mar 1 19:17:18 2016 +0200
  drm-intel-nightly: 2016y-03m-01d-17h-16m-32s UTC integration manifest
Comment 10 cprigent 2016-03-09 11:06:59 UTC
So closed

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.