Summary: | [BAT IGT ALL] drv_module_reload_basic fail on all platforms | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Daniela Prodan <daniela.doras-prodan> | ||||||||||||||
Component: | DRM/Intel | Assignee: | Libin Yang <libin.yang> | ||||||||||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||||||||||
Severity: | critical | ||||||||||||||||
Priority: | medium | CC: | intel-gfx-bugs, marius.c.vlad, tiwai | ||||||||||||||
Version: | DRI git | ||||||||||||||||
Hardware: | All | ||||||||||||||||
OS: | All | ||||||||||||||||
Whiteboard: | |||||||||||||||||
i915 platform: | ALL | i915 features: | |||||||||||||||
Attachments: |
|
Created attachment 122608 [details]
drm-intel-nightly commit
Thanks, it's just a spurious WARN_ON() for a leftover pointer at module exit, and nothing serious. The patch below fixes it. Created attachment 122609 [details] [review] Fix patch Same warning: [ 40.798598] ------------[ cut here ]------------ [ 40.798606] WARNING: CPU: 1 PID: 450 at sound/hda/hdac_i915.c:309 snd_hdac_i915_init+0x130/0x169 [snd_hda_core] [ 40.798608] Modules linked in: snd_hda_intel(+) i915 drm_kms_helper drm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_codec intel_rapl snd_hda_core x86_pkg_temp_thermal dell_smm_hwmon intel_powerclamp tpm_tis tpm joydev coretemp snd_hwdep ghash_clmulni_intel serio_raw snd_pcm snd_timer lpc_ich mei_me mei processor battery evdev netconsole hid_generic usbhid hid xhci_pci xhci_hcd psmouse ehci_pci ehci_hcd r8169 i2c_algo_bit mii usbcore usb_common fan thermal video button [last unloaded: drm] [ 40.798624] CPU: 1 PID: 450 Comm: kworker/1:3 Tainted: G U W 4.6.0-rc1+ #89 [ 40.798625] Hardware name: Dell Inc. Inspiron 3847/088DT1 , BIOS A06 01/15/2015 [ 40.798628] Workqueue: events azx_probe_work [snd_hda_intel] [ 40.798629] 0000000000000286 00000000ffe976ea ffffffff814b8ca5 0000000000000000 [ 40.798630] 0000000000000000 ffffffff810f8ebe ffff88002f29f708 ffff8802058ae600 [ 40.798632] ffff88021fa96240 ffff880215968098 ffff880215968000 ffff88002f29f000 [ 40.798633] Call Trace: [ 40.798637] [<ffffffff814b8ca5>] ? dump_stack+0x5c/0x77 [ 40.798640] [<ffffffff810f8ebe>] ? __warn+0xbe/0xe0 [ 40.798644] [<ffffffffa03d49e0>] ? snd_hdac_i915_init+0x130/0x169 [snd_hda_core] [ 40.798646] [<ffffffff8112d127>] ? dequeue_task_fair+0x57/0x940 [ 40.798649] [<ffffffffa0116642>] ? azx_probe_continue+0x2e2/0x850 [snd_hda_intel] [ 40.798652] [<ffffffff8111170a>] ? process_one_work+0x15a/0x410 [ 40.798653] [<ffffffff81111a0d>] ? worker_thread+0x4d/0x480 [ 40.798654] [<ffffffff811119c0>] ? process_one_work+0x410/0x410 [ 40.798656] [<ffffffff8111783d>] ? kthread+0xcd/0xf0 [ 40.798659] [<ffffffff81791ca2>] ? ret_from_fork+0x22/0x40 [ 40.798661] [<ffffffff81117770>] ? kthread_create_on_node+0x190/0x190 [ 40.798662] ---[ end trace 368a576a7cb20cec ]--- Tried with: commit f6216961932e1c64251ddb6a6ba17e4f92596397 Author: Takashi Iwai <tiwai@suse.de> Date: Tue Mar 29 12:29:24 2016 +0200 ALSA: hda - Clear the leftover component assignment at snd_hdac_i915_exit() The commit [d745f5e7b8b2: ALSA: hda - Add the pin / port mapping on Intel ILK and VLV] introduced a WARN_ON() to check the pointer for avoiding the double initializations. But hdac_acomp pointer wasn't cleared at snd_hdac_i915_exit(), thus after reloading the HD-audio driver, it may result in the false positive warning. This patch makes sure to clear the leftover pointer at exit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94736 Reported-by: Daniela Doras-prodan <daniela.doras-prodan@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> commit 2702045fbd4188c1c26bd890bc43976fd10937ad Author: Jani Nikula <jani.nikula@intel.com> Date: Tue Mar 29 15:21:58 2016 +0300 drm-intel-nightly: 2016y-03m-29d-12h-21m-19s UTC integration manifest (In reply to Marius Vlad from comment #4) > Same warning: Which hardware? The behavior pretty much depends on the model. HSW, i5-4460, I'll try on other platforms as well. (In reply to Marius Vlad from comment #6) > HSW, i5-4460, I'll try on other platforms as well. The initial bug report is on all platforms from ILK to SKL. (ILK, SNB, IVB, BYT, HSW, BDW, SLK). Was curios and indeed d745f5e7b8b2961f68b0b9093a0f914a8a83c2ae seems to trigger those warns. Yes, the commit added WARN_ON() at probe, but forgot to clear it at exit. And today's sound.git tree got the support of i915 binding for ILK and later models. That's why you hit on all ILK+ machines (but some machines like ILK or IVY should show a different stack trace). But I wonder why you still get the warning with the latest fix. It works on my several machines. Could you give the simplest reproducer to trigger WARN_ON() even with the fix patch? From i-g-t: tests/drv_module_reload, or the following script: # we must kick away fbcon (but only fbcon) for vtcon in /sys/class/vtconsole/vtcon*/ ; do if grep "frame buffer device" $vtcon/name > /dev/null ; then echo unbinding $vtcon: `cat $vtcon/name` echo 0 > $vtcon/bind fi done # The sound driver uses our power well pkill alsactl rmmod snd_hda_intel &> /dev/null #ignore errors in ips - gen5 only rmmod intel_ips &> /dev/null rmmod i915 || return $IGT_EXIT_SKIP #ignore errors in intel-gtt, often built-in rmmod intel-gtt &> /dev/null # drm may be used by other devices (nouveau, radeon, udl, etc) rmmod drm_kms_helper &> /dev/null rmmod drm &> /dev/null if lsmod | grep i915 &> /dev/null ; then echo WARNING: i915.ko still loaded! return $IGT_EXIT_FAILURE else echo module successfully unloaded fi modprobe i915 $* if [ -f /sys/class/vtconsole/vtcon1/bind ]; then echo 1 > /sys/class/vtconsole/vtcon1/bind fi modprobe snd_hda_intel The test works fine on my HSW machine. Could you double-check whether it still happens with the patch in comment 3? I can reproduce here with http://paste.debian.net/422479/ It is a variant of tests/drv_module_reload_basic but w/o i-g-t dependencies. Might have more success with that. (In reply to Marius Vlad from comment #12) > I can reproduce here with http://paste.debian.net/422479/ > > It is a variant of tests/drv_module_reload_basic but w/o i-g-t dependencies. Hmm, the script still works here on my HSW machine. Is your kernel really with the patch applied? If so, please apply the debug patch below in addition, and give the whole kernel message after the reload test. Created attachment 122612 [details] [review] Debug patch Created attachment 122613 [details]
Debug messages for patch
Here's the output you requested. Unbinding doesn't happen at subsequent runs (more than once seems here to be the problem).
Thanks, now it's clear that it's the error path that didn't clear the pointer. The fix is below. At the next time, please give the whole kernel message from the beginning. Sometimes it leads to a misjudge without the proper information like this case. Created attachment 122618 [details] [review] Fix patch #2 BTW, the secondary error path was triggered basically by a kmod bug. That's why I couldn't reproduce it here. It's already fixed in the kmod upstream, as it seems. My bad won't happen again. Tested with both patches (_init and _exit) and I see no warnings now: commit 50bd9fb15ccf6f013131f334cd604d783c0c3186 Author: Takashi Iwai <tiwai@suse.de> Date: Tue Mar 29 18:48:07 2016 +0200 ALSA: hda - Fix yet another i915 pointer leftover in error path The hdac_acomp object in hdac_i915.c is left as assigned even after binding with i915 actually fails, and this leads to the WARN_ON() at the next load of the module. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94736 Signed-off-by: Takashi Iwai <tiwai@suse.de> commit f07183a706c41c8aabef9cd711e45a4ca367aeea Author: Takashi Iwai <tiwai@suse.de> Date: Tue Mar 29 12:29:24 2016 +0200 ALSA: hda - Clear the leftover component assignment at snd_hdac_i915_exit() The commit [d745f5e7b8b2: ALSA: hda - Add the pin / port mapping on Intel ILK and VLV] introduced a WARN_ON() to check the pointer for avoiding the double initializations. But hdac_acomp pointer wasn't cleared at snd_hdac_i915_exit(), thus after reloading the HD-audio driver, it may result in the false positive warning. This patch makes sure to clear the leftover pointer at exit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94736 Reported-by: Daniela Doras-prodan <daniela.doras-prodan@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> This is fixed in nightly (03c0f854e93263563f559d2bc8e47fb51adae697). Closing. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 122607 [details] dmesg log Due to sound patches, the drv_module_reload_basic is failing on all platforms: drm-intel-nightly: 2016y-03m-29d-08h-51m-45s UTC integration manifest [ 112.715554] ------------[ cut here ]------------ [ 112.715561] WARNING: CPU: 3 PID: 5306 at sound/hda/hdac_i915.c:309 snd_hdac_i915_init+0x12a/0x170 [snd_hda_core] [ 112.715562] Modules linked in: snd_hda_intel(+) i915 snd_hda_codec_hdmi x86_pkg_temp_thermal snd_hda_codec_realtek snd_hda_codec_generic intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me mei i2c_hid e1000e ptp pps_core sdhci_pci sdhci mmc_core [last unloaded: i915] [ 112.715577] CPU: 3 PID: 5306 Comm: kworker/3:18 Tainted: G U 4.6.0-rc1-gfxbench+ #1 [ 112.715578] Hardware name: /NUC6i5SYB, BIOS SYSKLi35.86A.0036.2016.0223.1140 02/23/2016 [ 112.715582] Workqueue: events azx_probe_work [snd_hda_intel] [ 112.715583] 0000000000000000 ffff880261047ca8 ffffffff81404d25 0000000000000000 [ 112.715586] 0000000000000000 ffff880261047ce8 ffffffff81079b7c 0000013500000000 [ 112.715588] ffff880036c633f8 00000000fffffff0 ffff880036c633f8 ffff88026441b490 [ 112.715590] Call Trace: [ 112.715594] [<ffffffff81404d25>] dump_stack+0x67/0x92 [ 112.715596] [<ffffffff81079b7c>] __warn+0xcc/0xf0 [ 112.715598] [<ffffffff81079c58>] warn_slowpath_null+0x18/0x20 [ 112.715601] [<ffffffffa00ef67a>] snd_hdac_i915_init+0x12a/0x170 [snd_hda_core] [ 112.715604] [<ffffffffa001f20c>] azx_probe_work+0x6c/0x870 [snd_hda_intel] [ 112.715606] [<ffffffff810cbc99>] ? __lock_is_held+0x49/0x70 [ 112.715609] [<ffffffff810962fb>] process_one_work+0x1cb/0x680 [ 112.715610] [<ffffffff81096276>] ? process_one_work+0x146/0x680 [ 112.715612] [<ffffffff810967f9>] worker_thread+0x49/0x490 [ 112.715614] [<ffffffff810967b0>] ? process_one_work+0x680/0x680 [ 112.715615] [<ffffffff8109ceda>] kthread+0xea/0x100 [ 112.715618] [<ffffffff817d0c87>] ? _raw_spin_unlock_irq+0x27/0x50 [ 112.715620] [<ffffffff817d1672>] ret_from_fork+0x22/0x40 [ 112.715621] [<ffffffff8109cdf0>] ? kthread_create_on_node+0x1f0/0x1f0 [ 112.715622] ---[ end trace d0e1df257615cbc1 ]--- I have attached dmesg and commits.log