Bug 70336

Summary: [HSW regression] module_reload doesn't work on Haswell
Product: DRI Reporter: fangxun <xunx.fang>
Component: DRM/IntelAssignee: Paulo Zanoni <przanoni>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: high CC: huax.lu, intel-gfx-bugs, przanoni
Version: XOrg git   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg none

Description fangxun 2013-10-10 08:42:35 UTC
System Environment:
--------------------------
Platform: Haswell
Kernel:    (drm-intel-fixes)d32270460fee83e22ee9e6b1bfd7b486263eeb1d

Bug detailed description:
-----------------------------
Runnig module_reload, the output shows:
ERROR: Module i915 is in use
WARNING: i915.ko still loaded!

I did an analysis:
 After running "echo 0 > /sys/class/vtconsole/vtcon1/bind", "lsmod" showed there were still 2 programs using i915 module. It only happened on haswell since Kernel 3.12-rc2. I found it was snd_hda_intel that use i915 module. Unloading snd_hda_intel, module_reload worked.

[root@x-hsw24 ~]# lsmod
Module                  Size  Used by
snd_hda_codec_realtek    34049  1
snd_hda_codec_hdmi     27236  1
snd_hda_intel          25263  0
snd_hda_codec         100145  3 snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_intel
snd_hwdep               5094  1 snd_hda_codec
snd_seq                41921  0
snd_seq_device          4581  1 snd_seq
snd_pcm                66241  3 snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
iTCO_wdt                4551  1
iTCO_vendor_support     1608  1 iTCO_wdt
snd_timer              15514  2 snd_seq,snd_pcm
snd                    50169  9 snd_hda_codec_realtek,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_seq,snd_seq_device,snd_pcm,snd_timer
serio_raw               3929  0
pcspkr                  1699  0
microcode               7049  0
soundcore               4219  1 snd
i2c_i801                8374  0
snd_page_alloc          5930  2 snd_hda_intel,snd_pcm
lpc_ich                12608  0
mfd_core                2441  1 lpc_ich
acpi_cpufreq            6299  0
freq_table              2068  1 acpi_cpufreq
uinput                  6676  0
ipv6                  246159  43
i915                  531119  2
drm_kms_helper         22993  1 i915
drm                   199664  2 i915,drm_kms_helper
button                  4261  1 i915
video                  10625  1 i915
dm_mirror              11024  0
dm_region_hash          5831  1 dm_mirror
dm_log                  7220  2 dm_mirror,dm_region_hash
dm_mod                 66376  2 dm_mirror,dm_logReproduce steps:

Reproduce steps:
----------------------------
1. ./module_reload
Comment 1 Daniel Vetter 2013-10-10 09:22:49 UTC
Ok, I've fixed the testcase to properly fail when the module reloading doesn't work:

commit 8a9b275b96f1ea5637d21e4568647dcb7fed98f2
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Thu Oct 10 11:22:09 2013 +0200

    tests/module_reload: fail if the module didn't unload

Please confirm that we now no longer report a success here.

Also I think Paulo is working on a patch to make module reload work again on hsw.
Comment 2 lu hua 2013-10-23 05:45:30 UTC
It still exists.
Comment 3 Daniel Vetter 2013-10-27 19:21:35 UTC
(In reply to comment #2)
> It still exists.

Erhm, I didn't say that the bug is now fixed, but asked you to confirm that the test now properly fails (with a nonzero exit status) ...
Comment 4 Paulo Zanoni 2013-11-01 15:00:20 UTC
I just pushed a patch for this on intel-gpu-tools. With this patch, we should go back to hit the other module_reload bugs we already have on Haswell.

commit bd0aa100ca438fa68cf07dc55ec6dbfe7391ba6c
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
    module_reload: remove snd_hda_intel

Closing bug. If it still happens, please reopen.
Comment 5 lu hua 2013-11-04 02:14:53 UTC
Test on latest -nightly kernel.

Run the first cycle, It causes call trace:
output:
module successfully unloaded
module successfully loaded again

dmesg:
[   31.122255] ------------[ cut here ]------------
[   31.122373] WARNING: CPU: 2 PID: 3913 at fs/sysfs/file.c:498 sysfs_attr_ns+0x25/0x8c()
[   31.122545] sysfs: kobject \xffffffd0\xffffff85\xffffff85\x04\xffffff9e without dirent
[   31.122660] Modules linked in: netconsole configfs ipv6 dm_mod snd_hda_codec_realtek pcspkr snd_hda_codec_hdmi i2c_i801 iTCO_wdt iTCO_vendor_support lpc_ich mfd_core snd_hda_codec snd_hwdep snd_pcm snd_page_alloc snd_timer snd soundcore acpi_cpufreq i915(-) video button drm_kms_helper drm freq_table [last unloaded: snd_hda_intel]
[   31.124140] CPU: 2 PID: 3913 Comm: rmmod Not tainted 3.12.0-rc7_nightlytop_265a90_20131104_+ #2203
[   31.124317] Hardware name: Intel Corporation Shark Bay Client platform/Flathead Creek Crb, BIOS HSWLPTU1.86C.0131.R03.1307262359 07/26/2013
[   31.124498]  0000000000000000 0000000000000009 ffffffff8170464c ffff880243723cd8
[   31.124827]  ffffffff8103319e ffff880200000000 ffffffff8112f8b7 ffff880243723ca8
[   31.125153]  ffffffff81adc2b0 ffff88009e073610 ffff88009e0c7a00 ffff88009e13db40
[   31.125477] Call Trace:
[   31.125579]  [<ffffffff8170464c>] ? dump_stack+0x41/0x51
[   31.125693]  [<ffffffff8103319e>] ? warn_slowpath_common+0x73/0x8b
[   31.125809]  [<ffffffff8112f8b7>] ? sysfs_attr_ns+0x25/0x8c
[   31.125916]  [<ffffffff8103324e>] ? warn_slowpath_fmt+0x45/0x4a
[   31.126028]  [<ffffffff8112f8b7>] ? sysfs_attr_ns+0x25/0x8c
[   31.126142]  [<ffffffff8112f934>] ? sysfs_remove_file+0x16/0x32
[   31.126255]  [<ffffffff8136ab85>] ? device_del+0x114/0x17a
[   31.126368]  [<ffffffff8136abf4>] ? device_unregister+0x9/0x12
[   31.126487]  [<ffffffffa000befc>] ? drm_sysfs_connector_remove+0x7d/0x89 [drm]
[   31.126672]  [<ffffffffa00905fc>] ? intel_modeset_cleanup+0xb9/0xe2 [i915]
[   31.126796]  [<ffffffffa00628ca>] ? i915_driver_unload+0xb6/0x2a7 [i915]
[   31.126909]  [<ffffffffa000954e>] ? drm_dev_unregister+0x21/0xd0 [drm]
[   31.127025]  [<ffffffffa0009ba6>] ? drm_put_dev+0x48/0x51 [drm]
[   31.127137]  [<ffffffff812eb190>] ? pci_device_remove+0x24/0x48
[   31.127251]  [<ffffffff8136d34f>] ? __device_release_driver+0x68/0xc1
[   31.127364]  [<ffffffff8136da29>] ? driver_detach+0x6e/0x99
[   31.127476]  [<ffffffff8136d1ab>] ? bus_remove_driver+0x78/0xb9
[   31.127590]  [<ffffffff812eb2c5>] ? pci_unregister_driver+0x17/0x75
[   31.127703]  [<ffffffffa000b210>] ? drm_pci_exit+0x3b/0x72 [drm]
[   31.127817]  [<ffffffff81077d1d>] ? SyS_delete_module+0x1a3/0x219
[   31.127929]  [<ffffffff81709ef2>] ? page_fault+0x22/0x30
[   31.128039]  [<ffffffff8170ebe2>] ? system_call_fastpath+0x16/0x1b
[   31.128152] ---[ end trace 67e4404f3dd63a8a ]---


Run the 2nd cycle, It segfault.
output:
./module_reload: line 27: 18675 Segmentation fault      rmmod i915
WARNING: i915.ko still loaded!
Comment 6 lu hua 2013-11-04 02:16:42 UTC
Created attachment 88584 [details]
dmesg
Comment 7 Daniel Vetter 2013-11-04 07:05:55 UTC
That's a different bug, please file a new one.

Here we only track the module unload failure due to the sound module reference on i915.ko
Comment 8 lu hua 2013-11-04 08:27:21 UTC
The segfault happens on -nightly and drm-next branch. It doesn't happens on -queued and -fixes kernel.
Comment 9 Daniel Vetter 2013-11-04 09:02:07 UTC
(In reply to comment #8)
> The segfault happens on -nightly and drm-next branch. It doesn't happens on
> -queued and -fixes kernel.

Oh, then I have an idea. As soon as you've filed the bug I'll point you at the the relevant patch.
Comment 10 Jari Tahvanainen 2016-09-30 13:07:04 UTC
Closing Verified+Fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.