Summary: | [PNV/ILK] igt/module_reload fails and causes many cases fail: modules.dep: no such file | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | lu hua <huax.lu> | ||||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | gordon.jin, xunx.fang, yangweix.shui, yi.sun | ||||||
Version: | unspecified | ||||||||
Hardware: | All | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
lu hua
2013-07-05 06:51:10 UTC
Please boot the system with drm debugging enabled, run the module reload testcase and then attach the complete dmesg. Created attachment 82066 [details]
dmesg
(In reply to comment #0) > output: > module successfully unloaded > FATAL: Could not load > /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No > such file or directory > ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or > directory > gem_create:487 failed, ret=-1, errno=9 > ./module_reload: line 42: 3759 Aborted (core dumped) > $SOURCE_DIR/gem_exec_nop > /dev/null According to dmesg the module unloads without issues, but then it never gets loaded. And the output pasted here indicates that something with your installed kernel image is seriously broken since it seems to be unable to read the modules.dep file. Everything else is just fallout from that. So I think there's something wrong either with the installed kernel or the module load tools on that machine, but the kernel itself and the igt test seem to work like they should. It also happens on ILK,SNB,IVB. Run with clean boot. output: ./module_reload module successfully unloaded ERROR: could not insert 'i915': Operation not permitted ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory gem_create:487 failed, ret=-1, errno=9 ./module_reload: line 42: 3991 Aborted (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null Also I run it manually: echo 0 > /sys/class/vtconsole/vtcon1/bind modprobe -r i915 modprobe -r drm modprobe i915 ERROR: could not insert 'i915': Operation not permitted Created attachment 82161 [details]
dmesg
Please reopen bugs when new data indicating that it's still broken (or broken on other machines) shows up. Can you please retest with latest -nightly? I've accidentally merged a broken patch which could have broken module reload. Also please check again for a last known-good configuration, since module-reloading really worked once ... It still fails on latest -nightly branch. The latest good configuration: igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge 53d3b4d 22e407d) It also fails on 53d3b4d and 22e407d. (In reply to comment #7) > It still fails on latest -nightly branch. > > The latest good configuration: > igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel > commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge > 53d3b4d 22e407d) > > It also fails on 53d3b4d and 22e407d. Is this just for PNV or for all platforms? Please clarify. Oops, I've just noticed that I've failed to push the fixed-up dinq/-nightly branches yesterday. Can you please retest. My apologies for the mess I've caused here. (In reply to comment #8) > (In reply to comment #7) > > It still fails on latest -nightly branch. > > > > The latest good configuration: > > igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel > > commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge > > 53d3b4d 22e407d) > > > > It also fails on 53d3b4d and 22e407d. > > Is this just for PNV or for all platforms? Please clarify. All platforms. PNV,ILK,SNB,IVB,HSW. (In reply to comment #10) > (In reply to comment #8) > > (In reply to comment #7) > > > It still fails on latest -nightly branch. > > > > > > The latest good configuration: > > > igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel > > > commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge > > > 53d3b4d 22e407d) > > > > > > It also fails on 53d3b4d and 22e407d. > > > > Is this just for PNV or for all platforms? Please clarify. > > All platforms. PNV,ILK,SNB,IVB,HSW. dinq commit 34b9674c786c73e5472e8b98a729bcdde9197859 works for me on IVB Ok, I've run module reload in an endless-loop and the thing indeed eventually crashed, with some neat fail in dmesg (this is on my gm45 laptop): [11332.369834] ------------[ cut here ]------------ [11332.369900] WARNING: at include/linux/kref.h:47 kref_get+0x2d/0x36() [11332.369962] Modules linked in: i915(+) drm_kms_helper drm cpufreq_userspace cpufreq_powersave cpufre q_conservative cpufreq_stats parport_pc ppdev lp parport bnep rfcomm bluetooth binfmt_misc uinput loop firewire_sbp2 fuse b43 bcma mac80211 snd_hda_codec_hdmi cfg80211 snd_hda_codec_idt snd_hda_intel snd_hd a_codec rfkill rng_core i2c_algo_bit snd_hwdep snd_pcm snd_page_alloc snd_seq ssb pcmcia acpi_cpufreq s nd_seq_device snd_timer hid_generic snd dell_wmi sparse_keymap yenta_socket pcmcia_rsrc mperf iTCO_wdt processor usbhid iTCO_vendor_support lpc_ich psmouse video hid i2c_i801 button battery ac pcmcia_core d ell_laptop wmi dcdbas pcspkr serio_raw mfd_core evdev i2c_core soundcore ext4 crc16 mbcache jbd2 dm_mod sr_mod sd_mod cdrom crc_t10dif sdhci_pci sdhci firewire_ohci firewire_core mmc_core crc_itu_t ahci lib ahci libata scsi_mod uhci_hcd ehci_pci ehci_hcd usbcore usb_common thermal thermal_sys [last unloaded: drm] [11332.374334] CPU: 0 PID: 28004 Comm: modprobe Not tainted 3.10.0-rc7+ #174 [11332.374399] Hardware name: Dell Inc. Latitude E6400 /0W620R, BIOS A14 05/11/2009 [11332.374475] 0000000000000000 ffff880210685728 ffffffff81397d34 ffff880210685760 [11332.374716] ffffffff810308d8 0000000000000000 ffff8802106857d0 ffff880211f865a8 [11332.374953] ffff88021212ca28 0000000000000000 ffff880210685770 ffffffff8103098f [11332.375191] Call Trace: [11332.375253] [<ffffffff81397d34>] dump_stack+0x19/0x1b [11332.375316] [<ffffffff810308d8>] warn_slowpath_common+0x60/0x78 [11332.375378] [<ffffffff8103098f>] warn_slowpath_null+0x15/0x17 [11332.375441] [<ffffffff8138132a>] kref_get+0x2d/0x36 [11332.375504] [<ffffffff81381620>] klist_next+0x70/0xa6 [11332.375566] [<ffffffff811eb3b7>] ? pci_do_find_bus+0x4a/0x4a [11332.375629] [<ffffffff8126f01a>] next_device+0x9/0x18 [11332.375690] [<ffffffff8126f1c4>] bus_find_device+0x52/0x8c [11332.375751] [<ffffffff811eb59b>] pci_get_dev_by_id+0x59/0x7f [11332.375812] [<ffffffff811eb6a4>] pci_get_class+0x48/0x4a [11332.375913] [<ffffffffa069e2d5>] intel_dsm_detect+0x5f/0x150 [i915] [11332.376006] [<ffffffffa069e3cf>] intel_register_dsm_handler+0x9/0xb [i915] [11332.376118] [<ffffffffa064fbf4>] i915_driver_load+0xa42/0xcc1 [i915] [11332.376209] [<ffffffffa033761f>] drm_get_pci_dev+0x161/0x274 [drm] [11332.376293] [<ffffffffa064c65e>] i915_pci_probe+0x4e/0x58 [i915] [11332.376358] [<ffffffff811eadbf>] local_pci_probe+0x39/0x61 [11332.376422] [<ffffffff811eaea6>] pci_device_probe+0xbf/0xe5 [11332.376483] [<ffffffff812709d7>] driver_probe_device+0x98/0x1c4 [11332.376545] [<ffffffff81270b97>] __driver_attach+0x5c/0x7e [11332.376606] [<ffffffff81270b3b>] ? __device_attach+0x38/0x38 [11332.376668] [<ffffffff8126f0a1>] bus_for_each_dev+0x78/0x82 [11332.376729] [<ffffffff812704d0>] driver_attach+0x19/0x1b [11332.376792] [<ffffffff812701a1>] bus_add_driver+0xf4/0x1fc [11332.376854] [<ffffffff81271073>] driver_register+0x87/0xf8 [11332.376915] [<ffffffff811ea4e0>] __pci_register_driver+0x5b/0x5e [11332.376980] [<ffffffffa0481000>] ? 0xffffffffa0480fff [11332.377050] [<ffffffffa03377b8>] drm_pci_init+0x86/0xea [drm] [11332.377113] [<ffffffffa0481000>] ? 0xffffffffa0480fff [11332.377190] [<ffffffffa0481066>] i915_init+0x66/0x68 [i915] [11332.377253] [<ffffffff81000263>] do_one_initcall+0x7b/0x10f [11332.377316] [<ffffffff81081e4b>] load_module+0x1224/0x1dad [11332.377377] [<ffffffff8107ea80>] ? store_uevent+0x35/0x35 [11332.377440] [<ffffffff810de1ac>] ? might_fault+0x3d/0x8b [11332.377502] [<ffffffff81082a74>] SyS_init_module+0xa0/0xaf [11332.377565] [<ffffffff813a2d92>] system_call_fastpath+0x16/0x1b [11332.377626] ---[ end trace f64ad603e02a19b0 ]--- Two questions: - Is the failure on your side 100% reproducible? - Any backtraces in dmesg? (In reply to comment #12) > Ok, I've run module reload in an endless-loop and the thing indeed > eventually crashed, with some neat fail in dmesg (this is on my gm45 > laptop): > > > Two questions: > - Is the failure on your side 100% reproducible? > - Any backtraces in dmesg? 1. The failure is 100% reproducible. 2. No backtraces in dmesg. (In reply to comment #13) > 1. The failure is 100% reproducible. > 2. No backtraces in dmesg. You've mentioned that module_reload fails on pnv, ilk, snb, ivb, hsw. Is it 100% failure on all these platforms with no output on dmesg? Also please double-check your local installation - after all module reloading _has_ worked before, so something broke it. If it's not a regression in the kernel, then it must be a regression somewhere in your kernel build/test setup. Also the above list of platforms are almost all we still test. Are there any platforms where module reload still works? Gordon, can you please ramp up the priority for investigating this? Afaik module reloading works, so I'm left with the conclusion that something (recently) broke with your -nightly setup. Flying blind like that makes me feel uneasy ... Maybe also drag Yi into the analysis. Test on latest -nightly branch(09fa6edd6bf1ee1aa2092a8f50d407338be888e8): IVB: 0/5 FAIL SNB: 2/5 FAIL ILK: 5/5 FAIL PNV: 5/5 FAIL When module_reload fails on these platforms, is the output always like module successfully unloaded FATAL: Could not load /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No such file or directory ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory gem_create:487 failed, ret=-1, errno=9 ./module_reload: line 42: 3759 Aborted (core dumped) i.e. what you've pasted in comment #0? Or is there different output in some case/on some platforms? Test on latest -nightly kernel: On PNV and ILK output: module successfully unloaded FATAL: Could not load /lib/modules/3.10.0-rc7_nightlytop_a3ee4d_20130712_+/modules.dep: No such file or directory ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory gem_create:487 failed, ret=-1, errno=9 ./module_reload: line 42: 3749 Aborted (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null On SNB output: module successfully unloaded ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory gem_create:487 failed, ret=-1, errno=9 ./module_reload: line 42: 5668 Aborted (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null (In reply to comment #17) > Test on latest -nightly kernel: > On PNV and ILK output: > module successfully unloaded > FATAL: Could not load > /lib/modules/3.10.0-rc7_nightlytop_a3ee4d_20130712_+/modules.dep: No such > file or directory > ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or > directory > gem_create:487 failed, ret=-1, errno=9 > ./module_reload: line 42: 3749 Aborted (core dumped) > $SOURCE_DIR/gem_exec_nop > /dev/null This really looks like an issue in userspace: "modules.dep: No such file or directory" means that modprobe can't properly load the module. Can you please check what might be going wrong on your system? Also have you recently upgraded anything or installed these systems newly which might explain the breakage? > On SNB output: > module successfully unloaded > ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or > directory > gem_create:487 failed, ret=-1, errno=9 > ./module_reload: line 42: 5668 Aborted (core dumped) > $SOURCE_DIR/gem_exec_nop > /dev/null This looks more like a kernel issue, at least it's possible that it's a kernel issue. Please file a new bug for this snb machine and attach debug dmesg for the module reload. (In reply to comment #4) > It also happens on ILK,SNB,IVB. Run with clean boot. > output: > ./module_reload > module successfully unloaded > ERROR: could not insert 'i915': Operation not permitted > ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or > directory > gem_create:487 failed, ret=-1, errno=9 > ./module_reload: line 42: 3991 Aborted (core dumped) > $SOURCE_DIR/gem_exec_nop > /dev/null > > Also I run it manually: > echo 0 > /sys/class/vtconsole/vtcon1/bind > modprobe -r i915 > modprobe -r drm > modprobe i915 > ERROR: could not insert 'i915': Operation not permitted 1. We found that FATAL: "Could not load /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No such file or directory ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory" is not an issue. 2. "ERROR: could not insert 'i915': Operation not permitted" is an issue, it has been fixed now. So I close this bug. Verified.Fixed. That still leaves the issue on SNB, which looks like a different bug. Is module_reload working on SNB well again? If it's still broken like you show in comment #17 then please file a new bug report with the requested information. Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.