Bug 66610 - [PNV/ILK] igt/module_reload fails and causes many cases fail: modules.dep: no such file
Summary: [PNV/ILK] igt/module_reload fails and causes many cases fail: modules.dep: no...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-05 06:51 UTC by lu hua
Modified: 2017-10-06 14:45 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (69.65 KB, text/plain)
2013-07-05 07:53 UTC, lu hua
no flags Details
dmesg (73.05 KB, text/plain)
2013-07-08 05:12 UTC, lu hua
no flags Details

Description lu hua 2013-07-05 06:51:10 UTC
System Environment:
--------------------------
Arch:           i386
Platform:       pineview
Kernel:	(drm-intel-fixes) 446f8d81ca2d9cefb614e87f2fabcc996a9e4e7e

Bug detailed description:
-------------------------
It fails on pineview with drm-intel-fixes kernel and drm-intel-next-queued kernel. I can't find a good commit. It also has this issue on (drm-intel-fixes) 8abbbaf6adb46157b6bd416f7616b555cc6a332f.
Run ./module_reload then run following cases, they also fail.
igt/gem_ctx_bad_destroy	
igt/gem_hangcheck_forcewake	
igt/gem_largeobject	
igt/gem_mmap_offset_exhaustion	
igt/gem_ring_sync_loop	
igt/gem_storedw_loop_bsd	
igt/gem_storedw_loop_render	
igt/gem_tiled_pread	
igt/gen3_mixed_blits	
igt/gen3_render_linear_blits	
igt/module_reload	
igt/prime_self_import/with_two_bos

output:
module successfully unloaded
FATAL: Could not load /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No such file or directory
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory
gem_create:487 failed, ret=-1, errno=9
./module_reload: line 42:  3759 Aborted                 (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null


dmesg:
[  127.169426] Console: switching to colour VGA+ 80x25
[  127.175588] [drm:intel_crtc_cursor_set], cursor off
[  127.175597] [drm:intel_crtc_set_config], [CRTC:3] [NOFB]
[  127.175607] [drm:intel_modeset_stage_output_state], [CONNECTOR:5:LVDS-1] to [CRTC:4]
[  127.175613] [drm:intel_crtc_cursor_set], cursor off
[  127.175617] [drm:intel_crtc_set_config], [CRTC:4] [FB:14] #connectors=1 (x y) (0 0)
[  127.175625] [drm:intel_modeset_stage_output_state], [CONNECTOR:5:LVDS-1] to [CRTC:4]
[  127.183992] drm_kms_helper: drm: unregistered panic notifier
[  127.188110] [drm:i915_get_vblank_counter], trying to get vblank count for disabled pipe A
[  127.188953] [drm:intel_crtc_cursor_set], cursor off
[  127.188966] [drm:intel_crtc_cursor_set], cursor off
[  127.193990] [drm] Module unloaded


Reproduce steps:
----------------
1. ./module_reload
Comment 1 Daniel Vetter 2013-07-05 07:04:50 UTC
Please boot the system with drm debugging enabled, run the module reload testcase and then attach the complete dmesg.
Comment 2 lu hua 2013-07-05 07:53:51 UTC
Created attachment 82066 [details]
dmesg
Comment 3 Daniel Vetter 2013-07-05 08:01:02 UTC
(In reply to comment #0)
> output:
> module successfully unloaded
> FATAL: Could not load
> /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No
> such file or directory
> ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or
> directory
> gem_create:487 failed, ret=-1, errno=9
> ./module_reload: line 42:  3759 Aborted                 (core dumped)
> $SOURCE_DIR/gem_exec_nop > /dev/null

According to dmesg the module unloads without issues, but then it never gets loaded. And the output pasted here indicates that something with your installed kernel image is seriously broken since it seems to be unable to read the modules.dep file. Everything else is just fallout from that.

So I think there's something wrong either with the installed kernel or the module load tools on that machine, but the kernel itself and the igt test seem to work like they should.
Comment 4 lu hua 2013-07-08 05:11:52 UTC
It also happens on ILK,SNB,IVB. Run with clean boot.
output:
 ./module_reload
module successfully unloaded
ERROR: could not insert 'i915': Operation not permitted
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory
gem_create:487 failed, ret=-1, errno=9
./module_reload: line 42:  3991 Aborted                 (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null

Also I run it manually:
echo 0 > /sys/class/vtconsole/vtcon1/bind
modprobe -r i915
modprobe -r drm
modprobe i915
ERROR: could not insert 'i915': Operation not permitted
Comment 5 lu hua 2013-07-08 05:12:33 UTC
Created attachment 82161 [details]
dmesg
Comment 6 Daniel Vetter 2013-07-08 20:33:55 UTC
Please reopen bugs when new data indicating that it's still broken (or broken on other machines) shows up.

Can you please retest with latest -nightly? I've accidentally merged a broken patch which could have broken module reload. Also please check again for a last known-good configuration, since module-reloading really worked once ...
Comment 7 lu hua 2013-07-09 08:11:06 UTC
It still fails on latest -nightly branch.

The latest good configuration:
igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge 53d3b4d 22e407d)

It also fails on 53d3b4d and 22e407d.
Comment 8 Daniel Vetter 2013-07-09 08:49:58 UTC
(In reply to comment #7)
> It still fails on latest -nightly branch.
> 
> The latest good configuration:
> igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel
> commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge
> 53d3b4d 22e407d)
> 
> It also fails on 53d3b4d and 22e407d.

Is this just for PNV or for all platforms? Please clarify.
Comment 9 Daniel Vetter 2013-07-09 08:54:41 UTC
Oops, I've just noticed that I've failed to push the fixed-up dinq/-nightly branches yesterday. Can you please retest.

My apologies for the mess I've caused here.
Comment 10 lu hua 2013-07-09 08:59:18 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > It still fails on latest -nightly branch.
> > 
> > The latest good configuration:
> > igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel
> > commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge
> > 53d3b4d 22e407d)
> > 
> > It also fails on 53d3b4d and 22e407d.
> 
> Is this just for PNV or for all platforms? Please clarify.

All platforms. PNV,ILK,SNB,IVB,HSW.
Comment 11 Mika Kuoppala 2013-07-09 12:06:32 UTC
(In reply to comment #10)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > It still fails on latest -nightly branch.
> > > 
> > > The latest good configuration:
> > > igt commit:bc388b54d4325669bfffef314c6f18349c239a1c and kernel
> > > commit(nightly branch):4f9e7cfb09aa3e2fc3b3bba635c6d0c558ce1b70 (merge
> > > 53d3b4d 22e407d)
> > > 
> > > It also fails on 53d3b4d and 22e407d.
> > 
> > Is this just for PNV or for all platforms? Please clarify.
> 
> All platforms. PNV,ILK,SNB,IVB,HSW.

dinq commit 34b9674c786c73e5472e8b98a729bcdde9197859 works for me on IVB
Comment 12 Daniel Vetter 2013-07-09 12:52:46 UTC
Ok, I've run module reload in an endless-loop and the thing indeed eventually crashed, with some neat fail in dmesg (this  is on my gm45 laptop):

[11332.369834] ------------[ cut here ]------------
[11332.369900] WARNING: at include/linux/kref.h:47 kref_get+0x2d/0x36()
[11332.369962] Modules linked in: i915(+) drm_kms_helper drm cpufreq_userspace cpufreq_powersave cpufre
q_conservative cpufreq_stats parport_pc ppdev lp parport bnep rfcomm bluetooth binfmt_misc uinput loop 
firewire_sbp2 fuse b43 bcma mac80211 snd_hda_codec_hdmi cfg80211 snd_hda_codec_idt snd_hda_intel snd_hd
a_codec rfkill rng_core i2c_algo_bit snd_hwdep snd_pcm snd_page_alloc snd_seq ssb pcmcia acpi_cpufreq s
nd_seq_device snd_timer hid_generic snd dell_wmi sparse_keymap yenta_socket pcmcia_rsrc mperf iTCO_wdt 
processor usbhid iTCO_vendor_support lpc_ich psmouse video hid i2c_i801 button battery ac pcmcia_core d
ell_laptop wmi dcdbas pcspkr serio_raw mfd_core evdev i2c_core soundcore ext4 crc16 mbcache jbd2 dm_mod
 sr_mod sd_mod cdrom crc_t10dif sdhci_pci sdhci firewire_ohci firewire_core mmc_core crc_itu_t ahci lib
ahci libata scsi_mod uhci_hcd ehci_pci ehci_hcd usbcore usb_common thermal thermal_sys [last unloaded: 
drm]  
[11332.374334] CPU: 0 PID: 28004 Comm: modprobe Not tainted 3.10.0-rc7+ #174
[11332.374399] Hardware name: Dell Inc. Latitude E6400                  /0W620R, BIOS A14 05/11/2009
[11332.374475]  0000000000000000 ffff880210685728 ffffffff81397d34 ffff880210685760
[11332.374716]  ffffffff810308d8 0000000000000000 ffff8802106857d0 ffff880211f865a8
[11332.374953]  ffff88021212ca28 0000000000000000 ffff880210685770 ffffffff8103098f
[11332.375191] Call Trace:
[11332.375253]  [<ffffffff81397d34>] dump_stack+0x19/0x1b
[11332.375316]  [<ffffffff810308d8>] warn_slowpath_common+0x60/0x78
[11332.375378]  [<ffffffff8103098f>] warn_slowpath_null+0x15/0x17
[11332.375441]  [<ffffffff8138132a>] kref_get+0x2d/0x36
[11332.375504]  [<ffffffff81381620>] klist_next+0x70/0xa6
[11332.375566]  [<ffffffff811eb3b7>] ? pci_do_find_bus+0x4a/0x4a
[11332.375629]  [<ffffffff8126f01a>] next_device+0x9/0x18
[11332.375690]  [<ffffffff8126f1c4>] bus_find_device+0x52/0x8c
[11332.375751]  [<ffffffff811eb59b>] pci_get_dev_by_id+0x59/0x7f
[11332.375812]  [<ffffffff811eb6a4>] pci_get_class+0x48/0x4a
[11332.375913]  [<ffffffffa069e2d5>] intel_dsm_detect+0x5f/0x150 [i915]
[11332.376006]  [<ffffffffa069e3cf>] intel_register_dsm_handler+0x9/0xb [i915]
[11332.376118]  [<ffffffffa064fbf4>] i915_driver_load+0xa42/0xcc1 [i915]
[11332.376209]  [<ffffffffa033761f>] drm_get_pci_dev+0x161/0x274 [drm]
[11332.376293]  [<ffffffffa064c65e>] i915_pci_probe+0x4e/0x58 [i915]
[11332.376358]  [<ffffffff811eadbf>] local_pci_probe+0x39/0x61
[11332.376422]  [<ffffffff811eaea6>] pci_device_probe+0xbf/0xe5
[11332.376483]  [<ffffffff812709d7>] driver_probe_device+0x98/0x1c4
[11332.376545]  [<ffffffff81270b97>] __driver_attach+0x5c/0x7e
[11332.376606]  [<ffffffff81270b3b>] ? __device_attach+0x38/0x38
[11332.376668]  [<ffffffff8126f0a1>] bus_for_each_dev+0x78/0x82
[11332.376729]  [<ffffffff812704d0>] driver_attach+0x19/0x1b
[11332.376792]  [<ffffffff812701a1>] bus_add_driver+0xf4/0x1fc
[11332.376854]  [<ffffffff81271073>] driver_register+0x87/0xf8
[11332.376915]  [<ffffffff811ea4e0>] __pci_register_driver+0x5b/0x5e
[11332.376980]  [<ffffffffa0481000>] ? 0xffffffffa0480fff
[11332.377050]  [<ffffffffa03377b8>] drm_pci_init+0x86/0xea [drm]
[11332.377113]  [<ffffffffa0481000>] ? 0xffffffffa0480fff
[11332.377190]  [<ffffffffa0481066>] i915_init+0x66/0x68 [i915]
[11332.377253]  [<ffffffff81000263>] do_one_initcall+0x7b/0x10f
[11332.377316]  [<ffffffff81081e4b>] load_module+0x1224/0x1dad
[11332.377377]  [<ffffffff8107ea80>] ? store_uevent+0x35/0x35
[11332.377440]  [<ffffffff810de1ac>] ? might_fault+0x3d/0x8b
[11332.377502]  [<ffffffff81082a74>] SyS_init_module+0xa0/0xaf
[11332.377565]  [<ffffffff813a2d92>] system_call_fastpath+0x16/0x1b
[11332.377626] ---[ end trace f64ad603e02a19b0 ]---

Two questions:
- Is the failure on your side 100% reproducible?
- Any backtraces in dmesg?
Comment 13 lu hua 2013-07-10 06:09:23 UTC
(In reply to comment #12)
> Ok, I've run module reload in an endless-loop and the thing indeed
> eventually crashed, with some neat fail in dmesg (this  is on my gm45
> laptop):
> 

> 
> Two questions:
> - Is the failure on your side 100% reproducible?
> - Any backtraces in dmesg?


1. The failure is 100% reproducible.
2. No backtraces in dmesg.
Comment 14 Daniel Vetter 2013-07-10 06:42:32 UTC
(In reply to comment #13)
> 1. The failure is 100% reproducible.
> 2. No backtraces in dmesg.

You've mentioned that module_reload fails on pnv, ilk, snb, ivb, hsw. Is it 100% failure on all these platforms with no output on dmesg?

Also please double-check your  local installation - after all module reloading _has_ worked before, so something broke it. If it's not a regression in the kernel, then it must be a regression somewhere in your kernel build/test setup.

Also the above list of platforms are almost all we still test. Are there any platforms where module reload still works?

Gordon, can you please ramp up the priority for investigating this? Afaik module reloading works, so I'm left with the conclusion that something (recently) broke with your -nightly setup. Flying blind like that makes me feel uneasy ... Maybe also drag Yi into the analysis.
Comment 15 lu hua 2013-07-11 08:05:02 UTC
Test on latest -nightly branch(09fa6edd6bf1ee1aa2092a8f50d407338be888e8):
IVB: 0/5 FAIL
SNB: 2/5 FAIL
ILK: 5/5 FAIL
PNV: 5/5 FAIL
Comment 16 Daniel Vetter 2013-07-11 09:31:30 UTC
When module_reload fails on these platforms, is the output always like

module successfully unloaded
FATAL: Could not load /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No such file or directory
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory
gem_create:487 failed, ret=-1, errno=9
./module_reload: line 42:  3759 Aborted                 (core dumped) 

i.e. what you've pasted in comment #0? Or is there different output in some case/on some platforms?
Comment 17 lu hua 2013-07-12 06:35:42 UTC
Test on latest -nightly kernel:
On PNV and ILK output:
module successfully unloaded
FATAL: Could not load /lib/modules/3.10.0-rc7_nightlytop_a3ee4d_20130712_+/modules.dep: No such file or directory
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory
gem_create:487 failed, ret=-1, errno=9
./module_reload: line 42:  3749 Aborted                 (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null

On SNB output:
module successfully unloaded
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory
gem_create:487 failed, ret=-1, errno=9
./module_reload: line 42:  5668 Aborted                 (core dumped) $SOURCE_DIR/gem_exec_nop > /dev/null
Comment 18 Daniel Vetter 2013-07-12 12:44:04 UTC
(In reply to comment #17)
> Test on latest -nightly kernel:
> On PNV and ILK output:
> module successfully unloaded
> FATAL: Could not load
> /lib/modules/3.10.0-rc7_nightlytop_a3ee4d_20130712_+/modules.dep: No such
> file or directory
> ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or
> directory
> gem_create:487 failed, ret=-1, errno=9
> ./module_reload: line 42:  3749 Aborted                 (core dumped)
> $SOURCE_DIR/gem_exec_nop > /dev/null

This really looks like an issue in userspace: "modules.dep: No such file or directory" means that modprobe can't properly load the module. Can you please check what might be going wrong on your system? Also have you recently upgraded anything or installed these systems newly which might explain the breakage?

> On SNB output:
> module successfully unloaded
> ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or
> directory
> gem_create:487 failed, ret=-1, errno=9
> ./module_reload: line 42:  5668 Aborted                 (core dumped)
> $SOURCE_DIR/gem_exec_nop > /dev/null

This looks more like a kernel issue, at least it's possible that it's a kernel issue. Please file a new bug for this snb machine and attach debug dmesg for the module reload.
Comment 19 lu hua 2013-07-15 08:55:17 UTC
(In reply to comment #4)
> It also happens on ILK,SNB,IVB. Run with clean boot.
> output:
>  ./module_reload
> module successfully unloaded
> ERROR: could not insert 'i915': Operation not permitted
> ./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or
> directory
> gem_create:487 failed, ret=-1, errno=9
> ./module_reload: line 42:  3991 Aborted                 (core dumped)
> $SOURCE_DIR/gem_exec_nop > /dev/null
> 
> Also I run it manually:
> echo 0 > /sys/class/vtconsole/vtcon1/bind
> modprobe -r i915
> modprobe -r drm
> modprobe i915
> ERROR: could not insert 'i915': Operation not permitted


1. We found that FATAL: "Could not load /lib/modules/3.10.0-rc7_drm-intel-fixes_446f8d_20130704_+/modules.dep: No such file or directory
./module_reload: line 39: /sys/class/vtconsole/vtcon1/bind: No such file or directory" is not an issue.

2. "ERROR: could not insert 'i915': Operation not permitted" is an issue, it has been fixed now.

So I close this bug.
Comment 20 lu hua 2013-07-15 08:58:11 UTC
Verified.Fixed.
Comment 21 Daniel Vetter 2013-07-15 10:07:47 UTC
That still leaves the issue on SNB, which looks like a different bug. Is module_reload working on SNB well again? If it's still broken like you show in comment #17 then please file a new bug report with the requested information.
Comment 22 Elizabeth 2017-10-06 14:45:16 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.