System Environment: -------------------------- Kernel: (drm-intel-nightly)24b0ac8d5fae57b2e99ac2babfc6e63f0a1d851d Some additional commit info: Merge: 8417942 c12aba5 Bug detailed description: ------------------------- System hang while booting up. It happens on -queued kernel.It works well on -fixes kernel. Bisect shows: 90eb871b012a30c2476444b7c5c0b37e7b44c05a is the first bad commit commit 90eb871b012a30c2476444b7c5c0b37e7b44c05a Author: Imre Deak <imre.deak@intel.com> AuthorDate: Mon Feb 18 19:28:04 2013 +0200 Commit: Daniel Vetter <daniel.vetter@ffwll.ch> CommitDate: Tue Mar 19 09:51:15 2013 +0100 drm/i915: use for_each_sg_page for setting up the gtt ptes The existing gtt setup code is correct - and so doesn't need to be fixed to handle compact dma scatter lists similarly to the previous patches. Still, take the for_each_sg_page macro into use, to get somewhat simpler code. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I can reproduce this, but only with less than 100% frequency. Also I can reproduce the same issue with earlier commits, I went back so far to: 0d4a42f6bd - Merge tag 'v3.9-rc3' into drm-intel-next-queued I'll try to further narrow it down, or get a log somehow.
The failures I saw were somehow related to i915 being built as module. Having it built-in I can't reproduce the problem on -queued or -nightly. One thing I thought of is the drm race discussed at http://lists.freedesktop.org/archives/intel-gfx/2013-March/025685.html Could the reporter please attach their kconfig? Also could they test if it's reproducible with i915 built statically? And if possible, could they test if one of the solutions posted above help (needs patching xorg or the kernel)? Thanks, Imre
Also checking if the issue is 100% reproducible or not would help.
Created attachment 76890 [details] kconfig It fails 10 in 10 runs. It also fails with i915 built in.
- Does the hang still happen with rc6 disabled? - Can you reproduce occasional failures even before that bad commit, like Imre can?
I got it on my VLV, reverting the move to for_each_sg_page seems to work around the issue. Note I'm not starting X, but plymouth may be trying to start. [ 72.772513] fb: conflicting fb hw usage inteldrmfb vs EFI VGA - removing generic driver [ 72.772972] i915 0000:00:02.0: setting latency timer to 64 [ 72.775278] i915 0000:00:02.0: irq 104 for MSI/MSI-X [ 72.775328] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [ 72.775354] [drm] Driver supports precise vblank timestamp query. [ 72.775612] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 72.880203] [drm] GMBUS [i915 gmbus vga] timed out, falling back to bit banging on pin 2 [ 72.972301] BUG: unable to handle kernel NULL pointer dereference at (null) [ 72.972343] IP: [<c22d9e1d>] __sg_page_iter_next+0x8d/0xd0 [ 72.972399] *pde = 00000000 [ 72.972430] Oops: 0000 [#1] SMP [ 72.972456] Modules linked in: i915(O+) drm_kms_helper drm coretemp kvm_intel kvm aesni_intel ablk_helper cryptd lrw aes_i586 xts gf128mul hid_generic snd_hda_intel microcode snd_hda_codec snd_hwdep snd_pcm asix usbhid hid snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device mac_hid snd soundcore snd_page_alloc bnep rfcomm bluetooth parport_pc ppdev i2c_algo_bit video binfmt_misc lp parport nls_iso8859_1 [last unloaded: drm] [ 72.972753] Pid: 1895, comm: insmod Tainted: G W IO 3.9.0-rc3-merge+ #14 [ 72.972794] EIP: 0060:[<c22d9e1d>] EFLAGS: 00010246 CPU: 0 [ 72.972825] EIP is at __sg_page_iter_next+0x8d/0xd0 [ 72.972852] EAX: f4de19d0 EBX: 00000000 ECX: f6fab940 EDX: 00000000 [ 72.972879] ESI: fa2001cc EDI: f5311000 EBP: f4de19b8 ESP: f4de19b4 [ 72.972906] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 72.972932] CR0: 8005003b CR2: 00000000 CR3: 3536c000 CR4: 001007d0 [ 72.972963] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 72.972991] DR6: ffff0ff0 DR7: 00000400 [ 72.973020] Process insmod (pid: 1895, ti=f4de0000 task=f4e18cc0 task.ti=f4de0000) [ 72.973050] Stack: [ 72.973069] f732e200 f4de19f0 fa16315b 00000000 fa2001cc f52a4000 00000000 00000000 [ 72.973138] f6fab940 00000000 00000001 00000001 f732e200 f52a4000 00000001 f4de1a08 [ 72.973202] fa163b07 00000000 f732e200 f52a4000 00000001 f4de1a68 fa15da8a 00001000 [ 72.973266] Call Trace: [ 72.973406] [<fa16315b>] gen6_ggtt_insert_entries+0x4b/0x170 [i915] [ 72.973547] [<fa163b07>] i915_gem_gtt_bind_object+0x37/0x50 [i915] [ 72.973678] [<fa15da8a>] i915_gem_object_pin+0x41a/0x5b0 [i915] [ 72.973716] [<c22d9c1d>] ? sg_init_table+0x1d/0x40 [ 72.973840] [<fa15dc85>] i915_gem_object_pin_to_display_plane+0x65/0xd0 [i915] [ 72.973978] [<fa16494e>] ? i915_gem_object_create_stolen+0x15e/0x200 [i915] [ 72.974123] [<fa171c12>] intel_pin_and_fence_fb_obj+0x52/0x120 [i915] [ 72.974277] [<fa19531e>] intelfb_create+0xae/0x380 [i915] [ 72.974365] [<fa0afc20>] ? drm_mode_create+0x40/0x70 [drm] [ 72.974410] [<f9d8387f>] ? drm_setup_crtcs+0x40f/0x6c0 [drm_kms_helper] [ 72.974457] [<f9d83dbe>] drm_fb_helper_initial_config+0x28e/0x470 [drm_kms_helper] [ 72.974504] [<c213a29d>] ? __kmalloc+0x13d/0x170 [ 72.974639] [<fa17b4d1>] ? intel_modeset_setup_hw_state+0x581/0x8b0 [i915] [ 72.974677] [<c2139f33>] ? kmem_cache_alloc_trace+0x103/0x110 [ 72.974722] [<f9d831e7>] ? drm_fb_helper_single_add_all_connectors+0x37/0xc0 [drm_kms_helper] [ 72.974876] [<fa1956be>] intel_fbdev_initial_config+0x1e/0x20 [i915] [ 72.975000] [<fa14e563>] i915_driver_load+0x7b3/0xca0 [i915] [ 72.975114] [<fa14c040>] ? i915_switcheroo_set_state+0xa0/0xa0 [i915] [ 72.975208] [<fa0ab52b>] drm_get_pci_dev+0x13b/0x270 [drm] [ 72.975252] [<c202dbb8>] ? default_spin_lock_flags+0x8/0x10 [ 72.975368] [<fa14a3aa>] i915_pci_probe+0x3a/0x90 [i915] [ 72.975409] [<c22f9bc9>] pci_device_probe+0x79/0xb0 [ 72.975451] [<c23ad9ac>] driver_probe_device+0x5c/0x1e0 [ 72.975485] [<c22f9b13>] ? pci_match_device+0xb3/0xc0 [ 72.975519] [<c23adbc1>] __driver_attach+0x91/0xa0 [ 72.975552] [<c23adb30>] ? driver_probe_device+0x1e0/0x1e0 [ 72.975586] [<c23ac1a2>] bus_for_each_dev+0x42/0x80 [ 72.975619] [<c23ad56e>] driver_attach+0x1e/0x20 [ 72.975652] [<c23adb30>] ? driver_probe_device+0x1e0/0x1e0 [ 72.975685] [<c23ad17c>] bus_add_driver+0xdc/0x240 [ 72.975720] [<c22f99b0>] ? pci_device_shutdown+0x50/0x50 [ 72.975757] [<c23ae1aa>] driver_register+0x6a/0x160 [ 72.975795] [<f9db2000>] ? 0xf9db1fff [ 72.975830] [<c22f8d83>] __pci_register_driver+0x33/0x40 [ 72.975909] [<fa0ab75d>] drm_pci_init+0xfd/0x110 [drm] [ 72.975944] [<f9db2000>] ? 0xf9db1fff [ 72.976053] [<f9db205e>] i915_init+0x5e/0x60 [i915] [ 72.976090] [<c2001222>] do_one_initcall+0x112/0x160 [ 72.976129] [<c2097802>] ? set_section_ro_nx+0x62/0x80 [ 72.976164] [<c209a8a0>] load_module+0x1b60/0x2430 [ 72.976203] [<c2041930>] ? __do_softirq+0x110/0x1d0 [ 72.976252] [<c209b1e8>] sys_init_module+0x78/0xb0 [ 72.976296] [<c25e278d>] sysenter_do_call+0x12/0x28 [ 72.976323] Code: 14 8b 52 14 f6 c2 01 75 55 83 68 0c 01 89 48 04 74 e4 85 c9 75 b0 eb de 8d b6 00 00 00 00 31 c0 c3 90 8d 74 26 00 8b 11 83 e2 fc <8b> 0a c1 e9 1a 8b 0c cd a0 fb a3 c2 83 e1 fc 29 ca c1 fa 05 01 [ 72.976607] EIP: [<c22d9e1d>] __sg_page_iter_next+0x8d/0xd0 SS:ESP 0068:f4de19b4 [ 72.976658] CR2: 0000000000000000 [ 72.976731] ---[ end trace bc34bd02746d5a3c ]---
Created attachment 76922 [details] [review] fix for invalid page in sg The attached patch works around the problem Jesse reported. Though it's CONFIG_SPARSEMEM related and the kconfig attached to this bug doesn't have that, could you guys still give it a go? Otherwise I'm going to set up an SNB box with your kconfig and continue from there.
Now in dinq, commit 1fbd6797df28cbacf7fb249fdc867e4d52079ec3 Author: Imre Deak <imre.deak@intel.com> Date: Fri Mar 22 23:10:44 2013 +0200 drm/i915: set dummy page for stolen objects
Fixed on -queued branch commit 4f3308b9754cb0a4467ccaca4f3ccee42d803620.
Verified. Fixed.
Closing verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.