Created attachment 80688 [details] messages It can't enter to OS due to i915 hang up when warm boot or cold boot sometimes on haswell chipset, Fail rate is too high, especially on Ultra books. Kernel: 3.9.2 Please help to give us some suggestion for this issue. Below is messages, please refer it.if you need other information, please tell me. Thanks a lot! [ 3.623726] i915 0000:00:02.0: setting latency timer to 64 [ 3.649096] [drm:i915_write32] *ERROR* Unknown unclaimed register before writing to c5100 [ 3.649246] i915 0000:00:02.0: irq 63 for MSI/MSI-X [ 3.649252] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010). [ 3.649253] [drm] Driver supports precise vblank timestamp query. [ 3.649305] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=io+mem,decodes=io+mem:owns=io+mem [ 3.694679] fbcon: inteldrmfb (fb0) is primary device [ 3.694769] Console: switching to colour frame buffer device 200x56 [ 3.694778] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device [ 3.694780] i915 0000:00:02.0: registered panic notifier [ 3.708855] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048 [ 3.708972] IP: [<ffffffffa06e1371>] i915_driver_load+0xe51/0xe90 [i915] [ 3.709086] PGD 14c18e067 PUD 14f482067 PMD 0 [ 3.709176] Oops: 0000 [#1] SMP [ 3.709231] Modules linked in: i915(+) bnep bluetooth iTCO_wdt iTCO_vendor_support coretemp crc32c_intel joydev ghash_clmulni_intel microcode pcspkr wl(PO) r8169 mii cfg80211 lib80211 lpc_ich mfd_core i2c_i801 snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_intel wmi snd_hda_codec battery ideapad_laptop rfkill snd_hwdep sparse_keymap video i2c_algo_bit drm_kms_helper snd_pcm snd_page_alloc snd_timer drm snd soundcore i2c_core mac_hid [ 3.709910] CPU 1 [ 3.709953] Pid: 340, comm: modprobe Tainted: P IO 3.9.2-8.1.1.lp19.x86_64 #1 LENOVO SharkBay/INVALID [ 3.710109] RIP: 0010:[<ffffffffa06e1371>] [<ffffffffa06e1371>] i915_driver_load+0xe51/0xe90 [i915] [ 3.710230] RSP: 0018:ffff88014c9bd918 EFLAGS: 00010246 [ 3.710293] RAX: ffff88014c914c80 RBX: ffff88014c914800 RCX: ffff88014c914c80 [ 3.710376] RDX: ffff88014c914c78 RSI: ffff88014c0d6438 RDI: ffff88015a802700 [ 3.710457] RBP: ffff88014c9bdaa8 R08: 0000000000016fa0 R09: ffff88015f256fa0 [ 3.710538] R10: ffffea0005300a80 R11: ffffffff8141a370 R12: 0000000000000000 [ 3.710617] R13: 0000000010000000 R14: 0000000000000000 R15: ffff8801421dc000 [ 3.710713] FS: 00007f4a13e0f740(0000) GS:ffff88015f240000(0000) knlGS:0000000000000000 [ 3.710807] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.710880] CR2: 0000000000000048 CR3: 000000014c02b000 CR4: 00000000001407e0 [ 3.710968] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.711051] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3.711141] Process modprobe (pid: 340, threadinfo ffff88014c9bc000, task ffff88014c88aec0) [ 3.711245] Stack: [ 3.711273] ffffffffa074f5f5 ffffffffa074f4cb ffffffffa074f4cb ffffffffa074f4cb [ 3.711372] ffffffffa074f4cb ffffffffa074f5c1 ffffffffa074f4cb ffffffffa074f4cb [ 3.711479] ffffffffa074f4cb ffffffffa074f4cb ffffffffa074f4cb ffffffffa074f4cb [ 3.711582] Call Trace: [ 3.711627] [<ffffffffa0047a96>] drm_get_pci_dev+0x186/0x2d0 [drm] [ 3.711723] [<ffffffffa06dc33c>] i915_pci_probe+0x3c/0x90 [i915] [ 3.711799] [<ffffffff813c6c3b>] local_pci_probe+0x4b/0x80 [ 3.711867] [<ffffffff813c6f51>] pci_device_probe+0x111/0x120 [ 3.711939] [<ffffffff8148700b>] driver_probe_device+0x8b/0x390 [ 3.712012] [<ffffffff814873bb>] __driver_attach+0xab/0xb0 [ 3.712082] [<ffffffff81487310>] ? driver_probe_device+0x390/0x390 [ 3.712162] [<ffffffff8148505d>] bus_for_each_dev+0x5d/0xa0 [ 3.712231] [<ffffffff8148696e>] driver_attach+0x1e/0x20 [ 3.712297] [<ffffffff8148650e>] bus_add_driver+0x11e/0x2a0 [ 3.712366] [<ffffffffa080c000>] ? 0xffffffffa080bfff [ 3.712428] [<ffffffffa080c000>] ? 0xffffffffa080bfff [ 3.712491] [<ffffffff81487a87>] driver_register+0x77/0x170 [ 3.712557] [<ffffffffa080c000>] ? 0xffffffffa080bfff [ 3.712618] [<ffffffff813c5edc>] __pci_register_driver+0x4c/0x50 [ 3.712720] [<ffffffffa0047cfa>] drm_pci_init+0x11a/0x130 [drm] [ 3.712803] [<ffffffffa080c000>] ? 0xffffffffa080bfff [ 3.712887] [<ffffffffa080c066>] i915_init+0x66/0x68 [i915] [ 3.712968] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 [ 3.713045] [<ffffffff810c433e>] load_module+0x1c1e/0x27b0 [ 3.713124] [<ffffffff813bac70>] ? ddebug_proc_open+0xc0/0xc0 [ 3.713205] [<ffffffff810c4fa7>] sys_init_module+0xd7/0x120 [ 3.713282] [<ffffffff816ced19>] system_call_fastpath+0x16/0x1b [ 3.713361] Code: 80 1a 00 00 00 00 00 00 e9 01 f7 ff ff 48 c7 c6 00 f6 74 a0 48 c7 c7 50 c5 73 a0 41 bc fb ff ff ff e8 b4 3d 96 ff e9 f4 f8 ff ff <48> 8b 3c 25 48 00 00 00 48 85 ff 0f 84 ce fb ff ff e9 bd fb ff [ 3.713781] RIP [<ffffffffa06e1371>] i915_driver_load+0xe51/0xe90 [i915] [ 3.713877] RSP <ffff88014c9bd918> [ 3.713920] CR2: 0000000000000048 [ 4.739363] [drm] Enabling RC6 states: RC6 on, RC6p off, RC6pp off [ 5.215475] ---[ end trace 363ca17e08482316 ]--- [ 7.719131] fuse init (API version 7.21)
gdb drivers/gpu/drm/i915.ko list *i915_driver_load+0xe51
(gdb) list *i915_driver_load+0xe51 0x5ac1 is in i915_driver_load (include/linux/pci.h:818). 813 return pci_bus_write_config_word(dev->bus, dev->devfn, where, val); 814 } 815 static inline int pci_write_config_dword(const struct pci_dev *dev, int where, 816 u32 val) 817 { 818 return pci_bus_write_config_dword(dev->bus, dev->devfn, where, val); 819 } 820 821 int pcie_capability_read_word(struct pci_dev *dev, int pos, u16 *val); 822 int pcie_capability_read_dword(struct pci_dev *dev, int pos, u32 *val); (gdb)
(gdb) list *i915_driver_load+0xe90 0x5b00 is in i915_driver_load (drivers/gpu/drm/i915/i915_dma.c:1163). 1158 PCIBIOS_MIN_MEM, 1159 0, pcibios_align_resource, 1160 dev_priv->bridge_dev); 1161 if (ret) { 1162 DRM_DEBUG_DRIVER("failed bus alloc: %d\n", ret); 1163 dev_priv->mch_res.start = 0; 1164 return ret; 1165 } 1166 1167 if (INTEL_INFO(dev)->gen >= 4) (gdb)
Can you please retest with intel_iommu=igfx_off added to the kernel cmdline?
After add the intel_iommu=igfx_off to kernel cmdline, it still fail. error code is the same as before. Jul 27 15:28:26 localhost kernel: [ 3.785949] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jul 27 15:28:26 localhost kernel: [ 3.786038] Process udevd (pid: 129, threadinfo ffff88006e606000, task ffff88006dc14620) Jul 27 15:28:26 localhost kernel: [ 3.786138] Stack: Jul 27 15:28:26 localhost kernel: [ 3.786165] ffffffffa08dc8f5 ffffffffa08dc7cb ffffffffa08dc7cb ffffffffa08dc7cb Jul 27 15:28:26 localhost kernel: [ 3.786266] ffffffffa08dc7cb ffffffffa08dc8c1 ffffffffa08dc7cb ffffffffa08dc7cb Jul 27 15:28:26 localhost kernel: [ 3.786369] ffffffffa08dc7cb ffffffffa08dc7cb ffffffffa08dc7cb ffffffffa08dc7cb Jul 27 15:28:26 localhost kernel: [ 3.786470] Call Trace: Jul 27 15:28:26 localhost kernel: [ 3.786520] [<ffffffffa0124a96>] drm_get_pci_dev+0x186/0x2d0 [drm] Jul 27 15:28:26 localhost kernel: [ 3.786616] [<ffffffffa086933c>] i915_pci_probe+0x3c/0x90 [i915] Jul 27 15:28:26 localhost kernel: [ 3.786698] [<ffffffff813c6c3b>] local_pci_probe+0x4b/0x80 Jul 27 15:28:26 localhost kernel: [ 3.786772] [<ffffffff813c6f51>] pci_device_probe+0x111/0x120 Jul 27 15:28:26 localhost kernel: [ 3.786849] [<ffffffff8148700b>] driver_probe_device+0x8b/0x390 Jul 27 15:28:26 localhost kernel: [ 3.786925] [<ffffffff814873bb>] __driver_attach+0xab/0xb0 Jul 27 15:28:26 localhost kernel: [ 3.786997] [<ffffffff81487310>] ? driver_probe_device+0x390/0x390 Jul 27 15:28:26 localhost kernel: [ 3.787076] [<ffffffff8148505d>] bus_for_each_dev+0x5d/0xa0 Jul 27 15:28:26 localhost kernel: [ 3.787149] [<ffffffff8148696e>] driver_attach+0x1e/0x20 Jul 27 15:28:26 localhost kernel: [ 3.787218] [<ffffffff8148650e>] bus_add_driver+0x11e/0x2a0 Jul 27 15:28:26 localhost kernel: [ 3.787293] [<ffffffffa044a000>] ? 0xffffffffa0449fff Jul 27 15:28:26 localhost kernel: [ 3.787360] [<ffffffffa044a000>] ? 0xffffffffa0449fff Jul 27 15:28:26 localhost kernel: [ 3.787426] [<ffffffff81487a87>] driver_register+0x77/0x170 Jul 27 15:28:26 localhost kernel: [ 3.787499] [<ffffffffa044a000>] ? 0xffffffffa0449fff Jul 27 15:28:26 localhost kernel: [ 3.787567] [<ffffffff813c5edc>] __pci_register_driver+0x4c/0x50 Jul 27 15:28:26 localhost kernel: [ 3.787654] [<ffffffffa0124cfa>] drm_pci_init+0x11a/0x130 [drm] Jul 27 15:28:26 localhost kernel: [ 3.787733] [<ffffffffa044a000>] ? 0xffffffffa0449fff Jul 27 15:28:26 localhost kernel: [ 3.787812] [<ffffffffa044a066>] i915_init+0x66/0x68 [i915] Jul 27 15:28:26 localhost kernel: [ 3.787886] [<ffffffff8100215a>] do_one_initcall+0x12a/0x180 Jul 27 15:28:26 localhost kernel: [ 3.787962] [<ffffffff810c433e>] load_module+0x1c1e/0x27b0 Jul 27 15:28:26 localhost kernel: [ 3.788034] [<ffffffff813bac70>] ? ddebug_proc_open+0xc0/0xc0 Jul 27 15:28:26 localhost kernel: [ 3.788110] [<ffffffff810c4fa7>] sys_init_module+0xd7/0x120 Jul 27 15:28:26 localhost kernel: [ 3.788183] [<ffffffff816cf659>] system_call_fastpath+0x16/0x1b Jul 27 15:28:26 localhost kernel: [ 3.788258] Code: 80 1a 00 00 00 00 00 00 e9 01 f7 ff ff 48 c7 c6 00 c9 8d a0 48 c7 c7 50 98 8c a0 41 bc fb ff ff ff e8 b4 3d 8b ff e9 f4 f8 ff ff <48> 8b 3c 25 48 00 00 00 48 85 ff 0f 84 ce fb ff ff e9 bd fb ff Jul 27 15:28:26 localhost kernel: [ 3.788656] RIP [<ffffffffa086e371>] i915_driver_load+0xe51/0xe90 [i915] Jul 27 15:28:26 localhost kernel: [ 3.788757] RSP <ffff88006e607918> Jul 27 15:28:26 localhost kernel: [ 3.788802] CR2: 0000000000000048 Jul 27 15:28:27 localhost kdumpctl[275]: E: Dracut module "rpmversion" cannot be found. Jul 27 15:28:27 localhost kdumpctl[275]: E: Dracut module "rpmversion" cannot be found. Jul 27 15:28:27 localhost kdumpctl[275]: i18n Jul 27 15:28:27 localhost kdumpctl[275]: convertfs Jul 27 15:28:28 localhost kdumpctl[275]: kernel-modules Jul 27 15:28:28 localhost kernel: [ 5.389082] ---[ end trace a23fd74953912f8f ]--- Jul 27 15:28:28 localhost kernel: [ 5.888161] [drm] Enabling RC6 states: RC6 off, RC6p off, RC6pp off Jul 27 15:28:28 localhost udevd[106]: worker [129] terminated by signal 9 (Killed)
Does this problem also happen with newer Kernels? Could you please test 3.10 or 3.11-rc3? The first error message while booting is this one: [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: at drivers/iommu/dmar.c:483 warn_invalid_dmar+0x86/0xa0() [ 0.000000] Hardware name: SharkBay [ 0.000000] Your BIOS is broken; DMAR reported at address 0! [ 0.000000] BIOS vendor: LENOVO; Ver: 7CCN12WW; Product Version: INVALID [ 0.000000] Modules linked in: [ 0.000000] Pid: 0, comm: swapper Not tainted 3.9.2-8.1.1.lp19.x86_64 #1 [ 0.000000] Call Trace: [ 0.000000] [<ffffffff8105ed6f>] warn_slowpath_common+0x7f/0xc0 [ 0.000000] [<ffffffff8105ee0f>] warn_slowpath_fmt_taint+0x3f/0x50 [ 0.000000] [<ffffffff81d19f51>] ? early_ioremap+0x13/0x15 [ 0.000000] [<ffffffff81d1160a>] ? __acpi_map_table+0x13/0x1a [ 0.000000] [<ffffffff81594b26>] warn_invalid_dmar+0x86/0xa0 [ 0.000000] [<ffffffff81d42f89>] check_zero_address+0x57/0xf7 [ 0.000000] [<ffffffff81d43040>] detect_intel_iommu+0x17/0xb9 [ 0.000000] [<ffffffff81d0be50>] pci_iommu_alloc+0x4a/0x72 [ 0.000000] [<ffffffff81d19920>] mem_init+0x15/0x133 [ 0.000000] [<ffffffff81d03cc9>] start_kernel+0x1e3/0x3ff [ 0.000000] [<ffffffff81d038e5>] ? repair_env_string+0x5e/0x5e [ 0.000000] [<ffffffff81d035de>] x86_64_start_reservations+0x2a/0x2c [ 0.000000] [<ffffffff81d036d1>] x86_64_start_kernel+0xf1/0x100 [ 0.000000] ---[ end trace 363ca17e08482314 ]--- I wonder if this is related to the gfx problems later.
After adjust i915 load time later, the issue can't be reproduced.
[ 3.708972] IP: [<ffffffffa06e1371>] i915_driver_load+0xe51/0xe90 [i915] is very likely during the error path. You might need drm.debug=7 to see if we can catch whatever the driver load is racing against.
Created attachment 84082 [details] /var/log/messages after add drm.debug=7
Hi EvaWang, do you still see this issue with newer kernels? Thanks, Rodrigo.
Hi Rodrigo, we didn't find the issue on new kernel 3.12. Thanks!
Gone. Not surprising since it was an impossible bug.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.