I'm using 4.11rc4 with Fedora 25 on an XPS 9350 hooked up to a TP15 dock with 2 external monitors. I've seen this error on 4.10 as well. Occassionally, about once or twice a day, I get the following bug. It crashes my wayland session and I have to reboot. The underlying OS is still running as I can login from another machine. Mar 31 10:47:03 nc6910p kernel: ---[ end trace 7191751e8c8925ea ]--- Mar 31 10:47:03 nc6910p kernel: CR2: 0000000000000018 Mar 31 10:47:03 nc6910p kernel: RIP: gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x250 [i915] RSP: ffffbf0085d8b878 Mar 31 10:47:03 nc6910p kernel: Code: e6 48 8b 90 28 03 00 00 48 8b b8 e0 02 00 00 48 8b 52 08 48 83 ca 03 e8 4a d0 ff ff 48 8b 45 b0 48 8b 4d c8 48 8b 10 48 8b 45 d0 <4c> 89 24 ca 48 0f ab 08 0f 1f 44 00 00 e9 53 ff ff ff 65 8b 0 Mar 31 10:47:03 nc6910p kernel: R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000000000 Mar 31 10:47:03 nc6910p kernel: R10: 0000000000000050 R11: 0000000000000246 R12: 00000000c0406469 Mar 31 10:47:03 nc6910p kernel: RBP: 00007ffc29523620 R08: 0000000000000000 R09: 0000000000000000 Mar 31 10:47:03 nc6910p kernel: RDX: 00007ffc29523620 RSI: 00000000c0406469 RDI: 0000000000000010 Mar 31 10:47:03 nc6910p kernel: RAX: ffffffffffffffda RBX: 00001e30454f3000 RCX: 00007fa9fd3ed787 Mar 31 10:47:03 nc6910p kernel: RSP: 002b:00007ffc295235d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Mar 31 10:47:03 nc6910p kernel: RIP: 0033:0x7fa9fd3ed787 Mar 31 10:47:03 nc6910p kernel: entry_SYSCALL64_slow_path+0x25/0x25 Mar 31 10:47:03 nc6910p kernel: do_syscall_64+0x67/0x180 Mar 31 10:47:03 nc6910p kernel: SyS_ioctl+0x79/0x90 Mar 31 10:47:03 nc6910p kernel: do_vfs_ioctl+0xa3/0x5f0 Mar 31 10:47:03 nc6910p kernel: ? i915_gem_execbuffer+0x310/0x310 [i915] Mar 31 10:47:03 nc6910p kernel: ? seccomp_run_filters+0x52/0xc0 Mar 31 10:47:03 nc6910p kernel: drm_ioctl+0x209/0x4c0 [drm] Mar 31 10:47:03 nc6910p kernel: i915_gem_execbuffer2+0xc5/0x240 [i915] Mar 31 10:47:03 nc6910p kernel: i915_gem_do_execbuffer.isra.36+0x4ec/0x1650 [i915] Mar 31 10:47:03 nc6910p kernel: i915_gem_execbuffer_reserve.isra.30+0x457/0x490 [i915] Mar 31 10:47:03 nc6910p kernel: i915_gem_execbuffer_reserve_vma.isra.29+0x14d/0x1b0 [i915] Mar 31 10:47:03 nc6910p kernel: __i915_vma_do_pin+0x3a3/0x460 [i915] Mar 31 10:47:03 nc6910p kernel: i915_vma_bind+0x81/0x170 [i915] Mar 31 10:47:03 nc6910p kernel: gen8_alloc_va_range+0x25b/0x410 [i915] Mar 31 10:47:03 nc6910p kernel: ? add_hole+0xf0/0x110 [drm] Mar 31 10:47:03 nc6910p kernel: ? pick_next_task_fair+0x398/0x550 Mar 31 10:47:03 nc6910p kernel: ? sched_clock+0x9/0x10 Mar 31 10:47:03 nc6910p kernel: gen8_alloc_va_range_3lvl+0xd4/0x920 [i915] Mar 31 10:47:03 nc6910p kernel: Call Trace: Mar 31 10:47:03 nc6910p kernel: CR2: 0000000000000018 CR3: 0000000220044000 CR4: 00000000003406e0 Mar 31 10:47:03 nc6910p kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 31 10:47:03 nc6910p kernel: FS: 00007faa03d64f80(0000) GS:ffff9ee4bed00000(0000) knlGS:0000000000000000 Mar 31 10:47:03 nc6910p kernel: R13: ffff9ee1637a4d10 R14: 00000000fffef000 R15: 0000000000008000 Mar 31 10:47:03 nc6910p kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9ee390806000 Mar 31 10:47:03 nc6910p kernel: RBP: ffffbf0085d8b8d0 R08: 0000000000000000 R09: 0000000000000000 Mar 31 10:47:03 nc6910p kernel: RDX: 0000000000000000 RSI: ffff9ee1d1a74000 RDI: ffff9ee4a73b8000 Mar 31 10:47:03 nc6910p kernel: RAX: ffff9ee43fc76dc0 RBX: 0000000000000003 RCX: 0000000000000003 Mar 31 10:47:03 nc6910p kernel: RSP: 0018:ffffbf0085d8b878 EFLAGS: 00010246 Mar 31 10:47:03 nc6910p kernel: RIP: 0010:gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x250 [i915] Mar 31 10:47:03 nc6910p kernel: task: ffff9ee26905cc00 task.stack: ffffbf0085d88000 Mar 31 10:47:03 nc6910p kernel: Hardware name: Dell Inc. XPS 13 9350/0H67KH, BIOS 1.4.14 02/08/2017 Mar 31 10:47:03 nc6910p kernel: CPU: 2 PID: 24710 Comm: chrome Tainted: G U OE 4.11.0-0.rc4.git0.2.local.fc27.x86_64 #1 Mar 31 10:47:03 nc6910p kernel: int3403_thermal intel_hid intel_lpss int340x_thermal_zone sparse_keymap int3400_thermal acpi_pad acpi_thermal_rel acpi_als kfifo_buf industrialio tpm_tis tpm_tis_core tpm nfsd auth_rpcgss nfs_acl l Mar 31 10:47:03 nc6910p kernel: vfat fat dell_led snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match snd_soc_core snd_hda_codec_realtek snd_hda_codec_generic snd_compress snd_pcm_dmaeng Mar 31 10:47:03 nc6910p kernel: Modules linked in: ccm arc4 mac80211 cfg80211 cdc_ether usbnet snd_usb_audio snd_usbmidi_lib snd_rawmidi r8152 mii rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJE Mar 31 10:47:03 nc6910p kernel: Oops: 0002 [#1] SMP Mar 31 10:47:03 nc6910p kernel: Mar 31 10:47:03 nc6910p kernel: PGD 0 Mar 31 10:47:03 nc6910p kernel: IP: gen8_ppgtt_alloc_page_directories.isra.38+0x115/0x250 [i915] Mar 31 10:47:03 nc6910p kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
The crash happens when moving the mouse or opening a window, say in Chrome. It appears to be random other than the mouse/window/popup action.
commit e2b763caa6eb68ea56918ee6f79b40b82bdcf7c9 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 08:43:48 2017 +0000 drm/i915: Remove bitmap tracking for used-pdpes Which is too big for stable and queued for 4.12.
So, because the fix is too big for "stable" kernels, kernel 4.10 and 4.11 are therefore unstable for any PC running Intel graphics? I've seen this happen under X, so it's not just Wayland sessions that crash with this bug. Is the fix in drm-intel-nightly?
I am running on drm-tip which does not generate this error, it generates another error that seems similar in that it deals with allocation. Devs thought it was a userspace issue with say, gnome-shell, but I'm not so sure.
I am running Fedora 25 with 4.10.13 and Xorg crashes one or two times daily with the following message: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: gen8_ppgtt_alloc_page_directories.isra.36+0x115/0x250 [i915] I guess the fix didn't make it into 4.10.12?
Chris, I wonder how you guys find it "ok" to have two major kernel versions (4.10 and 4.11) lockup on users accross the board with no intent to backport the fix ? This is hitting *all* the laptops here in ozlabs since the distros have been updating to 4.10. We can't get more than about a day uptime. This really need a workaround of some sort in 4.10 and 4.11
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.