Bug 104773 - GPF in i915 call to gen8_ppgtt_alloc_pdp causes laptop to hang
Summary: GPF in i915 call to gen8_ppgtt_alloc_pdp causes laptop to hang
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-24 19:56 UTC by Eric Blau
Modified: 2018-04-20 11:02 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features: GEM/PPGTT


Attachments

Description Eric Blau 2018-01-24 19:56:26 UTC
System Architecture: x86_64
Kernel Version:      4.14.13-1-ARCH
Linux Distribution:  Arch Linux
Machine:             MacBook Pro 12,1
Display Connector:   Thunderbolt to DisplayPort

I was working on my laptop this morning, doing nothing special in particular, when my laptop hung and became unresponsive. It would not respond to mouse or keypresses and required a hard power cycle. The exact trigger was clicking a tab in Chromium to switch tabs. Checking journalctl, I see the following general protection fault in i915:

Jan 24 10:32:48 eric-macbookpro kernel: general protection fault: 0000 [#1] PREEMPT SMP PTI
Jan 24 10:32:48 eric-macbookpro kernel: Modules linked in: brcmfmac brcmutil cfg80211 mmc_core facetimehd(O) videobuf2_dma_sg videobuf2_memops videobuf2_v4l2 videobuf2_core videodev media asix usbnet mii libphy tun rfcomm ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack fuse libcrc32c crc32c_generic br_netfilter bridge stp llc cmac bnep nls_iso8859_1 nls_cp437 vfat fat uas msr iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi joydev thunderbolt snd_hda_codec_cirrus snd_hda_codec_generic sch_fq_codel sg crypto_user applesmc input_polldev intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm irqbypass intel_cstate intel_rapl_perf pcspkr btusb btrtl btbcm
Jan 24 10:32:48 eric-macbookpro kernel:  btintel bluetooth i2c_algo_bit intel_pch_thermal drm_kms_helper bcm5974 snd_hda_intel drm snd_hda_codec ecdh_generic intel_gtt evdev mousedev input_leds rfkill agpgart led_class snd_hda_core crc16 mei_me snd_hwdep syscopyarea mac_hid snd_pcm i2c_i801 sysfillrect lpc_ich sysimgblt mei snd_timer shpchp snd spi_pxa2xx_pci fb_sys_fops soundcore battery video acpi_als kfifo_buf sbs industrialio sbshc spi_pxa2xx_platform apple_bl ac button ip_tables x_tables zfs(PO) zunicode(PO) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) algif_skcipher af_alg hid_apple hid_generic usbhid hid dm_crypt dm_mod sd_mod usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ahci aesni_intel libahci aes_x86_64 crypto_simd glue_helper cryptd libata xhci_pci scsi_mod xhci_hcd usbcore usb_common
Jan 24 10:32:48 eric-macbookpro kernel:  [last unloaded: brcmutil]
Jan 24 10:32:48 eric-macbookpro kernel: CPU: 0 PID: 7374 Comm: chromium Tainted: P           O    4.14.13-1-ARCH #1
Jan 24 10:32:48 eric-macbookpro kernel: Hardware name: Apple Inc. MacBookPro12,1/Mac-E43C1C25D4880AD6, BIOS MBP121.88Z.0167.B33.1706181928 06/18/2017
Jan 24 10:32:48 eric-macbookpro kernel: task: ffff994f696c2c40 task.stack: ffffb1a789d4c000
Jan 24 10:32:48 eric-macbookpro kernel: RIP: 0010:gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915]
Jan 24 10:32:48 eric-macbookpro kernel: RSP: 0018:ffffb1a789d4f940 EFLAGS: 00010206
Jan 24 10:32:48 eric-macbookpro kernel: RAX: 81c1788cc4f68138 RBX: ffff994f54db8000 RCX: ffff994f696c2c40
Jan 24 10:32:48 eric-macbookpro kernel: RDX: 000000023bc73003 RSI: ffff994d598b6b80 RDI: ffff994f54db8000
Jan 24 10:32:48 eric-macbookpro kernel: RBP: ffff994d598b6b80 R08: 0000000000000000 R09: 0000000000000000
Jan 24 10:32:48 eric-macbookpro kernel: R10: ffffb1a789d4f550 R11: ffff994eaf3c3208 R12: 0000000000000027
Jan 24 10:32:48 eric-macbookpro kernel: R13: 0000000000005000 R14: 0000000004e8f000 R15: ffff994f54dba000
Jan 24 10:32:48 eric-macbookpro kernel: FS:  00007f585886aa00(0000) GS:ffff994faec00000(0000) knlGS:0000000000000000
Jan 24 10:32:48 eric-macbookpro kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 24 10:32:48 eric-macbookpro kernel: CR2: 00000000004ac8e8 CR3: 00000002552c8004 CR4: 00000000003606f0
Jan 24 10:32:48 eric-macbookpro kernel: Call Trace:
Jan 24 10:32:48 eric-macbookpro kernel:  gen8_ppgtt_alloc_pdp+0x178/0x320 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  gen8_ppgtt_alloc_4lvl+0x5f/0x150 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  ppgtt_bind_vma+0x30/0x70 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  i915_vma_bind+0x68/0xd0 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  __i915_vma_do_pin+0x2d6/0x3a0 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  eb_lookup_vmas+0x7a2/0xb50 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  i915_gem_do_execbuffer+0x4d7/0x10e0 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  ? sock_wfree+0x34/0x60
Jan 24 10:32:48 eric-macbookpro kernel:  ? unix_stream_read_generic+0x1f9/0x7e0
Jan 24 10:32:48 eric-macbookpro kernel:  ? import_iovec+0x37/0xd0
Jan 24 10:32:48 eric-macbookpro kernel:  ? i915_gem_execbuffer2+0x5d/0x390 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  i915_gem_execbuffer2+0x1b7/0x390 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  drm_ioctl_kernel+0x59/0xb0 [drm]
Jan 24 10:32:48 eric-macbookpro kernel:  drm_ioctl+0x2d5/0x370 [drm]
Jan 24 10:32:48 eric-macbookpro kernel:  ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
Jan 24 10:32:48 eric-macbookpro kernel:  ? __seccomp_filter+0x3b/0x260
Jan 24 10:32:48 eric-macbookpro kernel:  do_vfs_ioctl+0xa1/0x610
Jan 24 10:32:48 eric-macbookpro kernel:  ? syscall_trace_enter+0xdb/0x2b0
Jan 24 10:32:48 eric-macbookpro kernel:  SyS_ioctl+0x74/0x80
Jan 24 10:32:48 eric-macbookpro kernel:  do_syscall_64+0x55/0x110
Jan 24 10:32:48 eric-macbookpro kernel:  entry_SYSCALL64_slow_path+0x25/0x25
Jan 24 10:32:48 eric-macbookpro kernel: RIP: 0033:0x7f584fa82d27
Jan 24 10:32:48 eric-macbookpro kernel: RSP: 002b:00007ffee14a7828 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jan 24 10:32:48 eric-macbookpro kernel: RAX: ffffffffffffffda RBX: 000003b0126a1030 RCX: 00007f584fa82d27
Jan 24 10:32:48 eric-macbookpro kernel: RDX: 00007ffee14a7870 RSI: 0000000040406469 RDI: 0000000000000080
Jan 24 10:32:48 eric-macbookpro kernel: RBP: 00007ffee14a7870 R08: 0000000000000002 R09: 0000000000000077
Jan 24 10:32:48 eric-macbookpro kernel: R10: 00007f5839f2b780 R11: 0000000000000246 R12: 0000000040406469
Jan 24 10:32:48 eric-macbookpro kernel: R13: 0000000000000080 R14: 00007f5842b00040 R15: 0000000000000000
Jan 24 10:32:48 eric-macbookpro kernel: Code: 01 00 83 81 58 0a 00 00 01 48 2b 05 13 9d fd c9 48 c1 f8 06 48 c1 e0 0c 48 8d 04 d0 48 8b 56 08 48 03 05 0c 9d fd c9 48 83 ca 03 <48> 89 10 83 a9 58 0a 00 00 01 65 ff 0d 37 03 fb 3e 74 02 f3 c3 
Jan 24 10:32:48 eric-macbookpro kernel: RIP: gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915] RSP: ffffb1a789d4f940
Jan 24 10:32:48 eric-macbookpro kernel: ---[ end trace 927b3fb3beeae4b1 ]---


Please advise if any additional information is required.
Comment 1 Elizabeth 2018-01-30 17:14:21 UTC
Hello Eric,
Did you get crash log or dmesg of the event? Is this reproducible? If so could you get a dmesg with debug info?
Comment 2 Eric Blau 2018-01-30 18:00:10 UTC
My laptop was completely frozen so I could not get any further output and I have not been able to reproduce this problem. I captured what I was able to get from journalctl after the fact.

Are there additional debugging options I can turn on that would be captured in journalctl if this happens again?

I'm now running with kernel version 4.14.15-1-ARCH from Arch Linux and it seems much more stable so far.
Comment 3 Chris Wilson 2018-02-01 07:26:03 UTC
commit b715a2f0c7714a399e7f8e951cc8dea9cd4eeb4b
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jan 31 21:44:39 2018 +0000

    drm/i915/ppgtt: Pin page directories before allocation
    
    Commit e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
    believed that because it did not insert its freshly allocated page
    directory into the pd tree, it was safe from the shrinker. I failed to
    heed the lesson learnt from commit dd19674bacba ("drm/i915: Remove bitmap
    tracking for used-ptes") that we need to pin all the levels in the tree
    before hitting the shrinker or else the shrinker may free an upper layer
    as we proceed to allocate the tree. Thus leaving dangling pointers
    everywhere and a GPF should we hit direct reclaim at just the wrong
    moment.
    
    CPU: 0 PID: 7374 Comm: chromium Tainted: P           O    4.14.13-1-ARCH #1
    Hardware name: Apple Inc. MacBookPro12,1/Mac-E43C1C25D4880AD6, BIOS MBP121.88Z.0167.B33.1706181928 06/18/2017
    task: ffff994f696c2c40 task.stack: ffffb1a789d4c000
    RIP: 0010:gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915]
    RSP: 0018:ffffb1a789d4f940 EFLAGS: 00010206
    RAX: 81c1788cc4f68138 RBX: ffff994f54db8000 RCX: ffff994f696c2c40
    RDX: 000000023bc73003 RSI: ffff994d598b6b80 RDI: ffff994f54db8000
    RBP: ffff994d598b6b80 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffb1a789d4f550 R11: ffff994eaf3c3208 R12: 0000000000000027
    R13: 0000000000005000 R14: 0000000004e8f000 R15: ffff994f54dba000
    FS:  00007f585886aa00(0000) GS:ffff994faec00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000004ac8e8 CR3: 00000002552c8004 CR4: 00000000003606f0
    Call Trace:
     gen8_ppgtt_alloc_pdp+0x178/0x320 [i915]
     gen8_ppgtt_alloc_4lvl+0x5f/0x150 [i915]
     ppgtt_bind_vma+0x30/0x70 [i915]
     i915_vma_bind+0x68/0xd0 [i915]
     __i915_vma_do_pin+0x2d6/0x3a0 [i915]
     eb_lookup_vmas+0x7a2/0xb50 [i915]
     i915_gem_do_execbuffer+0x4d7/0x10e0 [i915]
     ? sock_wfree+0x34/0x60
     ? unix_stream_read_generic+0x1f9/0x7e0
     ? import_iovec+0x37/0xd0
     ? i915_gem_execbuffer2+0x5d/0x390 [i915]
     i915_gem_execbuffer2+0x1b7/0x390 [i915]
     ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
     drm_ioctl_kernel+0x59/0xb0 [drm]
     drm_ioctl+0x2d5/0x370 [drm]
     ? i915_gem_execbuffer+0x2d0/0x2d0 [i915]
     ? __seccomp_filter+0x3b/0x260
     do_vfs_ioctl+0xa1/0x610
     ? syscall_trace_enter+0xdb/0x2b0
     SyS_ioctl+0x74/0x80
     do_syscall_64+0x55/0x110
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: 0033:0x7f584fa82d27
    RSP: 002b:00007ffee14a7828 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
    RAX: ffffffffffffffda RBX: 000003b0126a1030 RCX: 00007f584fa82d27
    RDX: 00007ffee14a7870 RSI: 0000000040406469 RDI: 0000000000000080
    RBP: 00007ffee14a7870 R08: 0000000000000002 R09: 0000000000000077
    R10: 00007f5839f2b780 R11: 0000000000000246 R12: 0000000040406469
    R13: 0000000000000080 R14: 00007f5842b00040 R15: 0000000000000000
    Code: 01 00 83 81 58 0a 00 00 01 48 2b 05 13 9d fd c9 48 c1 f8 06 48 c1 e0 0c 48 8d 04 d0 48 8b 56 08 48 03 05 0c 9d fd c9 48 83 ca 03 <48> 89 10 83 a9 58 0a 00 00 01 65 ff 0d 37 03 fb 3e 74 02 f3 c3
    RIP: gen8_ppgtt_set_pde.isra.40+0x48/0x70 [i915] RSP: ffffb1a789d4f940
    
    Reported-by: Eric Blau <eblau@eblau.com>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104773
    Fixes: e2b763caa6eb ("drm/i915: Remove bitmap tracking for used-pdpes")
    References: dd19674bacba ("drm/i915: Remove bitmap tracking for used-ptes")
    Testcase: igt/drv_selftest/live_gtt (igt_ppgtt_shrink_boom)
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180131214440.7141-1-chris@chris-wilson.co.uk
    Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Comment 4 Jani Saarinen 2018-04-20 11:02:28 UTC
Closing, please re-open if still occurs.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.