Bug 111063

Summary: [CI][SHARDS]igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: BXT, HSW, ICL, KBL, SKL i915 features: GEM/Other

Description Lakshmi 2019-07-04 11:58:28 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6403/shard-apl3/igt@gem_busy@close-race.html

<6> [502.819949] Console: switching to colour dummy device 80x25
<6> [502.820091] [IGT] gem_busy: executing
<5> [502.864784] Setting dangerous option reset - tainting kernel
<6> [502.879332] [IGT] gem_busy: starting subtest close-race
<7> [502.881217] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<7> [502.913907] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<7> [502.950901] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<6> [502.992446] gem_busy (3105): drop_caches: 4
<4> [509.505766] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4> [509.505791] CPU: 2 PID: 3109 Comm: gem_busy Tainted: G     U            5.2.0-rc7-CI-CI_DRM_6403+ #1
<4> [509.505811] Hardware name:  /NUC6CAYB, BIOS AYAPLCEL.86A.0049.2018.0508.1356 05/08/2018
<4> [509.505933] RIP: 0010:i915_gem_busy_ioctl+0x136/0x5d0 [i915]
<4> [509.505950] Code: 01 44 89 75 c4 4c 8d 64 c3 20 4c 89 7d c8 4d 89 ee 49 8b 1e e8 0b 4a fe e0 85 c0 74 0d 80 3d 04 16 21 00 00 0f 84 40 01 00 00 <48> 81 7b 08 40 ac 27 a0 74 7b 31 c0 48 8b 4d d0 49 83 c6 08 0b 41
<4> [509.505987] RSP: 0018:ffffc90000347ce8 EFLAGS: 00010202
<4> [509.506002] RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 00000000ffffffff
<4> [509.506018] RDX: ffff8882763f0040 RSI: 00000000ffffffff RDI: 0000000000000246
<4> [509.506033] RBP: ffffc90000347d28 R08: 0000000000000000 R09: 0000000000000001
<4> [509.506049] R10: 0000000000000000 R11: ffff888265632060 R12: ffff8885b10e5608
<4> [509.506065] R13: ffff888255b2fab0 R14: ffff888255b2fab0 R15: ffff88826d5cc140
<4> [509.506082] FS:  00007f8f5a0bde40(0000) GS:ffff888277b00000(0000) knlGS:0000000000000000
<4> [509.506100] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [509.506113] CR2: 00007f8f5a0f1000 CR3: 000000023c6ec000 CR4: 00000000003406e0
<4> [509.506129] Call Trace:
<4> [509.506272]  ? mock_engine_free+0x80/0x80 [i915]
<4> [509.506291]  drm_ioctl_kernel+0x83/0xf0
<4> [509.506307]  drm_ioctl+0x2f3/0x3b0
<4> [509.506407]  ? mock_engine_free+0x80/0x80 [i915]
<4> [509.506425]  ? update_curr+0x21b/0x410
<4> [509.506439]  ? pick_next_entity+0x6b/0x110
<4> [509.506456]  do_vfs_ioctl+0xa0/0x6e0
<4> [509.506469]  ? lockdep_hardirqs_on+0xe3/0x1b0
<4> [509.506484]  ? _raw_spin_unlock_irq+0x2f/0x50
<4> [509.506497]  ? __schedule+0x71e/0x840
<4> [509.506511]  ksys_ioctl+0x35/0x60
<4> [509.506524]  __x64_sys_ioctl+0x11/0x20
<4> [509.506536]  do_syscall_64+0x55/0x1c0
<4> [509.506549]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [509.506564] RIP: 0033:0x7f8f593425d7
<4> [509.506577] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4> [509.506616] RSP: 002b:00007ffd7b4cabd8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4> [509.506636] RAX: ffffffffffffffda RBX: 00007f8f5a0f1000 RCX: 00007f8f593425d7
<4> [509.506653] RDX: 00007ffd7b4cae88 RSI: 00000000c0086457 RDI: 0000000000000005
<4> [509.506670] RBP: 00007ffd7b4cae88 R08: 00007f8f5a0bde40 R09: 00007f8f598372b0
<4> [509.506687] R10: 0000000000000056 R11: 0000000000000246 R12: 00000000c0086457
<4> [509.506704] R13: 0000000000000005 R14: 0000000000000002 R15: 0000000000000002
<4> [509.506728] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel i915 lpc_ich r8169 realtek snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me pinctrl_broxton pinctrl_intel mei prime_numbers
<0> [509.506810] Dumping ftrace buffer:

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6403/shard-iclb5/igt@gem_busy@close-race.html

<6> [26.351073] Console: switching to colour dummy device 80x25
<6> [26.351120] [IGT] gem_busy: executing
<5> [26.355275] Setting dangerous option reset - tainting kernel
<6> [26.359975] [IGT] gem_busy: starting subtest close-race
<6> [26.368971] [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 1
<7> [26.369415] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<7> [26.397168] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<7> [26.431513] [drm:vgem_gem_dumb_create [vgem]] Created object of size 1
<6> [26.461613] gem_busy (1159): drop_caches: 4
<7> [28.883764] [drm:edp_panel_vdd_off_sync [i915]] Turning eDP port A VDD off
<7> [28.883990] [drm:edp_panel_vdd_off_sync [i915]] PP_STATUS: 0x80000008 PP_CONTROL: 0x00000067
<7> [28.884038] [drm:intel_power_well_disable [i915]] disabling DC off
<7> [28.884087] [drm:skl_enable_dc6 [i915]] Enabling DC6
<7> [28.884123] [drm:gen9_set_dc_state [i915]] Setting DC state from 00 to 02
<4> [39.878676] irq event stamp: 15730018
<4> [39.878680] general protection fault: 0000 [#1] PREEMPT SMP NOPTI
<4> [39.878684] CPU: 1 PID: 1170 Comm: gem_busy Tainted: G     U            5.2.0-rc7-CI-CI_DRM_6403+ #1
<4> [39.878698] hardirqs last  enabled at (15730017): [<ffffffff8123245d>] __slab_alloc.isra.28.constprop.34+0x4d/0x70
<4> [39.878703] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [39.878720] hardirqs last disabled at (15730018): [<ffffffff81232429>] __slab_alloc.isra.28.constprop.34+0x19/0x70
<4> [39.878816] RIP: 0010:i915_gem_busy_ioctl+0x136/0x5d0 [i915]
<4> [39.878819] Code: 01 44 89 75 c4 4c 8d 64 c3 20 4c 89 7d c8 4d 89 ee 49 8b 1e e8 0b fa f2 e0 85 c0 74 0d 80 3d 04 16 21 00 00 0f 84 40 01 00 00 <48> 81 7b 08 40 fc 32 a0 74 7b 31 c0 48 8b 4d d0 49 83 c6 08 0b 41
<4> [39.878838] softirqs last  enabled at (15729884): [<ffffffff81c0033a>] __do_softirq+0x33a/0x4b9
<4> [39.878842] softirqs last disabled at (15729877): [<ffffffff810b6499>] irq_exit+0xa9/0xc0
<4> [39.878850] RSP: 0000:ffffc900006afce8 EFLAGS: 00010202
<4> [39.878923] RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 00000000ffffffff
<4> [39.878938] RDX: ffff88848de20040 RSI: 00000000ffffffff RDI: 0000000000000246
<4> [39.878953] RBP: ffffc900006afd28 R08: 0000000000000000 R09: 0000000000000001
<4> [39.878969] R10: 0000000000000000 R11: ffff8884952ff010 R12: ffff8887eb055928
<4> [39.878985] R13: ffff88848fa9fdd0 R14: ffff88848fa9fdd0 R15: ffff88848ddc54c0
<4> [39.879000] FS:  00007fc8e7d61240(0000) GS:ffff88849fc80000(0000) knlGS:0000000000000000
<4> [39.879016] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [39.879029] CR2: 00007f2148051000 CR3: 000000048de24001 CR4: 0000000000760ee0
<4> [39.879044] PKRU: 55555554
<4> [39.879052] Call Trace:
<4> [39.879141]  ? mock_engine_free+0x80/0x80 [i915]
<4> [39.879157]  drm_ioctl_kernel+0x83/0xf0
<4> [39.879171]  drm_ioctl+0x2f3/0x3b0
<4> [39.879248]  ? mock_engine_free+0x80/0x80 [i915]
<4> [39.879264]  ? finish_task_switch+0x95/0x260
<4> [39.879277]  ? finish_task_switch+0x68/0x260
<4> [39.879290]  ? __switch_to+0x1b2/0x460
<4> [39.879302]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4> [39.879315]  ? lockdep_hardirqs_on+0xe3/0x1b0
<4> [39.879327]  ? trace_hardirqs_on_thunk+0x1a/0x1c
<4> [39.879343]  do_vfs_ioctl+0xa0/0x6e0
<4> [39.879354]  ? retint_kernel+0x2b/0x2b
<4> [39.879368]  ksys_ioctl+0x35/0x60
<4> [39.879380]  __x64_sys_ioctl+0x11/0x20
<4> [39.879391]  do_syscall_64+0x55/0x1c0
<4> [39.879403]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4> [39.879416] RIP: 0033:0x7fc8e6fe55d7
<4> [39.879426] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4> [39.879460] RSP: 002b:00007ffffab6a6e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4> [39.879478] RAX: ffffffffffffffda RBX: 00007fc8e7d94000 RCX: 00007fc8e6fe55d7
<4> [39.879493] RDX: 00007ffffab6a998 RSI: 00000000c0086457 RDI: 0000000000000005
<4> [39.879508] RBP: 00007ffffab6a998 R08: 00007fc8e7d61240 R09: 00007fc8e74da2b0
<4> [39.879523] R10: 0000000000000056 R11: 0000000000000246 R12: 00000000c0086457
<4> [39.879538] R13: 0000000000000005 R14: 000000000000000d R15: 0000000000000006
<4> [39.879559] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal coretemp cdc_ether e1000e usbnet mei_hdcp mii crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec ptp snd_hwdep ghash_clmulni_intel pps_core snd_hda_core snd_pcm mei_me mei prime_numbers
<0> [39.879629] Dumping ftrace buffer:
<0> [39.879640] ---------------------------------
<0> [39.879758] kms_atom-1153    7.... 24480265us : i915_gem_wait_for_idle: flags=3 (locked), timeout=9223372036854775807 (forever), awake?=no
<0> [39.879881] kms_atom-1153    7.... 24480266us : i915_gem_wait_for_idle: flags=3 (locked), timeout=9223372036854775807 (forever), awake?=no
<0> [39.880000] kms_atom-1153    7.... 25086750us : i915_gem_wait_for_idle: flags=3 (locked), timeout=9223372036854775807 (forever), awake?=no
<0> [39.880117] kms_atom-1153    7.... 25086752us : i915_gem_wait_for_idle: flags=3 (locked), timeout=9223372036854775807 (forever), awake?=no
<0> [39.880226] CPU:4 [LOST 458 EVENTS]
Comment 2 CI Bug Log 2019-07-04 12:02:39 UTC
A CI Bug Log filter associated to this bug has been updated:

{- APL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI -}
{+ HSW APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6403/shard-kbl3/igt@gem_busy@close-race.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6404/shard-skl1/igt@gem_busy@close-race.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5081/shard-hsw5/igt@gem_busy@close-race.html
Comment 3 CI Bug Log 2019-07-04 12:02:58 UTC
A CI Bug Log filter associated to this bug has been updated:

{- HSW APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP NOPTI -}
{+ HSW APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP (NOPTI|PTI) +}


  No new failures caught with the new filter
Comment 6 Chris Wilson 2019-07-04 14:41:07 UTC
commit 0c159ffef628fa94d0f4f9128e7f2b6f2b5e86ef (HEAD -> drm-intel-next-queued, drm-intel/for-linux-next, drm-intel/drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Jul 3 19:06:01 2019 +0100

    drm/i915/gem: Defer obj->base.resv fini until RCU callback
    
    Since reservation_object_fini() does an immediate free, rather than
    kfree_rcu as normal, we have to delay the release until after the RCU
    grace period has elapsed (i.e. from the rcu cleanup callback) so that we
    can rely on the RCU protected access to the fences while the object is a
    zombie.
    
    i915_gem_busy_ioctl relies on having an RCU barrier to protect the
    reservation in order to avoid having to take a reference and strong
    memory barriers.
    
    v2: Order is important; only release after putting the pages!
    
    Fixes: c03467ba40f7 ("drm/i915/gem: Free pages before rcu-freeing the object")
    Testcase: igt/gem_busy/close-race
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Matthew Auld <matthew.auld@intel.com>
    Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190703180601.10950-1-chris@chris-wilson.co.uk
Comment 7 CI Bug Log 2019-07-05 06:24:48 UTC
A CI Bug Log filter associated to this bug has been updated:

{- HSW APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP (NOPTI|PTI) -}
{+ HSW SNB APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP (NOPTI|PTI) +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5082/shard-snb2/igt@gem_busy@close-race.html
Comment 8 CI Bug Log 2019-07-05 06:25:26 UTC
A CI Bug Log filter associated to this bug has been updated:

{- HSW APL KBL ICL: igt@runner@aborted - fail - Previous test: gem_busy (close-race) -}
{+ HSW SNB APL KBL ICL: igt@runner@aborted - fail - Previous test: gem_busy (close-race) +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5082/shard-snb2/igt@runner@aborted.html
Comment 9 CI Bug Log 2019-07-08 08:38:31 UTC
A CI Bug Log filter associated to this bug has been updated:

{- HSW SNB APL SKL KBL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP (NOPTI|PTI) -}
{+ HSW SNB APL SKL KBL CML CFL ICL: igt@gem_busy@close-race - dmesg-fail - general protection fault: 0000 [#1] PREEMPT SMP (NOPTI|PTI) +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cfl-8109u/igt@gem_busy@close-race.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cfl-guc/igt@gem_busy@close-race.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cml-u/igt@gem_busy@close-race.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cml-u2/igt@gem_busy@close-race.html
Comment 10 CI Bug Log 2019-07-08 08:39:51 UTC
A CI Bug Log filter associated to this bug has been updated:

{- HSW SNB APL KBL ICL: igt@runner@aborted - fail - Previous test: gem_busy (close-race) -}
{+ HSW SNB APL KBL ICL: igt@runner@aborted - fail - Previous test: gem_busy (close-race) +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cfl-8109u/igt@runner@aborted.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cfl-guc/igt@runner@aborted.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cml-u/igt@runner@aborted.html
  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_321/fi-cml-u2/igt@runner@aborted.html

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.