Bug 112155

Summary: [CI][DRMTIP]igt@gem_exec_reuse@baggage - incomplete - INFO: task java:\d+ blocked for more than d+ seconds.
Product: DRI Reporter: Lakshmi <lakshminarayana.vudum>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: minor    
Priority: low CC: intel-gfx-bugs
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: I965G i915 features: GEM/Other

Description Lakshmi 2019-10-28 15:45:33 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_392/fi-bwr-2160/igt@gem_exec_reuse@baggage.html
<6> [54.497961] Console: switching to colour dummy device 80x25
<6> [54.498138] [IGT] gem_exec_reuse: executing
<5> [54.510115] Setting dangerous option reset - tainting kernel
<6> [54.520546] gem_exec_reuse (940): drop_caches: 4
<6> [59.457799] [IGT] gem_exec_reuse: starting subtest baggage
<3> [185.783792] INFO: task java:773 blocked for more than 61 seconds.
<3> [185.783910]       Tainted: G     U  W         5.4.0-rc4-g87aff128f9ba-drmtip_392+ #1
<3> [185.783986] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6> [185.784062] java            D13752   773    743 0x00000000
<4> [185.784078] Call Trace:
<4> [185.784099]  ? __schedule+0x2ef/0x7f0
<4> [185.784114]  ? rwsem_down_read_slowpath+0x28d/0x600
<4> [185.784125]  schedule+0x34/0xc0
<4> [185.784133]  rwsem_down_read_slowpath+0x28d/0x600
<4> [185.784161]  ? down_read+0x85/0x150
<4> [185.784168]  down_read+0x85/0x150
<4> [185.784177]  __do_page_fault+0x388/0x4f0
<4> [185.784195]  page_fault+0x34/0x40
<4> [185.784203] RIP: 0033:0x7fbbef2687de
<4> [185.784217] Code: Bad RIP value.
<4> [185.784223] RSP: 002b:00007fbbd8ffec20 EFLAGS: 00010202
<4> [185.784231] RAX: 00007fbbf06b4600 RBX: 0000000000000023 RCX: 0000000000000018
<4> [185.784237] RDX: 0000000000000001 RSI: 0000000000017974 RDI: 00007ffe82d9a080
<4> [185.784242] RBP: 00007fbbd8ffec30 R08: 00007ffe82d9a090 R09: 00007fbbd8ffebd0
<4> [185.784247] R10: 0000000000000069 R11: 000000001a4a3a53 R12: 00007fbbef8326f2
<4> [185.784252] R13: 00007fbbef837f28 R14: 00007fbbe8096800 R15: 00007fbbe8007a00
<3> [185.784281] INFO: task java:778 blocked for more than 61 seconds.
<3> [185.784343]       Tainted: G     U  W         5.4.0-rc4-g87aff128f9ba-drmtip_392+ #1
<3> [185.784417] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6> [185.784493] java            D12872   778    743 0x00000000
<4> [185.784506] Call Trace:
<4> [185.784520]  ? __schedule+0x2ef/0x7f0
<4> [185.784534]  ? rwsem_down_read_slowpath+0x28d/0x600
<4> [185.784545]  schedule+0x34/0xc0
<4> [185.784553]  rwsem_down_read_slowpath+0x28d/0x600
<4> [185.784581]  ? down_read+0x85/0x150
<4> [185.784587]  down_read+0x85/0x150
<4> [185.784595]  __do_page_fault+0x388/0x4f0
<4> [185.784612]  page_fault+0x34/0x40
<4> [185.784619] RIP: 0033:0x7fbbf000843d
<4> [185.784628] Code: Bad RIP value.
<4> [185.784634] RSP: 002b:00007fbbd86ea148 EFLAGS: 00010202
<4> [185.784641] RAX: 00007fbbd95a2ff0 RBX: 00000000f63cccb0 RCX: 0000000000040000
<4> [185.784647] RDX: 0000000000000080 RSI: 00007fbbc8704ed0 RDI: 00007fbbd95a3080
<4> [185.784652] RBP: 00007fbbd86ea270 R08: 00007fbbd95a3070 R09: 0000000000000000
<4> [185.784657] R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000001
<4> [185.784662] R13: 00007fbbd95a2850 R14: 0000000000000000 R15: 00007fbbd95a2ec0
<3> [185.784690] INFO: task java:780 blocked for more than 61 seconds.
<3> [185.784834]       Tainted: G     U  W         5.4.0-rc4-g87aff128f9ba-drmtip_392+ #1
<3> [185.784909] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6> [185.784985] java            D13824   780    743 0x00000000
<4> [185.784999] Call Trace:
<4> [185.785015]  ? __schedule+0x2ef/0x7f0
<4> [185.785029]  ? rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785040]  schedule+0x34/0xc0
<4> [185.785048]  rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785076]  ? down_read+0x85/0x150
<4> [185.785083]  down_read+0x85/0x150
<4> [185.785091]  __do_page_fault+0x388/0x4f0
<4> [185.785109]  page_fault+0x34/0x40
<4> [185.785116] RIP: 0033:0x7fbbef208ede
<4> [185.785126] Code: Bad RIP value.
<4> [185.785132] RSP: 002b:00007fbbd84e8d60 EFLAGS: 00010202
<4> [185.785139] RAX: 000000000029d1e0 RBX: 00007fbbf06b5a10 RCX: 0000000000000002
<4> [185.785145] RDX: 00007fbbeeba3310 RSI: 00007fbbef8385ec RDI: 00007fbbe80110a0
<4> [185.785150] RBP: 00007fbbd84e8d70 R08: 00007ffe82d9a090 R09: 0000000000000032
<4> [185.785155] R10: 0000000000000074 R11: 00000000a8521547 R12: 0000000000000008
<4> [185.785160] R13: 00007fbbe801dc10 R14: 0000000000000002 R15: 0000000000000001
<3> [185.785191] INFO: task java:788 blocked for more than 61 seconds.
<3> [185.785252]       Tainted: G     U  W         5.4.0-rc4-g87aff128f9ba-drmtip_392+ #1
<3> [185.785326] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6> [185.785402] java            D13584   788    743 0x00000000
<4> [185.785416] Call Trace:
<4> [185.785431]  ? __schedule+0x2ef/0x7f0
<4> [185.785445]  ? rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785456]  schedule+0x34/0xc0
<4> [185.785464]  rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785492]  ? down_read+0x85/0x150
<4> [185.785499]  down_read+0x85/0x150
<4> [185.785507]  __do_page_fault+0x388/0x4f0
<4> [185.785524]  page_fault+0x34/0x40
<4> [185.785531] RIP: 0033:0x7fbbeff2e0dd
<4> [185.785542] Code: Bad RIP value.
<4> [185.785547] RSP: 002b:00007fbbb84d7750 EFLAGS: 00010202
<4> [185.785554] RAX: 0000000000014731 RBX: 0000000000000b21 RCX: 00007fbb9800bdb0
<4> [185.785560] RDX: 0000000000000b25 RSI: 0000000000000004 RDI: 00007fbb9800c8d0
<4> [185.785565] RBP: ffffffffffffffb0 R08: 0000000000000077 R09: 0000000000000000
<4> [185.785570] R10: 00007fbb98000e30 R11: 0000000000000000 R12: 00000000000000b0
<4> [185.785575] R13: 00007fbb98000080 R14: 00007fbb98000020 R15: 0000000000000000
<3> [185.785605] INFO: task java:796 blocked for more than 61 seconds.
<3> [185.785667]       Tainted: G     U  W         5.4.0-rc4-g87aff128f9ba-drmtip_392+ #1
<3> [185.785785] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<6> [185.785863] java            D13304   796    743 0x00000000
<4> [185.785876] Call Trace:
<4> [185.785892]  ? __schedule+0x2ef/0x7f0
<4> [185.785907]  ? rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785917]  schedule+0x34/0xc0
<4> [185.785925]  rwsem_down_read_slowpath+0x28d/0x600
<4> [185.785953]  ? down_read+0x85/0x150
<4> [185.785959]  down_read+0x85/0x150
<4> [185.785967]  __do_page_fault+0x388/0x4f0
<4> [185.785984]  page_fault+0x34/0x40
<4> [185.785991] RIP: 0033:0x7fbbeff2e0dd
<4> [185.786001] Code: Bad RIP value.
<4> [185.786006] RSP: 002b:00007fbb9f9b5fc0 EFLAGS: 00010206
<4> [185.786014] RAX: 0000000000012db1 RBX: 0000000000007ff1 RCX: 00007fbbc0006260
<4> [185.786019] RDX: 0000000000007ff5 RSI: 0000000000000004 RDI: 00007fbbc000e250
<4> [185.786025] RBP: 0000000000000010 R08: 00007fbbc00006b0 R09: 0000000000000000
<4> [185.786030] R10: 00007fbbc0000c90 R11: 0000000000000002 R12: 00000000000007fd
<4> [185.786035] R13: 00007fbbc0000080 R14: 00007fbbc0000020 R15: 0000000000000000
<4> [185.786068] 
Showing all locks held in the system:
<4> [185.786081] 1 lock held by khungtaskd/22:
<4> [185.786086]  #0: ffffffffaa4481e0 (rcu_read_lock){....}, at: debug_show_all_locks+0xe/0x1a0
<4> [185.786109] 3 locks held by kswapd0/132:
<4> [185.786114]  #0: ffffffffaa464ac0 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30
<4> [185.786129]  #1: ffffffffaa459e68 (shrinker_rwsem){++++}, at: shrink_slab+0x46/0x2c0
<4> [185.786143]  #2: ffff8f1b2bbd93c0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0x21/0x50 [i915]
<4> [185.786369] 4 locks held by systemd-resolve/250:
<4> [185.786374]  #0: ffff8f1b3c3eed30 (&ei->i_mmap_sem){++++}, at: ext4_filemap_fault+0x22/0x39
<4> [185.786390]  #1: ffffffffaa464ac0 (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.117+0x0/0x30
<4> [185.786403]  #2: ffffffffaa459e68 (shrinker_rwsem){++++}, at: shrink_slab+0x46/0x2c0
<4> [185.786416]  #3: ffff8f1b2bbd93c0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0x21/0x50 [i915]
<4> [185.786580] 1 lock held by in:imklog/307:
<4> [185.786591] 1 lock held by java/773:
<4> [185.786595]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.786612] 4 locks held by java/777:
<4> [185.786617]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x131/0x4f0
<4> [185.786630]  #1: ffffffffaa464ac0 (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.117+0x0/0x30
<4> [185.786642]  #2: ffffffffaa459e68 (shrinker_rwsem){++++}, at: shrink_slab+0x46/0x2c0
<4> [185.786655]  #3: ffff8f1b2bbd93c0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0x21/0x50 [i915]
<4> [185.786855] 1 lock held by java/778:
<4> [185.786861]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.786883] 1 lock held by java/780:
<4> [185.786888]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.786908] 1 lock held by java/784:
<4> [185.786913]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xaa/0x2f0
<4> [185.786933] 1 lock held by java/785:
<4> [185.786939]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xaa/0x2f0
<4> [185.786957] 1 lock held by java/787:
<4> [185.786962]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __x64_sys_brk+0x4a/0x240
<4> [185.786979] 1 lock held by java/788:
<4> [185.786985]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787005] 1 lock held by java/792:
<4> [185.787010]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787027] 1 lock held by java/793:
<4> [185.787032]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xaa/0x2f0
<4> [185.787051] 1 lock held by java/796:
<4> [185.787056]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787074] 1 lock held by java/797:
<4> [185.787080]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787097] 1 lock held by java/798:
<4> [185.787103]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787120] 1 lock held by java/799:
<4> [185.787126]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787144] 1 lock held by java/800:
<4> [185.787150]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: vm_mmap_pgoff+0x6a/0xf0
<4> [185.787169] 1 lock held by java/805:
<4> [185.787175]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: do_mprotect_pkey+0xaa/0x2f0
<4> [185.787193] 1 lock held by java/944:
<4> [185.787198]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787212] 1 lock held by java/945:
<4> [185.787217]  #0: ffff8f1b371288d8 (&mm->mmap_sem#2){++++}, at: __do_page_fault+0x388/0x4f0
<4> [185.787231] 5 locks held by dmesg/915:
<4> [185.787236]  #0: ffff8f1b2a58b920 (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x3a/0x50
<4> [185.787250]  #1: ffff8f1b3a494408 (sb_writers#3){.+.+}, at: vfs_write+0x1a4/0x1d0
<4> [185.787263]  #2: ffff8f1b30748c38 (&sb->s_type->i_mutex_key#12){+.+.}, at: ext4_file_write_iter+0x65/0x390
<4> [185.787277]  #3: ffff8f1b3a485758 (jbd2_handle){++++}, at: start_this_handle+0x11a/0x4d0
<4> [185.787292]  #4: ffff8f1b30748a58 (&ei->i_data_sem){++++}, at: ext4_da_get_block_prep+0x1e8/0x3b0
<4> [185.787306] 3 locks held by igt_runner/932:
<4> [185.787311]  #0: ffff8f1b3a494408 (sb_writers#3){.+.+}, at: vfs_write+0x1a4/0x1d0
<4> [185.787323]  #1: ffff8f1b34bf5ca8 (&sb->s_type->i_mutex_key#12){+.+.}, at: ext4_file_write_iter+0x65/0x390
<4> [185.787337]  #2: ffff8f1b3a485758 (jbd2_handle){++++}, at: start_this_handle+0x11a/0x4d0
<4> [185.787351] 7 locks held by gem_exec_reuse/940:
<4> [185.787355]  #0: ffff8f1b2bbd00c8 (&dev->struct_mutex){+.+.}, at: i915_gem_do_execbuffer+0x5f0/0x2410 [i915]
<4> [185.787509]  #1: ffff8f1b2d352c70 (&kernel#2){+.+.}, at: i915_request_create+0x16/0x1b0 [i915]
<4> [185.787656]  #2: ffff989c401cfc60 (reservation_ww_class_acquire){+.+.}, at: i915_gem_execbuffer2_ioctl+0x11b/0x460 [i915]
<4> [185.787841]  #3: ffff8f1b30e05f90 (reservation_ww_class_mutex){+.+.}, at: i915_gem_do_execbuffer+0x1911/0x2410 [i915]
<4> [185.787986]  #4: ffffffffaa464ac0 (fs_reclaim){+.+.}, at: fs_reclaim_acquire.part.117+0x0/0x30
<4> [185.788000]  #5: ffffffffaa459e68 (shrinker_rwsem){++++}, at: shrink_slab+0x46/0x2c0
<4> [185.788013]  #6: ffff8f1b2bbd93c0 (&vm->mutex){+.+.}, at: i915_vma_unbind+0x21/0x50 [i915]
<4> [185.788169] 
<4> [185.788174] =============================================
Comment 1 CI Bug Log 2019-10-28 15:47:12 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* BWR: igt@gem_exec_reuse@baggage - incomplete - INFO: task java:\d+ blocked for more than d+ seconds.
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_392/fi-bwr-2160/igt@gem_exec_reuse@baggage.html
Comment 2 Chris Wilson 2019-10-28 15:49:17 UTC
Looks like shrinker inversion onto ext4:&ei->i_mmap_semm.
Comment 3 Francesco Balestrieri 2019-11-11 11:06:13 UTC
Looks like a one-off but let's keep watching.
Comment 4 Chris Wilson 2019-11-20 17:26:41 UTC
commit a6edbca74b305adc165e67065d7ee766006e6a48
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Nov 20 16:55:13 2019 +0000

    drm/i915/gt: Close race between engine_park and intel_gt_retire_requests

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.