Bug 112365 - [CI][DRMTIP]igt@runner@aborted - fail - TAINT_WARN: WARN_ON has happened.
Summary: [CI][DRMTIP]igt@runner@aborted - fail - TAINT_WARN: WARN_ON has happened.
Status: RESOLVED DUPLICATE of bug 109385
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: not set not set
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-11-21 08:39 UTC by Lakshmi
Modified: 2019-11-21 09:27 UTC (History)
1 user (show)

See Also:
i915 platform: BSW/CHT
i915 features: GEM/Other


Attachments

Description Lakshmi 2019-11-21 08:39:56 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_404/fi-bsw-nick/igt@runner@aborted.html
Aborting.
Previous test: nothing
Next test: kms_draw_crc (draw-method-rgb565-render-xtiled)

Kernel badly tainted (0x200) (check dmesg for details):
	(0x200) TAINT_WARN: WARN_ON has happened.

Boot log
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_404/fi-bsw-nick/boot27.txt
<4>[   11.919831] ======================================================
<4>[   11.919840] WARNING: possible circular locking dependency detected
<4>[   11.919850] 5.4.0-rc7-g74911fbf004d-drmtip_404+ #1 Not tainted
<4>[   11.919857] ------------------------------------------------------
<4>[   11.919866] modprobe/273 is trying to acquire lock:
<4>[   11.919874] ffffffff9d43be50 (cpu_hotplug_lock.rw_sem){++++}, at: stop_machine+0x12/0x30
<4>[   11.919893] 
                  but task is already holding lock:
<4>[   11.919902] ffff9c7a9bd993c0 (&vm->mutex){+.+.}, at: i915_vma_pin+0xf3/0xfc0 [i915]
<4>[   11.920085] 
                  which lock already depends on the new lock.

<4>[   11.920095] 
                  the existing dependency chain (in reverse order) is:
<4>[   11.920105] 
                  -> #2 (&vm->mutex){+.+.}:
<4>[   11.920239]        i915_gem_shrinker_taints_mutex+0xa2/0xd0 [i915]
<4>[   11.920373]        i915_address_space_init+0xa9/0x160 [i915]
<4>[   11.920507]        i915_ggtt_init_hw+0x47/0x130 [i915]
<4>[   11.920624]        i915_driver_probe+0xc51/0x15b0 [i915]
<4>[   11.920743]        i915_pci_probe+0x43/0x1c0 [i915]
<4>[   11.920753]        pci_device_probe+0x9e/0x120
<4>[   11.920763]        really_probe+0xea/0x420
<4>[   11.920771]        driver_probe_device+0x10b/0x120
<4>[   11.920779]        device_driver_attach+0x4a/0x50
<4>[   11.920787]        __driver_attach+0x97/0x130
<4>[   11.920795]        bus_for_each_dev+0x74/0xc0
<4>[   11.920803]        bus_add_driver+0x142/0x220
<4>[   11.920810]        driver_register+0x56/0xf0
<4>[   11.920819]        do_one_initcall+0x58/0x2ff
<4>[   11.920827]        do_init_module+0x56/0x1f8
<4>[   11.920834]        load_module+0x243e/0x29f0
<4>[   11.920842]        __do_sys_finit_module+0xe9/0x110
<4>[   11.920849]        do_syscall_64+0x4f/0x210
<4>[   11.920859]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   11.920866] 
                  -> #1 (fs_reclaim){+.+.}:
<4>[   11.920878]        fs_reclaim_acquire.part.117+0x24/0x30
<4>[   11.920888]        kmem_cache_alloc_trace+0x2a/0x2c0
<4>[   11.920896]        intel_cpuc_prepare+0x37/0x1a0
<4>[   11.920905]        cpuhp_invoke_callback+0x9b/0x9d0
<4>[   11.920913]        _cpu_up+0xa2/0x140
<4>[   11.920920]        do_cpu_up+0x61/0xa0
<4>[   11.920929]        smp_init+0x57/0x96
<4>[   11.920937]        kernel_init_freeable+0xac/0x1c7
<4>[   11.920946]        kernel_init+0x5/0x100
<4>[   11.920953]        ret_from_fork+0x3a/0x50
<4>[   11.920959] 
                  -> #0 (cpu_hotplug_lock.rw_sem){++++}:
<4>[   11.920971]        __lock_acquire+0x1328/0x15d0
<4>[   11.920979]        lock_acquire+0xa7/0x1c0
<4>[   11.920986]        cpus_read_lock+0x34/0xd0
<4>[   11.920994]        stop_machine+0x12/0x30
<4>[   11.921126]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[   11.921259]        aliasing_gtt_bind_vma+0xec/0x190 [i915]
<4>[   11.921393]        i915_vma_bind+0x21d/0x460 [i915]
<4>[   11.921527]        i915_vma_pin+0x1f6/0xfc0 [i915]
<4>[   11.921650]        intel_gt_init+0x67/0x100 [i915]
<4>[   11.921783]        i915_gem_init+0x121/0x8d0 [i915]
<4>[   11.921900]        i915_driver_probe+0xb9d/0x15b0 [i915]
<4>[   11.922019]        i915_pci_probe+0x43/0x1c0 [i915]
<4>[   11.922027]        pci_device_probe+0x9e/0x120
<4>[   11.922035]        really_probe+0xea/0x420
<4>[   11.922043]        driver_probe_device+0x10b/0x120
<4>[   11.922051]        device_driver_attach+0x4a/0x50
<4>[   11.922059]        __driver_attach+0x97/0x130
<4>[   11.922067]        bus_for_each_dev+0x74/0xc0
<4>[   11.922075]        bus_add_driver+0x142/0x220
<4>[   11.922082]        driver_register+0x56/0xf0
<4>[   11.922089]        do_one_initcall+0x58/0x2ff
<4>[   11.922097]        do_init_module+0x56/0x1f8
<4>[   11.922104]        load_module+0x243e/0x29f0
<4>[   11.922111]        __do_sys_finit_module+0xe9/0x110
<4>[   11.922119]        do_syscall_64+0x4f/0x210
<4>[   11.922127]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[   11.922134] 
                  other info that might help us debug this:

<4>[   11.922145] Chain exists of:
                    cpu_hotplug_lock.rw_sem --> fs_reclaim --> &vm->mutex
Comment 1 CI Bug Log 2019-11-21 08:51:48 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* BSW: igt@runner@aborted - TAINT_WARN: WARN_ON has happened.
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568045297/fi-bsw-cyan/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568045297/fi-bsw-kefka/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568047078/fi-bsw-cyan/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568047078/fi-bsw-kefka/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568047078/fi-bsw-n3050/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568963895/fi-bsw-cyan/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568963895/fi-bsw-kefka/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568963895/fi-bsw-n3050/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568966012/fi-bsw-cyan/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568966012/fi-bsw-kefka/igt@runner@aborted.html
  - http://gfx-ci.fi.intel.com/tree/drm-tip/CUSTOM_mupuf-1568966012/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14548/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14548/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14737/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14737/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14796/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14824/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_404/fi-bsw-nick/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14845/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5202/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5202/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14968/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15069/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15069/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5244/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_5244/fi-bsw-n3050/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15356/fi-bsw-kefka/igt@runner@aborted.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_15356/fi-bsw-n3050/igt@runner@aborted.html
Comment 2 Chris Wilson 2019-11-21 09:27:07 UTC
Same old story with others allocating under cpuhp polluting stop_machine(). We need stop_machine() to avoid bsw reading/writing into random memory, and we need to do so from within an allocating mutex at present. However, the inversion reported here is only during cpuhp and not possible for our runtime, so as low priority as ever.

*** This bug has been marked as a duplicate of bug 109385 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.