Bug 109831 - [CI][BAT] igt@i915_module_load@reload-with-fault-injection - incomplete - WARN_ON(!intel_irqs_enabled(dev_priv))
Summary: [CI][BAT] igt@i915_module_load@reload-with-fault-injection - incomplete - WAR...
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: high normal
Assignee: Madhumitha Tolakanahalli Pradeep
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-05 11:02 UTC by Lakshmi
Modified: 2019-07-03 20:24 UTC (History)
1 user (show)

See Also:
i915 platform: KBL
i915 features: display/DP MST


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-03-05 11:02:56 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5703/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html

<4> [342.218287] CPU: 3 PID: 3536 Comm: i915_module_loa Tainted: G     U            5.0.0-CI-CI_DRM_5703+ #1
<4> [342.218368]  drm_fb_helper_fini.part.18+0xb3/0x100
<4> [342.218407]  intel_fbdev_fini+0x20/0x80 [i915]
<4> [342.218441]  intel_modeset_cleanup+0x9a/0x140 [i915]
<4> [342.218461]  i915_driver_unload+0xa7/0x110 [i915]
<4> [342.218481]  i915_pci_remove+0x19/0x30 [i915]
<4> [342.218539]  i915_exit+0x16/0x1c [i915]
<4> [342.594335] WARN_ON(!intel_irqs_enabled(dev_priv))
<4> [342.594362] WARNING: CPU: 1 PID: 627 at drivers/gpu/drm/i915/i915_irq.c:625 bdw_update_pipe_irq+0x108/0x120 [i915]
<4> [342.594455] Workqueue: events_long drm_dp_mst_link_probe_work
<4> [342.594596]  drm_vblank_enable+0x7c/0x120
<4> [342.594600]  drm_vblank_get+0x86/0xa0
<4> [342.594647]  intel_pipe_update_start+0xee/0x3f0 [i915]
<4> [342.594699]  intel_update_crtc+0xc7/0x3d0 [i915]
<4> [342.594777]  intel_atomic_commit_tail+0x207/0x12a0 [i915]
<4> [342.594834]  intel_atomic_commit+0x244/0x330 [i915]
<4> [342.594847]  drm_fb_helper_restore_fbdev_mode_unlocked+0x42/0x90
<4> [342.594852]  drm_fb_helper_set_par+0x24/0x50
<4> [342.594857]  drm_fb_helper_hotplug_event.part.24+0x9d/0xb0
<4> [342.594862]  drm_kms_helper_hotplug_event+0x21/0x30
<4> [342.594867]  drm_dp_send_link_address+0x16c/0x1d0
<4> [342.594872]  drm_dp_check_and_send_link_address+0x82/0xc0
<4> [342.594877]  drm_dp_mst_link_probe_work+0x54/0x70
<4> [342.595009] WARNING: CPU: 1 PID: 627 at drivers/gpu/drm/i915/i915_irq.c:625 bdw_update_pipe_irq+0x108/0x120 [i915]
<3> [353.098218] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:47:pipe A] flip_done timed out
<3> [363.338218] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:47:pipe A] flip_done timed out
<3> [373.578217] [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:30:plane 1A] flip_done timed out
<3> [373.630266] [drm:intel_pipe_update_start [i915]] *ERROR* PSR idle timed out 0x40010006, atomic update may fail
<3> [373.633251] [drm:intel_pipe_update_start [i915]] *ERROR* Potential atomic update failure on pipe A
<3> [373.633302] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=83 end=84) time 8 us, min 1772, max 1799, scanline start 1799, end 1799
<3> [383.818217] [drm:drm_atomic_helper_wait_for_flip_done] *ERROR* [CRTC:47:pipe A] flip_done timed out
<4> [383.818346] WARNING: CPU: 3 PID: 3536 at drivers/gpu/drm/i915/intel_fbc.c:1155 intel_fbc_global_disable+0x5f/0x70 [i915]
<4> [383.818392] CPU: 3 PID: 3536 Comm: i915_module_loa Tainted: G     U  W         5.0.0-CI-CI_DRM_5703+ #1
<4> [383.818434] RIP: 0010:intel_fbc_global_disable+0x5f/0x70 [i915]
<4> [383.818531]  intel_modeset_cleanup+0xa7/0x140 [i915]
<4> [383.818554]  i915_driver_unload+0xa7/0x110 [i915]
<4> [383.818577]  i915_pci_remove+0x19/0x30 [i915]
<4> [383.818638]  i915_exit+0x16/0x1c [i915]
<4> [383.818776] WARNING: CPU: 3 PID: 3536 at drivers/gpu/drm/i915/intel_fbc.c:1155 intel_fbc_global_disable+0x5f/0x70 [i915]
<4> [383.830816] WARNING: CPU: 0 PID: 3536 at drivers/gpu/drm/drm_mode_config.c:456 drm_mode_config_cleanup+0x283/0x2d0
<4> [383.830859] CPU: 0 PID: 3536 Comm: i915_module_loa Tainted: G     U  W         5.0.0-CI-CI_DRM_5703+ #1
<4> [383.830872] RIP: 0010:drm_mode_config_cleanup+0x283/0x2d0
<4> [383.830973]  intel_modeset_cleanup+0xc3/0x140 [i915]
<4> [383.830994]  i915_driver_unload+0xa7/0x110 [i915]
<4> [383.831015]  i915_pci_remove+0x19/0x30 [i915]
<4> [383.831073]  i915_exit+0x16/0x1c [i915]
<4> [383.831183] WARNING: CPU: 0 PID: 3536 at drivers/gpu/drm/drm_mode_config.c:456 drm_mode_config_cleanup+0x283/0x2d0
<3> [383.831222] [drm:drm_mode_config_cleanup] *ERROR* connector eDP-1 leaked!
<4> [383.861445] WARNING: CPU: 2 PID: 3536 at drivers/gpu/drm/i915/intel_runtime_pm.c:4509 intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [383.861496] CPU: 2 PID: 3536 Comm: i915_module_loa Tainted: G     U  W         5.0.0-CI-CI_DRM_5703+ #1
<4> [383.861535] RIP: 0010:intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [383.861628]  i915_pci_remove+0x19/0x30 [i915]
<4> [383.861691]  i915_exit+0x16/0x1c [i915]
<4> [383.861830] WARNING: CPU: 2 PID: 3536 at drivers/gpu/drm/i915/intel_runtime_pm.c:4509 intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [383.871419] WARNING: CPU: 3 PID: 3536 at drivers/gpu/drm/drm_vblank.c:413 drm_vblank_cleanup+0x3b/0x70
<4> [383.871478] CPU: 3 PID: 3536 Comm: i915_module_loa Tainted: G     U  W         5.0.0-CI-CI_DRM_5703+ #1
<4> [383.871496] RIP: 0010:drm_vblank_cleanup+0x3b/0x70
<4> [383.871621]  drm_dev_fini+0x9/0xd0
<4> [383.871661]  i915_driver_release+0x19/0x30 [i915]
<4> [383.871695]  i915_pci_remove+0x21/0x30 [i915]
<4> [383.871780]  i915_exit+0x16/0x1c [i915]
<4> [383.871948] WARNING: CPU: 3 PID: 3536 at drivers/gpu/drm/drm_vblank.c:413 drm_vblank_cleanup+0x3b/0x70
<4> [384.929283] RIP: 0010:drm_setup_crtcs+0x13c/0xc90
<4> [384.929366]  ? drm_fb_helper_is_bound.isra.12+0x71/0xa0
<4> [384.929371]  drm_fb_helper_hotplug_event.part.24+0x84/0xb0
<4> [384.929377]  drm_kms_helper_hotplug_event+0x21/0x30
<0> [384.929487] CPU:1 [LOST 640719 EVENTS]
Comment 1 CI Bug Log 2019-03-05 11:05:08 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* KBL: igt@i915_module_load@reload-with-fault-injection - incomplete - WARN_ON(!intel_irqs_enabled(dev_priv)) 
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5703/fi-kbl-7560u/igt@i915_module_load@reload-with-fault-injection.html
Comment 2 Chris Wilson 2019-03-05 11:08:29 UTC
It's the unflushed worker:

<4>[  342.594335] WARN_ON(!intel_irqs_enabled(dev_priv))
<4>[  342.594362] WARNING: CPU: 1 PID: 627 at drivers/gpu/drm/i915/i915_irq.c:625 bdw_update_pipe_irq+0x108/0x120 [i915]
<4>[  342.594372] Modules linked in: i915(-) vgem cdc_ether r8152 snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ghash_clmulni_intel snd_hwdep snd_hda_core mcs7830 usbnet mii snd_pcm thunderbolt mei_me mei prime_numbers i2c_hid btusb btrtl btbcm btintel bluetooth ecdh_generic pinctrl_sunrisepoint pinctrl_intel [last unloaded: i915]
<4>[  342.594425] CPU: 1 PID: 627 Comm: kworker/1:3 Tainted: G     U  W         5.0.0-CI-CI_DRM_5703+ #1
<4>[  342.594447] Hardware name: Dell Inc. XPS 13 9360/0823VW, BIOS 2.3.1 10/03/2017
<4>[  342.594455] Workqueue: events_long drm_dp_mst_link_probe_work
<4>[  342.594475] RIP: 0010:bdw_update_pipe_irq+0x108/0x120 [i915]
<4>[  342.594480] Code: a0 e8 7c 64 b4 e0 0f 0b 41 80 bc 24 e5 af 00 00 00 0f 85 41 ff ff ff 48 c7 c6 60 4a 70 a0 48 c7 c7 7a 1f 6f a0 e8 58 64 b4 e0 <0f> 0b 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00
<4>[  342.594492] RSP: 0018:ffffc900005239d0 EFLAGS: 00010082
<4>[  342.594496] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
<4>[  342.594501] RDX: 0000000000000002 RSI: 0000000000000001 RDI: ffffffff81123e20
<4>[  342.594524] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
<4>[  342.594529] R10: ffffc90000523970 R11: ffffffff82249a38 R12: ffff8881d5100000
<4>[  342.594535] R13: 0000000000000001 R14: ffff8881d51002b8 R15: ffff888271e3efd0
<4>[  342.594540] FS:  0000000000000000(0000) GS:ffff888276280000(0000) knlGS:0000000000000000
<4>[  342.594546] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  342.594551] CR2: 00005570b1d0a850 CR3: 0000000257b36004 CR4: 00000000003606e0
<4>[  342.594571] Call Trace:
<4>[  342.594590]  gen8_enable_vblank+0x33/0x60 [i915]
<4>[  342.594596]  drm_vblank_enable+0x7c/0x120
<4>[  342.594600]  drm_vblank_get+0x86/0xa0
<4>[  342.594647]  intel_pipe_update_start+0xee/0x3f0 [i915]
<4>[  342.594654]  ? wait_woken+0xa0/0xa0
<4>[  342.594699]  intel_update_crtc+0xc7/0x3d0 [i915]
<4>[  342.594729]  skl_update_crtcs+0x266/0x2b0 [i915]
<4>[  342.594777]  intel_atomic_commit_tail+0x207/0x12a0 [i915]
<4>[  342.594784]  ? flush_workqueue+0x236/0x530
<4>[  342.594788]  ? lockdep_hardirqs_on+0xe0/0x1b0
<4>[  342.594834]  intel_atomic_commit+0x244/0x330 [i915]
<4>[  342.594840]  restore_fbdev_mode_atomic+0x1da/0x1f0
<4>[  342.594847]  drm_fb_helper_restore_fbdev_mode_unlocked+0x42/0x90
<4>[  342.594852]  drm_fb_helper_set_par+0x24/0x50
<4>[  342.594857]  drm_fb_helper_hotplug_event.part.24+0x9d/0xb0
<4>[  342.594862]  drm_kms_helper_hotplug_event+0x21/0x30
<4>[  342.594867]  drm_dp_send_link_address+0x16c/0x1d0
<4>[  342.594872]  drm_dp_check_and_send_link_address+0x82/0xc0
<4>[  342.594877]  drm_dp_mst_link_probe_work+0x54/0x70
<4>[  342.594899]  process_one_work+0x245/0x610
<4>[  342.594905]  worker_thread+0x37/0x380
<4>[  342.594909]  ? process_one_work+0x610/0x610
<4>[  342.594913]  kthread+0x119/0x130
<4>[  342.594917]  ? kthread_park+0x80/0x80
<4>[  342.594922]  ret_from_fork+0x3a/0x50
<4>[  342.594927] irq event stamp: 1410618
<4>[  342.594932] hardirqs last  enabled at (1410617): [<ffffffff8197b80c>] _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[  342.594955] hardirqs last disabled at (1410618): [<ffffffff8197445a>] __schedule+0xaa/0xb40
<4>[  342.594979] softirqs last  enabled at (1410284): [<ffffffff81c0033a>] __do_softirq+0x33a/0x4b9
<4>[  342.594987] softirqs last disabled at (1410277): [<ffffffff810b50f1>] irq_exit+0xd1/0xe0
<4>[  342.595009] WARNING: CPU: 1 PID: 627 at drivers/gpu/drm/i915/i915_irq.c:625 bdw_update_pipe_irq+0x108/0x120 [i915]

that appears to be the culprit.
Comment 3 Manasi 2019-06-06 19:52:25 UTC
This failure is seen on KBL machines and last seen 1 day ago.
Failure is in the igt@i915_module_load@reload-with-fault-injection test.
This test injects a failure at various points during the driver load to test whether the unload-load(reload) can happen correctly.

Failure analysis:
Looking through the dmesg logs, the failure occurs at the failure injection checkpoint 9. This injects failure at a point when the worker threads are already created.
Possible Cause:
When we unload the driver due to the failure, the driver is not cleaning up/killing the worker threads correctly. Due to this drm_dp_mst_link_probe_work() continues to do the modeset and expects the vblank irqs in crtc_enable() while the IRqs are already disabled due to the unload causing the warnings about IRqs not enabled.

Possible Fix:
This could be fixed by properly handling the worker thread cleanup/kill in intel_modeset_cleanup() or earlier.

Manasi
Comment 4 Chris Wilson 2019-07-03 20:24:39 UTC
commit 3b295cb1a411d9c82bbfaa66bc17a8508716ed07
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jun 3 14:58:57 2019 +0100

    drm: Flush output polling on shutdown
    
    We need to mark the output polling as disabled to prevent concurrent
    irqs from queuing new work as shutdown the probe -- causing that work to
    execute after we have freed the structs:


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.