Bug 104695

Summary: [BAT] [SKL GUC only] igt@pm_rpm@basic-[pci-d3-state | rte] -dmesg-warn - WARN_ON(!intel_irqs_enabled(dev_priv))
Product: DRI Reporter: Marta Löfstedt <marta.lofstedt>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, john.a.spotswood
Version: DRI git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: SKL i915 features: firmware/guc

Description Marta Löfstedt 2018-01-19 07:44:39 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3648/fi-skl-guc/igt@pm_rpm@basic-pci-d3-state.html
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_3648/fi-skl-guc/igt@pm_rpm@basic-rte.html

[  351.778210] WARN_ON(!intel_irqs_enabled(dev_priv))
[  351.778235] WARNING: CPU: 0 PID: 4066 at drivers/gpu/drm/i915/i915_irq.c:353 gen6_unmask_pm_irq+0x23/0x30 [i915]
[  351.778237] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec e1000e ghash_clmulni_intel snd_hwdep snd_hda_core mei_me ptp snd_pcm mei pps_core prime_numbers
[  351.778252] CPU: 0 PID: 4066 Comm: pm_rpm Tainted: G     U           4.15.0-rc8-CI-CI_DRM_3648+ #1
[  351.778254] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[  351.778268] RIP: 0010:gen6_unmask_pm_irq+0x23/0x30 [i915]
[  351.778270] RSP: 0018:ffffc90002a0b988 EFLAGS: 00010096
[  351.778272] RAX: 0000000000000026 RBX: ffff88021f100000 RCX: 0000000000000002
[  351.778273] RDX: 0000000080000002 RSI: 0000000000000001 RDI: ffffffff810eaaa8
[  351.778274] RBP: ffff88021f1046a0 R08: 0000000000000000 R09: 0000000000000000
[  351.778275] R10: 0000000000000000 R11: ffffffff810eaa74 R12: ffffffffa02b4980
[  351.778276] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff814cb560
[  351.778277] FS:  00007fbb5c6aca40(0000) GS:ffff880236c00000(0000) knlGS:0000000000000000
[  351.778279] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  351.778280] CR2: 00007f3582bcc530 CR3: 00000001a33a0005 CR4: 00000000003606f0
[  351.778281] Call Trace:
[  351.778296]  gen9_enable_guc_interrupts+0x59/0x90 [i915]
[  351.778318]  intel_guc_resume+0xdd/0x150 [i915]
[  351.778333]  intel_runtime_resume+0x6c/0x210 [i915]
[  351.778335]  pci_pm_runtime_resume+0x6e/0x90
[  351.778338]  __rpm_callback+0xb1/0x1e0
[  351.778340]  ? pci_restore_standard_config+0x40/0x40
[  351.778343]  rpm_callback+0x1a/0x70
[  351.778344]  ? pci_restore_standard_config+0x40/0x40
[  351.778346]  rpm_resume+0x4d7/0x770
[  351.778349]  __pm_runtime_resume+0x42/0x80
[  351.778366]  intel_runtime_pm_get+0x1c/0xa0 [i915]
[  351.778388]  intel_pin_and_fence_fb_obj+0x83/0x160 [i915]
[  351.778411]  intel_prepare_plane_fb+0x1bf/0x570 [i915]
[  351.778414]  drm_atomic_helper_prepare_planes+0x47/0xd0
[  351.778435]  intel_atomic_commit+0xa9/0x330 [i915]
[  351.778438]  drm_atomic_helper_set_config+0x7b/0x90
[  351.778440]  __drm_mode_set_config_internal+0x5c/0x110
[  351.778442]  drm_mode_setcrtc+0x457/0x5d0
[  351.778445]  ? lock_acquire+0xaf/0x200
[  351.778449]  ? drm_mode_getcrtc+0x160/0x160
[  351.778451]  drm_ioctl_kernel+0x60/0xa0
[  351.778453]  drm_ioctl+0x290/0x330
[  351.778455]  ? drm_mode_getcrtc+0x160/0x160
[  351.778458]  ? task_work_run+0x39/0xb0
[  351.778461]  do_vfs_ioctl+0x8a/0x670
[  351.778464]  ? get_task_pid+0x150/0x150
[  351.778466]  SyS_ioctl+0x36/0x70
[  351.778469]  entry_SYSCALL_64_fastpath+0x22/0x8f
[  351.778471] RIP: 0033:0x7fbb5a586ef7
[  351.778472] RSP: 002b:00007ffd72d1d8f8 EFLAGS: 00000246
[  351.778475] Code: 0f 1f 84 00 00 00 00 00 80 bf c5 7c 00 00 00 74 07 89 f2 e9 90 d4 ff ff 48 c7 c6 50 d8 2d a0 48 c7 c7 be db 2c a0 e8 5d 4c ee e0 <0f> ff c3 66 2e 0f 1f 84 00 00 00 00 00 8b 05 4a 41 0f e2 55 89 
[  351.778510] ---[ end trace fc2dba5621bf5af9 ]---
[  351.805206] ------------[ cut here ]------------
[  351.805208] WARN_ON(!intel_irqs_enabled(dev_priv))
[  351.805249] WARNING: CPU: 1 PID: 4066 at drivers/gpu/drm/i915/i915_irq.c:353 gen6_unmask_pm_irq+0x23/0x30 [i915]
[  351.805252] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul snd_hda_intel crc32_pclmul snd_hda_codec e1000e ghash_clmulni_intel snd_hwdep snd_hda_core mei_me ptp snd_pcm mei pps_core prime_numbers
[  351.805279] CPU: 1 PID: 4066 Comm: pm_rpm Tainted: G     U  W        4.15.0-rc8-CI-CI_DRM_3648+ #1
[  351.805282] Hardware name: System manufacturer System Product Name/Z170 PRO GAMING, BIOS 3402 04/26/2017
[  351.805307] RIP: 0010:gen6_unmask_pm_irq+0x23/0x30 [i915]
[  351.805310] RSP: 0018:ffffc90002a0b9a8 EFLAGS: 00010092
[  351.805313] RAX: 0000000000000026 RBX: ffff88021f100000 RCX: 0000000000000002
[  351.805315] RDX: 0000000080000002 RSI: 0000000000000001 RDI: ffffffff810eaaa8
[  351.805317] RBP: ffff88021f1046a0 R08: 0000000000000000 R09: 0000000000000000
[  351.805319] R10: 0000000000000000 R11: ffffffff810eaa74 R12: ffffffffa02b4980
[  351.805321] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff814cb560
[  351.805324] FS:  00007fbb5c6aca40(0000) GS:ffff880236c40000(0000) knlGS:0000000000000000
[  351.805326] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  351.805328] CR2: 00007f317fbea1f0 CR3: 00000001a33a0006 CR4: 00000000003606e0
[  351.805330] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  351.805332] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  351.805334] Call Trace:
[  351.805361]  gen9_enable_guc_interrupts+0x59/0x90 [i915]
[  351.805403]  intel_guc_resume+0xdd/0x150 [i915]
[  351.805429]  intel_runtime_resume+0x6c/0x210 [i915]
[  351.805433]  pci_pm_runtime_resume+0x6e/0x90
[  351.805438]  __rpm_callback+0xb1/0x1e0
[  351.805442]  ? pci_restore_standard_config+0x40/0x40
[  351.805446]  rpm_callback+0x1a/0x70
[  351.805448]  ? pci_restore_standard_config+0x40/0x40
[  351.805452]  rpm_resume+0x4d7/0x770
[  351.805455]  ? wait_woken+0x90/0x90
[  351.805460]  __pm_runtime_resume+0x42/0x80
[  351.805490]  intel_runtime_pm_get+0x1c/0xa0 [i915]
[  351.805519]  intel_display_power_get+0x16/0x40 [i915]
[  351.805560]  intel_atomic_commit_tail+0xb52/0xce0 [i915]
[  351.805601]  intel_atomic_commit+0x256/0x330 [i915]
[  351.805606]  drm_atomic_helper_set_config+0x7b/0x90
[  351.805610]  __drm_mode_set_config_internal+0x5c/0x110
[  351.805613]  drm_mode_setcrtc+0x457/0x5d0
[  351.805619]  ? lock_acquire+0xaf/0x200
[  351.805626]  ? drm_mode_getcrtc+0x160/0x160
[  351.805630]  drm_ioctl_kernel+0x60/0xa0
[  351.805634]  drm_ioctl+0x290/0x330
[  351.805638]  ? drm_mode_getcrtc+0x160/0x160
[  351.805643]  ? task_work_run+0x39/0xb0
[  351.805648]  do_vfs_ioctl+0x8a/0x670
[  351.805652]  ? get_task_pid+0x150/0x150
[  351.805657]  SyS_ioctl+0x36/0x70
[  351.805662]  entry_SYSCALL_64_fastpath+0x22/0x8f
[  351.805664] RIP: 0033:0x7fbb5a586ef7
[  351.805666] RSP: 002b:00007ffd72d1d8f8 EFLAGS: 00000246
[  351.805671] Code: 0f 1f 84 00 00 00 00 00 80 bf c5 7c 00 00 00 74 07 89 f2 e9 90 d4 ff ff 48 c7 c6 50 d8 2d a0 48 c7 c7 be db 2c a0 e8 5d 4c ee e0 <0f> ff c3 66 2e 0f 1f 84 00 00 00 00 00 8b 05 4a 41 0f e2 55 89 
[  351.805737] ---[ end trace fc2dba5621bf5afa ]---
Comment 1 Marta Löfstedt 2018-01-19 07:45:45 UTC
Also, see bug 104693 and bug 104694
Comment 2 Chris Wilson 2018-01-19 21:23:21 UTC
commit bd724318b682587ad2f989ab8e0f7b3d4486ced5
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date:   Fri Jan 19 12:49:26 2018 +0000

    drm/i915/guc: Keep GuC log disabled by default
    
    It looks that GuC log functionality is not fully functional yet and
    causes issues when enabled by auto(-1) modparam on debug builds.
    For example, but not limited to:
    
    [   30.062893] ======================================================
    [   30.062894] WARNING: possible circular locking dependency detected
    [   30.062895] 4.15.0-rc8-CI-CI_DRM_3648+ #1 Tainted: G     U
    [   30.062896] ------------------------------------------------------
    [   30.062897] debugfs_test/1268 is trying to acquire lock:
    [   30.062898]  (&dev->struct_mutex){+.+.}, at: [<00000000e4213449>] i915_mutex_lock_interruptible+0x47/0x130 [i915]
    [   30.062921]
                   but task is already holding lock:
    [   30.062921]  (&mm->mmap_sem){++++}, at: [<00000000dd7adc93>] __do_page_fault+0x106/0x560
    [   30.062924]
                   which lock already depends on the new lock.
    
    References: 0ed87953532652 ("drm/i915/guc: Redefine guc_log_level modparam values")
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104694
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104695
    Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Jani Saarinen <jani.saarinen@intel.com>
    Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
    Cc: Marta Lofstedt <marta.lofstedt@intel.com>
    Cc: Michal Winiarski <michal.winiarski@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20180119124926.29844-1-michal.wajdeczko@intel.com
    Reviewed-by: Michal Winiarski <michal.winiarski@intel.com>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 3 Marta Löfstedt 2018-01-22 09:17:51 UTC
This fix is hiding the issue since CI_DRM_3662. I can't close the bug, since I know that the issue is still there. However, I will archive from cibuglog.
Comment 4 Sagar Kamble 2018-01-25 00:26:36 UTC
Hi Marta,

Fix has been merged for this today
https://patchwork.freedesktop.org/patch/200405/

CI BAT also confirmed resolution with GuC log enabled by default.
But change to enable GuC log by default is gated by https://bugs.freedesktop.org/show_bug.cgi?id=104694 and will be addressed next.

Can we mark this bug as CLOSED now.

Thanks,
Sagar

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.