Bug 109383

Summary: [CI][BAT] igt@*suspend* - dmesg-warn - i915->runtime_pm.wakeref_count=1 on cleanup
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, sudeep.dutt
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: ILK, KBL i915 features: power/runtime PM

Description Martin Peres 2019-01-18 13:07:22 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5446/fi-ilk-m540/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

<4> [273.397923] ------------[ cut here ]------------
<4> [273.397930] i915->runtime_pm.wakeref_count=1 on cleanup
<4> [273.398025] WARNING: CPU: 3 PID: 2427 at drivers/gpu/drm/i915/intel_runtime_pm.c:4476 intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [273.398027] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_generic i915 coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm mei_me intel_ips lpc_ich e1000e mei prime_numbers
<4> [273.398043] CPU: 3 PID: 2427 Comm: kworker/u16:58 Tainted: G     U            5.0.0-rc2-CI-CI_DRM_5446+ #1
<4> [273.398045] Hardware name: Hewlett-Packard HP EliteBook 8440p/172A, BIOS 68CCU Ver. F.24 09/13/2013
<4> [273.398050] Workqueue: events_unbound async_run_entry_fn
<4> [273.398084] RIP: 0010:intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [273.398086] Code: 53 be 01 00 00 00 48 89 fb f0 0f c1 b7 88 ac 00 00 85 f6 75 09 48 89 df 5b e9 72 dc ff ff 48 c7 c7 20 b4 2d a0 e8 d6 e1 f3 e0 <0f> 0b 48 89 df 5b e9 5b dc ff ff 90 66 2e 0f 1f 84 00 00 00 00 00
<4> [273.398088] RSP: 0018:ffffc90000c83d88 EFLAGS: 00010286
<4> [273.398090] RAX: 0000000000000000 RBX: ffff88811a780000 RCX: 0000000000000000
<4> [273.398092] RDX: 0000000000000000 RSI: ffffffff8212efea RDI: 00000000ffffffff
<4> [273.398094] RBP: 0000000000000000 R08: 00000000e0d7d399 R09: 0000000000000000
<4> [273.398096] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881326f1158
<4> [273.398097] R13: ffffffff82109396 R14: 0000000000000000 R15: 0000000000000000
<4> [273.398100] FS:  0000000000000000(0000) GS:ffff888133cc0000(0000) knlGS:0000000000000000
<4> [273.398101] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [273.398103] CR2: 00007fecdb110000 CR3: 0000000002214000 CR4: 00000000000006e0
<4> [273.398105] Call Trace:
<4> [273.398137]  i915_drm_suspend_late+0xad/0x120 [i915]
<4> [273.398145]  ? pci_pm_poweroff_late+0x30/0x30
<4> [273.398151]  dpm_run_callback+0x64/0x280
<4> [273.398156]  __device_suspend_late+0xad/0x140
<4> [273.398160]  async_suspend_late+0x15/0x90
<4> [273.398163]  async_run_entry_fn+0x34/0x160
<4> [273.398168]  process_one_work+0x245/0x610
<4> [273.398175]  worker_thread+0x37/0x380
<4> [273.398179]  ? process_one_work+0x610/0x610
<4> [273.398182]  kthread+0x119/0x130
<4> [273.398185]  ? kthread_park+0x80/0x80
<4> [273.398192]  ret_from_fork+0x3a/0x50
<4> [273.398201] irq event stamp: 2718
<4> [273.398206] hardirqs last  enabled at (2717): [<ffffffff81123ca2>] vprintk_emit+0x302/0x320
<4> [273.398209] hardirqs last disabled at (2718): [<ffffffff810019b0>] trace_hardirqs_off_thunk+0x1a/0x1c
<4> [273.398214] softirqs last  enabled at (502): [<ffffffff81c0033a>] __do_softirq+0x33a/0x4b9
<4> [273.398218] softirqs last disabled at (493): [<ffffffff810b5081>] irq_exit+0xd1/0xe0
<4> [273.398250] WARNING: CPU: 3 PID: 2427 at drivers/gpu/drm/i915/intel_runtime_pm.c:4476 intel_runtime_pm_cleanup+0x2a/0x40 [i915]
<4> [273.398252] ---[ end trace 9eace47a4dfcc290 ]---
Comment 1 CI Bug Log 2019-01-18 13:07:53 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* ILK: suspend tests - dmesg-warn - i915-&gt;runtime_pm.wakeref_count=1 on cleanup
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5446/fi-ilk-m540/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html

* ILK: igt@runner@aborted - fail - Previous test: *(suspend|s3)*
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5446/fi-ilk-m540/igt@runner@aborted.html
Comment 2 Martin Peres 2019-01-18 13:08:58 UTC
This likely got introduced by the fix for https://bugs.freedesktop.org/show_bug.cgi?id=107588 .
Comment 3 Chris Wilson 2019-01-18 13:11:36 UTC
First glance suggested a spurious warning, the code assumed it was running from a serial context and probably wasn't (i.e. the wakeref_count wasn't stable).
Comment 4 CI Bug Log 2019-01-24 13:22:01 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* KBL: igt@pm_rpm@module-reload - dmesg-fail - i915-&gt;runtime_pm.wakeref_count=1 on cleanup
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5466/fi-kbl-7500u/igt@pm_rpm@module-reload.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5449_214/fi-kbl-7500u/igt@pm_rpm@module-reload.html
Comment 5 CI Bug Log 2019-01-24 13:24:43 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ILK: igt@runner@aborted - fail - Previous test: *(suspend|s3)* -}
{+ ILK KBL: igt@runner@aborted - fail - Previous test: *(suspend|s3|module-reload)* +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5466/fi-kbl-7500u/igt@runner@aborted.html
* https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_5449_214/fi-kbl-7500u/igt@runner@aborted.html
Comment 6 CI Bug Log 2019-01-28 13:55:26 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ILK: suspend tests - dmesg-warn - i915-&gt;runtime_pm.wakeref_count=1 on cleanup -}
{+ ILK APL: suspend tests - dmesg-warn - i915-&gt;runtime_pm.wakeref_count=1 on cleanup +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_200/fi-apl-guc/igt@kms_pipe_crc_basic@suspend-read-crc-pipe-a.html
* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_200/fi-apl-guc/igt@i915_suspend@sysfs-reader.html
Comment 7 CI Bug Log 2019-01-28 14:48:29 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ILK KBL: igt@runner@aborted - fail - Previous test: *(suspend|s3|module-reload)* -}
{+ ILK KBL APL: igt@runner@aborted - fail - Previous test: *(suspend|s3|module-reload|sysfs-reader)* +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_200/fi-apl-guc/igt@runner@aborted.html
* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_200/fi-apl-guc/igt@runner@aborted.html
Comment 8 CI Bug Log 2019-02-08 10:10:58 UTC
A CI Bug Log filter associated to this bug has been updated:

{- ILK APL: suspend tests - dmesg-warn - i915-&gt;runtime_pm.wakeref_count=1 on cleanup -}
{+ ILK APL: suspend tests - dmesg-warn - i915-&gt;runtime_pm.wakeref_count=\d on cleanup +}

New failures caught by the filter:

* https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_204/fi-apl-guc/igt@kms_cursor_crc@cursor-64x64-suspend.html
Comment 9 Lakshmi 2019-03-29 14:15:18 UTC
Steve, any updates on this bug?
Comment 10 Don Hiatt 2019-04-08 20:59:22 UTC
I'm unable to reproduce this bug and the last sighting was on 2019-02-08.
If I understand Chris' Comment #3, this warning might be just the result of a race condition and so has negligible customer impact. 

I'm dropping the severity to medium and will continue to monitor. Please let me know if you disagree.

-don
Comment 11 Chris Wilson 2019-05-15 21:22:01 UTC
commit 4547c255f4420e20c6cda2ee4172ae68b323e695
Author: Imre Deak <imre.deak@intel.com>
Date:   Thu May 9 20:34:36 2019 +0300

    drm/i915: Add support for tracking wakerefs w/o power-on guarantee
    
    It's useful to track runtime PM refs that don't guarantee a device
    power-on state to the rest of the driver. One such case is holding a
    reference that will be put asynchronously, during which normal users
    without their own reference shouldn't access the HW. A follow-up patch
    will add support for disabling display power domains asynchronously
    which needs this.
    
    For this we can split wakeref_count into a low half-word tracking
    all references (raw-wakerefs) and a high half-word tracking
    references guaranteeing a power-on state (wakelocks).
    
    Follow-up patches will make use of the API added here.
    
    While at it add the missing docbook header for the unchecked
    display-power and runtime_pm put functions.
    
    No functional changes, except for printing leaked raw-wakerefs
    and wakelocks separately in intel_runtime_pm_cleanup().
    
    v2:
    - Track raw wakerefs/wakelocks in the low/high half-word of
      wakeref_count, instead of adding a new counter. (Chris)
    v3:
    - Add a struct_member(T, m) helper instead of open-coding it. (Chris)
    - Checkpatch indentation formatting fix.
    
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: Imre Deak <imre.deak@intel.com>
    Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190509173446.31095-2-imre.deak@intel.com

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.