Bug 110829

Summary: [CI][BAT] igt@i915_pm_rpm@basic-pci-d3-state - fail - Failed assertion: wait_for_suspended()
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Lakshmi <lakshminarayana.vudum>
Status: RESOLVED MOVED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: normal    
Priority: medium CC: intel-gfx-bugs, jon.ewins
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard: ReadyForDev
i915 platform: KBL i915 features: display/Other

Description Martin Peres 2019-06-03 13:30:37 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5028/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html

Starting subtest: basic-pci-d3-state
(i915_pm_rpm:2780) CRITICAL: Test assertion failure function pci_d3_state_subtest, file ../tests/i915/i915_pm_rpm.c:1430:
(i915_pm_rpm:2780) CRITICAL: Failed assertion: wait_for_suspended()
Comment 1 Martin Peres 2019-06-03 13:31:48 UTC
Putting the bug on GUC because it recently got updated and so far the only machine with this new issue are fi-kbl-guc.
Comment 2 CI Bug Log 2019-06-03 13:32:01 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* GUC: igt@i915_pm_rpm@basic-pci-d3-state - fail - Failed assertion: wait_for_suspended()
  - https://intel-gfx-ci.01.org/tree/drm-tip/IGT_5028/fi-kbl-guc/igt@i915_pm_rpm@basic-pci-d3-state.html
Comment 3 CI Bug Log 2019-06-18 04:25:45 UTC
A CI Bug Log filter associated to this bug has been updated:

{- GUC: igt@i915_pm_rpm@basic-pci-d3-state - fail - Failed assertion: wait_for_suspended() -}
{+ GUC: igt@i915_pm_rpm@basic-pci-d3-state|igt@i915_pm_rpm@system-suspend-devices - fail - Failed assertion: wait_for_suspended() +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_306/fi-kbl-guc/igt@i915_pm_rpm@system-suspend-devices.html
Comment 4 CI Bug Log 2019-06-18 04:53:22 UTC
A CI Bug Log filter associated to this bug has been updated:

{- GUC: igt@i915_pm_rpm@basic-pci-d3-state|igt@i915_pm_rpm@system-suspend-devices - fail - Failed assertion: wait_for_suspended() -}
{+ GUC: igt@i915_pm_rpm@* - fail - Failed assertion: wait_for_suspended() +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_308/fi-kbl-guc/igt@i915_pm_rpm@pm-caching.html
Comment 5 CI Bug Log 2019-06-25 09:37:10 UTC
A CI Bug Log filter associated to this bug has been updated:

{- GUC: igt@i915_pm_rpm@* - fail - Failed assertion: wait_for_suspended() -}
{+ GUC: igt@i915_pm_rpm@* - fail - Failed assertion: wait_for_suspended() +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_312/fi-kbl-guc/igt@i915_pm_rpm@debugfs-forcewake-user.html
Comment 6 CI Bug Log 2019-06-28 14:07:14 UTC
A CI Bug Log filter associated to this bug has been updated:

{- GUC: igt@i915_pm_rpm@* - fail - Failed assertion: wait_for_suspended() -}
{+ GUC: igt@i915_pm_rpm@* - fail - Failed assertion: wait_for_suspended() +}

New failures caught by the filter:

  * https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_316/fi-kbl-guc/igt@i915_pm_rpm@gem-idle.html
Comment 7 Jon Ewins 2019-08-19 20:18:11 UTC
Looks like a display related issue, possibly on the target board. 
intel_hdmi_detect() is called by the drm subsystem when the device is suspended, and then an HDMI interrupt wakes it back up. There are several HDMI-related logs in dmesg about invalid settings.
Comment 8 Martin Peres 2019-08-21 13:40:16 UTC
(In reply to Jon Ewins from comment #7)
> Looks like a display related issue, possibly on the target board. 
> intel_hdmi_detect() is called by the drm subsystem when the device is
> suspended, and then an HDMI interrupt wakes it back up. There are several
> HDMI-related logs in dmesg about invalid settings.

Thanks for looking into this. However, could you follow the normal bug assessment process so as a priority could be set? Right now, the only thing you have done is saying it wasn't likely due to the GuC.
Comment 9 Don Hiatt 2019-08-21 16:19:53 UTC
(In reply to Martin Peres from comment #8)
> (In reply to Jon Ewins from comment #7)
> > Looks like a display related issue, possibly on the target board. 
> > intel_hdmi_detect() is called by the drm subsystem when the device is
> > suspended, and then an HDMI interrupt wakes it back up. There are several
> > HDMI-related logs in dmesg about invalid settings.
> 
> Thanks for looking into this. However, could you follow the normal bug
> assessment process so as a priority could be set? Right now, the only thing
> you have done is saying it wasn't likely due to the GuC.

Hey Martin/John, I'll do the bug assessment process on this. Thanks.
Comment 10 Don Hiatt 2019-08-21 16:56:45 UTC
Issue: http://gfx-ci.fi.intel.com/cibuglog-ng/issue/1496
Comment 11 Don Hiatt 2019-08-21 17:14:26 UTC
There are several of the 'gt@i915_pm_rpm' tests failing ('tests/i915/i915_pm_rpm.c') on fi-kbl-guc starting with IGT_5028 (2 months, 2 weeks old) all the way to present drmtip_345 (3 days, 20 hours old).

All of the tests are failing while wait_for_suspended(). As Jon pointed out, there a lots of HDMI errors in the logs and since this appears to be only happening on one machine ('fi-kbl-guc') so this machine should be looked into. Either way, this looks to be a display issue.

This appears to be 100% reproducible on 'fi-kbl-guc' Setting priority to medium as issue appears isolated to a single machine.
Comment 12 Jon Ewins 2019-09-16 21:30:51 UTC
Efforts underway to test guc usage disabled to identify if this is not related to GuC and more specific to the display configuration of this device.
Comment 13 Don Hiatt 2019-09-20 21:56:28 UTC
I created some patches to IGT to only run the failing test and to i915 to force GuC off. I had some issues learning how to get the patches to run together. I finally got it working but then the host fi-kbl-guc skipped the test. I submitted the patches to trybot again and I'm waiting for the results.
Comment 14 Don Hiatt 2019-09-24 23:34:27 UTC
There really seems to be an issue with 'fi-kbl-guc' as every trybot run excludes this machine. Give that, and the HDMI issues, this really seems to be a machine/display specific issue. 

Lakshmina: Can you please assign this to the display group as I'm not sure who that would be.
Comment 15 Martin Peres 2019-12-02 16:41:05 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/704.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.