Bug 110584 - [CI][SHARDS] igt@i915_pm_sseu@full-enable - Fail - Failed assertion: stat->info.eu_total <= stat->hw.eu_total
Summary: [CI][SHARDS] igt@i915_pm_sseu@full-enable - Fail - Failed assertion: stat->i...
Status: NEW
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: low normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 110593 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-05-02 08:05 UTC by Lakshmi
Modified: 2019-07-25 12:48 UTC (History)
1 user (show)

See Also:
i915 platform: SKL
i915 features: power/Other


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-05-02 08:05:47 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6014/shard-skl8/igt@i915_pm_sseu@full-enable.html

Starting subtest: full-enable
(i915_pm_sseu:2360) CRITICAL: Test assertion failure function check_full_enable, file ../tests/i915/i915_pm_sseu.c:344:
(i915_pm_sseu:2360) CRITICAL: Failed assertion: stat->info.eu_total <= stat->hw.eu_total
(i915_pm_sseu:2360) CRITICAL: error: 24 > 0
Subtest full-enable failed.
**** DEBUG ****
: no
(i915_pm_sseu:2360) DEBUG:   Has Subslice Power Gating: no
(i915_pm_sseu:2360) DEBUG:   Has EU Power Gating: yes
(i915_pm_sseu:2360) DEBUG: SSEU Device Status
(i915_pm_sseu:2360) DEBUG:   Enabled Slice Mask: 0001
(i915_pm_sseu:2360) DEBUG:   Enabled Slice Total: 1
(i915_pm_sseu:2360) DEBUG:   Enabled Subslice Total: 3
(i915_pm_sseu:2360) DEBUG:   Enabled Slice0 subslices: 3
(i915_pm_sseu:2360) DEBUG:   Enabled EU Total: 0
(i915_pm_sseu:2360) DEBUG:   Enabled EU Per Subslice: 0
(i915_pm_sseu:2360) CRITICAL: Test assertion failure function check_full_enable, file ../tests/i915/i915_pm_sseu.c:344:
(i915_pm_sseu:2360) CRITICAL: Failed assertion: stat->info.eu_total <= stat->hw.eu_total
(i915_pm_sseu:2360) CRITICAL: error: 24 > 0
(i915_pm_sseu:2360) igt_core-INFO: Stack trace:
(i915_pm_sseu:2360) igt_core-INFO:   #0 ../lib/igt_core.c:1476 __igt_fail_assert()
(i915_pm_sseu:2360) igt_core-INFO:   #1 ../tests/i915/i915_pm_sseu.c:345 check_full_enable()
(i915_pm_sseu:2360) igt_core-INFO:   #2 ../tests/i915/i915_pm_sseu.c:379 full_enable()
(i915_pm_sseu:2360) igt_core-INFO:   #3 ../tests/i915/i915_pm_sseu.c:397 __real_main388()
(i915_pm_sseu:2360) igt_core-INFO:   #4 ../tests/i915/i915_pm_sseu.c:388 main()
(i915_pm_sseu:2360) igt_core-INFO:   #5 ../csu/libc-start.c:344 __libc_start_main()
(i915_pm_sseu:2360) igt_core-INFO:   #6 [_start+0x2a]
****  END  ****
Subtest full-enable: FAIL (0.297s)
Comment 1 CI Bug Log 2019-05-02 08:07:53 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* SKL: igt@i915_pm_sseu@full-enable - Fail - Failed assertion: stat-&gt;info.eu_total &lt;= stat-&gt;hw.eu_total
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6014/shard-skl8/igt@i915_pm_sseu@full-enable.html
Comment 2 Arek Hiler 2019-07-23 13:51:58 UTC
pm_sseu@full-enable makes sure that all the slices/subslices/EUs are "fully enabled" (i.e. not power gated) while graphics is busy.

To make the execution units busy media_spin workload is used, and after waiting for 2ms for a (supposedly) 10ms workload to be fully spun up we are reading the current state from i915_sseu_status debugfs.

We have seen this failure once on shard-skl8 over 2 months ago. media_spin was working correctly as the asserts both on function exit and verification of the number of 'spins' performed were successful.

My guess is that the GPU managed to spin down because the batch ended already - the numbers of spins is established in a experimental way and prone to scheduling noise. Either that or we have read some trash from debugfs. Volatility is not that bad if it happened once in forever.

Due to negligible impact and being cosmic-ray-flipping-a-bit-class of an event I am lowering the priority to low.
Comment 3 Arek Hiler 2019-07-24 08:40:14 UTC
*** Bug 110593 has been marked as a duplicate of this bug. ***
Comment 4 Arek Hiler 2019-07-25 12:48:38 UTC
This test should be rewritten as a self-test. Rationale by Lionel can be found here: https://bugs.freedesktop.org/show_bug.cgi?id=103484


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.