Bug 85684 - [BYT/BSW regression]igt/drv_hangman/kms_flip/gem_reset_stats sporadically causes *ERROR* Timed out: waiting for Render to ack
Summary: [BYT/BSW regression]igt/drv_hangman/kms_flip/gem_reset_stats sporadically cau...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: All Linux (All)
: high normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
: 85358 (view as bug list)
Depends on:
Blocks:
 
Reported: 2014-10-31 06:33 UTC by lu hua
Modified: 2017-08-15 06:54 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Wait old forcewake ack to clear on vlv (1.60 KB, patch)
2014-11-04 09:24 UTC, Mika Kuoppala
no flags Details | Splinter Review
dmesg(gem_reset_stats) (124.25 KB, text/plain)
2014-11-11 03:27 UTC, lu hua
no flags Details

Description lu hua 2014-10-31 06:33:07 UTC
==System Environment==
--------------------------
Regression: not sure
Non-working platforms: BYT

==kernel==
--------------------------
drm-intel-nightly/bd21cf795cf5dc278f1451d0f7a597cb1d13c2ba

==Bug detailed description==
Run 2 cycles, it happens once. 

output:
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./drv_hangman --run-subtest error-state-capture-bsd
IGT-Version: 1.8-gab5f7ea (x86_64) (Linux: 3.18.0-rc2_drm-intel-nightly_bd21cf_20141030+ x86_64)
Subtest error-state-capture-bsd: SUCCESS (12.504s)
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./drv_hangman --run-subtest error-state-capture-bsd
IGT-Version: 1.8-gab5f7ea (x86_64) (Linux: 3.18.0-rc2_drm-intel-nightly_bd21cf_20141030+ x86_64)
Subtest error-state-capture-bsd: SUCCESS (12.057s)
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
<3>[19432.618274] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.

dmesg:
[19408.408488] drv_hangman: executing
[19408.408671] drv_hangman: starting subtest error-state-capture-bsd
[19414.574594] [drm] stuck on bsd ring
[19414.582570] [drm] GPU HANG: ecode 1:0xfffffffe, in drv_hangman [9123], reason: Ring hung, action: reset
[19414.582574] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
[19414.582576] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
[19414.582578] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
[19414.582581] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
[19414.582583] [drm] GPU crash dump saved to /sys/class/drm/card0/error
[19414.585604] [drm] Simulated gpu hang, resetting stop_rings
[19414.585610] drm/i915: Resetting chip after gpu hang
[19426.892311] drv_hangman: executing
[19426.892526] drv_hangman: starting subtest error-state-capture-bsd
[19432.610176] [drm] stuck on bsd ring
[19432.618152] [drm] GPU HANG: ecode 1:0xfffffffe, in drv_hangman [9131], reason: Ring hung, action: reset
[19432.618274] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
[19432.620510] [drm] Simulated gpu hang, resetting stop_rings
[19432.620516] drm/i915: Resetting chip after gpu hang


==Reproduce steps==
---------------------------- 
1. ./drv_hangman --run-subtest error-state-capture-bsd
Comment 1 lu hua 2014-10-31 06:36:04 UTC
gem_reset_stats also has this issue.

@test: Intel_gpu_tools/igt_gem_reset_stats_reset-count-blt
returncode: 0
info: @@@Returncode: 0

test case start at: Thu Oct 30 12:32:10 2014
test case end at:   Thu Oct 30 12:32:20 2014

Errors:


Dmesg:
<6>[  108.951070] [drm] stuck on blitter ring
<6>[  108.958779] [drm] GPU HANG: ecode 2:0xe77ffff3, in gem_reset_stats [5973], reason: Ring hung, action: reset
<6>[  108.958787] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6>[  108.958789] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6>[  108.958792] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6>[  108.958794] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6>[  108.958797] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<3>[  108.958906] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<6>[  108.960558] [drm] Simulated gpu hang, resetting stop_rings
<5>[  108.960564] drm/i915: Resetting chip after gpu hang
Comment 2 lu hua 2014-11-03 02:21:27 UTC
Some kms_flip case also has this issue.
@test: Intel_gpu_tools/igt_kms_flip_vblank-vs-hang-interruptible
returncode: 0
info: @@@Returncode: 0

test case start at: Sun Nov  2 17:03:29 2014
test case end at:   Sun Nov  2 17:04:15 2014

Errors:


Dmesg:
<6>[   82.897180] [drm] stuck on render ring
<6>[   82.905149] [drm] GPU HANG: ecode 0:0x97f4fffe, in kms_flip [4654], reason: Ring hung, action: reset
<6>[   82.905153] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6>[   82.905155] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6>[   82.905157] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6>[   82.905160] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6>[   82.905162] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<6>[   82.906904] [drm] Simulated gpu hang, resetting stop_rings
<5>[   82.906910] drm/i915: Resetting chip after gpu hang
<6>[   91.914805] [drm] stuck on render ring
<6>[   91.922612] [drm] GPU HANG: ecode 0:0x97f4fffe, in kms_flip [4654], reason: Ring hung, action: reset
<3>[   91.922791] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<6>[   91.924479] [drm] Simulated gpu hang, resetting stop_rings
<5>[   91.924486] drm/i915: Resetting chip after gpu hang
<6>[  100.936360] [drm] stuck on render ring
<6>[  100.944273] [drm] GPU HANG: ecode 0:0x97f4fffe, in kms_flip [4654], reason: Ring hung, action: reset
<6>[  100.946063] [drm] Simulated gpu hang, resetting stop_rings
<5>[  100.946069] drm/i915: Resetting chip after gpu hang
<6>[  108.947934] [drm] stuck on render ring
<6>[  108.955718] [drm] GPU HANG: ecode 0:0x97f4fffe, in kms_flip [4654], reason: Ring hung, action: reset
<6>[  108.957675] [drm] Simulated gpu hang, resetting stop_rings
<5>[  108.957680] drm/i915: Resetting chip after gpu hang
Comment 3 lu hua 2014-11-03 06:43:45 UTC
It also impacts BSW.
[root@x-bsw01 tests]# ./drv_hangman --run-subtest error-state-basic
IGT-Version: 1.8-gab5f7ea (x86_64) (Linux: 3.18.0-rc2_drm-intel-nightly_203b34_20141103+ x86_64)
Subtest error-state-basic: SUCCESS (15.659s)
[root@x-bsw01 tests]# dmesg -r|egrep "<[1-4]>"|grep drm
<3>[  235.778310] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Comment 4 Mika Kuoppala 2014-11-03 14:49:43 UTC
Please 'cat /sys/module/i915/parameters/enable_ppgtt' and paste result.

Does this happen if you use i915_enable.ppgtt=1?
Comment 5 lu hua 2014-11-04 06:10:30 UTC
# cat /sys/module/i915/parameters/enable_ppgtt
0

Add i915_enable.ppgtt=1, it still happens.

igt_gem_workarounds_reset also has this error.
Dmesg:
<6>[ 8678.817582] [drm] stuck on render ring
<6>[ 8678.824135] [drm] GPU HANG: ecode 0:0x97f4fffe, in gem_workarounds [20809], reason: Ring hung, action: reset
<6>[ 8678.824143] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace.
<6>[ 8678.824146] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel
<6>[ 8678.824148] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue.
<6>[ 8678.824151] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it.
<6>[ 8678.824153] [drm] GPU crash dump saved to /sys/class/drm/card0/error
<3>[ 8678.824242] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<6>[ 8678.826343] [drm] Simulated gpu hang, resetting stop_rings
<5>[ 8678.826347] drm/i915: Resetting chip after gpu hang
Comment 6 lu hua 2014-11-04 07:41:05 UTC
(In reply to lu hua from comment #5)
> # cat /sys/module/i915/parameters/enable_ppgtt
> 0
> 
> Add i915_enable.ppgtt=1, it still happens.
> 

i915.enable_ppgtt=1
Comment 7 Mika Kuoppala 2014-11-04 09:24:41 UTC
Created attachment 108880 [details] [review]
Wait old forcewake ack to clear on vlv
Comment 8 Mika Kuoppala 2014-11-04 14:39:57 UTC
*** Bug 85358 has been marked as a duplicate of this bug. ***
Comment 9 lu hua 2014-11-05 06:35:09 UTC
(In reply to Mika Kuoppala from comment #7)
> Created attachment 108880 [details] [review] [review]
> Wait old forcewake ack to clear on vlv

Test this patch 10 cycles, it doesn't happen.
Comment 10 Daniel Vetter 2014-11-05 12:40:58 UTC
Apparently a regresion due to

> >  commit 5cb13c07dae73380d8b3ddc792740487b8742938
> >  Author: Deepak S <deepak.s@linux.intel.com>
> >  Date:   Thu Sep 18 18:51:50 2014 +0530
> >
> >     drm/i915/vlv: Remove check for Old Ack during forcewake


mark it as such.
Comment 11 lu hua 2014-11-10 07:08:31 UTC
Test it on latest -nightly kernel 10 cycles, it works well.
root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./drv_hangman --run-subtest error-state-capture-bsd
IGT-Version: 1.8-gc049c39 (x86_64) (Linux: 3.18.0-rc3_drm-intel-nightly_b921a5_20141110+ x86_64)
Subtest error-state-capture-bsd: SUCCESS (13.560s)
root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
Comment 12 lu hua 2014-11-11 03:27:55 UTC
Created attachment 109259 [details]
dmesg(gem_reset_stats)

Test all gem_reset_stats subcases on BYT, it still has this error.

root@x-byt06:/GFX/Test/Infrastructure/infrastructure# dmesg -r|egrep "<[1-4]>"|grep drm
<3>[  162.062426] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[  237.209028] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[  249.232659] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[  505.732108] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.

Intel_gpu_tools/igt_gem_reset_stats_ban-blt    DMESG_WARN         reboot
Intel_gpu_tools/igt_gem_reset_stats_ban-ctx-blt    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_ban-ctx-bsd    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_ban-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_ban-ctx-vebox    NSPT         reboot
Intel_gpu_tools/igt_gem_reset_stats_ban-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_ban-vebox    NSPT     reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-blt    DMESG_WARN       reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-bsd    DMESG_WARN       reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-ctx-blt    NSPT         reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-ctx-bsd    NSPT         reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-ctx-vebox    NSPT       reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-blt    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-bsd    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-reverse-blt    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-reverse-bsd    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-reverse-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-reverse-vebox    NSPT      reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-fork-vebox    NSPT      reboot
Intel_gpu_tools/igt_gem_reset_stats_close-pending-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_close-pending-vebox    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_params    PASS
Intel_gpu_tools/igt_gem_reset_stats_params-ctx-blt    NSPT        reboot
Intel_gpu_tools/igt_gem_reset_stats_params-ctx-bsd    NSPT        reboot
Intel_gpu_tools/igt_gem_reset_stats_params-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_params-ctx-vebox    NSPT      reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-count-blt    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-count-bsd    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-count-ctx-blt    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-count-ctx-bsd    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-count-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-count-ctx-vebox    NSPT         reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-count-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-count-vebox    NSPT     reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-blt    DMESG_WARN         reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-bsd    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-ctx-blt    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-ctx-bsd    NSPT   reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-ctx-vebox    NSPT         reboot
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_reset-stats-vebox    NSPT     reboot
Intel_gpu_tools/igt_gem_reset_stats_unrelated-ctx-blt    NSPT     reboot
Intel_gpu_tools/igt_gem_reset_stats_unrelated-ctx-bsd    NSPT     reboot
Intel_gpu_tools/igt_gem_reset_stats_unrelated-ctx-render    PASS
Intel_gpu_tools/igt_gem_reset_stats_unrelated-ctx-vebox    NSPT   reboot
Comment 13 lu hua 2014-11-11 05:27:46 UTC
Run drv_hangman, the error still exists.
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./drv_hangman
IGT-Version: 1.8-gc049c39 (x86_64) (Linux: 3.18.0-rc3_drm-intel-nightly_f56717_20141105+ x86_64)
Subtest error-state-debugfs-entry: SUCCESS (0.000s)
Subtest error-state-sysfs-entry: SUCCESS (0.000s)
Subtest ring-stop-sysfs-entry: SUCCESS (0.001s)
Subtest error-state-basic: SUCCESS (6.057s)
Subtest error-state-capture-render: SUCCESS (7.766s)
Subtest error-state-capture-bsd: SUCCESS (12.350s)
Subtest error-state-capture-blt: SUCCESS (19.028s)
Test requirement not met in function gem_require_ring, file ioctl_wrappers.c:881:
Test requirement: gem_has_vebox(fd)
Subtest error-state-capture-vebox: SKIP (0.000s)
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
<3>[ 6621.654382] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Comment 14 Daniel Vetter 2014-11-18 09:14:27 UTC
Offending commit has been reverted:

commit 9500986159d9017f9277f9e42a1f9b13b5e0e666
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date:   Wed Nov 5 17:30:52 2014 +0200

    Revert "drm/i915/vlv: Remove check for Old Ack during forcewake"
Comment 15 Guo Jinxian 2014-11-19 06:16:32 UTC
Verified on latest -nightly(3cb89f9eef2888c848248bf45d6dd0d67c594586)

root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./drv_hangman --run-subtest error-state-capture-bsd
IGT-Version: 1.8-gaa63fc7 (x86_64) (Linux: 3.18.0-rc5_kcloud_3cb89f_20141119+ x86_64)
Subtest error-state-capture-bsd: SUCCESS (12.330s)
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
root@x-byt06:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#
Comment 16 Jari Tahvanainen 2017-08-15 06:54:16 UTC
Closing old verified+fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.