Bug 84892

Summary: [SNB/BYT/SKL]igt/gem_concurrent_blit/gem_evict_everything sporadically causes *ERROR* timed out waiting for Punit
Product: DRI Reporter: lu hua <huax.lu>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: CLOSED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: major    
Priority: highest CC: intel-gfx-bugs, jinxianx.guo
Version: unspecified   
Hardware: All   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description lu hua 2014-10-11 03:42:52 UTC
==System Environment==
--------------------------
Regression: not sure, unstable

Non-working platforms: BYT

==kernel==
--------------------------
drm-intel-nightly/ea4bec8e96ea8b33b49a7892c1c7f20041a56da6

==Bug detailed description==
-----------------------------
It sporadically causes "*ERROR* timed out waiting for Punit" when run automation testing. I can't reproduce it manually. 
Following sub cases have this issue:
case                                             fail rate(in recent 10 days)
cpu-bcs-early-read-interruptible                   1/10
cpu-bcs-overwrite-source-forked                    2/10
cpu-rcs-gpu-read-after-write-forked                1/10
cpu-rcs-overwrite-source-interruptible             1/10
gtt-bcs-early-read-interruptible                   1/10
gttX-bcs-overwrite-source-interruptible            1/10
gtt-rcs-early-read-interruptible                   1/10  
gtt-rcs-overwrite-source-forked                    1/10
gttX-bcs-overwrite-source-interruptible            1/10
prw-bcs-gpu-read-after-write-interruptible         1/10
gttX-bcs-gpu-read-after-write-interruptible        1/10

run log:
@test: Intel_gpu_tools/igt_gem_concurrent_blit_gtt-rcs-overwrite-source-forked
info: @@@Returncode: 0

test case start at: Wed Oct  1 00:46:50 2014
test case end at:   Wed Oct  1 00:47:07 2014

Errors:


Dmesg:
<3>[ 1230.838096] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit
<3>[ 1231.595680] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit


Output:
             command   pid dev master a   uid      magic
Test Environment check: Succeeded.
[1/1] dmesg-warn: 1 Running Test(s): 0
[1/1] dmesg-warn: 1 Running Test(s): 1
Thank you for running Piglit!
Results have been written to /GFX/Test/Piglit/piglit/t
{
    "results_version": 1,
    "name": "t",
    "options": {
        "profile": [
            "tests/igt.py"
        ],
        "dmesg": false,
        "execute": true,
        "log_level": "quiet",
        "platform": "mixed_glx_egl",
        "test_suffix": "",
        "valgrind": false,
        "sync": false,
        "filter": [
            "igt/gem_concurrent_blit/gtt-rcs-overwrite-source-forked$"
        ],
        "concurrent": "some",
        "test_count": 0,
        "exclude_tests": [],
        "exclude_filter": []
    },
    "lspci": "00:00.0 Host bridge: Intel Corporation ValleyView SSA-CUnit\n00:02.0 VGA compatible controller: Intel Corporation ValleyView Gen7\n00:13.0 SATA controller: Intel Corporation ValleyView 6-Port SATA AHCI Controller\n00:14.0 USB controller: Intel Corporation ValleyView USB xHCI Host Controller\n00:1a.0 Encryption controller: Intel Corporation ValleyView SEC\n00:1b.0 Audio device: Intel Corporation ValleyView High Definition Audio Controller\n00:1c.0 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.1 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.2 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.3 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1f.0 ISA bridge: Intel Corporation ValleyView Power Control Unit\n00:1f.3 SMBus: Intel Corporation ValleyView SMBus Controller\n01:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)\n",
    "uname": "Linux x-byt06 3.17.0-rc7_drm-intel-nightly_54faa9_20141002+ #142 SMP Thu Oct 2 11:25:44 CST 2014 x86_64 x86_64 x86_64 GNU/Linux\n",
    "tests": {
 "tests": {
        "igt/gem_concurrent_blit/gtt-rcs-overwrite-source-forked": {
            "dmesg": "[ 1230.838096] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit\n[ 1231.595680] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit",
            "returncode": 0,
            "err": "",
            "environment": "PIGLIT_SOURCE_DIR=\"/GFX/Test/Piglit/piglit\" PIGLIT_PLATFORM=\"mixed_glx_egl\"",
            "command": "/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/gem_concurrent_blit --run-subtest gtt-rcs-overwrite-source-forked",
            "result": "dmesg-warn",
            "time": 11.689888000488281,
            "out": "IGT-Version: 1.8-g5782eca (x86_64) (Linux: 3.17.0-rc7_drm-intel-nightly_54faa9_20141002+ x86_64)\nusing 2x512 buffers, each 1MiB\nSubtest gtt-rcs-overwrite-source-forked: SUCCESS (11.563s)\n"
        }
    },
    "time_elapsed": 11.854626178741455
}
returncode: 0
result: dmesg-warn
summary: Intel_gpu_tools/igt_gem_concurrent_blit_gtt-rcs-overwrite-source-forked    DMESG_WARN    reboot

==Reproduce steps==
---------------------------- 
1. run all igt cases 10 cycles.
Comment 1 lu hua 2014-10-11 03:45:16 UTC
gem_evict_everything subcases also have this error

run log:
@test: Intel_gpu_tools/igt_gem_evict_everything_forked-swapping-multifd-interruptible
info: @@@Returncode: 0

test case start at: Thu Oct  9 02:34:47 2014
test case end at:   Thu Oct  9 02:35:21 2014

Errors:


Dmesg:
<3>[ 1978.117344] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit


Output:
             command   pid dev master a   uid      magic
Test Environment check: Succeeded.
[1/1] dmesg-warn: 1 Running Test(s): 0
[1/1] dmesg-warn: 1 Running Test(s): 1
Thank you for running Piglit!
Results have been written to /GFX/Test/Piglit/piglit/t
{
    "results_version": 1,
    "time_elapsed": 30.338670015335083,
    "tests": {
        "igt/gem_evict_everything/forked-swapping-multifd-interruptible": {
            "dmesg": "[ 1978.117344] [drm:vlv_set_rps_idle] *ERROR* timed out waiting for Punit",
            "returncode": 0,
            "err": "",
            "environment": "PIGLIT_SOURCE_DIR=\"/GFX/Test/Piglit/piglit\" PIGLIT_PLATFORM=\"mixed_glx_egl\"",
            "command": "/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/gem_evict_everything --run-subtest forked-swapping-multifd-interruptible",
            "result": "dmesg-warn",
            "time": 30.12121319770813,
            "out": "IGT-Version: 1.8-gb7d80d1 (x86_64) (Linux: 3.17.0_drm-intel-nightly_ea4bec_20141010+ x86_64)\nSubtest forked-swapping-multifd-interruptible: SUCCESS (29.879s)\n"
        }
    },
    "name": "t",
    "options": {
        "profile": [
            "tests/igt.py"
        ],
        "dmesg": false,
        "execute": true,
        "platform": "mixed_glx_egl",
        "valgrind": false,
        "sync": false,
        "filter": [
            "igt/gem_evict_everything/forked-swapping-multifd-interruptible$"
        ],
        "concurrent": "some",
        "exclude_tests": [],
        "env": {
            "lspci": "00:00.0 Host bridge: Intel Corporation ValleyView SSA-CUnit\n00:02.0 VGA compatible controller: Intel Corporation ValleyView Gen7\n00:13.0 SATA controller: Intel Corporation ValleyView 6-Port SATA AHCI Controller\n00:14.0 USB controller: Intel Corporation ValleyView USB xHCI Host Controller\n00:1a.0 Encryption controller: Intel Corporation ValleyView SEC\n00:1b.0 Audio device: Intel Corporation ValleyView High Definition Audio Controller\n00:1c.0 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.1 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.2 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1c.3 PCI bridge: Intel Corporation ValleyView PCI Express Root Port\n00:1f.0 ISA bridge: Intel Corporation ValleyView Power Control Unit\n00:1f.3 SMBus: Intel Corporation ValleyView SMBus Controller\n01:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06)\n",
            "uname": "Linux x-byt06 3.17.0_drm-intel-nightly_ea4bec_20141010+ #358 SMP Fri Oct 10 11:24:22 CST 2014 x86_64 x86_64 x86_64 GNU/Linux\n"
        },
        "exclude_filter": []
    }
}
returncode: 0
result: dmesg-warn
summary: Intel_gpu_tools/igt_gem_evict_everything_forked-swapping-multifd-interruptible    DMESG_WARN     reboot
Comment 2 Guo Jinxian 2014-10-16 05:57:05 UTC
I met another dmesg error while running tests on BSW


[root@x-bsw01 tests]# ./gem_concurrent_blit --run-subtest gpuX-bcs-overwrite-source-forked
IGT-Version: 1.8-ga0b5c6d (x86_64) (Linux: 3.17.0_drm-intel-nightly_2ea23c_20141014+ x86_64)
using 2x512 buffers, each 1MiB
Subtest gpuX-bcs-overwrite-source-forked: SUCCESS (9.268s)
[root@x-bsw01 tests]# dmesg -r|egrep "<[1-4]>"|grep drm
<3>[ 2460.492212] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[ 2463.683662] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[ 2463.710842] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[ 2463.718482] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
<3>[ 2466.467510] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Comment 3 lu hua 2014-10-21 03:08:42 UTC
igt/gem_ring_sync_copy/sync-render-blitter-write-write also has this issue.

@test: Intel_gpu_tools/igt_gem_ring_sync_copy_sync-render-blitter-write-write
info: @@@Returncode: 0

test case start at: Mon Oct 20 13:09:51 2014
test case end at:   Mon Oct 20 13:10:38 2014

Errors:


Dmesg:
<3>[  748.059922] [drm:gen6_rps_idle [i915]] *ERROR* timed out waiting for Punit
Comment 4 lu hua 2014-11-03 02:23:52 UTC
gem_render_linear_blits also has this issue.
@test: Intel_gpu_tools/igt_gem_render_linear_blits
info: @@@Returncode: 0

test case start at: Sun Nov  2 11:16:52 2014
test case end at:   Sun Nov  2 11:24:38 2014

Errors:


Dmesg:
<3>[ 6264.239492] [drm:gen6_rps_idle [i915]] *ERROR* timed out waiting for Punit


Output:
             command   pid dev master a   uid      magic
Test Environment check: Succeeded.
[1/1] dmesg-warn: 1 Running Test(s): 0
[1/1] dmesg-warn: 1 Running Test(s): 1
Comment 5 lu hua 2014-11-13 05:29:07 UTC
gem_render_tiled_blits also has this error on BYT.
@test: Intel_gpu_tools/igt_gem_render_tiled_blits
info: @@@Returncode: 0

test case start at: Wed Nov 12 09:27:34 2014
test case end at:   Wed Nov 12 09:34:15 2014

Errors:


Dmesg:
<3>[ 4473.725127] [drm:gen6_rps_idle [i915]] *ERROR* timed out waiting for Punit
Comment 6 Imre Deak 2014-11-19 15:19:38 UTC
Could you give a try to the following:
http://lists.freedesktop.org/archives/intel-gfx/2014-November/055832.html
Comment 7 Guo Jinxian 2014-11-20 06:43:23 UTC
(In reply to Imre Deak from comment #6)
> Could you give a try to the following:
> http://lists.freedesktop.org/archives/intel-gfx/2014-November/055832.html

With this patch, I run 10 time about gem_concurrent_blit, didn't reproduce this issue.
Comment 8 wendy.wang 2014-12-17 03:21:48 UTC
Because of this bug, QA disabled 131 IGT cases.
Comment 9 fangxun 2015-02-06 04:19:03 UTC
It also happenes on SKL. 
igt/gem_concurrent_blit/cpu-rcs-gpu-read-after-write-forked
igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write
igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write-forked
igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write-interruptible
igt/gem_concurrent_blit/gpu-rcs-overwrite-source-interruptible
igt/gem_concurrent_blit/gtt-rcs-early-read-forked
igt/gem_concurrent_blit/gtt-rcs-early-read-interruptible
igt/gem_concurrent_blit/gtt-rcs-gpu-read-after-write-forked
igt/gem_concurrent_blit/gtt-rcs-gpu-read-after-write-interruptible
igt/gem_concurrent_blit/prw-rcs-gpu-read-after-write-interruptible
igt/gem_concurrent_blit/prw-rcs-overwrite-source-forked
igt/gem_concurrent_blit/prw-rcs-overwrite-source-interruptible
igt/gem_render_linear_blits
igt/gem_render_tiled_blits
Comment 10 Ding Heng 2015-03-11 08:16:54 UTC
(In reply to fangxun from comment #9)
> It also happenes on SKL. 
> igt/gem_concurrent_blit/cpu-rcs-gpu-read-after-write-forked
> igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write
> igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write-forked
> igt/gem_concurrent_blit/gpu-rcs-gpu-read-after-write-interruptible
> igt/gem_concurrent_blit/gpu-rcs-overwrite-source-interruptible
> igt/gem_concurrent_blit/gtt-rcs-early-read-forked
> igt/gem_concurrent_blit/gtt-rcs-early-read-interruptible
> igt/gem_concurrent_blit/gtt-rcs-gpu-read-after-write-forked
> igt/gem_concurrent_blit/gtt-rcs-gpu-read-after-write-interruptible
> igt/gem_concurrent_blit/prw-rcs-gpu-read-after-write-interruptible
> igt/gem_concurrent_blit/prw-rcs-overwrite-source-forked
> igt/gem_concurrent_blit/prw-rcs-overwrite-source-interruptible
> igt/gem_render_linear_blits
> igt/gem_render_tiled_blits

add SNB in this issue.
Comment 11 wendy.wang 2015-03-31 01:53:33 UTC
@Imre Deak 

On 2014-11-20, QA tested your patch, the result shows positive, will you plan to merge up your patch? Thanks.
Comment 12 Daniel Vetter 2015-04-02 08:42:59 UTC
(In reply to wendy.wang from comment #11)
> @Imre Deak 
> 
> On 2014-11-20, QA tested your patch, the result shows positive, will you
> plan to merge up your patch? Thanks.

It's been merged since a long time:

commit 2837ac40698d0931727b957a40c8c8ea27c3bcb2
Author: Imre Deak <imre.deak@intel.com>
Date:   Wed Nov 19 16:25:38 2014 +0200

    drm/i915: vlv: increase timeout when setting idle GPU freq

But this will only work for byt, so you need to file a new bug for snb/skl.
Comment 13 ye.tian 2015-04-03 10:15:01 UTC
Tested gem_concurrent_blit on SNG and BYT with the latest nightly kernel and latest igt . All subcases successfully pass.
I will test the gem_evict_everything case next week.
Comment 14 ye.tian 2015-04-07 08:52:12 UTC
Tested gem_evict_everything on BYT with the latest nightly kernel and latest igt.

output:  (Remove the less than 600s and success cases)
--------------------
root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_evict_everything
IGT-Version: 1.10-g43a1f64 (x86_64) (Linux: 4.0.0-rc6_drm-intel-nightly_333cf6_20150403+x86_64)
Subtest forked-mempressure-normal: SUCCESS (790.305s)
Subtest forked-mempressure-interruptible: SUCCESS (809.774s)
Subtest forked-multifd-mempressure-normal: SUCCESS (789.705s)
Subtest forked-multifd-mempressure-interruptible: SUCCESS (809.812s)
Test requirement not met in function intel_require_memory, file intel_os.c:244:
Test requirement: !(total <= required)
Estimated that we need 6442455040 bytes for the test, but only have 3911188480 bytes ava                                                                                ilable (RAM)
Subtest major-normal: SKIP (0.003s)

Test requirement not met in function intel_require_memory, file intel_os.c:244:
Test requirement: !(total <= required)
Estimated that we need 6442455040 bytes for the test, but only have 3909091328 bytes ava
Subtest major-interruptible: SKIP (0.004s)
Subtest mlocked-hang: SUCCESS (1245.226s)
Test requirement not met in function intel_require_memory, file intel_os.c:244:
Test requirement: !(total <= required)
Estimated that we need 6442455040 bytes for the test, but only have 3908042752 bytes ava                                                                                ilable (RAM)
Subtest major-hang: SKIP (0.004s)
Comment 15 lu hua 2015-04-08 06:59:11 UTC
Verified.Fixed.
Comment 16 Jari Tahvanainen 2017-08-14 08:35:15 UTC
Moving old bug from Verified to Closed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.