==System Environment== -------------------------- Regression: not sure Non-working platforms: BSW ==kernel== -------------------------- drm-intel-nightly/34d267c2ba9c0845432baf959a2c4deed87f3ee4 ==Bug detailed description== ----------------------------- When run automation, it sporadically causes *ERROR* Timed out: waiting for Render to ack. I am unable to reproduce it manually. It happens on different subcase if run mutiple cycles. log: @test: Intel_gpu_tools/igt_gem_concurrent_blit_gpu-bcs-early-read-forked info: @@@Returncode: 0 test case start at: Sat Jan 6 23:19:38 2001 test case end at: Sat Jan 6 23:19:51 2001 Errors: Dmesg: <3>[ 895.532394] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. Output: command pid dev master a uid magic Test Environment check: Succeeded. [1/1] dmesg-warn: 1 | [1/1] dmesg-warn: 1 / Thank you for running Piglit! Results have been written to /GFX/Test/Piglit/piglit/t { "results_version": 2, "uname": "Linux x-bsw01 3.18.0_drm-intel-nightly_34d267_20141209+ #2416 SMP Tue Dec 9 11:25:02 CST 2014 x86_64 x86_64 x86_64 GNU/Linux\n", "time_elapsed": 7.836741924285889, "tests": { "igt/gem_concurrent_blit/gpu-bcs-early-read-forked": { "dmesg": "[ 895.532394] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.", "returncode": 0, "err": "", "environment": "PIGLIT_SOURCE_DIR=\"/GFX/Test/Piglit/piglit\" PIGLIT_PLATFORM=\"mixed_glx_egl\"", "command": "/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests/gem_concurrent_blit --run-subtest gpu-bcs-early-read-forked", "result": "dmesg-warn", "time": 7.61755907535553, "out": "IGT-Version: 1.8-gf333981 (x86_64) (Linux: 3.18.0_drm-intel-nightly_34d267_20141209+ x86_64)\nusing 2x512 buffers, each 1MiB\nSubtest gpu-bcs-early-read-forked: SUCCESS (7.422s)\n" } }, "name": "t", "lspci": "00:00.0 Host bridge: Intel Corporation Device 2280 (rev 15)\n00:02.0 VGA compatible controller: Intel Corporation Device 22b0 (rev 15)\n00:03.0 Multimedia controller: Intel Corporation Device 22b8 (rev 15)\n00:0b.0 Signal processing controller: Intel Corporation Device 22dc (rev 15)\n00:13.0 SATA controller: Intel Corporation Device 22a3 (rev 15)\n00:14.0 USB controller: Intel Corporation Device 22b5 (rev 15)\n00:1a.0 Encryption controller: Intel Corporation Device 2298 (rev 15)\n00:1b.0 Audio device: Intel Corporation Device 2284 (rev 15)\n00:1c.0 PCI bridge: Intel Corporation Device 22c8 (rev 15)\n00:1c.1 PCI bridge: Intel Corporation Device 22ca (rev 15)\n00:1f.0 ISA bridge: Intel Corporation Device 229c (rev 15)\n00:1f.3 SMBus: Intel Corporation Device 2292 (rev 15)\n02:00.0 Network controller: Intel Corporation Wireless 7265 (rev 2b)\n", "options": { "profile": [ "tests/igt.py" ], "dmesg": false, "execute": true, "log_level": "quiet", "concurrent": "some", "valgrind": false, "sync": false, "filter": [ "igt/gem_concurrent_blit/gpu-bcs-early-read-forked$" ], "platform": "mixed_glx_egl", "exclude_tests": [], "env": { "PIGLIT_SOURCE_DIR": "/GFX/Test/Piglit/piglit", "PIGLIT_PLATFORM": "mixed_glx_egl" }, "exclude_filter": [] } } returncode: 0 result: dmesg-warn summary: Intel_gpu_tools/igt_gem_concurrent_blit_gpu-bcs-early-read-forked DMESG_WARN Reproduce steps: ------------------------- 1. run all igt case
BYT also has this error. When run automation testing, drv_hangman and gem_reset_stats also have this error. Run more than 5 cycles, I am unable to reproduce it. @test: Intel_gpu_tools/igt_drv_hangman_error-state-capture-bsd returncode: 0 info: @@@Returncode: 0 test case start at: Sat Dec 20 00:13:01 2014 test case end at: Sat Dec 20 00:13:21 2014 Errors: Dmesg: <6>[ 96.817503] [drm] stuck on bsd ring <6>[ 96.824051] [drm] GPU HANG: ecode 7:1:0xfffffffe, in drv_hangman [5367], reason: Ring hung, action: reset <6>[ 96.824060] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <6>[ 96.824063] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <6>[ 96.824065] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <6>[ 96.824068] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <6>[ 96.824070] [drm] GPU crash dump saved to /sys/class/drm/card0/error <3>[ 96.824152] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <6>[ 96.826195] [drm] Simulated gpu hang, resetting stop_rings <5>[ 96.826200] drm/i915: Resetting chip after gpu hang @test: Intel_gpu_tools/igt_gem_reset_stats_ban-bsd returncode: 0 info: @@@Returncode: 0 test case start at: Sat Dec 20 05:10:18 2014 test case end at: Sat Dec 20 05:10:35 2014 Errors: Dmesg: <6>[ 754.357515] [drm] stuck on bsd ring <6>[ 754.364622] [drm] GPU HANG: ecode 7:1:0x277fffff, in gem_reset_stats [17204], reason: Ring hung, action: reset <6>[ 754.364638] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <6>[ 754.364641] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <6>[ 754.364643] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <6>[ 754.364646] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <6>[ 754.364648] [drm] GPU crash dump saved to /sys/class/drm/card0/error <3>[ 754.364730] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <6>[ 754.367345] [drm] Simulated gpu hang, resetting stop_rings <5>[ 754.367350] drm/i915: Resetting chip after gpu hang <6>[ 760.362407] [drm] stuck on bsd ring <6>[ 760.368876] [drm] GPU HANG: ecode 7:1:0x277fffff, in gem_reset_stats [17204], reason: Ring hung, action: reset <6>[ 760.371375] [drm] Simulated gpu hang, resetting stop_rings <5>[ 760.371380] drm/i915: Resetting chip after gpu hang
Created attachment 111355 [details] dmesg(kms_flip) Many kms_flip subcases also have this issue. Run ./kms_flip --run-subtest flip-vs-panning-vs-hang, it fails 1 in 2 runs. root@x-bsw01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip --run-subtest flip-vs-panning-vs-hang IGT-Version: 1.9-geb799b2 (x86_64) (Linux: 3.18.0-rc7_drm-intel-next-queued_140fd3_20141226+ x86_64) Using monotonic timestamps Beginning flip-vs-panning-vs-hang on crtc 8, connector 29 1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780 .... flip-vs-panning-vs-hang on crtc 8, connector 29: PASSED Beginning flip-vs-panning-vs-hang on crtc 13, connector 29 1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780 .... flip-vs-panning-vs-hang on crtc 13, connector 29: PASSED Subtest flip-vs-panning-vs-hang: SUCCESS (49.886s) root@x-bsw01:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r | egrep "<[1-4]>" |grep drm <3>[ 250.789840] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <3>[ 268.790830] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Test on the latest -nightly kernel, It still has this issue. root@x-bsw08:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./kms_flip --run-subtest flip-vs-modeset-vs-hang IGT-Version: 1.9-g5fb26d1 (x86_64) (Linux: 3.19.0-rc3_drm-intel-nightly_0056b6_20150109+ x86_64) Using monotonic timestamps Beginning flip-vs-modeset-vs-hang on crtc 19, connector 40 1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780 ... flip-vs-modeset-vs-hang on crtc 19, connector 40: PASSED Beginning flip-vs-modeset-vs-hang on crtc 24, connector 40 1920x1080 60 1920 1966 1996 2080 1080 1082 1086 1112 0xa 0x48 138780 ... flip-vs-modeset-vs-hang on crtc 24, connector 40: PASSED Subtest flip-vs-modeset-vs-hang: SUCCESS (36.625s) root@x-bsw08:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm <3>[ 512.803434] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Test gem_reloc_vs_gpu on BYT, It also has this error(bug 88358 track fail). gem_reloc_vs_gpu/forked-faulting-reloc-hang igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrash-inactive-hang igt/gem_reloc_vs_gpu/forked-faulting-reloc-thrashing-hang igt/gem_reloc_vs_gpu/forked-hang igt/gem_reloc_vs_gpu/forked-interruptible-faulting-reloc-thrashing-hang igt/gem_reloc_vs_gpu/forked-thrash-inactive-hang igt/gem_reloc_vs_gpu/forked-thrashing-hang root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-hang IGT-Version: 1.9-g5fb26d1 (x86_64) (Linux: 3.19.0-rc4_drm-intel-nightly_95cce4_20150115+ x86_64) Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef Test assertion failure function do_test, file gem_reloc_vs_gpu.c:240: Failed assertion: test == 0xdeadbeef mismatch in buffer 0: 0x00000000 instead of 0xdeadbeef child 12 failed with exit status 99 Subtest forked-faulting-reloc-hang: FAIL (104.062s) root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm <3>[ 7186.608528] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <3>[ 7198.618182] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <3>[ 7204.622584] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <3>[ 7210.619451] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack. <3>[ 7246.660833] [drm:__vlv_force_wake_get [i915]] *ERROR* Timed out: waiting for Render to ack.
Does it also affect BDW? What happens with ppgtt disabled? Why is it critical? What is customer impact here?
(In reply to Rodrigo Vivi from comment #5) > Does it also affect BDW? Run ./kms_flip --run-subtest flip-vs-panning-vs-hang 5 cycles on BDW, it doesn't have this error. > What happens with ppgtt disabled? I will give it a try. > Why is it critical? What is customer impact here? More than 200 cases have this error on BYT/BSW. gem_reset_stats has 53 subcases kms_flip has 80+ subcase, gem_concurrent_blit has 108 subcases. And the result is unstable, It will interfere with result check and prts bisect. We hope we could focus on real regression and new case's fail, So disabled these cases. If these unstable issue could be fixed in time, it's valuable. Do you think so?
I test on the latest -nightly kernel. Run ./kms_flip --run-subtest flip-vs-panning-vs-hang ./gem_reloc_vs_gpu --run-subtest forked-faulting-reloc-hang, the error is unable to reproduce. Due to it sporadically fail, I will double check it. If fixed, I will close it.
Run these cases on BYT and BSW, this error goes away. Close it. kms_flip has *ERROR* The master control interrupt lied (PM)! on BSW and tracked in bug 87347.
Verified.Fixed.
Closing old verified+fixed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.