Bug 81858 - [BYT/BDW/BSW Regression]igt/pm_rps some subcases fail and cause " [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!"
Summary: [BYT/BDW/BSW Regression]igt/pm_rps some subcases fail and cause " [drm:i915_c...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: highest normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-29 02:50 UTC by Guo Jinxian
Modified: 2017-02-10 08:49 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (4.17 KB, text/plain)
2014-07-29 02:50 UTC, Guo Jinxian
no flags Details
dmesg (123.31 KB, text/plain)
2014-09-04 06:56 UTC, Guo Jinxian
no flags Details

Description Guo Jinxian 2014-07-29 02:50:25 UTC
Created attachment 103621 [details]
dmesg

==System Environment==
--------------------------
Regression: Yes. 
Good commit on -next-queued: 91565c85b66db820f01894a971d39aaef60c4325(here has another bug 80704)

Non-working platforms: BSW

==kernel==
--------------------------
origin/drm-intel-nightly: e967a525207bd40ab446e2f809907039f88e66f3(fails)
    drm-intel-nightly: 2014y-07m-25d-23h-02m-06s integration manifest
origin/drm-intel-next-queued: eff9b57c1a91ccf309d57500ab6a365ba7be5712(fails)
    drm/i915: Update DRIVER_DATE to 20140725
origin/drm-intel-fixes: f4be89cecea437aaddd7700d05c6bdb5678041f7(here has another bug 80704)
    drm/i915: Fix crash when failing to parse MIPI VBT

==Bug detailed description==
igt/pm_rps subcases  min-max-config-loaded and reset fail and cause " [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!"

Output:
[root@x-bsw01 tests]# ./pm_rps --run-subtest reset
IGT-Version: 1.7-ge48c495 (x86_64) (Linux: 3.16.0-rc4_drm-intel-next-queued_eff9b5_20140728+ x86_64)

Test assertion failure function intel_batchbuffer_flush_on_ring, file intel_batchbuffer.c:172:
Failed assertion: (drm_intel_bo_mrb_exec(batch->bo, used, ((void *)0), 0, 0, ring)) == 0
Last errno: 5, Input/output error
Subtest reset: FAIL
Test assertion failure function matchit, file pm_rps.c:128:
Failed assertion: freqs1[CUR] == freqs2[CUR]
error: 160 == 320
Subtest reset: FAIL
Test assertion failure function load_helper_stop, file pm_rps.c:258:
Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
Last errno: 10, No child processes
pm_rps: igt_core.c:714: igt_fail: Assertion `!test_with_subtests || in_fixture' failed.
Aborted (core dumped)
[root@x-bsw01 tests]# dmesg -r|egrep ""<[1-4]>""|grep drm
<3>[  126.760022] [drm:i915_context_is_banned] *ERROR* gpu hanging too fast, banning!


==Reproduce steps==
---------------------------- 
1. ./pm_rps --run-subtest reset
Comment 1 Guo Jinxian 2014-08-07 06:43:28 UTC
Test still failed on BYT while running test igt/pm_rps/basic-api on latest -nihgtly(5a299a5a794999ddcc44578c0cfd58da83bac62b), but here isn't dmesg error.

root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rps --run-subtest basic-api
IGT-Version: 1.7-gac31f19 (x86_64) (Linux: 3.16.0_drm-intel-nightly_5a299a_20140807+ x86_64)
Test assertion failure function checkit, file pm_rps.c:117:
Failed assertion: freqs[MIN] <= freqs[CUR]
error: 500 <= 187
Subtest basic-api: FAIL
Comment 2 Guo Jinxian 2014-09-04 06:56:41 UTC
Created attachment 105721 [details]
dmesg

Test failed on BDW too.

root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rps --run-subtest blocking
IGT-Version: 1.7-gf473a55 (x86_64) (Linux: 3.17.0-rc2_drm-intel-nightly_4144c9_20140904+ x86_64)
Test assertion failure function emit_store_dword_imm, file pm_rps.c:193:
Failed assertion: batch->ptr == batch->end
Subtest blocking: FAIL
Test assertion failure function load_helper_stop, file pm_rps.c:254:
Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
Subtest blocking: FAIL

root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./pm_rps --run-subtest reset
IGT-Version: 1.7-gf473a55 (x86_64) (Linux: 3.17.0-rc2_drm-intel-nightly_4144c9_20140904+ x86_64)
Test assertion failure function emit_store_dword_imm, file pm_rps.c:193:
Failed assertion: batch->ptr == batch->end
Subtest reset: FAIL

^CTest assertion failure function load_helper_stop, file pm_rps.c:254:
Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
Last errno: 10, No child processes
Subtest reset: FAIL
pm_rps: igt_core.c:891: fork_helper_exit_handler: Assertion `helper_process_count == 0' failed.
Aborted

real    17m39.073s
user    0m0.015s
sys     0m0.039s
root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep "<[1-4]>"|grep drm
Comment 3 Jani Nikula 2014-09-09 15:03:51 UTC
Please bisect.
Comment 4 Guo Jinxian 2014-09-12 03:13:53 UTC
This failure is able to reproduce while run tests below:
igt/gem_partial_pwrite_pread/reads-snoop
igt/gem_partial_pwrite_pread/write-snoop
igt/gem_partial_pwrite_pread/writes-after-reads-snoop
igt/gem_pread_after_blit/interruptible-snoop
igt/gem_pread_after_blit/normal-snoop

root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_partial_pwrite_pread --run-subtest reads-snoop
IGT-Version: 1.8-g107151c (x86_64) (Linux: 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
checking partial reads
Test assertion failure function intel_batchbuffer_flush_on_ring, file intel_batchbuffer.c:180:
Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0
Last errno: 22, Invalid argument
Subtest reads-snoop: FAIL (0.248s)
root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./gem_pread_after_blit --run-subtest interruptible-snoop
IGT-Version: 1.8-g107151c (x86_64) (Linux: 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
Test assertion failure function intel_batchbuffer_flush_on_ring, file intel_batchbuffer.c:180:
Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0
Last errno: 22, Invalid argument
Subtest interruptible-snoop: FAIL (0.002s)
Comment 5 Chris Wilson 2014-09-12 05:47:45 UTC
(In reply to comment #2)
> Created attachment 105721 [details]
> dmesg
> 
> Test failed on BDW too.
> 
> root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# ./pm_rps
> --run-subtest blocking
> IGT-Version: 1.7-gf473a55 (x86_64) (Linux:
> 3.17.0-rc2_drm-intel-nightly_4144c9_20140904+ x86_64)
> Test assertion failure function emit_store_dword_imm, file pm_rps.c:193:
> Failed assertion: batch->ptr == batch->end
> Subtest blocking: FAIL
> Test assertion failure function load_helper_stop, file pm_rps.c:254:
> Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
> Subtest blocking: FAIL
> 
> root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# time ./pm_rps
> --run-subtest reset
> IGT-Version: 1.7-gf473a55 (x86_64) (Linux:
> 3.17.0-rc2_drm-intel-nightly_4144c9_20140904+ x86_64)
> Test assertion failure function emit_store_dword_imm, file pm_rps.c:193:
> Failed assertion: batch->ptr == batch->end
> Subtest reset: FAIL
> 
> ^CTest assertion failure function load_helper_stop, file pm_rps.c:254:
> Failed assertion: igt_wait_helper(&lh.igt_proc) == 0
> Last errno: 10, No child processes
> Subtest reset: FAIL
> pm_rps: igt_core.c:891: fork_helper_exit_handler: Assertion
> `helper_process_count == 0' failed.
> Aborted
> 
> real    17m39.073s
> user    0m0.015s
> sys     0m0.039s
> root@x-bdw05:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests# dmesg -r|egrep
> "<[1-4]>"|grep drm

commit 3a1751ef34c32c5d288a328d855bec49ad0eaf9f
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Sep 12 06:46:28 2014 +0100

    igt/pm_rps: Fix the batch count for emitting the flush
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81858#c2
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Comment 6 Chris Wilson 2014-09-12 05:48:19 UTC
(In reply to comment #4)
> This failure is able to reproduce while run tests below:
> igt/gem_partial_pwrite_pread/reads-snoop
> igt/gem_partial_pwrite_pread/write-snoop
> igt/gem_partial_pwrite_pread/writes-after-reads-snoop
> igt/gem_pread_after_blit/interruptible-snoop
> igt/gem_pread_after_blit/normal-snoop
> 
> root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#
> ./gem_partial_pwrite_pread --run-subtest reads-snoop
> IGT-Version: 1.8-g107151c (x86_64) (Linux:
> 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
> checking partial reads
> Test assertion failure function intel_batchbuffer_flush_on_ring, file
> intel_batchbuffer.c:180:
> Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used,
> ring)) == 0
> Last errno: 22, Invalid argument
> Subtest reads-snoop: FAIL (0.248s)
> root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#
> ./gem_pread_after_blit --run-subtest interruptible-snoop
> IGT-Version: 1.8-g107151c (x86_64) (Linux:
> 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
> Test assertion failure function intel_batchbuffer_flush_on_ring, file
> intel_batchbuffer.c:180:
> Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used,
> ring)) == 0
> Last errno: 22, Invalid argument
> Subtest interruptible-snoop: FAIL (0.002s)

That's full-ppgtt fallout: http://patchwork.freedesktop.org/patch/33361/
Comment 7 Guo Jinxian 2014-09-16 06:48:53 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > This failure is able to reproduce while run tests below:
> > igt/gem_partial_pwrite_pread/reads-snoop
> > igt/gem_partial_pwrite_pread/write-snoop
> > igt/gem_partial_pwrite_pread/writes-after-reads-snoop
> > igt/gem_pread_after_blit/interruptible-snoop
> > igt/gem_pread_after_blit/normal-snoop
> > 
> > root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#
> > ./gem_partial_pwrite_pread --run-subtest reads-snoop
> > IGT-Version: 1.8-g107151c (x86_64) (Linux:
> > 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
> > checking partial reads
> > Test assertion failure function intel_batchbuffer_flush_on_ring, file
> > intel_batchbuffer.c:180:
> > Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used,
> > ring)) == 0
> > Last errno: 22, Invalid argument
> > Subtest reads-snoop: FAIL (0.248s)
> > root@x-bytm02:/GFX/Test/Intel_gpu_tools/intel-gpu-tools/tests#
> > ./gem_pread_after_blit --run-subtest interruptible-snoop
> > IGT-Version: 1.8-g107151c (x86_64) (Linux:
> > 3.17.0-rc4_drm-intel-nightly_72faa6_20140911+ x86_64)
> > Test assertion failure function intel_batchbuffer_flush_on_ring, file
> > intel_batchbuffer.c:180:
> > Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used,
> > ring)) == 0
> > Last errno: 22, Invalid argument
> > Subtest interruptible-snoop: FAIL (0.002s)
> 
> That's full-ppgtt fallout: http://patchwork.freedesktop.org/patch/33361/

The failure unable to reproduce on latest -nightly(43df30da20447e2856b2761215ff274886a9f931) or with this patch. but another error occurs below. which tracked by Bug 83915.

[root@x-bsw01 tests]# ./pm_rps --run-subtest reset
IGT-Version: 1.8-g137877f (x86_64) (Linux: 3.17.0-rc5_kcloud_b5ed37_20140916+ x86_64)
Test assertion failure function idle_check, file pm_rps.c:408:
Failed assertion: freqs[CUR] == freqs[MIN]
error: 480 != 160
Subtest reset: FAIL (65.632s)
[root@x-bsw01 tests]# dmesg -r|egrep "<[1-4]>"|grep drm
Comment 8 Jari Tahvanainen 2017-02-10 08:49:14 UTC
Closing (>2 years) old Verified+Fixed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.