Bug 80892 - [BSW]igt/gem_reloc_vs_gpu some subcases cost long time to execute
Summary: [BSW]igt/gem_reloc_vs_gpu some subcases cost long time to execute
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: Other All
: high normal
Assignee: Thomas Wood
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-07-04 06:05 UTC by Guo Jinxian
Modified: 2017-10-06 14:37 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (124.51 KB, text/plain)
2014-07-04 06:05 UTC, Guo Jinxian
no flags Details

Description Guo Jinxian 2014-07-04 06:05:55 UTC
Created attachment 102235 [details]
dmesg

==System Environment==
--------------------------
Regression: No. 
It's first time to run the test.

Non-working platforms: BSW

==kernel==
--------------------------
origin/drm-intel-nightly: eb638c7fabe97a9df752aeb2f59a9463ce4aed8e(fails)
    drm-intel-nightly: 2014y-07m-03d-14h-50m-16s integration manifest
origin/drm-intel-next-queued: 5e59f7175f96550ede91f58d267d2b551cb6fbba(fails)
    drm/i915: Try harder to get FBC  
origin/drm-intel-fixes: 5549d25f642a7e6cfb8744d0031a9da404f696d6(fails)
    drm/i915: Drop early VLV WA to fix Voltage not getting dropped to Vmin

==Bug detailed description==
igt/gem_reloc_vs_gpu some subcases cost long time to execute

Output:
IGT-Version: 1.7-g67e29a3 (x86_64) (Linux: 3.16.0-rc2_drm-intel-nightly_eb638c_20140704+ x86_64)
Test assertion failure function do_test, file gem_reloc_vs_gpu.c:260:
Last errno: 0, Success
Failed assertion: test == 0xdeadbeef
mismatch in buffer 1: 0x00000000 instead of 0xdeadbeef
Subtest faulting-reloc: FAIL
Test assertion failure function gem_quiescent_gpu, file drmtest.c:163:
Last errno: 5, Input/output error
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
^CTest assertion failure function gem_quiescent_gpu, file drmtest.c:149:
Last errno: 5, Input/output error
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
gem_reloc_vs_gpu: igt_core.c:651: igt_fail: Assertion `!test_with_subtests || in_fixture' failed.
Aborted (core dumped)

real    11m38.099s
user    0m0.015s
sys     0m0.124s


==Reproduce steps==
---------------------------- 
1. time ./gem_reloc_vs_gpu --run-subtest faulting-reloc
Comment 1 Chris Wilson 2014-07-04 06:34:09 UTC
You need to attach the error state as well.
Comment 2 Guo Jinxian 2014-07-04 08:12:27 UTC
(In reply to comment #1)
> You need to attach the error state as well.

The error state file it too big to bugzilla, I had send it to you by email. Thanks.
Comment 3 Chris Wilson 2014-07-04 08:17:30 UTC
blt command stream:
  HEAD: 0x0000085c
  TAIL: 0x00000880
  CTL: 0x0001f001
  HWS: 0x0005a000
  ACTHD: 0x00000000 0189c004
  IPEIR: 0x00000008
  IPEHR: 0xdeadbeef
  INSTDONE: 0xfffffff7
  BBADDR: 0x00000000 0189c001
  BB_STATE: 0x00000020
  INSTPS: 0x00000000
  INSTPM: 0x00000000
  FADDR: 0x00000000 0189c200
  RC PSMI: 0x00000010
  FAULT_REG: 0x00000000
  SYNC_0: 0x00000000 [last synced 0x00000000]
  SYNC_1: 0x00000000 [last synced 0x00000000]
  SYNC_2: 0x00000000 [last synced 0x00000000]
  GFX_MODE: 0x00000200
  PDP0: 0x00000000028aa000
  PDP1: 0x00000000028ab000
  PDP2: 0x0000000000000000
  PDP3: 0x0000000000000000
  seqno: 0xfffff023
  waiting: yes
  ring->head: 0x00000000
  ring->tail: 0x00000880
  hangcheck: hung [40]

Test is working as designed to catch races. :|
Comment 4 Jesse Barnes 2014-08-19 20:02:46 UTC
We might need two bugs for this one, one for the actual race the test found, and another to make the test report it better when detected and avoid the timeout logic, which is 10min iirc.

Assigning to Thomas to triage.

Chris, any ideas on the actual race?
Comment 5 Chris Wilson 2014-08-20 10:07:40 UTC
Choose between the interrupt generation being incorrect, a missing flush from the gpu or a missing CS invalidate.
Comment 6 Guo Jinxian 2014-10-20 05:40:19 UTC
Input/output error is able to reproduce on PNV while running test gem_cpu_reloc and gem_cs_prefetch on latest -fixes(bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9)


[root@x-pnv1 tests]# ./gem_cpu_reloc
IGT-Version: 1.8-g303fe74 (i686) (Linux: 3.17.0_drm-intel-fixes_bfe01a_20141020+ i686)
gem_cpu_reloc:  65%Test assertion failure function exec, file gem_cpu_reloc.c:120:
Failed assertion: (drmIoctl(fd, (((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8))), &execbuf)) == 0
Last errno: 5, Input/output error
Test assertion failure function gem_quiescent_gpu, file drmtest.c:164:
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
Last errno: 5, Input/output error
[root@x-pnv1 tests]# echo $?
99
[root@x-pnv1 tests]# ./gem_cs_prefetch
IGT-Version: 1.8-g303fe74 (i686) (Linux: 3.17.0_drm-intel-fixes_bfe01a_20141020+ i686)
Test assertion failure function gem_quiescent_gpu, file drmtest.c:164:
Failed assertion: drmIoctl((fd), ((((1U) << (((0+8)+8)+14)) | ((('d')) << (0+8)) | (((0x40 + 0x29)) << 0) | ((((sizeof(struct drm_i915_gem_execbuffer2)))) << ((0+8)+8)))), (&execbuf)) == 0
Last errno: 5, Input/output error
Comment 7 Chris Wilson 2014-10-20 06:37:23 UTC
(In reply to Guo Jinxian from comment #6)
> Input/output error is able to reproduce on PNV while running test
> gem_cpu_reloc and gem_cs_prefetch on latest
> -fixes(bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9)

That seems odd. If only we had some form of debug message and error state we could use to inspect the failure!

Please open a new bug report for this issue as the cause on pnv is likely very different to a bsw failure.
Comment 8 Jani Nikula 2015-01-29 13:55:14 UTC
(In reply to Chris Wilson from comment #7)
> (In reply to Guo Jinxian from comment #6)
> > Input/output error is able to reproduce on PNV while running test
> > gem_cpu_reloc and gem_cs_prefetch on latest
> > -fixes(bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9)
> 
> That seems odd. If only we had some form of debug message and error state we
> could use to inspect the failure!
> 
> Please open a new bug report for this issue as the cause on pnv is likely
> very different to a bsw failure.

Has this been done?
Comment 9 Ding Heng 2015-01-30 03:19:19 UTC
(In reply to Jani Nikula from comment #8)
> (In reply to Chris Wilson from comment #7)
> (In reply to Guo Jinxian from
> comment #6)
> > Input/output error is able to reproduce on PNV while running
> test
> > gem_cpu_reloc and gem_cs_prefetch on latest
> >
> -fixes(bfe01a5ba2490f299e1d2d5508cbbbadd897bbe9)
> 
> That seems odd. If
> only we had some form of debug message and error state we
> could use to
> inspect the failure!
> 
> Please open a new bug report for this issue as the
> cause on pnv is likely
> very different to a bsw failure.

Has this been
> done?

Please refer to Bug 85672.
Comment 10 Ding Heng 2015-01-30 03:23:32 UTC
This issue not able to reproduce on BSW, change state to verified
Comment 11 Elizabeth 2017-10-06 14:37:24 UTC
Closing old verified.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.