Bug 107260 - [CI][SHARDS] igt@* - fail - gem exec bo fails with error -5 on gen9
Summary: [CI][SHARDS] igt@* - fail - gem exec bo fails with error -5 on gen9
Status: CLOSED WORKSFORME
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-17 11:11 UTC by Martin Peres
Modified: 2018-11-01 18:07 UTC (History)
1 user (show)

See Also:
i915 platform: BXT, CFL, GLK, KBL, SKL
i915 features: GEM/Other


Attachments

Description Martin Peres 2018-07-17 11:11:04 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4489/shard-glk6/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-onoff.html

https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4489/shard-apl2/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-onoff.html

(kms_frontbuffer_tracking:1418) intel_batchbuffer-CRITICAL: Test assertion failure function intel_batchbuffer_flush_on_ring, file ../lib/intel_batchbuffer.c:239:
(kms_frontbuffer_tracking:1418) intel_batchbuffer-CRITICAL: Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0
(kms_frontbuffer_tracking:1418) intel_batchbuffer-CRITICAL: Last errno: 5, Input/output error
Subtest fbc-1p-primscrn-spr-indfb-onoff failed.


https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4489/shard-apl4/igt@kms_vblank@pipe-c-query-idle-hang.html

(kms_vblank:1224) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf_wr, file ../lib/ioctl_wrappers.c:635:
(kms_vblank:1224) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf_wr(fd, execbuf) == 0
(kms_vblank:1224) ioctl_wrappers-CRITICAL: error: -5 != 0
Subtest pipe-C-query-idle-hang failed.


https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4489/shard-apl4/igt@kms_rotation_crc@bad-pixel-format.html

(kms_rotation_crc:1270) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ../lib/ioctl_wrappers.c:605:
(kms_rotation_crc:1270) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(kms_rotation_crc:1270) ioctl_wrappers-CRITICAL: error: -5 != 0
Subtest bad-pixel-format failed.
Comment 1 Martin Peres 2018-07-17 11:12:50 UTC
This may have been fixed 7 runs ago (CI_DRM_4497). Will investigate when I am done filing!
Comment 2 Chris Wilson 2018-07-17 11:15:14 UTC
The IGT bug (failing to check if their is a GPU before using it) is still there though. That still affects skl-iommu...
Comment 3 Martin Peres 2018-07-17 11:32:46 UTC
(In reply to Chris Wilson from comment #2)
> The IGT bug (failing to check if their is a GPU before using it) is still
> there though. That still affects skl-iommu...

I guess I should de-dup them then :) Is that the only reason to get a -5?
Comment 4 Chris Wilson 2018-07-17 11:39:53 UTC
-5 is from the driver being wedged following an earlier failed reset. In this case that was a driver bug 107259, but for skl-iommu we have a hw feature to contend with. (And in essence iommu will be disabled for skl again in the near future, hopefully before 4.18.)
Comment 5 Chris Wilson 2018-07-28 15:57:00 UTC
Until the next time we have a wedged device.
Comment 6 Francesco Balestrieri 2018-08-07 08:02:49 UTC
Closing based on the comments above.
Comment 7 Martin Peres 2018-09-10 09:37:46 UTC
(In reply to Chris Wilson from comment #5)
> Until the next time we have a wedged device.

Yeah... It's back... What should we do with this bug?

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/igt@kms_plane_scaling@pipe-c-scaler-with-pixel-format.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html
	
(kms_frontbuffer_tracking:1750) intel_batchbuffer-CRITICAL: Test assertion failure function intel_batchbuffer_flush_on_ring, file ../lib/intel_batchbuffer.c:239:
(kms_frontbuffer_tracking:1750) intel_batchbuffer-CRITICAL: Failed assertion: (drm_intel_gem_bo_context_exec(batch->bo, ctx, used, ring)) == 0
(kms_frontbuffer_tracking:1750) intel_batchbuffer-CRITICAL: Last errno: 5, Input/output error
Subtest fbc-1p-primscrn-cur-indfb-onoff failed.
Comment 8 Martin Peres 2018-09-13 07:57:59 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/igt@pm_rpm@gem-execbuf-stress-extra-wait.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/igt@pm_rpm@gem-execbuf-stress.html

(pm_rpm:2032) ioctl_wrappers-CRITICAL: Test assertion failure function gem_execbuf, file ../lib/ioctl_wrappers.c:605:
(pm_rpm:2032) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd, execbuf) == 0
(pm_rpm:2032) ioctl_wrappers-CRITICAL: error: -5 != 0
Subtest gem-execbuf-stress failed.
Comment 9 Chris Wilson 2018-09-13 08:08:35 UTC
(In reply to Martin Peres from comment #8)
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/
> igt@pm_rpm@gem-execbuf-stress-extra-wait.html
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/
> igt@pm_rpm@gem-execbuf-stress.html
> 
> (pm_rpm:2032) ioctl_wrappers-CRITICAL: Test assertion failure function
> gem_execbuf, file ../lib/ioctl_wrappers.c:605:
> (pm_rpm:2032) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd,
> execbuf) == 0
> (pm_rpm:2032) ioctl_wrappers-CRITICAL: error: -5 != 0
> Subtest gem-execbuf-stress failed.

These are definitely not the same problem. This is not the device being wedged before hand, but the guc exploding midtest.
Comment 10 Chris Wilson 2018-09-13 08:15:07 UTC
(In reply to Martin Peres from comment #7)
> (In reply to Chris Wilson from comment #5)
> > Until the next time we have a wedged device.
> 
> Yeah... It's back... What should we do with this bug?
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/
> igt@kms_plane_scaling@pipe-c-scaler-with-pixel-format.html
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/
> igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html
> 
> https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_107/fi-cfl-guc/
> igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-gtt.html

The same as we do everytime, teach them to skip if they don't have the resources required to run their test. kms_frontbuffer_tracking is messy to try and get igt_require_gem() into the right subset.
Comment 11 Martin Peres 2018-09-13 08:38:01 UTC
(In reply to Chris Wilson from comment #9)
> (In reply to Martin Peres from comment #8)
> > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/
> > igt@pm_rpm@gem-execbuf-stress-extra-wait.html
> > 
> > https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_109/fi-kbl-guc/
> > igt@pm_rpm@gem-execbuf-stress.html
> > 
> > (pm_rpm:2032) ioctl_wrappers-CRITICAL: Test assertion failure function
> > gem_execbuf, file ../lib/ioctl_wrappers.c:605:
> > (pm_rpm:2032) ioctl_wrappers-CRITICAL: Failed assertion: __gem_execbuf(fd,
> > execbuf) == 0
> > (pm_rpm:2032) ioctl_wrappers-CRITICAL: error: -5 != 0
> > Subtest gem-execbuf-stress failed.
> 
> These are definitely not the same problem. This is not the device being
> wedged before hand, but the guc exploding midtest.

Thanks, bug filed here: https://bugs.freedesktop.org/show_bug.cgi?id=107917
Comment 12 Martin Peres 2018-11-01 18:07:53 UTC
 This was very much reproducible every single run, now not seen since drmtip_116 (1 month, 1 week / 19 runs ago). Closing!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.