Bug 106685 - [CI] igt@* - fail/dmesg-fail - *ERROR* GuC firmware xfer error -110
Summary: [CI] igt@* - fail/dmesg-fail - *ERROR* GuC firmware xfer error -110
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: XOrg git
Hardware: Other All
: medium normal
Assignee: Talha Nassar
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 106652 107837 (view as bug list)
Depends on:
Blocks:
 
Reported: 2018-05-28 12:06 UTC by Martin Peres
Modified: 2018-11-01 17:09 UTC (History)
1 user (show)

See Also:
i915 platform: ALL
i915 features: firmware/guc, GEM/Other


Attachments

Description Martin Peres 2018-05-28 12:06:41 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_39/fi-skl-guc/igt@gem_eio@in-flight-internal-immediate.html

(gem_eio:1575) igt_gt-CRITICAL: Test assertion failure function igt_force_gpu_reset, file ../lib/igt_gt.c:415:
(gem_eio:1575) igt_gt-CRITICAL: Failed assertion: !wedged
Subtest in-flight-internal-immediate failed.

[  143.582521] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  143.590315] Setting dangerous option reset - tainting kernel
[  143.590812] Setting dangerous option reset - tainting kernel
[  143.592094] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  143.609933] Setting dangerous option reset - tainting kernel
[  143.610147] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  143.711748] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer error -110
[  143.711877] [drm] GuC: Failed to load firmware i915/skl_guc_ver9_33.bin (error -110)
[  143.711881] i915 0000:00:02.0: GuC initialization failed -110
[  143.711928] [drm:i915_gem_init_hw [i915]] *ERROR* Enabling uc failed (-110)
[  143.711971] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-110)
[  143.731654] Setting dangerous option reset - tainting kernel
[  143.732070] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff


https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_52/fi-skl-guc/igt@gem_eio@in-flight-contexts-1us.html

(gem_eio:1724) igt_gt-CRITICAL: Test assertion failure function igt_force_gpu_reset, file ../lib/igt_gt.c:415:
(gem_eio:1724) igt_gt-CRITICAL: Failed assertion: !wedged
Subtest in-flight-contexts-1us failed.

[  188.657525] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  188.667105] Setting dangerous option reset - tainting kernel
[  188.679423] Setting dangerous option reset - tainting kernel
[  188.696702] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  188.724036] Setting dangerous option reset - tainting kernel
[  188.724360] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  188.744481] Setting dangerous option reset - tainting kernel
[  188.759961] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  188.780695] Setting dangerous option reset - tainting kernel
[  188.780912] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  188.885918] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer error -110
[  188.886052] [drm] GuC: Failed to load firmware i915/skl_guc_ver9_33.bin (error -110)
[  188.886054] i915 0000:00:02.0: GuC initialization failed -110
[  188.886088] [drm:i915_gem_init_hw [i915]] *ERROR* Enabling uc failed (-110)
[  188.886119] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-110)
[  188.908648] Setting dangerous option reset - tainting kernel
[  188.908915] i915 0000:00:02.0: Resetting chip for Manually set wedged engine mask = ffffffffffffffff
[  189.010293] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer error -110
[  189.010335] [drm] GuC: Failed to load firmware i915/skl_guc_ver9_33.bin (error -110)
[  189.010337] i915 0000:00:02.0: GuC initialization failed -110
[  189.010371] [drm:i915_gem_init_hw [i915]] *ERROR* Enabling uc failed (-110)
[  189.010402] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-110)
Comment 1 Martin Peres 2018-06-13 09:14:55 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_61/fi-skl-guc/igt@kms_vblank@pipe-b-query-forked-busy-hang.html

[   43.753421] i915 0000:00:02.0: Resetting chip for hang on rcs0
[   43.869070] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer error -110
[   43.869135] [drm] GuC: Failed to load firmware i915/skl_guc_ver9_33.bin (error -110)
[   43.869137] i915 0000:00:02.0: GuC initialization failed -110
[   43.869158] [drm:i915_gem_init_hw [i915]] *ERROR* Enabling uc failed (-110)
[   43.869174] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-110)
Comment 2 Chris Wilson 2018-07-20 10:50:50 UTC
*** Bug 106652 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2018-08-31 14:10:08 UTC
*** Bug 107710 has been marked as a duplicate of this bug. ***
Comment 4 Chris Wilson 2018-09-06 19:17:39 UTC
*** Bug 107837 has been marked as a duplicate of this bug. ***
Comment 5 Chris Wilson 2018-09-21 16:26:17 UTC
And guc submission has been dropped from CI.
Comment 6 Martin Peres 2018-10-12 14:21:22 UTC
(In reply to Chris Wilson from comment #5)
> And guc submission has been dropped from CI.

Indeed.... but the bug still remains even without it...

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_123/fi-skl-guc/igt@kms_flip@2x-flip-vs-panning-vs-hang.html

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_122/fi-skl-guc/igt@gem_eio@in-flight-contexts-1us.html

[...]

https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_125/fi-skl-guc/igt@gem_eio@reset-stress.html

<3> [136.597804] [drm:guc_fw_xfer [i915]] *ERROR* GuC firmware xfer error -110
<3> [136.598129] [drm:i915_gem_init_hw [i915]] *ERROR* Enabling uc failed (-110)
<3> [136.598290] [drm:i915_reset [i915]] *ERROR* Failed to initialise HW following reset (-110)
Comment 7 Martin Peres 2018-10-15 10:03:25 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4977/fi-apl-guc/igt@pm_rpm@module-reload.html

<3> [363.630599] [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action 0x502 failed with error -110 0x502
<3> [363.973595] [drm:intel_guc_send_mmio [i915]] *ERROR* MMIO: GuC action 0x501 failed with error -110 0x501
Comment 8 Chris Wilson 2018-10-23 16:14:20 UTC
Error message declared superfluous:

commit fbffc5a3b877adc0c5334f3f6ff628ffb7e70d5e (HEAD -> drm-intel-next-queued)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Oct 18 20:55:36 2018 +0100

    drm/i915/guc: Propagate the fw xfer timeout
    
    Propagate the timeout on transferring the fw back to the caller where it
    may act upon it, usually by restarting the xfer before failing.
    
    v2: Simplify the wait to only wait upon the guc signaling completion,
    with an assertion that the fw xfer must have completed for it to be
    ready!
    
    Testcase: igt/drv_selftest/live_hangcheck
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20181018195536.11522-1-chris@chris-wilson.co.uk
Comment 9 Martin Peres 2018-11-01 17:09:54 UTC
(In reply to Chris Wilson from comment #8)
> Error message declared superfluous:
> 
> commit fbffc5a3b877adc0c5334f3f6ff628ffb7e70d5e (HEAD ->
> drm-intel-next-queued)
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Thu Oct 18 20:55:36 2018 +0100
> 
>     drm/i915/guc: Propagate the fw xfer timeout
>     
>     Propagate the timeout on transferring the fw back to the caller where it
>     may act upon it, usually by restarting the xfer before failing.
>     
>     v2: Simplify the wait to only wait upon the guc signaling completion,
>     with an assertion that the fw xfer must have completed for it to be
>     ready!
>     
>     Testcase: igt/drv_selftest/live_hangcheck
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>     Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>     Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>     Link:
> https://patchwork.freedesktop.org/patch/msgid/20181018195536.11522-1-
> chris@chris-wilson.co.uk

Thanks! Seems to have done the trick!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.