Bug 110554 - [CI][DRMTIP][guc] igt@drm_import_export@flink - timeout - GPU HANG
Summary: [CI][DRMTIP][guc] igt@drm_import_export@flink - timeout - GPU HANG
Status: RESOLVED WONTFIX
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard: ReadyForDev
Keywords:
: 110743 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-04-30 07:07 UTC by Lakshmi
Modified: 2019-05-28 11:17 UTC (History)
1 user (show)

See Also:
i915 platform: BXT
i915 features: firmware/guc


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Lakshmi 2019-04-30 07:07:53 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_269/fi-apl-guc/igt@drm_import_export@flink.html

<6> [239.656456] Console: switching to colour dummy device 80x25
<6> [239.656614] [IGT] drm_import_export: executing
<6> [239.704270] [IGT] drm_import_export: starting subtest flink
<6> [532.542666] kworker/dying (7) used greatest stack depth: 11264 bytes left
<5> [933.054772] Setting dangerous option reset - tainting kernel
<7> [963.579379] hangcheck rcs0
<7> [963.579386] hangcheck 	Awake? 1
<7> [963.579391] hangcheck 	Hangcheck a4eb31e2:315287d9 [6017 ms]
<7> [963.579394] hangcheck 	Reset count: 0 (global 0)
<7> [963.579398] hangcheck 	Requests:
<7> [963.579406] hangcheck 		first   9:3a  prio=-8186 @ 8001ms: [i915]
<7> [963.579412] hangcheck 		last    9:3a  prio=-8186 @ 8001ms: [i915]
<7> [963.579426] hangcheck 	RING_START: 0x00861000
<7> [963.579433] hangcheck 	RING_HEAD:  0x00001380
<7> [963.579439] hangcheck 	RING_TAIL:  0x00001380
<7> [963.579449] hangcheck 	RING_CTL:   0x00003000
<7> [963.579458] hangcheck 	RING_MODE:  0x00000200 [idle]
<7> [963.579465] hangcheck 	RING_IMR: fffffefe
<7> [963.579477] hangcheck 	ACTHD:  0x00000000_00001380
<7> [963.579489] hangcheck 	BBADDR: 0x00000000_0001be84
<7> [963.579501] hangcheck 	DMA_FADDR: 0x00000000_00000000
<7> [963.579508] hangcheck 	IPEIR: 0x00000000
<7> [963.579514] hangcheck 	IPEHR: 0x00000000
<7> [963.579524] hangcheck 	Execlist status: 0x00000301 00000000, entries 6
<7> [963.579528] hangcheck 	Execlist CSB read 5, write 3, tasklet queued? no (enabled)
<7> [963.579532] hangcheck 	Execlist CSB[0]: 0x00000001, context: 0
<7> [963.579536] hangcheck 	Execlist CSB[1]: 0x00000018, context: 2092455
<7> [963.579540] hangcheck 	Execlist CSB[2]: 0x00000001, context: 0
<7> [963.579544] hangcheck 	Execlist CSB[3]: 0x00000018, context: 2092455
<7> [963.579551] hangcheck 		ELSP[0] count=1, ring:{start:007ee000, hwsp:fede7000, seqno:00000038}, rq:  9:3a  prio=-8186 @ 8001ms: [i915]
<7> [963.579555] hangcheck 		ELSP[1] idle
<7> [963.579559] hangcheck 		HW active? 0x1
<7> [963.579567] hangcheck 		E  9:3a  prio=-8186 @ 8001ms: [i915]
<7> [963.579571] hangcheck HWSP:
<7> [963.579577] hangcheck [0000] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [963.579580] hangcheck *
<7> [963.579586] hangcheck [0040] 00000001 00000000 00000018 001feda7 00000001 00000000 00000018 001feda7
<7> [963.579591] hangcheck [0060] 00000001 00000000 00000018 001feda7 00000000 00000000 00000000 00000003
<7> [963.579595] hangcheck [0080] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [963.579599] hangcheck *
<7> [963.579604] hangcheck [00c0] 00000000 00000000 00000000 00000000 a4eb31e2 00000000 00000000 00000000
<7> [963.579608] hangcheck [00e0] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<7> [963.579612] hangcheck *
<7> [963.579618] hangcheck Idle? no
<6> [963.591046] i915 0000:00:02.0: GPU HANG: ecode 9:0:0x00000000, hang on rcs0
<7> [963.592592] [drm:i915_reset_device [i915]] resetting chip
<5> [963.593845] i915 0000:00:02.0: Resetting chip for hang on rcs0
<7> [963.595619] [drm:intel_uc_fw_upload [i915]] HuC fw load i915/bxt_huc_ver01_8_2893.bin
<7> [963.595697] [drm:intel_uc_fw_upload [i915]] HuC fw load PENDING
<7> [963.596664] [drm:huc_fw_xfer [i915]] HuC DMA transfer wait over with ret 0
<7> [963.596749] [drm:intel_uc_fw_upload [i915]] HuC fw load SUCCESS
<6> [963.596753] [drm] HuC: Loaded firmware i915/bxt_huc_ver01_8_2893.bin (version 1.8)
<7> [963.596830] [drm:intel_guc_init_params [i915]] param[ 0] = 0x816040
<7> [963.596904] [drm:intel_guc_init_params [i915]] param[ 1] = 0x0
<7> [963.596978] [drm:intel_guc_init_params [i915]] param[ 2] = 0x5f5e100
<7> [963.597053] [drm:intel_guc_init_params [i915]] param[ 3] = 0x0
<7> [963.597127] [drm:intel_guc_init_params [i915]] param[ 4] = 0x7f7fd3
<7> [963.597246] [drm:intel_guc_init_params [i915]] param[ 5] = 0x0
<7> [963.597324] [drm:intel_guc_init_params [i915]] param[ 6] = 0x8
<7> [963.597401] [drm:intel_guc_init_params [i915]] param[ 7] = 0x3
<7> [963.597476] [drm:intel_guc_init_params [i915]] param[ 8] = 0x405203
<7> [963.597554] [drm:intel_guc_init_params [i915]] param[ 9] = 0x0
<7> [963.597630] [drm:intel_guc_init_params [i915]] param[10] = 0x0
<7> [963.597706] [drm:intel_guc_init_params [i915]] param[11] = 0x0
<7> [963.597780] [drm:intel_guc_init_params [i915]] param[12] = 0x0
<7> [963.597854] [drm:intel_guc_init_params [i915]] param[13] = 0x0
<7> [963.597952] [drm:intel_uc_fw_upload [i915]] GuC fw load i915/bxt_guc_ver9_29.bin
<7> [963.598026] [drm:intel_uc_fw_upload [i915]] GuC fw load PENDING
<7> [963.608246] [drm:guc_fw_xfer [i915]] GuC status 0x8002f0ec
<7> [963.608331] [drm:intel_uc_fw_upload [i915]] GuC fw load SUCCESS
<6> [963.608336] [drm] GuC: Loaded firmware i915/bxt_guc_ver9_29.bin (version 9.29)
<7> [963.608846] [drm:__guc_client_enable [i915]] Host engines 0x47 => GuC engines used 0xf
<7> [963.609832] [drm:__guc_client_enable [i915]] Host engines 0x47 => GuC engines used 0xf
<6> [963.609995] i915 0000:00:02.0: GuC firmware version 9.29
<6> [963.610000] i915 0000:00:02.0: GuC submission enabled
<6> [963.610004] i915 0000:00:02.0: HuC enabled
Comment 1 CI Bug Log 2019-04-30 07:09:49 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* APL: igt@drm_import_export@flink - timeout - GPU HANG
  - https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_269/fi-apl-guc/igt@drm_import_export@flink.html
Comment 2 Chris Wilson 2019-05-23 15:26:31 UTC
*** Bug 110743 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2019-05-28 11:17:08 UTC
commit a2904ade3dc28cf1a1b7deded41f4369f75e664c
Author: Michal Wajdeczko <michal.wajdeczko@intel.com>
Date:   Mon May 27 18:35:58 2019 +0000

    drm/i915/guc: Don't allow GuC submission
    
    Due to the upcoming changes to the GuC ABI interface, we must
    disable GuC submission mode until final ABI will be available
    on all GuC firmwares.
    
    To avoid regressions on systems configured to run with no longer
    supported configuration "enable_guc=3" or "enable_guc=1" clear
    GuC submission bit.
    
    v2: force switch to non-GuC submission mode
    v3: use GEM_BUG_ON (Joonas)
    
    Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Cc: John Spotswood <john.a.spotswood@intel.com>
    Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
    Cc: Tony Ye <tony.ye@intel.com>
    Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
    Cc: Jeff Mcgee <jeff.mcgee@intel.com>
    Cc: Antonio Argenziano <antonio.argenziano@intel.com>
    Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
    Cc: Martin Peres <martin.peres@linux.intel.com>
    Acked-by: Martin Peres <martin.peres@linux.intel.com>
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-3-michal.wajdeczko@intel.com


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.