Bug 111562

Summary: [CI][BAT] igt@gem* - fail / dmesg-fail - Failed assertion: !"GPU hung"
Product: DRI Reporter: Martin Peres <martin.peres>
Component: DRM/IntelAssignee: Intel GFX Bugs mailing list <intel-gfx-bugs>
Status: RESOLVED FIXED QA Contact: Intel GFX Bugs mailing list <intel-gfx-bugs>
Severity: not set    
Priority: highest CC: intel-gfx-bugs
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: TGL i915 features: GEM/Other

Description Martin Peres 2019-09-05 07:37:11 UTC
https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_close_race@basic-threads.html

Starting subtest: basic-threads
(gem_close_race:1040) igt_aux-CRITICAL: Test assertion failure function sig_abort, file ../lib/igt_aux.c:502:
(gem_close_race:1040) igt_aux-CRITICAL: Failed assertion: !"GPU hung"
Comment 1 CI Bug Log 2019-09-05 07:37:37 UTC
The CI Bug Log issue associated to this bug has been updated.

### New filters associated

* TGL: random tests - fail / dmesg-fail - Failed assertion: !&quot;GPU hung&quot;
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_close_race@basic-process.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_close_race@basic-threads.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_ctx_create@basic-files.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_basic@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_create@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_fence@basic-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_fence@nb-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_gttfill@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_parallel@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_store@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_suspend@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_suspend@basic-s3.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_exec_suspend@basic-s4-devices.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_sync@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_sync@basic-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_sync@basic-many-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_sync@basic-store-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14277/fi-tgl-u/igt@gem_sync@basic-store-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_close_race@basic-process.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_close_race@basic-threads.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_ctx_create@basic-files.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_basic@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_create@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_fence@basic-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_fence@nb-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_gttfill@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_parallel@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_store@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_suspend@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_suspend@basic-s3.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_exec_suspend@basic-s4-devices.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_sync@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_sync@basic-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_sync@basic-many-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_sync@basic-store-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6838/fi-tgl-u/igt@gem_sync@basic-store-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_close_race@basic-process.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_close_race@basic-threads.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_ctx_create@basic-files.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_basic@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_create@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_fence@basic-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_fence@nb-await-default.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_gttfill@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_parallel@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_store@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_suspend@basic.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_suspend@basic-s3.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_exec_suspend@basic-s4-devices.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_sync@basic-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_sync@basic-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_sync@basic-many-each.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_sync@basic-store-all.html
  - https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_14279/fi-tgl-u/igt@gem_sync@basic-store-each.html
Comment 2 Chris Wilson 2019-09-05 11:17:38 UTC
*** Bug 111560 has been marked as a duplicate of this bug. ***
Comment 3 Chris Wilson 2019-09-05 11:18:01 UTC
*** Bug 111563 has been marked as a duplicate of this bug. ***
Comment 4 Chris Wilson 2019-09-05 11:18:20 UTC
*** Bug 111564 has been marked as a duplicate of this bug. ***
Comment 5 Chris Wilson 2019-09-05 11:19:31 UTC
The failure signature is from context restore; and lo and behold we missed the patch to update the context image for gen12. Hopefully that will land soon and we will get on to the next error.
Comment 6 Chris Wilson 2019-09-05 17:44:35 UTC
https://patchwork.freedesktop.org/series/66276/
Comment 7 Chris Wilson 2019-09-06 17:21:40 UTC
commit 5bf05dc58d65b215437df3013163a7eea78d5d4c (HEAD -> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
Author: Michel Thierry <michel.thierry@intel.com>
Date:   Fri Sep 6 15:23:14 2019 +0300

    drm/i915/tgl: Register state context definition for Gen12
    
    Gen12 has subtle changes in the reg state context offsets (some fields
    are gone, some are in a different location), compared to previous Gens.
    
    The simplest approach seems to be keeping Gen12 (and future platform)
    changes apart from the previous gens, while keeping the registers that
    are contiguous in functions we can reuse.
    
    v2: alias, virtual engine, rpcs, prune unused regs
    v3: use engine base (Daniele), take ctx_bb for all
    
    Bspec: 46255
    Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
    Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: José Roberto de Souza <jose.souza@intel.com>
    Signed-off-by: Michel Thierry <michel.thierry@intel.com>
    Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
    Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
    Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    Tested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
    [ickle: Tweaked the GEM_WARN_ON after settling on a compromise with
    Daniele]
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190906122314.2146-2-mika.kuoppala@linux.intel.com
Comment 8 Martin Peres 2019-09-09 08:11:56 UTC
(In reply to Chris Wilson from comment #7)
> commit 5bf05dc58d65b215437df3013163a7eea78d5d4c (HEAD ->
> drm-intel-next-queued, drm-intel/drm-intel-next-queued)
> Author: Michel Thierry <michel.thierry@intel.com>
> Date:   Fri Sep 6 15:23:14 2019 +0300
> 
>     drm/i915/tgl: Register state context definition for Gen12
>     
>     Gen12 has subtle changes in the reg state context offsets (some fields
>     are gone, some are in a different location), compared to previous Gens.
>     
>     The simplest approach seems to be keeping Gen12 (and future platform)
>     changes apart from the previous gens, while keeping the registers that
>     are contiguous in functions we can reuse.
>     
>     v2: alias, virtual engine, rpcs, prune unused regs
>     v3: use engine base (Daniele), take ctx_bb for all
>     
>     Bspec: 46255
>     Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
>     Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>     Cc: Chris Wilson <chris@chris-wilson.co.uk>
>     Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
>     Cc: José Roberto de Souza <jose.souza@intel.com>
>     Signed-off-by: Michel Thierry <michel.thierry@intel.com>
>     Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
>     Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
>     Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>     Tested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>     [ickle: Tweaked the GEM_WARN_ON after settling on a compromise with
>     Daniele]
>     Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>     Link:
> https://patchwork.freedesktop.org/patch/msgid/20190906122314.2146-2-mika.
> kuoppala@linux.intel.com

Thanks, it definitely fixed the issue. I will create a new bug for all the other instances of gpu hungs.
Comment 9 CI Bug Log 2019-10-16 10:37:33 UTC
The CI Bug Log issue associated to this bug has been archived.

New failures matching the above filters will not be associated to this bug anymore.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.