Summary: | [BAT][ELK] The gpu failed to reset when executing igt@gem_exec_fence@await-hang-default in CI | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | critical | ||
Priority: | highest | CC: | intel-gfx-bugs, jani.saarinen |
Version: | XOrg git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | PatchMerged | ||
i915 platform: | G45 | i915 features: | GPU hang |
Description
Martin Peres
2017-05-05 07:21:00 UTC
Adding tag into "Whiteboard" field - ReadyForDev The bug still active *Status is correct *Platform is included *Feature is included *Priority and Severity correctly set This occurred again. This time, it also generated a lot of other dmesg-fails: - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_mmap_gtt@basic.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_mmap@basic-small-bo.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_flink_basic@bad-open.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_flink_basic@double-flink.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_flink_basic@basic.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_flink_basic@flink-lifetime.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_flink_basic@bad-flink.html - https://intel-gfx-ci.01.org/CI/CI_DRM_2598/fi-elk-e7500/igt@gem_mmap@basic.html elk reset is not working as expected. Sometimes resetting a gpu results whole whole gpu getting stuck (~5% chance) No workaround found yet. Retrying the reset nor going to d3 -> d0 helps to revive. Does it fail when doing the media reset or the render reset? There are a couple of interesting looking w/a names in the database: WaMediaResetBeforeFullReset and WaMediaResetMainRingCleanup WaMediaResetBeforeFullReset might just mean that we should change the order in which we do the resets. Not sure what WaMediaResetMainRingCleanup might be about. Does it mean that we should do a media reset when cleaning up the ring, or that we should do some kind of ring cleanup when we do a media reset. Failed even with media reset before render. I will try next with stopping the rings in reverse order prior the reset. That is the best I can imagine WaMediaResetMainRingCleanup to be. (In reply to Ville Syrjala from comment #4) > Does it fail when doing the media reset or the render reset? It fails when doing the render reset. The reset never completes and the gpu seems to be stuck. As the ring inits will fail if trying to restart. commit 2c80353f3cd0cd4b28b17d55226e5914d2c0d5e1 Author: Mika Kuoppala <mika.kuoppala@linux.intel.com> Date: Fri May 19 12:13:40 2017 +0300 drm/i915/g4x: Improve gpu reset reliability *** Bug 100943 has been marked as a duplicate of this bug. *** *** Bug 100999 has been marked as a duplicate of this bug. *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.