Summary: | [KBL] igt@gem_ctx_switch@ blt-interruptible GuC 9.14 triggers GEM_BUG_ON(!i915_gem_request_completed(request)); | ||
---|---|---|---|
Product: | DRI | Reporter: | Mika Kuoppala <mika.kuoppala> |
Component: | DRM/Intel | Assignee: | John Spotswood <john.a.spotswood> |
Status: | CLOSED WORKSFORME | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | low | CC: | intel-gfx-bugs |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | ReadyForDev | ||
i915 platform: | KBL | i915 features: | GEM/Other |
Description
Mika Kuoppala
2017-06-29 08:55:11 UTC
This is 99.9999% likely not a driver bug. This was with guc submission enabled, kernel params: drm.debug=0xe 3 i915.enable_guc_loading=1 i915.enable_guc_submission=1 Without guc, ~2000 runs without failures. Changing priority to low, since is IGT non-basic Failure very sporadic (<20%). Thanks. I am looking into this issue. Is it possible for someone to post a binary blob of the GuC log for the run when this issue is exhibited? I am still looking into this issue; however, I have been context switching with other work. Also, I did not see a reply regarding the posting of a binary blob of the GuC log for the run when this issue is exhibited. Is that possible? I seem to hit lot of DRM_ERROR_RATELIMITED("no sub-buffer to capture logs\n"); while logging. Could you provide the desired log level and a method of reading the log buffers effectively. Or a way to switch to overwrite mode so that we get the latest stuff on error. So far, I have been unable to reproduce this issue. I have executed the test as specified in the initial comment: while ./gem_ctx_switch --r blt-interruptible ; do date ; done The loop has been running for over 24 hours, and I have not seen a failure. I am running on a KBL system; however, the device differs slightly: 00:02.0 8086:5917 (rev 07) I am running with the 9.14 version of the Guc FW and the latest kernel version from drm-intel-next-queued. It's unclear from this entry what kernel version exhibited the failure, but I could try a version based on the date of when this issue was submitted. I am also seeing the dmesg error: [drm:guc_read_update_log_buffer] *ERROR* no sub-buffer to capture logs I need to investigate that further. Hello, I am still unable to reproduce this issue. I did try an older kernel version (4.12.0-rc5+), but I am not seeing any failures. Are there any configuration details that I should be aware of? Can you post any relevant configuration in formation, or maybe post the entire dmesg dump for the failing case? I am also unable to reproduce this with latest drm-tip: commit 63e85ec6f910933a46b5a50a2a077b6860ed4815 ~500 runs without failure If neither of us can reproduce this issue, do you have any suggestions on how we should proceed? Hello, if not reproducible anymore, can this issue be closed? Thank you. I think it should be closed. Unless Mika objects, I can close it. To actually reconstruct the original bug where the guc was executing contexts out of order, you have to disable trickle feeding the guc. If you bump the queue depth up to say 128, that should restore behaviour similar to the original case. First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug. Closing |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.