Summary: | [CI][DRMTIP] igt@i915_suspend@debugfs-reader - dmesg-warn - Unexpected send: action=0x30 | ||
---|---|---|---|
Product: | DRI | Reporter: | Martin Peres <martin.peres> |
Component: | DRM/Intel | Assignee: | Robert M. Fosha <robert.m.fosha> |
Status: | RESOLVED MOVED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | intel-gfx-bugs, jon.ewins |
Version: | DRI git | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | KBL | i915 features: | firmware/guc |
Description
Martin Peres
2019-02-03 12:41:38 UTC
*** Bug 109537 has been marked as a duplicate of this bug. *** The guc is in a reset state (from an earlier failure leaving the driver wedged) and not able to respond. Either this should not be a warn, or the logging should be more intelligent about when guc is disabled. Worth a shot to see if this is fixed by commit 07c100b187332101220baf7446b4f09296d7c59b Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Feb 21 16:38:33 2019 +0000 drm/i915/guc: Flush the residual log capture irq on disabling As we disable the log capture events, flush any residual interrupt before we flush and disable the worker. v2: Mika pointed out that it wasn't the worker re-queueing itself, but a rogue irq. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109716 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190221163833.21393-1-chris@chris-wilson.co.uk (In reply to Chris Wilson from comment #3) > Worth a shot to see if this is fixed by > > commit 07c100b187332101220baf7446b4f09296d7c59b > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Thu Feb 21 16:38:33 2019 +0000 > > drm/i915/guc: Flush the residual log capture irq on disabling > > As we disable the log capture events, flush any residual interrupt > before we flush and disable the worker. > > v2: Mika pointed out that it wasn't the worker re-queueing itself, but a > rogue irq. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109716 > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> > Link: > https://patchwork.freedesktop.org/patch/msgid/20190221163833.21393-1- > chris@chris-wilson.co.uk Seems like it might have reduced the reproduction rate, but unfortunately, this is still happening: https://intel-gfx-ci.01.org/tree/drm-tip/drmtip_234/fi-kbl-guc/igt@i915_suspend@debugfs-reader.html The issue with the guc logging awareness of GuC state will still need to be fixed for when guc relay is in use, but as a first step the intention is to stop this issue occurring in BAT tests. The relay log is a developer only feature and an extension of the standard GuC logging mechanism. Its testing should not be part of the BAT set. It is being implicitly enabled by existing tests, which provide no consumer for the logs. The issue is that IGT tests that cycle through reading the debugfs entries will call the open file op for the guc_log_relay control file and this currently causes the file to be both created and the logging to be started. These tests have no consumer for the logs which go on to overflow, run out of sub-buffers or, as in this specific case, misbehave in suspend and reset handling. A proposed change, that will follow as RFC patches, is to separate the creation and logging start, updating the guc_log_relay_write function, which currently just flushes the log, to also support starting the logging based on the value written. An additional new test to actually explicitly test this relay log developer feature will follow separately. Same issue as 111148 and 111165. Proposed fix tested and passing. Patches in prep for posting. RFC patch sent out for review: drm/i915/guc: Enable guc logging on guc log relay write. While creating and testing the patch also noticed that the IGT tool intel_guc_logger needs to be updated for the new relay implementation. Will create another patch to update the IGT tool. -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/intel/issues/225. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.