Summary: | [HSW] GPU hang on HSW Celeron when doing 16 VA-API decodes and compositing | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Simon Farnsworth <simon> | ||||||
Component: | DRM/Intel | Assignee: | Simon Farnsworth <simon> | ||||||
Status: | CLOSED INVALID | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | normal | ||||||||
Priority: | medium | CC: | intel-gfx-bugs, przanoni | ||||||
Version: | unspecified | ||||||||
Hardware: | Other | ||||||||
OS: | All | ||||||||
Whiteboard: | |||||||||
i915 platform: | HSW | i915 features: | GPU hang | ||||||
Attachments: |
|
Description
Simon Farnsworth
2014-10-22 11:30:54 UTC
Created attachment 108231 [details]
Error state collected during hang
Whilst this doesn't seem to be ppgtt related at first glance, you want to use i915.enable_ppgtt=1 with that kernel to prevent an eventual hang. It also looks to be a different bug than bug 83677. On second thoughts, the symptom is slightly different, but it is still dying inside a context restore, but on a different command MEDIA_VFE_STATE instead of 3DSTATE_VF. Try: diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915 index 670edfd..4f4de1c 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -507,6 +507,10 @@ mi_set_context(struct i915_gem_request *rq, if (IS_GEN6(rq->i915)) rq->pending_flush |= I915_INVALIDATE_CACHES; + ret = i915_request_emit_flush(rq, I915_COMMAND_BARRIER); + if (ret) + return ret; + len = 3; switch (INTEL_INFO(rq->i915)->gen) { case 8: The patch I'm testing is: diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 841056c..0386721 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -434,6 +434,7 @@ mi_set_context(struct i915_gem_request *rq, { struct intel_ringbuffer *ring; int len; + int ret; /* w/a: If Flush TLB Invalidation Mode is enabled, driver must do a TLB * invalidation prior to MI_SET_CONTEXT. On GEN6 we don't set the value @@ -443,6 +444,10 @@ mi_set_context(struct i915_gem_request *rq, if (IS_GEN6(rq->i915)) rq->pending_flush |= I915_INVALIDATE_CACHES; + ret = i915_request_emit_flush(rq, I915_COMMAND_BARRIER); + if (ret) + return ret; + len = 3; switch (INTEL_INFO(rq->i915)->gen) { case 8: I've also set i915.enable_ppgtt=1 on the kernel command line. I'll let you know what I find. Created attachment 108234 [details] Error state after patch from comment #3 is applied No luck with that patch - it appears to simply move the deckchairs around again. New error state attached. I note from the OSRC PRMs that your patch strictly speaking asks the GPU to do something that's claimed as not supported - you set DW1 bit 20 (CS stall), but not one of the 5 bits the OSRC PRMs claim you must also set (at least one of DW1 bits 12, 0, 1, 13 or 15:14 must be set). Hi Simon, any chance you can update the status of this bug? Maybe it was just a side-effect of the ctx restore bug afterall! I can wish. At any rate, we have Ben's patch to try and Ben has been tackling further HSW GT1 issues in mesa. I need to free up some time to investigate this again, with Chris's context switch fix in place. (In reply to Simon Farnsworth from comment #7) > I need to free up some time to investigate this again, with Chris's context > switch fix in place. After a little over a year, have you had time to check this, or should we just close the bug...? (In reply to Jani Nikula from comment #8) > (In reply to Simon Farnsworth from comment #7) > > I need to free up some time to investigate this again, with Chris's context > > switch fix in place. > > After a little over a year, have you had time to check this, or should we > just close the bug...? I didn't have the chance to investigate again before I left ONELAN and thus lost access to the hardware. I've marked the bug as INVALID, since I can't help further. (In reply to Simon Farnsworth from comment #9) > I didn't have the chance to investigate again before I left ONELAN and thus > lost access to the hardware. I've marked the bug as INVALID, since I can't > help further. Thanks for the follow-up, Simon! |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.