During deployment of a new Haswell device to Intel's Mesa CI, we found that many tests fail for Linux 4.13. Using the stock 4.9 kernel, the same tests pass. About 100 tests fail on the newer public kernel, eg: ES31-CTS.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_greater.mirrored_repeat_clamp_to_edge ES31-CTS.functional.texture.gather.offset_dynamic.min_required_offset.2d.depth32f.size_npot.compare_less.repeat_mirrored_repeat Standard Output Using a framebuffer object with renderbuffer with format GL_RGBA8 and size (64, 64) Note: texture level's size is (17, 23) Texture base level is 0 s and t wrap modes are GL_MIRRORED_REPEAT and GL_CLAMP_TO_EDGE, respectively Minification and magnification filter modes are GL_NEAREST and GL_NEAREST, respectively (note that they should have no effect on gather result) Using texture swizzle [default swizzle state] Using texture compare func GL_GREATER Texture coordinates run from (-0.3, -0.4) to (1.5, 1.6) About 100 analogous Vulkan CTS tests fail as well.
In the unexpected bisect of the day: b7048ea12fbb2724ee0cd30752d4fac43cab0651 is the first bad commit commit b7048ea12fbb2724ee0cd30752d4fac43cab0651 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Wed Mar 15 16:31:58 2017 +0200 drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks Currently ILK-BDW explicitly disable LP1+ watermarks from their .init_clock_gating() hooks. Unfortunately that hook gets called way too late since by that time we've already initialized all the watermark state tracking which then gets out of sync with the hardware state. We may eventually want to consider killing off the explicit LP1+ disable from .init_clock_gating(). In the meantime however, we can avoid the problem by reordering the init sequence such that intel_modeset_init_hw()->intel_init_clock_gating() gets called prior to the hardware state takeover. I suppose prior to the two stage watermark programming we were magically saved by something that forced the watermarks to be reprogrammed fully after .init_clock_gating() got called. But now that no longer happens. Note that the diff might look a bit odd as it kills off one call of intel_update_cdclk(), but that's fine because intel_modeset_init_hw() does the exact same thing. Previously we just did it twice. Actually even this new init sequence is pretty bogus as .init_clock_gating() really should be called before any gem hardware init since it can configure various clock gating workarounds and whatnot that affect the GT side as well. Also intel_modeset_init() really should get split up into better defined init stages. Another "fun" detail is that intel_modeset_gem_init() is where RPS/RC6 gets configured. Why that is done from the display code is beyond me. I've decided to leave all this be for now, and just try to fix the init sequence enough for watermarks to work. Cc: stable@vger.kernel.org Cc: Gabriele Mazzotta <gabriele.mzt@gmail.com> Cc: David Purton <dcpurton@marshwiggle.net> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reported-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Reported-by: David Purton <dcpurton@marshwiggle.net> Tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96645 Fixes: ed4a6a7ca853 ("drm/i915: Add two-stage ILK-style watermark programming (v11)") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170220140443.30891-1-ville.syrjala@linux.intel.com Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20170315143158.31780-1-ville.syrjala@linux.intel.com (cherry picked from commit 5be6e33400992d3450e1c8234a5af353e1560580) Signed-off-by: Jani Nikula <jani.nikula@intel.com> So it looks like some of those registers are now stored in the power context, thus loading them before rc6 is held to wake up GT means the settings are lost. Quick fix: diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 91bc4ab..f1e03e6 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1244,6 +1244,52 @@ static int init_render_ring(struct intel_engine_cs *engine) if (IS_GEN(dev_priv, 6, 7)) I915_WRITE(INSTPM, _MASKED_BIT_ENABLE(INSTPM_FORCE_ORDERING)); + + if (IS_HASWELL(dev_priv)) { + /* L3 caching of data atomics doesn't work -- disable it. */ + I915_WRITE(HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE); + I915_WRITE(HSW_ROW_CHICKEN3, + _MASKED_BIT_ENABLE(HSW_ROW_CHICKEN3_L3_GLOBAL_ATOMICS_DISABLE)); + + /* This is required by WaCatErrorRejectionIssue:hsw */ + I915_WRITE(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG, + I915_READ(GEN7_SQ_CHICKEN_MBCUNIT_CONFIG) | + GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB); + + /* WaVSRefCountFullforceMissDisable:hsw */ + I915_WRITE(GEN7_FF_THREAD_MODE, + I915_READ(GEN7_FF_THREAD_MODE) & ~GEN7_FF_VS_REF_CNT_FFME); + + /* WaDisable_RenderCache_OperationalFlush:hsw */ + I915_WRITE(CACHE_MODE_0_GEN7, _MASKED_BIT_DISABLE(RC_OP_FLUSH_ENABLE)); + + /* enable HiZ Raw Stall Optimization */ + I915_WRITE(CACHE_MODE_0_GEN7, + _MASKED_BIT_DISABLE(HIZ_RAW_STALL_OPT_DISABLE)); + + /* WaDisable4x2SubspanOptimization:hsw */ + I915_WRITE(CACHE_MODE_1, + _MASKED_BIT_ENABLE(PIXEL_SUBSPAN_COLLECT_OPT_DISABLE)); + + /* + * BSpec recommends 8x4 when MSAA is used, + * however in practice 16x4 seems fastest. + * + * Note that PS/WM thread counts depend on the WIZ hashing + * disable bit, which we don't touch here, but it's good + * to keep in mind (see 3DSTATE_PS and 3DSTATE_WM). + */ + I915_WRITE(GEN7_GT_MODE, + _MASKED_FIELD(GEN6_WIZ_HASHING_MASK, GEN6_WIZ_HASHING_16x4)); + + /* WaSampleCChickenBitEnable:hsw */ + I915_WRITE(HALF_SLICE_CHICKEN3, + _MASKED_BIT_ENABLE(HSW_SAMPLE_C_PERFORMANCE)); + + /* WaSwitchSolVfFArbitrationPriority:hsw */ + I915_WRITE(GAM_ECOCHK, I915_READ(GAM_ECOCHK) | HSW_ECOCHK_ARB_PRIO_SOL); + } + if (INTEL_INFO(dev_priv)->gen >= 6) I915_WRITE_IMR(engine, ~engine->irq_keep_mask);
commit f72b84c677d61f201b869223a8d6e389c7bb7d3d Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Wed Nov 8 15:35:55 2017 +0200 drm/i915: Move init_clock_gating() back to where it was Apparently setting up a bunch of GT registers before we've properly initialized the rest of the GT hardware leads to these setting being lost. So looks like I broke HSW with commit b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks") by doing init_clock_gating() too early. This should actually affect other platforms as well, but apparently not to such a great degree. What I was ultimately after in that commit was to move the ilk_init_lp_watermarks() call earlier. So let's undo the damage and move init_clock_gating() back to where it was, and call ilk_init_lp_watermarks() just before the watermark state readout. This highlights how fragile and messed up our init order really is. I wonder why we even initialize the display before gem. The opposite order would make much more sense to me... v2: Keep WaRsPkgCStateDisplayPMReq:hsw early as it really must be done before all planes might get disabled. Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mark Janes <mark.a.janes@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103549 Fixes: b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it Cc: stable@vger.kernel.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mark Janes <mark.a.janes@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reported-by: Mark Janes <mark.a.janes@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103549 Fixes: b7048ea12fbb ("drm/i915: Do .init_clock_gating() earlier to avoid it clobbering watermarks") References: https://lists.freedesktop.org/archives/intel-gfx/2017-November/145432.html Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171108133555.14091-1-ville.syrjala@linux.intel.com Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.