When running a list of all kms tests I am getting the following dmesg: "[drm:bdw_set_cdclk [i915]] *ERROR* Switching to FCLK failed" The failure rate is about 1/5 runs and the affected tests so far are: igt@kms_cursor_crc@cursor-256x256-suspend igt@kms_cursor_crc@cursor-128x128-suspend igt@kms_flip@vblank-vs-suspend igt@kms_flipv@blank-vs-dpms-suspend-interruptible igt@kms_flip@flip-vs-dpms-interruptible I have not seen this on drm-tip 4.12-0.rc7+, but I now get this staring from ~4.13-0-rc3+
Now also reproduced drm-tip@5118c7fa9 igt@eda8cc9f 1/5 runs of kms-all testlist.
Also reproduce 170818 on: igt@kms_flipv@blank-vs-dpms-suspend gt@kms_flipv@flip-vs-modeset-interruptible 1/25 runs.
I also have gotten once: [drm:gen8_irq_handler [i915]] *ERROR* The master control interrupt lied (DE PIPE)!
run 10 times with increased timeout: +++ b/drivers/gpu/drm/i915/intel_cdclk.c @@ -673,7 +673,7 @@ static void bdw_set_cdclk(struct drm_i915_private *dev_priv, I915_WRITE(LCPLL_CTL, val); if (wait_for_us(I915_READ(LCPLL_CTL) & - LCPLL_CD_SOURCE_FCLK_DONE, 1)) + LCPLL_CD_SOURCE_FCLK_DONE, 1000)) DRM_ERROR("Switching to FCLK failed\n"); The issue can not be reproduced.
Created attachment 133739 [details] [review] patch used for printing time taken by waiting I have done some runs while spinning a list of all igt@kms_cursor_crc subtests, while printing the time spent in "wait_for_us". Attached patch was used for the printing. I increased timeout to 10us and hit the issue (i.e. waited longer that 1us) 28 times during 1247 invocations of the bdw_set_cdclk.
Created attachment 133741 [details] dmesg snipped with logs where timeout is less than 1us
Created attachment 133742 [details] dmesg snipped with logs where timeout is more than 1us
Created attachment 133743 [details] times waiting for fclk update to stick
Created attachment 133744 [details] times waiting for fclk update to stick with more debug prints
I have been able to reproduce the issue with more logs but the reproduction rate appear significantly lower 2/306 runs. I seen no difference in the dmesg snippets where we have waited < 1 us or above 1 us. The average waited time without logs: 263 ns The average waited time with logs: 263 ns
(In reply to Marta Löfstedt from comment #10) > I have been able to reproduce the issue with more logs but the reproduction > rate appear significantly lower 2/306 runs. > I seen no difference in the dmesg snippets where we have waited < 1 us or > above 1 us. > > The average waited time without logs: 263 ns > The average waited time with logs: 263 ns Sorry bad copy-paste: The average waited time with logs: 259 ns
Created attachment 133746 [details] [review] suggestion for fix Added patch for a suggested fix, but I need to research a bit more before sending to list.
reproduce 1/10 runs with same list as above with: +++ b/drivers/gpu/drm/i915/intel_cdclk.c @@ -665,6 +665,7 @@ static void bdw_set_cdclk(struct drm_i915_private *dev_priv, return; } + pm_qos_update_request(&dev_priv->pm_qos, 0); val = I915_READ(LCPLL_CTL); val |= LCPLL_CD_SOURCE_FCLK; I915_WRITE(LCPLL_CTL, val); @@ -672,7 +673,8 @@ static void bdw_set_cdclk(struct drm_i915_private *dev_priv, if (wait_for_us(I915_READ(LCPLL_CTL) & LCPLL_CD_SOURCE_FCLK_DONE, 1)) DRM_ERROR("Switching to FCLK failed\n"); - + pm_qos_update_request(&dev_priv->pm_qos, PM_QOS_DEFAULT_VALUE);
Tested this which doesn't work either. +++ b/drivers/gpu/drm/i915/intel_cdclk.c @@ -669,10 +669,13 @@ static void bdw_set_cdclk(struct drm_i915_private *dev_priv, val |= LCPLL_CD_SOURCE_FCLK; I915_WRITE(LCPLL_CTL, val); + preempt_disable(); if (wait_for_us(I915_READ(LCPLL_CTL) & LCPLL_CD_SOURCE_FCLK_DONE, 1)) DRM_ERROR("Switching to FCLK failed\n"); + preempt_enable();
sent up the workaround with increased polltime: https://patchwork.freedesktop.org/patch/175688/ All though I believe 10us would be enough, after discussing with Imre I decided to go for 100us.
My workaround has been merged. commit 3164888a40469c102b5d6d1b756c7646e7eb19e7 Author: Marta Lofstedt <marta.lofstedt@intel.com> Date: Fri Sep 8 16:28:29 2017 +0300 drm/i915: Increase poll time for BDW FCLK_DONE
This is now tested OK.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.