Summary: | [HSW Bisected] Fail to resume from S4, causing system hang (once out of 5 test circles) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | yaoming <ming.yao> | ||||||
Component: | DRM/Intel | Assignee: | Daniel Vetter <daniel> | ||||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||||
Severity: | major | ||||||||
Priority: | highest | CC: | intel-gfx-bugs, ming.yao, przanoni | ||||||
Version: | unspecified | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | All | ||||||||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=65496 | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Description
yaoming
2014-08-08 08:34:50 UTC
Created attachment 104271 [details]
Dmesg from netconsole
*** Bug 82339 has been marked as a duplicate of this bug. *** Try "echo +60 > /sys/class/rtc/rtc0/wakealarm" instead (In reply to comment #3) > Try "echo +60 > /sys/class/rtc/rtc0/wakealarm" instead It also fails to resume from S4, causing system hang. Duplicate of bug #65496? ==Bisect results== ---------------------------- Bisect shows: 8ee661b505613ef2747b350ca2871a31b3781bee is the first bad commit commit 8ee661b505613ef2747b350ca2871a31b3781bee Author: Daniel Vetter <daniel.vetter@ffwll.ch> AuthorDate: Wed Mar 26 20:10:09 2014 +0100 Commit: Dave Airlie <airlied@redhat.com> CommitDate: Fri Mar 28 12:33:50 2014 +1000 drm/i915: Undo gtt scratch pte unmapping again It apparently blows up on some machines. This functionally reverts commit 828c79087cec61eaf4c76bb32c222fbe35ac3930 Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Wed Oct 16 09:21:30 2013 -0700 drm/i915: Disable GGTT PTEs on GEN6+ suspend Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64841 Reported-and-Tested-by: Brad Jackson <bjackson0971@gmail.com> Cc: stable@vger.kernel.org Cc: Takashi Iwai <tiwai@suse.de> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Todd Previte <tprevite@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Perhaps diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index f3fd448505f1..968e21300849 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1316,6 +1316,16 @@ void i915_check_and_clear_faults(struct drm_device *dev) POSTING_READ(RING_FAULT_REG(&dev_priv->ring[RCS])); } +static void i915_ggtt_flush(struct drm_device *dev) +{ + if (INTEL_INFO(dev)->gen < 6) { + intel_gtt_chipset_flush(); + } else { + I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); + POSTING_READ(GFX_FLSH_CNTL_GEN6); + } +} + void i915_gem_suspend_gtt_mappings(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; @@ -1332,6 +1342,8 @@ void i915_gem_suspend_gtt_mappings(struct drm_device *dev) dev_priv->gtt.base.start, dev_priv->gtt.base.total, true); + + i915_ggtt_flush(dev); } void i915_gem_restore_gtt_mappings(struct drm_device *dev) @@ -1384,7 +1396,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) gen6_write_pdes(container_of(vm, struct i915_hw_ppgtt, base)); } - i915_gem_chipset_flush(dev); + i915_ggtt_flush(dev); } int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) (In reply to comment #7) > Perhaps > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c > b/drivers/gpu/drm/i915/i915_gem_gtt.c > index f3fd448505f1..968e21300849 100644 > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > @@ -1316,6 +1316,16 @@ void i915_check_and_clear_faults(struct drm_device > *dev) > POSTING_READ(RING_FAULT_REG(&dev_priv->ring[RCS])); > } > > +static void i915_ggtt_flush(struct drm_device *dev) > +{ > + if (INTEL_INFO(dev)->gen < 6) { > + intel_gtt_chipset_flush(); > + } else { > + I915_WRITE(GFX_FLSH_CNTL_GEN6, GFX_FLSH_CNTL_EN); > + POSTING_READ(GFX_FLSH_CNTL_GEN6); > + } > +} > + > void i915_gem_suspend_gtt_mappings(struct drm_device *dev) > { > struct drm_i915_private *dev_priv = dev->dev_private; > @@ -1332,6 +1342,8 @@ void i915_gem_suspend_gtt_mappings(struct drm_device > *dev) > dev_priv->gtt.base.start, > dev_priv->gtt.base.total, > true); > + > + i915_ggtt_flush(dev); > } > > void i915_gem_restore_gtt_mappings(struct drm_device *dev) > @@ -1384,7 +1396,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device > *dev) > gen6_write_pdes(container_of(vm, struct i915_hw_ppgtt, > base)); > } > > - i915_gem_chipset_flush(dev); > + i915_ggtt_flush(dev); > } > > int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) Add patch failed, so I modified i915_gem_gtt.c manually. But also failed to make when build kernel with kcloud. Try http://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=requests&id=30e95d38220df18fafb1918027714ef6ecb8d259 instead then. (In reply to comment #9) > Try > http://cgit.freedesktop.org/~ickle/linux-2.6/commit/ > ?h=requests&id=30e95d38220df18fafb1918027714ef6ecb8d259 instead then. With this patch, no system hang appears. But each time I test S4(20 circles), it would be interrupted on circle 2 or circle 3, just like reboot. It is the same result on the latest drm-intel-nightly(43df30da20447e2856b2761215ff274886a9f931). My reproduce step: for ((i=0; i<20; i++)) do echo 0 > /sys/class/rtc/rtc0/wakealarm echo +45 > /sys/class/rtc/rtc0/wakealarm echo disk > /sys/power/state done Created attachment 106305 [details]
new dmesg from netconsole
Could you try the following branch: https://github.com/ideak/linux/commits/suspend-fix Also could you get a serial console log when the hang happens (preferably from an RS232 port, or if that's not available via a USB->serial adapter)? See Documentation/serial-console.txt to set things up. Also make sure you have enabled the following kconfig options: CONFIG_LOCKUP_DETECTOR=y CONFIG_HARDLOCKUP_DETECTOR=y CONFIG_DETECT_HUNG_TASK=y and pass the 'debug', 'no_console_suspend' kernel parameters. (In reply to comment #12) > Could you try the following branch: > https://github.com/ideak/linux/commits/suspend-fix > > Also could you get a serial console log when the hang happens (preferably > from an RS232 port, or if that's not available via a USB->serial adapter)? > See Documentation/serial-console.txt to set things up. > > Also make sure you have enabled the following kconfig options: > CONFIG_LOCKUP_DETECTOR=y > CONFIG_HARDLOCKUP_DETECTOR=y > CONFIG_DETECT_HUNG_TASK=y > > and pass the 'debug', 'no_console_suspend' kernel parameters. I tested the branch you gave, still caused system hang. I am sorry I can't get log from a serial console, because the test machine has no RS232 port available. At this point, it would also be good to double check that the baseline kernel (i.e. without i915.ko loaded) is still reliable. (In reply to comment #14) > At this point, it would also be good to double check that the baseline > kernel (i.e. without i915.ko loaded) is still reliable. I test on the latest drm-intel-nightly(3e5e5913a80f593aa8f67784220eac2658034c6c) without i915.ko loaded, the result is exactly the same as Comment 10. (In reply to comment #15) > (In reply to comment #14) > > At this point, it would also be good to double check that the baseline > > kernel (i.e. without i915.ko loaded) is still reliable. > > I test on the latest > drm-intel-nightly(3e5e5913a80f593aa8f67784220eac2658034c6c) without i915.ko > loaded, the result is exactly the same as Comment 10. Ok, so it seems that the patch is good enough to prevent i915.ko from being the worst offender on the system. (In reply to comment #16) > Ok, so it seems that the patch is good enough to prevent i915.ko from being > the worst offender on the system. Pushed to drm-intel-fixes as commit f36c680a2774f459c1a290e9c2e84bed66a80a78 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Thu Sep 25 10:13:12 2014 +0100 drm/i915: Flush the PTEs after updating them before suspend Please verify. (In reply to comment #17) > (In reply to comment #16) > > Ok, so it seems that the patch is good enough to prevent i915.ko from being > > the worst offender on the system. > > Pushed to drm-intel-fixes as > > commit f36c680a2774f459c1a290e9c2e84bed66a80a78 > Author: Chris Wilson <chris@chris-wilson.co.uk> > Date: Thu Sep 25 10:13:12 2014 +0100 > > drm/i915: Flush the PTEs after updating them before suspend > > Please verify. Verified. Without i915.ko on this commit, the result is exactly the same as Comment 10. Closing verified+fixed. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.