Summary: | [BDW bisected]igt/pm_pc8 subcases cause system hang | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Guo Jinxian <jinxianx.guo> | ||||
Component: | DRM/Intel | Assignee: | Imre Deak <imre.deak> | ||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Severity: | critical | ||||||
Priority: | high | CC: | huax.lu, intel-gfx-bugs, wendy.wang | ||||
Version: | unspecified | ||||||
Hardware: | Other | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
i915 platform: | i915 features: | ||||||
Attachments: |
|
We will bisect it later. Thanks. fc1744ff7ba63cabf858c55217382104e9dd94ed is the first bad commit commit fc1744ff7ba63cabf858c55217382104e9dd94ed Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Thu Apr 10 09:01:40 2014 +0200 Revert "drm/i915: fix infinite loop at gen6_update_ring_freq" This reverts commit 4b28a1f3ef55a3b0b68dbab1fe6dbaf18e186710. This patch duct-tapes over some issue in the current bdw rps patches which must wait with enabling rc6/rps until the very first batch has been submitted by userspace. But those patches aren't merged yet, and for upstream we need to have an in-kernel emission of the very first batch. I shouldn't have merged this patch so let's revert it again. Also Imre noticed that even when rps is set up normally there's a small window (due to the 1s delay of the async rps init work) where we could runtime suspend already and blow up all over the place. Imre has a proper fix to block runtime pm until the rps init work has successfully completed. Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> :040000 040000 c7c7c9b7e4dc136a3ac650c847e65f6913e83ba4 6e22952e0db97d29757b13b0a1dc6a4392d86f95 M drivers Reverted this commit, the case passed. Thanks. I think Imre has a patch to prevent runtime pm until the delayed rps work has completed. That should address this. (In reply to comment #3) > I think Imre has a patch to prevent runtime pm until the delayed rps work > has completed. That should address this. I haven't checked yet this bug closer, but note that RC6/RPS is not enabled on BDW on current -nightly, I'm not sure if it's by overlook or on purpose. At least intel_disable_gt_powersave() is broken on BDW, since it'll try to disable RC6/RPS when it wasn't enabled in the first place. I posted a fix for this issue: http://lists.freedesktop.org/archives/intel-gfx/2014-April/043695.html As Daniel mentioned, in the same patchset there is also a patch to disable RPM until RC6/RPS is setup, but I'm not sure how that can make a difference on BDW, since we never enabled RC6/RPS there. (In reply to comment #4) > (In reply to comment #3) > > I think Imre has a patch to prevent runtime pm until the delayed rps work > > has completed. That should address this. > > I haven't checked yet this bug closer, but note that RC6/RPS is not enabled > on BDW on current -nightly, I'm not sure if it's by overlook or on purpose. > At least intel_disable_gt_powersave() is broken on BDW, since it'll try to > disable RC6/RPS when it wasn't enabled in the first place. I posted a fix > for this issue: > > http://lists.freedesktop.org/archives/intel-gfx/2014-April/043695.html > > As Daniel mentioned, in the same patchset there is also a patch to disable > RPM until RC6/RPS is setup, but I'm not sure how that can make a difference > on BDW, since we never enabled RC6/RPS there. So I think the likely reason for the failure is that we don't enable RC6 but we enable RPM (which depends on RC6). I suggest we fix this for now by correctly reporting that RC6 is disabled and also keep RPM disabled based on this. I cherry-picked the necessary patches from my VLV RPM branch for this and added a new one that reports the correct RC6 status for BDW. It's an obvious fix as it only keeps RPM disabled on BDW, but I still suggest it until BDW RC6 support is fixed. Daniel, if it's ok I can submit these patches separately from the rest of VLV RPM stuff. Please try the following branch: https://github.com/ideak/linux/commits/bdw-rc6-rpm-fix
>
> Please try the following branch:
> https://github.com/ideak/linux/commits/bdw-rc6-rpm-fix
Apply this patch fail, test the patch as below:
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 75c1c76..b1b5fd8 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -3262,6 +3262,10 @@ int intel_enable_rc6(const struct drm_device *dev)
if (INTEL_INFO(dev)->gen < 5)
return 0;
+ /* Disable RC6 on Broadwell for now */
+ if (IS_BROADWELL(dev))
+ return 0;
+
/* Respect the kernel parameter if it is set */
if (i915.enable_rc6 >= 0)
return i915.enable_rc6;
The hang still exists.
output:
IGT-Version: 1.6-g78e4c2b (x86_64) (Linux: 3.15.0-rc2_prts_78d88b_20140425 x86_64)
Runtime PM support: 1
PC8 residency support: 1
(In reply to comment #6) > > > > Please try the following branch: > > https://github.com/ideak/linux/commits/bdw-rc6-rpm-fix > > Apply this patch fail, test the patch as below: Please test the whole branch. You can get it with: $ git clone -b bdw-rc6-rpm-fix git://github.com/ideak/linux Also, please apply the following igt patch too, it should just make the pm_pc8 skip if runtime PM is disabled: diff --git a/tests/pm_pc8.c b/tests/pm_pc8.c index 010af44..9a95326 100644 --- a/tests/pm_pc8.c +++ b/tests/pm_pc8.c @@ -769,7 +769,7 @@ static void setup_environment(void) printf("Runtime PM support: %d\n", has_runtime_pm); printf("PC8 residency support: %d\n", has_pc8); - igt_require(has_runtime_pm || has_pc8); + igt_require(has_runtime_pm); } static void teardown_environment(void)
>
> Please test the whole branch. You can get it with:
> $ git clone -b bdw-rc6-rpm-fix git://github.com/ideak/linux
>
2 hours, download 8%, need more time to test this branch.
# tsocks git clone -b bdw-rc6-rpm-fix git://github.com/ideak/linux
Cloning into 'linux'...
remote: Counting objects: 3549127, done.
remote: Compressing objects: 100% (574405/574405), done.
Receiving objects: 8% (306166/3549127), 117.50 MiB | 22 KiB/s
(In reply to comment #8) > > > > Please test the whole branch. You can get it with: > > $ git clone -b bdw-rc6-rpm-fix git://github.com/ideak/linux > > > > > 2 hours, download 8%, need more time to test this branch. > # tsocks git clone -b bdw-rc6-rpm-fix git://github.com/ideak/linux > Cloning into 'linux'... > remote: Counting objects: 3549127, done. > remote: Compressing objects: 100% (574405/574405), done. > Receiving objects: 8% (306166/3549127), 117.50 MiB | 22 KiB/s You can speed it up using a local copy of the kernel as a reference: $ git clone --reference <path-to-kernel> -b bdw-rc6-rpm-fix git://github.com/ideak/linux Test on branch https://github.com/ideak/linux/commits/bdw-rc6-rpm-fix, It works well. #./pm_pc8 IGT-Version: 1.6-ga595a40 (x86_64) (Linux: 3.14.0-rc7_prts_dcb99f_20140328 x86_64) Runtime PM support: 0 PC8 residency support: 1 Test requirement not met in function setup_environment, file pm_pc8.c:772: Last errno: 5, Input/output error Test requirement: (!(has_runtime_pm)) Subtest rte: SKIP Subtest drm-resources-equal: SKIP Subtest pci-d3-state: SKIP Subtest modeset-lpsp: SKIP Subtest modeset-non-lpsp: SKIP Subtest gem-mmap-cpu: SKIP Subtest gem-mmap-gtt: SKIP Subtest gem-pread: SKIP Subtest gem-execbuf: SKIP Subtest gem-idle: SKIP Subtest reg-read-ioctl: SKIP Subtest i2c: SKIP Subtest pc8-residency: SKIP Subtest debugfs-read: SKIP Subtest debugfs-forcewake-user: SKIP Subtest sysfs-read: SKIP Subtest modeset-lpsp-stress: SKIP Subtest modeset-non-lpsp-stress: SKIP Subtest modeset-lpsp-stress-no-wait: SKIP Subtest modeset-non-lpsp-stress-no-wait: SKIP Subtest modeset-pc8-residency-stress: SKIP Subtest modeset-stress-extra-wait: SKIP Subtest gem-execbuf-stress: SKIP Subtest gem-execbuf-stress-pc8: SKIP Subtest gem-execbuf-stress-extra-wait: SKIP The fix is merged to -nightly, closing. Test on latest -nightyly(30c8c9cd8bc88d6ae70f09d403e725b51e0bd7dd ), all results are skip, verify it. ./pm_pc8 IGT-Version: 1.6-g4bd9fe6 (x86_64) (Linux: 3.15.0-rc3_drm-intel-nightly_30c8c9_20140507+ x86_64) Runtime PM support: 0 PC8 residency support: 1 Test requirement not met in function setup_environment, file pm_pc8.c:784: Last errno: 5, Input/output error Test requirement: (!(has_runtime_pm)) Subtest rte: SKIP Subtest drm-resources-equal: SKIP Subtest pci-d3-state: SKIP Subtest modeset-lpsp: SKIP Subtest modeset-non-lpsp: SKIP Subtest dpms-lpsp: SKIP Subtest dpms-non-lpsp: SKIP Subtest gem-mmap-cpu: SKIP Subtest gem-mmap-gtt: SKIP Subtest gem-pread: SKIP Subtest gem-execbuf: SKIP Subtest gem-idle: SKIP Subtest reg-read-ioctl: SKIP Subtest i2c: SKIP Subtest pc8-residency: SKIP Subtest debugfs-read: SKIP Subtest debugfs-forcewake-user: SKIP Subtest sysfs-read: SKIP Subtest modeset-lpsp-stress: SKIP Subtest modeset-non-lpsp-stress: SKIP Subtest modeset-lpsp-stress-no-wait: SKIP Subtest modeset-non-lpsp-stress-no-wait: SKIP Subtest modeset-pc8-residency-stress: SKIP Subtest modeset-stress-extra-wait: SKIP Subtest gem-execbuf-stress: SKIP Subtest gem-execbuf-stress-pc8: SKIP Subtest gem-execbuf-stress-extra-wait: SKIP Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 97500 [details] dmesg System Environment: -------------------------- Platform: BDW kernel: (drm-intel-nightly) Bug detailed description: ---------------------------- igt/pm_pc8 subcases (like debugfs-read, gem-execbuf) cause system hang on latest -nightly (1e771b84e47085ef9b6efea1321e7cb5a8b2c065) It's a regression bug output: IGT-Version: 1.6-g43c2ed7 (x86_64) (Linux: 3.14.0_drm-intel-nightly_1e771b_20140417+ x86_64) Runtime PM support: 1 PC8 residency support: 1 Reproduce steps: ---------------------------- 1. ./pm_pc8 --run-subtest debugfs-read