while true; do echo 0 > /sys/class/rtc/rtc0/wakealarm echo `date '+%s' -d '+ 3 seconds'` > /sys/class/rtc/rtc0/wakealarm dmesg|grep "SLP" |tail -1 # sleep 1 #with and without commenting this line echo freeze >/sys/power/state done We would see Suspend-To-Idle entry failing with the below Warning message while suspend stress test. Above is step to repro the issue. Occurrence is 20-30% and it's sporadic. 2018-02-23T03:12:24.347693-08:00 DEBUG kernel: [ 479.756025] PM: suspend-to-idle 2018-02-23T03:12:24.347710-08:00 WARNING kernel: [ 480.756211] CPU did not 2018-02-23T03:12:24.348021-08:00 DEBUG kernel: [ 480.756576] PM: resume from suspend-to-idle enter SLP S0 for suspend-to-idle.
Way to repro on Ubuntu is use the script below and the attached patch which has RC6 and DC6 counter status while true; do echo 0 > /sys/class/rtc/rtc0/wakealarm echo `date '+%s' -d '+ 3 seconds'` > /sys/class/rtc/rtc0/wakealarm dmesg|grep "DC6" |tail -1 # sleep 1 #with and without commenting this line echo freeze >/sys/power/state done In failure case both DC6 and RC6 residency counter wouldn't have incremented. Example- 2018-02-23T03:12:24.348124-08:00 INFO kernel: [ 480.779917] i915 0000:00:02.0: Abhijeet: PM residency counters DC5=(0001bc43->0001bc45) DC6=(0001bbbb->0001bbbb) RC6=(15b1c57c->15b1c57c) in the above stress test , RC6 is getting disabled which is leading to soix failure. By making below changes , system is able to enter RC6. Our analysis is that the resume was called, so RC6 was disabled and system tried to enter suspend again , where RC6 was not enabled from i915_gem_do_execbuffer. diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 5fdd2414ca31..cebf0fb67f81 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1843,6 +1843,7 @@ static int i915_pm_suspend(struct device *kdev) { struct pci_dev *pdev = to_pci_dev(kdev); struct drm_device *dev = pci_get_drvdata(pdev); + struct drm_i915_private *dev_priv = to_i915(dev); if (!dev) { dev_err(kdev, "DRM not initialized, aborting suspend.\n"); @@ -1852,13 +1853,28 @@ static int i915_pm_suspend(struct device *kdev) if (dev->switch_power_state == DRM_SWITCH_POWER_OFF) return 0; + printk(KERN_ERR "Abhijeet RC6 state = 0x%08x\n", (I915_READ(GEN6_RC_CONTROL) & GEN6_RC_CTL_HW_ENABLE)); + intel_enable_gt_powersave(dev_priv); + + return i915_drm_suspend(dev); }
The explanation does not match the current code base. Please test against upstream.
First of all. Sorry about spam. This is mass update for our bugs. Sorry if you feel this annoying but with this trying to understand if bug still valid or not. If bug investigation still in progress, please ignore this and I apologize! If you think this is not anymore valid, please comment to the bug that can be closed. If you haven't tested with our latest pre-upstream tree(drm-tip), can you do that also to see if issue is valid there still and if you cannot see issue there, please comment to the bug.
Just for the record, a newer version of the patch in comment #1 has been merged there: https://chromium-review.googlesource.com/q/Ia399473bc20773c0bc3 https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/chromeos-4.4/drivers/gpu/drm/i915/i915_drv.c CHROMIUM: drm/i915: Configure GPU PM in ->suspend() if not configured The patch implements workaround for a scenario, where the GPU Power Management, if not configured prior to platform suspend entry, will block SoC S0ix entry in suspend-to-idle...
The current codebase has changed. There is no need to load the context to enable rc6 :) [ 153.475078] Abhijeet i'm enabling rc6 [ 153.475085] Abhijeet addr = 0000A090 value= 00000000 [ 153.475094] Abhijeet addr = 0000A090 value= 88040000 [ 153.475100] CPU: 2 PID: 3057 Comm: kworker/u8:67 Tainted: G W 4.16.0-rc5-31709-g0e5bad01e4a9-dirty #17 [ 153.475103] Hardware name: HP Soraka/Soraka, BIOS Google_Soraka.10086.0.0 10/30/2017 [ 153.475110] Workqueue: events_unbound async_run_entry_fn [ 153.475114] Call Trace: [ 153.475122] dump_stack+0x4f/0x81 [ 153.475127] intel_enable_gt_powersave+0x1057/0x1941 [ 153.475133] ? __pm_runtime_resume+0x5f/0x8a [ 153.475138] i915_request_alloc+0xc8/0x40e [ 153.475143] i915_gem_switch_to_kernel_context+0xbf/0x144 [ 153.475150] i915_gem_resume+0x70/0xc9 [ 153.475155] ? pci_pm_suspend+0x1ac/0x1ac [ 153.475160] i915_drm_resume+0x75/0x10f [ 153.475165] ? pci_pm_suspend+0x1ac/0x1ac [ 153.475168] i915_pm_resume+0x1e/0x22 [ 153.475172] dpm_run_callback+0x45/0x80 [ 153.475177] device_resume+0x1f1/0x25c [ 153.475181] async_resume+0x1c/0x42 [ 153.475187] async_run_entry_fn+0x3f/0xd2 [ 153.475193] process_one_work+0x18d/0x2de [ 153.475194] call usb2+ returned 0 after 218 usecs [ 153.475202] worker_thread+0x194/0x329 [ 153.475208] ? worker_clr_flags+0x52/0x52 [ 153.475213] kthread+0xf1/0x101 [ 153.475215] call 00:02+ returned 0 after 220 usecs [ 153.475218] calling 00:03+ @ 2742, parent: pnp0 [ 153.475220] call 00:03+ returned 0 after 0 usecs [ 153.475223] calling 00:04+ @ 2742, parent: pnp0 [ 153.475225] calling 2-2+ @ 3020, parent: usb2 [ 153.475230] ? worker_clr_flags+0x52/0x52 [ 153.475235] call 00:04+ returned 0 after 0 usecs [ 153.475237] ? rcu_read_unlock_sched_notrace+0x4d/0x4d [ 153.475242] ret_from_fork+0x35/0x40 Closing the bug since its no more applicable.
Thanks for the feedback.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.