Bug 83914 - [HSW]: power usage rises a lot after resume from suspend to ram
Summary: [HSW]: power usage rises a lot after resume from suspend to ram
Status: CLOSED INVALID
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: DRI git
Hardware: x86-64 (AMD64) All
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-16 06:34 UTC by Arkadiusz Miskiewicz
Modified: 2017-07-24 22:51 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (102.38 KB, text/plain)
2014-09-16 06:34 UTC, Arkadiusz Miskiewicz
no flags Details

Description Arkadiusz Miskiewicz 2014-09-16 06:34:07 UTC
Created attachment 106351 [details]
dmesg

On my Dell XPS 15 (9530 model, late 2013) that uses

vendor_id       : GenuineIntel
cpu family      : 6
model           : 60
model name      : Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz
stepping        : 3
microcode       : 0x1a
cpu MHz         : 2899.960
cache size      : 6144 KB

there is a problem that after resume from ram the power usage rises a lot, about 7-10W (from 26-30W to 36-40W).

It happens with minimal userspace, so I don't blame userspace. powertop estimates that 17W is used by "CPU core" (I assume intel gpu hides under this, too). "perf top" doesn't show any obvious suspects.

When the problem happens registers look like this:
--- power-ok	2014-09-15 22:30:20.615917605 +0200
+++ power-bad	2014-09-15 22:31:43.383356688 +0200
@@ -130,12 +130,12 @@
                  TRANS_VSYNC_A: 0x00000000 (1 start, 1 end)
             TRANS_VSYNCSHIFT_A: 0x00000000
                     TRANSACONF: 0x00000000 (disable, inactive, progressive)
-                  FDI_RXA_MISC: 0x0a200090 (FDI Delay 144)
+                  FDI_RXA_MISC: 0x00200080 (FDI Delay 128)
                FDI_RXA_TUSIZE1: 0x7e000000
                    FDI_RXA_IIR: 0x00000000
                    FDI_RXA_IMR: 0x00000fff
               BLC_PWM_CPU_CTL2: 0xe0000000 (enable 1, pipe EDP, blinking 0, granularity 128)
-               BLC_PWM_CPU_CTL: 0x149906e9 (cycle 1769, freq 5273)
+               BLC_PWM_CPU_CTL: 0x00000cc7 (cycle 3271, freq 0)
              BLC_PWM2_CPU_CTL2: 0x60000000 (enable 0, pipe EDP, blinking 0, granularity 128)
               BLC_PWM2_CPU_CTL: 0x00000000 (cycle 0, freq 0)
                   BLC_MISC_CTL: 0x00000000 (PWM1-PCH PWM2-CPU)
@@ -149,7 +149,7 @@
                 PCH_PP_DIVISOR: 0x00186906
                    PIXCLK_GATE: 0x00000000
                         SDEISR: 0x00800000 (port d:1, port c:0, port b:0, crt:0)
-            RC6_RESIDENCY_TIME: 0x771f52d8
+            RC6_RESIDENCY_TIME: 0x02ee80e1
                  FENCE START 0: 0x02cc0001
                    FENCE END 0: 0x042b8063
                  FENCE START 1: 0x00000000


For me that happens under 3.17.0-rc5 and 3.14.18. Going to test other versions soon.

Note that there was similar problem for SNB but not sure if these are related:
Bug 54089 - [SNB regression] Power consumption goes postal after resume
Comment 1 Chris Wilson 2014-09-16 06:44:01 UTC
(In reply to comment #0) 
> It happens with minimal userspace, so I don't blame userspace. powertop
> estimates that 17W is used by "CPU core" (I assume intel gpu hides under
> this, too). "perf top" doesn't show any obvious suspects.

The GPU itself is measured separately, but that only really measures the EU power, not the display or the influence of the GPU on the L3 cache ring. (I would expect the ring power to be tallied under the CPU, so it is entirely possible the GPU is causing it to run at maximum frequency and we end up blaming the CPU.)

Best hope is that this is an easily identifiable kernel regression. *fingers crossed*
Comment 2 Arkadiusz Miskiewicz 2014-09-17 10:33:09 UTC
Looks like GPU is innocent.

This machine has 8 cores (4 normal and 4 HT). I was using maxcpus=4 (due to Dell XPS 15 bugs like whining coil).

My current theory is that after resume from ram these 4 HT cores are eating power. Why? Because if not using maxcpus option then everything fine and there is no issue at all.

It could be called kernel bug to not shut up these additional cores after resume but well...
Comment 3 Arkadiusz Miskiewicz 2014-09-17 11:23:46 UTC
... and workaround that gets power usage to normal when using maxcpus=X

for cpu in $(grep -l 0 /sys/devices/system/cpu/cpu*/online); do
    echo 1 > $cpu
    sleep 1
    echo 0 > $cpu
done
Comment 4 Jani Nikula 2014-09-17 11:59:25 UTC
Arkadiusz, thanks for following up with this. I'd like to ask you to do one more thing, though: please file a bug on https://bugzilla.kernel.org/ or the linux-kernel mailing list. Thanks.
Comment 5 Arkadiusz Miskiewicz 2014-09-17 12:26:49 UTC
https://bugzilla.kernel.org/show_bug.cgi?id=84741


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.