Bug 105712 - intel-gpu-overlay is showing insane power consumption amounts
Summary: intel-gpu-overlay is showing insane power consumption amounts
Alias: None
Product: DRI
Classification: Unclassified
Component: IGT (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Default DRI bug account
QA Contact:
Depends on:
Reported: 2018-03-23 14:25 UTC by leozinho29_eu
Modified: 2018-10-11 19:52 UTC (History)
0 users

See Also:
i915 platform:
i915 features:

Screenshot showing intel-gpu-overlay in Xnest (28.12 KB, image/png)
2018-03-23 14:25 UTC, leozinho29_eu
no flags Details
Use setlocale("C") around strtod (1.42 KB, patch)
2018-03-27 14:33 UTC, Chris Wilson
no flags Details | Splinter Review

Description leozinho29_eu 2018-03-23 14:25:21 UTC
Created attachment 138312 [details]
Screenshot showing intel-gpu-overlay in Xnest

The program intel-gpu-overlay from intel-gpu-tools is showing very high numbers of power consumption. It is showing values as high as 60 GW of power (60000000000000 mW). Comparing the values to versions that were OK previously (intel-gpu-tools 1.18 shows proper values), it appears that there are many additional numbers that should be in the decimal places, so that 60000000000000 mW would be 6000,0000000000 mW, or 6 W (which, usually, is the maximum I've seen so far).

This is happening both with intel-gpu-tools 1.22 and with the latest git (c2ee9077).

-- chipset: Intel Core i3-6100U
-- system architecture: 64-bit
-- xf86-video-intel: 2:2.99.917+git20171229-1
-- xserver: 2:1.19.6-1ubuntu3
-- mesa: 18.1.0-devel (git-903e9952fb)
-- libdrm: 2.4.91-2
-- kernel: 4.16.0-rc6
-- Linux distribution: Xubuntu 18.04 (Development branch)
-- Machine or mobo model: Lenovo Ideapad 310-14ISK 80UG
-- Display connector: eDP and VGA
Comment 1 Chris Wilson 2018-03-23 14:31:53 UTC
Please try drm-tip as the interface intel-gpu-overlay uses has finally been upstreamed, hopefully it's just that.
Comment 2 leozinho29_eu 2018-03-23 18:39:30 UTC
I suppose drm-tip is this: https://cgit.freedesktop.org/drm-tip

I have tried to use that kernel but intel-gpu-overlay is still showing the super high values. Both older versions of intel-gpu-overlay and turbostat show sane values.
Comment 3 Chris Wilson 2018-03-23 20:56:46 UTC
Ok, try changing intel-gpu-tools:

diff --git a/overlay/power.c b/overlay/power.c
index 9ac90fde..e02edec8 100644
--- a/overlay/power.c
+++ b/overlay/power.c
@@ -116,7 +116,8 @@ int power_init(struct power *power)
        memset(power, 0, sizeof(*power));
-       power->fd = igt_perf_open(rapl_type_id(), rapl_gpu_power());
+       power->fd = -1;
+       //power->fd = igt_perf_open(rapl_type_id(), rapl_gpu_power());
        if (power->fd >= 0) {
                power->rapl_scale = rapl_gpu_power_scale();
Comment 4 leozinho29_eu 2018-03-23 21:09:14 UTC
With this change it is showing the values as expected, 350 mW instead of 3500000000000 mW.
Comment 5 Chris Wilson 2018-03-23 21:18:54 UTC
Add a couple of printfs,

diff --git a/overlay/power.c b/overlay/power.c
index 9ac90fde..e6ac728a 100644
--- a/overlay/power.c
+++ b/overlay/power.c
@@ -117,8 +117,12 @@ int power_init(struct power *power)
        memset(power, 0, sizeof(*power));
        power->fd = igt_perf_open(rapl_type_id(), rapl_gpu_power());
+       fprintf(stderr, "rapl_type_id()=%"PRIx64", rapl_gpu_power()=%"PRIx64"\n",
+               rapl_type_id(), rapl_gpu_power());
        if (power->fd >= 0) {
                power->rapl_scale = rapl_gpu_power_scale();
+               fprintf(stderr, "rapl_gpu_power_scale()=%f\n",
+                       rapl_gpu_power_scale());
                if (power->rapl_scale != NAN) {
                        power->rapl_scale *= 1e3; /* from nano to micro */

and run with -f (so that it doesn't detach and we can see the output).
Comment 6 leozinho29_eu 2018-03-23 22:30:14 UTC
The lines

        fprintf(stderr, "rapl_type_id()=%"PRIx64", rapl_gpu_power()=%"PRIx64"\n",
                rapl_type_id(), rapl_gpu_power());

made the overlay fail to build. I have changed that to (using lx is not perfect, but PRIx64 made it fail to build):

	fprintf(stderr, "rapl_type_id()=%lx\n",rapl_type_id());
	fprintf(stderr, "rapl_gpu_power()=%lx\n",rapl_gpu_power());

which should result in the intended output. It was:

Comment 7 Chris Wilson 2018-03-24 02:08:10 UTC
And for completeness:  cat /sys/devices/power/events/energy-gpu.scale
Comment 8 leozinho29_eu 2018-03-24 02:55:47 UTC
$ sudo cat /sys/devices/power/events/energy-gpu.scale
Comment 9 Chris Wilson 2018-03-27 14:33:12 UTC
Created attachment 138378 [details] [review]
Use setlocale("C") around strtod

Please try the attached patch.
Comment 10 leozinho29_eu 2018-03-27 18:30:04 UTC
With this patch, the power consumption is shown correctly.
Comment 11 Chris Wilson 2018-03-27 20:00:35 UTC
Thank you for the bug report and the invaluable testing.

commit 3fa0b027304ec28cd24b314349d3731b55dfcc0a (HEAD, upstream/master)
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Mar 27 15:20:51 2018 +0100

    overlay: Call setlocale around strtod
    strtod() is locale-dependent. The decimal conversion depends on the radix
    character ('.' for some of us like myself) varies by locale. As the
    kernel reports its values using the "C" locale, we need to switch to
    that when parsing; and switch back before reporting to the user.
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105712
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
    Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Comment 12 Lakshmi 2018-10-11 19:52:40 UTC
Closing this bug, as it was resolved/fixed.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.