Bug 108712 - Kernel code no longer counts EU correctly for i5-5250U (HD Graphics 6000 BroadWell U-Processor GT3)
Summary: Kernel code no longer counts EU correctly for i5-5250U (HD Graphics 6000 Broa...
Status: CLOSED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Intel (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Intel GFX Bugs mailing list
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-12 01:59 UTC by gordon.lack
Modified: 2019-08-28 00:54 UTC (History)
3 users (show)

See Also:
i915 platform: BDW
i915 features:


Attachments
Attached information files, as noted n Description/Comment. (331.77 KB, application/gzip)
2018-11-12 01:59 UTC, gordon.lack
no flags Details
tools/intel_reg: Add Gen8 GT fuse registers (778 bytes, patch)
2018-11-12 10:57 UTC, Lionel Landwerlin
no flags Details | Splinter Review
Extended register dump. (26.58 KB, text/plain)
2018-11-12 11:38 UTC, gordon.lack
no flags Details
drm/i915: fix broadwell EU computation (1.07 KB, patch)
2018-11-12 12:40 UTC, Lionel Landwerlin
no flags Details | Splinter Review
debug file-system info relating to the EUs found (2.76 KB, text/plain)
2018-11-13 19:27 UTC, gordon.lack
no flags Details

Description gordon.lack 2018-11-12 01:59:53 UTC
Created attachment 142439 [details]
Attached information files, as noted n Description/Comment.

The Ubuntu Bioinc kernels (4.15.x) report an i5-5250U processor (with HD Graphics 6000 BroadWell U-Processor GT3 GPU) as having 48 EUs.

The Cosmic ones (4.18.x) report it has having only 24. This is a result of only 4 EUs per sub-slice being counted (it should have 8). Unsurprisingly this result sin a significant loss of compute power when software uses this value to determine how to run.

The code that determines how many EUs there are is in intel_device_info.c, and this was changed in the 4.17 kernel series by this commit:

https://github.com/torvalds/linux/commit/8cc7669355136f8952779e6f60053c1284d59c4d

I've attached the intel_reg dump info for the graphics set-up and also 4.15.x and 4.18.x clinfo and the contents of the debug file-system (/sys/kernel/debug/dri/0).
The clinfo output contains a debug statement added to the beignet code on the system which reports what it gets from the register-reading kernel code, then sets it to 48 regardless). This was originally reported as a beignet bug (https://bugs.launchpad.net/ubuntu/+source/beignet/+bug/1800752).

Attachment files are:
   intel_reg_dump.txt
   4.15.0-38-generic.clinfo
   4.15.0-38-generic.dri-debug
   4.18.0-10-generic.clinfo
   4.18.0-10-generic.dri-debug
Comment 1 Lakshmi 2018-11-12 09:09:07 UTC
Tvrtko/Lionel, any comments here?
Comment 2 Lionel Landwerlin 2018-11-12 10:57:13 UTC
Not really sure where the problem lies...
Can you redump the registers with the attached patch (unfortunately we're missing the fuse register in IGT).
Comment 3 Lionel Landwerlin 2018-11-12 10:57:38 UTC
Created attachment 142441 [details] [review]
tools/intel_reg: Add Gen8 GT fuse registers
Comment 4 gordon.lack 2018-11-12 11:38:51 UTC
Created attachment 142442 [details]
Extended register dump.

Extended register dump attached.
For quick reference, this is the diff versus the original.

--- intel_reg_dump.txt-ORIG     2018-11-12 00:50:59.182273594 +0000
+++ intel_reg_dump.txt  2018-11-12 11:32:54.640801914 +0000
@@ -512,6 +512,10 @@
                             BLT_IMR (0x000220a8): 0xfeffffff
                        PRIVATE_PAT1 (0x000040e0): 0x000a0907
                        PRIVATE_PAT2 (0x000040e4): 0x3b2b1b0b
+                              FUSE2 (0x00009120): 0x46000280
+                        EU_DISABLE0 (0x00009134): 0x00000000
+                        EU_DISABLE1 (0x00009138): 0xffff0000
+                        EU_DISABLE2 (0x0000913c): 0x91a03eff
                      AUD_TCA_CONFIG (0x00065000): 0x10090000
                      AUD_TCB_CONFIG (0x00065100): 0x0070fa60
                      AUD_TCC_CONFIG (0x00065200): 0x0070fa60
@@ -542,9 +546,9 @@
  AUD_TCB_PIN_PIPE_CONN_ENTRY_LENGTH (0x000651a8): 0x00000003
  AUD_TCC_PIN_PIPE_CONN_ENTRY_LENGTH (0x000652a8): 0x00000003
              AUD_PIPE_CONN_SEL_CTRL (0x000650ac): 0x00030303
-            AUD_TCA_DIP_ELD_CTRL_ST (0x000650b4): 0x00005541
-            AUD_TCB_DIP_ELD_CTRL_ST (0x000651b4): 0x00005421
-            AUD_TCC_DIP_ELD_CTRL_ST (0x000652b4): 0x00005421
+            AUD_TCA_DIP_ELD_CTRL_ST (0x000650b4): 0x00005583
+            AUD_TCB_DIP_ELD_CTRL_ST (0x000651b4): 0x00005463
+            AUD_TCC_DIP_ELD_CTRL_ST (0x000652b4): 0x00005463
                  AUD_PIN_ELD_CP_VLD (0x000650c0): 0x00000000
                AUD_HDMI_FIFO_STATUS (0x000650d4): 0x00000000
                            AUD_ICOI (0x00065f00): 0x00000000
Comment 5 Lionel Landwerlin 2018-11-12 12:11:50 UTC
The content of the registers seems to be consistent all of the EUs of slice2 being fused off. That's pretty odd considering that slice2 enable not...
I don't really know what's going on.

As far as I can tell, the commit you pointed to wouldn't have changed this issue and you should see 24EUs even before that.
Comment 6 Lionel Landwerlin 2018-11-12 12:25:07 UTC
Trying to repro locally with the values from the dump.
Comment 7 gordon.lack 2018-11-12 12:27:05 UTC
The commit I mentioned completely changed how the EUs were calculated (as far as I can tell). It's the only change  I can see within 4.15 -> 4.18 time frame that affects broadwell_sseu_info_init().

At 4.15 i915_sseu_status (in the debugfs file-system) reported:

SSEU Device Info
  Available Slice Mask: 0003
  Available Slice Total: 2
  Available Subslice Total: 6
  Available Subslice Mask: 0007
  Available Subslice Per Slice: 3
  Available EU Total: 48
  Available EU Per Subslice: 8
  Has Pooled EU: no
  Has Slice Power Gating: yes
  Has Subslice Power Gating: no
  Has EU Power Gating: no
SSEU Device Status
  Enabled Slice Mask: 0000
  Enabled Slice Total: 0
  Enabled Subslice Total: 0
  Enabled Subslice Mask: 0000
  Enabled Subslice Per Slice: 0
  Enabled EU Total: 0   
  Enabled EU Per Subslice: 0

whereas at 4.18 this becomes:
SSEU Device Info   
  Available Slice Mask: 0003
  Available Slice Total: 2
  Available Subslice Total: 6
  Available Slice0 subslices: 3
  Available Slice1 subslices: 3
  Available EU Total: 24
  Available EU Per Subslice: 4 
  Has Pooled EU: no
  Has Slice Power Gating: yes 
  Has Subslice Power Gating: no
  Has EU Power Gating: no
SSEU Device Status
  Enabled Slice Mask: 0003
  Enabled Slice Total: 2
  Enabled Subslice Total: 6
  Enabled Slice0 subslices: 3
  Enabled Slice1 subslices: 3
  Enabled EU Total: 24
  Enabled EU Per Subslice: 4
Comment 8 Lionel Landwerlin 2018-11-12 12:40:22 UTC
Created attachment 142443 [details] [review]
drm/i915: fix broadwell EU computation

You right!
Comment 9 gordon.lack 2018-11-12 17:49:48 UTC
Presumably this will now be updated in the latest kernel?
Comment 10 Lionel Landwerlin 2018-11-12 20:08:07 UTC
(In reply to gordon.lack from comment #9)
> Presumably this will now be updated in the latest kernel?

That's right, pushed to the intel branch :

commit 63ac3328f0d1d37f286e397b14d9596ed09d7ca5
Author: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Date:   Mon Nov 12 12:39:31 2018 +0000

    drm/i915: fix broadwell EU computation
    
    subslice_mask is an array indexed by slice, not subslice.


Not sure how long it takes to propagate to stable but it should get there eventually.

Thanks for the report.
Comment 11 Lakshmi 2018-11-13 09:17:46 UTC
Gordon, Can you verify if this works for you with latest drm-tip? I can close the bug if works as expected.
(https://cgit.freedesktop.org/drm-tip)
Comment 12 gordon.lack 2018-11-13 11:50:15 UTC
May take a day or two for that....
Comment 13 gordon.lack 2018-11-13 19:09:05 UTC
I've just tested it an .iIt does fix it.
Details to follow...
Comment 14 gordon.lack 2018-11-13 19:27:46 UTC
Created attachment 142455 [details]
debug file-system info relating to the EUs found

I've built the Ubuntu kernel 4.18.0-10-generic with the patch applied.
I copied the i915.ko module file out to the /lib/modules location.
Rebooted and found no change!
Then remembered that his module was probably in the initrd file, so created a new one and rebooted with that.
This now showed things to be OK.

I ran a simple test program I had to query I915_PARAM_EU_TOTAL using ioctl() and that now returns 48 (not 24).
The debug file-system data (relevant files attached) also now show 48 EUs.

So this has fixed the problem.

Thanks...
Comment 15 Lionel Landwerlin 2018-11-14 00:01:22 UTC
(In reply to gordon.lack from comment #14)
> Created attachment 142455 [details]
> debug file-system info relating to the EUs found
> 
> I've built the Ubuntu kernel 4.18.0-10-generic with the patch applied.
> I copied the i915.ko module file out to the /lib/modules location.
> Rebooted and found no change!
> Then remembered that his module was probably in the initrd file, so created
> a new one and rebooted with that.
> This now showed things to be OK.
> 
> I ran a simple test program I had to query I915_PARAM_EU_TOTAL using ioctl()
> and that now returns 48 (not 24).
> The debug file-system data (relevant files attached) also now show 48 EUs.
> 
> So this has fixed the problem.
> 
> Thanks...

Thanks for spending the time to verify!
Comment 16 Dmitry Rogozhkin 2019-08-27 17:05:47 UTC
Hi, Intel OpenCL NEO driver got a bug report (https://github.com/intel/compute-runtime/issues/200) which seems describing this issue. I checked whether Lionel's patch got applied against kernel.org LTS kernels and it seems it was not:
1. 4.20 has the patch: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-4.20.y&qt=grep&q=drm%2Fi915%3A+fix+broadwell+EU+computation
2. 4.19 does not: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/log/?h=linux-4.19.y&qt=grep&q=drm%2Fi915%3A+fix+broadwell+EU+computation

Patch has "Fixes" in the commit message, was it missed by KH in cherry-picks?

Lionel, can you, please, also check whether patch is needed for 4.14 LTS? /I assume that it is needed for 4.19 for sure./
Comment 17 gordon.lack 2019-08-28 00:54:56 UTC
FWIW, 5.0 (disco) is OK as well.

I know that as I'm using it with the standard beignet package, not the patched one that was needed to "force" 48 EUs to be seen.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.