Created attachment 140989 [details] dEQP Test execution log on lars, it fails on lars On chrome images, we are seeing a failure of dEQP test case failing on skylake lars. Steps To reproduce on a lars chromium test image: cd /usr/local/deqp/modules/gles31/ /usr/local/deqp/modules/gles31/deqp-gles31 --deqp-case=dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d --deqp-surface-type=pbuffer --deqp-gl-config-name=rgba8888d24s8ms0 --deqp-log-images=disable --deqp-watchdog=disable --deqp-surface-width=256 --deqp-surface-height=256 --deqp-log-filename=/tmp/dEQP-GLES31.log Test case 'dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d'.. Vertex shader compile time = 5.480000 ms Fragment shader compile time = 0.444000 ms Geometry shader compile time = 0.789000 ms Link time = 2.192000 ms Vertex shader compile time = 0.465000 ms Fragment shader compile time = 0.773000 ms Link time = 1.593000 ms Test case duration in microseconds = 21011 us Fail (Detected invalid layer content) DONE! Test run totals: Passed: 0/1 (0.0%) Failed: 1/1 (100.0%) Not supported: 0/1 (0.0%) Warnings: 0/1 (0.0%) The same test is passing on skylake chell.
Created attachment 140990 [details] Failing Lars system details, cache details
On further debug, we found this failure on SKU's variance. dEQP test always fails on Celeron SKU's. (paine) Intel(R) Celeron(R) 3205U @ 1.50GHz (yuna) Intel(R) Celeron(R) CPU 3215U @ 1.70GHz (gandof) Intel(R) Celeron(R) CPU 3215U @ 1.70GHz (lulu) Intel(R) Celeron(R) 3205U @ 1.50GHz (fizz) Intel(R) Celeron(R) CPU 3865U @ 1.80GHz It passes on below SKUs: Intel(R) Core(TM) i3-5005U Intel(R) Core(TM) i5-5300U
Tested this on skylake lars(Linux localhost 3.18.0-17579-gfb99dee6e87ae #1 SMP PREEMPT Sat May 12 00:43:34 PDT 2018 x86_64 Intel(R) Celeron(R) CPU 3855U @ 1.60GHz GenuineIntel GNU/Linux) on which dEQP test FAILS. While on skylake chell (Linux localhost 3.18.0-18114-g73dce1f1fb79 #1 SMP PREEMPT Wed Aug 1 13:31:27 IST 2018 x86_64 Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz GenuineIntel GNU/Linux) dEQP test PASSES While debugging, found below differences in these systems: Mesa Version Cache size(L3) CPU(s) Threads per Core skylake lars 18.2.0 2048KB 2 1 skylake chell 18.2.0 4096KB 4 2 I do not see any issue with kernel. Attaching the debugged data files for future reference.
Created attachment 140992 [details] dEQP Test execution log on chell, it passes on chell
Created attachment 140993 [details] Passing chell system details, cache, RAM details
Common reasons why test results may differ within a GPU generation: * Mesa has a bug in URB allocation because the two SKUs have different URB sizes. * The L3 cache may be coherent between the GPU and CPU on one chipset, but inchorent on the other chipset. * One device has 1 DIMM of RAM, the other device has 2 DIMMS. In this case, the hardware will swizzle the tiling patterns differently. Specifically, X and Y tiles with 1 DIMM differ from X and Y tiles with 2 DIMMS. This rarely causes bugs, but it's worth considering. * The SKU requires a hardware workaround (maybe documented, maybe not) that Mesa has not yet implemented. Checked the DIMM configuration thing, in both passing and failing cases, DIIM configuration is same. SO, that is eliminated as well. There is a difference in cache size in passing and failing case.Suspecting Mesa to have a bug in URB allocation or this particular SKU requires a hardware workaround to be fixed in Mesa.Would like Mesa's team insights into this bug. Please do let me know if any other debug input is required.
Thanks, could you add the pciid of the 2 devices? (lspci -nn)
Created attachment 140995 [details] [review] intel: sklgt1: limit URB size to 128kb Maybe you can give this patch a try. I'm trying to get some information about how to tell the exact size of the URB on SKL GT1.
Thanks Lionel, will try the patch. On failing lars, pci id= 0x1906 localhost ~ # lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:1904] (rev 08) 00:02.0 VGA compatible controller [0300]: Intel Corporation Device [8086:1906] (rev 07) 00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:1903] (rev 08) 00:14.0 USB controller [0c03]: Intel Corporation Device [8086:9d2f] (rev 21) 00:14.2 Signal processing controller [1180]: Intel Corporation Device [8086:9d31] (rev 21) 00:15.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d60] (rev 21) 00:15.1 Signal processing controller [1180]: Intel Corporation Device [8086:9d61] (rev 21) 00:19.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d66] (rev 21) 00:19.2 Signal processing controller [1180]: Intel Corporation Device [8086:9d64] (rev 21) 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:9d10] (rev f1) 00:1e.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d27] (rev 21) 00:1e.4 SD Host controller [0805]: Intel Corporation Device [8086:9d2b] (rev 21) 00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:9d43] (rev 21) 00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:9d21] (rev 21) 00:1f.3 Multimedia audio controller [0401]: Intel Corporation Device [8086:9d70] (rev 21) 00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:9d23] (rev 21) 00:1f.5 Non-VGA unclassified device [0000]: Intel Corporation Device [8086:9d24] (rev 21) 01:00.0 Network controller [0280]: Intel Corporation Wireless 7265 [8086:095a] (rev 59) On passing skylake chell, pci id = 0x191e localhost ~ # lspci -nn 00:00.0 Host bridge [0600]: Intel Corporation Sky Lake Host Bridge/DRAM Registers [8086:190c] (rev 08) 00:02.0 VGA compatible controller [0300]: Intel Corporation Sky Lake Integrated Graphics [8086:191e] (rev 07) 00:04.0 Signal processing controller [1180]: Intel Corporation Device [8086:1903] (rev 08) 00:14.0 USB controller [0c03]: Intel Corporation Device [8086:9d2f] (rev 21) 00:14.2 Signal processing controller [1180]: Intel Corporation Device [8086:9d31] (rev 21) 00:15.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d60] (rev 21) 00:15.1 Signal processing controller [1180]: Intel Corporation Device [8086:9d61] (rev 21) 00:19.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d66] (rev 21) 00:19.2 Signal processing controller [1180]: Intel Corporation Device [8086:9d64] (rev 21) 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:9d10] (rev f1) 00:1e.0 Signal processing controller [1180]: Intel Corporation Device [8086:9d27] (rev 21) 00:1e.4 SD Host controller [0805]: Intel Corporation Device [8086:9d2b] (rev 21) 00:1f.0 ISA bridge [0601]: Intel Corporation Device [8086:9d46] (rev 21) 00:1f.2 Memory controller [0580]: Intel Corporation Device [8086:9d21] (rev 21) 00:1f.3 Multimedia audio controller [0401]: Intel Corporation Device [8086:9d70] (rev 21) 00:1f.4 SMBus [0c05]: Intel Corporation Device [8086:9d23] (rev 21) 00:1f.5 Non-VGA unclassified device [0000]: Intel Corporation Device [8086:9d24] (rev 21) 01:00.0 Network controller [0280]: Intel Corporation Wireless 7265 [8086:095a] (rev 59)
Alright, you can forget about that patch then, your devices are SKL GT2.
Comment on attachment 140995 [details] [review] intel: sklgt1: limit URB size to 128kb Review of attachment 140995 [details] [review]: ----------------------------------------------------------------- This probably won't help the devices on this issue :(
You're running a kernel that is a bit old, but I know ChromeOS backports the i915 driver. Could you look into /sys/kernel/debug/dri/0/i915_gpu_info and see if you have a few lines like those : slice0: 3 subslice(s) (0x7): subslice0: 8 EUs (0xff) subslice1: 8 EUs (0xff) subslice2: 8 EUs (0xff) subslice3: 0 EUs (0x0) slice1: 0 subslice(s) (0x0): subslice0: 0 EUs (0x0) subslice1: 0 EUs (0x0) subslice2: 0 EUs (0x0) subslice3: 0 EUs (0x0) slice2: 0 subslice(s) (0x0): subslice0: 0 EUs (0x0) subslice1: 0 EUs (0x0) subslice2: 0 EUs (0x0) subslice3: 0 EUs (0x0) This could help figure out some of the fusing information that could impact programming. Thanks!
Lionel, i915_gpu_info is not present in /sys/kernel/debug/dri/0/* DO we need any patch to enable this sysfs path. Also, while debugging I found this commit which might help. commit c1e38ad37042b0ec261eb0ba5631b7ff0ee7a9da Author: Ben Widawsky <benjamin.widawsky@intel.com> Date: Thu Sep 10 16:59:12 2015 -0700 i965/skl: Use larger URB size where available. All SKL SKUs except the lowest one which has half the L3 size actually have 384K of URB per slice. For once, I can explain how this mistake was made and how it was missed in review... Historically when we enable a platform and put the production sizes, you can simply look at the "smallest" SKU and see what its URB size is (and we assumed it was the 1 slice variant). Since on newer platforms the URB sizes are scaled automatically by HW, this was sufficient. On SKL, this is a bit different as the lowest SKU actually has half of the L3 fused off. GT2 is the 1 slice (not GT1) variant and it has 384K. There are no Jenkins tests fixed (or regressions) and we don't expect any fixes here because you can always run with less URB size. Thanks to Sarah for bringing this to my attention. Cc: Sarah Sharp <sarah.a.sharp@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 7ad3a2fb7b40..a6a3bb670cae 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -314,7 +314,7 @@ static const struct brw_device_info brw_device_info_chv = { .max_wm_threads = 64 * 6, \ .max_cs_threads = 56, \ .urb = { \ - .size = 192, \ + .size = 384, \ .min_vs_entries = 64, \ .max_vs_entries = 1856, \ .max_hs_entries = 672, \ @@ -324,6 +324,7 @@ static const struct brw_device_info brw_device_info_chv = { static const struct brw_device_info brw_device_info_skl_gt1 = { GEN9_FEATURES, .gt = 1, + .urb.size = 192, };
(In reply to gaurav.k.singh from comment #13) > Lionel, > > i915_gpu_info is not present in /sys/kernel/debug/dri/0/* > > DO we need any patch to enable this sysfs path. > You might need to mount debugfs. Regarding the patch you mentioned, this is consistent with the hardware documentation, so I can't think of anything (unless the docs are wrong). Feel free to reduce the URB size to 192 and see if this fixes the bug.
(In reply to Lionel Landwerlin from comment #14) > (In reply to gaurav.k.singh from comment #13) > > Lionel, > > > > i915_gpu_info is not present in /sys/kernel/debug/dri/0/* > > > > DO we need any patch to enable this sysfs path. > > > > You might need to mount debugfs. Debugfs is already mounted. But i915_gpu_info is not there in /sys/kernel/debug/dri/0/ > > Regarding the patch you mentioned, this is consistent with the hardware > documentation, so I can't think of anything (unless the docs are wrong). > Feel free to reduce the URB size to 192 and see if this fixes the bug. Yes, I am trying. Will let you know.
this test is reliable on Mesa CI, for all platforms. How often does it fail on the lars system? Does it pass when you run the test by itself (without the rest of the dEQP suite)? You mention single channel ram. Does lars have single channel ram in the test configuration? This bug was found for that configuration: https://bugs.freedesktop.org/show_bug.cgi?id=105064
(In reply to Mark Janes from comment #16) > this test is reliable on Mesa CI, for all platforms. This test is failing on Celeron SKU's. (paine) Intel(R) Celeron(R) 3205U @ 1.50GHz (yuna) Intel(R) Celeron(R) CPU 3215U @ 1.70GHz (gandof) Intel(R) Celeron(R) CPU 3215U @ 1.70GHz (lulu) Intel(R) Celeron(R) 3205U @ 1.50GHz (fizz) Intel(R) Celeron(R) CPU 3865U @ 1.80GHz > > How often does it fail on the lars system? Does it pass when you run the > test by itself (without the rest of the dEQP suite)? I am running this test alone only on skylake lars (x86_64 Intel(R) Celeron(R) CPU 3855U @ 1.60GHz)only by running the below command on the target: cd /usr/local/deqp/modules/gles31/ /usr/local/deqp/modules/gles31/deqp-gles31 --deqp-case=dEQP-GLES31.functional.geometry_shading.layered.render_with_default_layer_3d --deqp-surface-type=pbuffer --deqp-gl-config-name=rgba8888d24s8ms0 --deqp-log-images=disable --deqp-watchdog=disable --deqp-surface-width=256 --deqp-surface-height=256 --deqp-log-filename=/tmp/dEQP-GLES31.log > > You mention single channel ram. Does lars have single channel ram in the > test configuration? This bug was found for that configuration: > > https://bugs.freedesktop.org/show_bug.cgi?id=105064 (In reply to Mark Janes from comment #16) > this test is reliable on Mesa CI, for all platforms. > > How often does it fail on the lars system? Does it pass when you run the > test by itself (without the rest of the dEQP suite)? > > You mention single channel ram. Does lars have single channel ram in the > test configuration? This bug was found for that configuration: > > https://bugs.freedesktop.org/show_bug.cgi?id=105064
Lionel, The skylake lars system I am using is GT1 only. The reason why is from the below info: localhost /sys/kernel/debug/dri/0 # cat i915_sseu_status SSEU Device Info Available Slice Total: 1 Available Subslice Total: 2 Available Subslice Per Slice: 2 Available EU Total: 12 Available EU Per Subslice: 6 Has Slice Power Gating: no Has Subslice Power Gating: no Has EU Power Gating: yes SSEU Device Status Enabled Slice Total: 1 Enabled Subslice Total: 2 Enabled Subslice Per Slice: 2 Enabled EU Total: 16 Enabled EU Per Subslice: 8 Confirmed with below mesa code as well which defined device info for skl_gt1: static const struct gen_device_info gen_device_info_skl_gt1 = { GEN9_FEATURES, .gt = 1, .is_skylake = true, .num_slices = 1, .num_subslices = { 2, }, .num_eu_per_subslice = 6, .l3_banks = 2, .urb.size = 192, };
(In reply to gaurav.k.singh from comment #18) > Lionel, > > The skylake lars system I am using is GT1 only. > > The reason why is from the below info: > > localhost /sys/kernel/debug/dri/0 # cat i915_sseu_status > SSEU Device Info > Available Slice Total: 1 > Available Subslice Total: 2 > Available Subslice Per Slice: 2 > Available EU Total: 12 > Available EU Per Subslice: 6 > Has Slice Power Gating: no > Has Subslice Power Gating: no > Has EU Power Gating: yes > SSEU Device Status > Enabled Slice Total: 1 > Enabled Subslice Total: 2 > Enabled Subslice Per Slice: 2 > Enabled EU Total: 16 > Enabled EU Per Subslice: 8 > > Confirmed with below mesa code as well which defined device info for skl_gt1: > > static const struct gen_device_info gen_device_info_skl_gt1 = { > GEN9_FEATURES, .gt = 1, > .is_skylake = true, > .num_slices = 1, > .num_subslices = { 2, }, > .num_eu_per_subslice = 6, > .l3_banks = 2, > .urb.size = 192, > }; Oh thanks. Then trying .urb.size = 128 would be interesting as it seems to be the lower bound for GT1 systems.
Based on this report, I think the i965 Mesa team needs a SKL GT1 device to add to our CI.
(In reply to Mark Janes from comment #20) > Based on this report, I think the i965 Mesa team needs a SKL GT1 device to > add to our CI. I think a lars system is making its way to Rafael. Not sure if on lease or permanently.
Lionel, changing the urb.size to 128 for SKL_GT1 did not help. Test fails again. --- a/src/intel/dev/gen_device_info.c +++ b/src/intel/dev/gen_device_info.c @@ -602,7 +602,7 @@ static const struct gen_device_info gen_device_info_skl_gt1 = { .num_subslices = { 2, }, .num_eu_per_subslice = 6, .l3_banks = 2, - .urb.size = 192, + .urb.size = 128, }; Please try at your end as well. Also, have you received the lars board from Patrick?
Can you give me the repos & git commits you're using to run this test? I think we need the version of kernel, mesa & CTS. I'm running master of the opengl-cts at https://github.com/KhronosGroup/VK-GL-CTS commit 44a2a863202dede460d83cc6b9ecdb632ccd0f7c Merge: e8cc637f8 59145955f Author: Alexander Galazin <alexander.galazin@arm.com> Date: Thu Jul 26 16:31:49 2018 +0200 Merge vk-gl-cts/opengl-es-cts-3.2.5 into vk-gl-cts/master Change-Id: I46b6332876af27cea732a67613759229f7028f7a Using Mesa on the 18.2 branch : commit 43208511981aff918fc779f66708818aef9eca81 Author: Ray Strode <rstrode@redhat.com> Date: Thu Aug 16 16:37:25 2018 -0400 gallium/winsys/kms: don't unmap what wasn't mapped And the kernel from fedora 28 : Linux lars-fedora 4.17.14-202.fc28.x86_64 #1 SMP Wed Aug 15 12:29:25 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux This particular test is passing.
Okay, I think we received the wrong Lars system... You reported the error with 0x1906 (SKL GT1) the system we received for debugging this is 0x1916 (SKL GT2). That's probably why it works.
Created attachment 141351 [details] [review] intel: limit urb size for SKL GT1 This makes the test pass for me. I'm looking for some comments from my colleagues because I don't know if the values from BXT will work for everything.
Created attachment 141352 [details] [review] intel: limit urb size for SKL GT1 Sorry trivial compile issue when moving the patch from lars to my laptop :(
Created attachment 141475 [details] [review] intel: limit urb size for SKL GT1 Another tentative fix which is probably closer to the final one. Unfortunately I need to loop back with the hardware people to verify all the new settings in there. Maybe keep the first patch for now if it works for you until we get all the settings confirmed. Thanks!
Hi Lionel and Gaurav, Any update on this? Is it safe to apply the first patch <https://lists.freedesktop.org/archives/mesa-dev/2018-August/203974.html> to the Chrome OS stable branches? Does the patch actually fix the test? Or should we wait for the second patch in comment 27?
I applied Lionel's fix from mesa-dev 2018-08-30 to the Chrome OS R70 stable branch. The dEQP failure was a potential release blocker, and the release deadline was today, so I chose the safer-looking of Lionel's two fixes. https://chromium-review.googlesource.com/c/chromiumos/third_party/mesa/+/1271945 For what it's worth, I tested the patch on SKU lars_intel_skylake_celeron_3855u_4Gb (a GT1 machine). Before patch, the test failed 128 of 128 runs. After patch, it passed 1024 or 1024 runs.
(In reply to Chad Versace from comment #29) > I applied Lionel's fix from mesa-dev 2018-08-30 to the Chrome OS R70 stable > branch. The dEQP failure was a potential release blocker, and the release > deadline was today, so I chose the safer-looking of Lionel's two fixes. > > https://chromium-review.googlesource.com/c/chromiumos/third_party/mesa/+/ > 1271945 > > For what it's worth, I tested the patch on SKU > lars_intel_skylake_celeron_3855u_4Gb (a GT1 machine). Before patch, the test > failed 128 of 128 runs. After patch, it passed 1024 or 1024 runs. Hey Chad, Yeah bug is 100% reproducible. I've started a discussion with HW engineers about this issue but it's rather slow going... Either of the patches will do for now.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1746.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.