drm-tip kernel has caused ~15% drop in SynMark v7.0 Multithread 3D use-case between these commits: kernel git://anongit.freedesktop.org/drm-tip at 560582ef03d1f6bf9010232fcc6613188a00d198 2017-07-17_16-27-42 drm-tip: 2017y-07m-17d-16h-27m-16s UTC integration manifest kernel git://anongit.freedesktop.org/drm-tip at 10de1e17faaab452782e5a1baffd1b30a639a261 2017-07-18_10-09-14 drm-tip: 2017y-07m-18d-10h-08m-42s UTC integration manifest (This test has quite a large variance, but 15% drop in this period is very clear from the longer term trend.) This drop is BXT specific and visible on 2 different BXT (18 EU) J4205 devices. Multithread test has 4 work threads doing low and hi resolution reflections / environment map rendering, with main thread rendering the final HDR image with tone mapping and bloom. This requires enough draw calls & syching that this test is (mostly) CPU bound. Note: The less CPU bound version (HdrBloom) of this test, which has only 1 work thread instead of 4, and which uses higher resolution & details for that, doesn't show any (clear) performance regression.
I don't have those commit ids available, but that date does coincide with the rc1 merge. Is that true? Just checking the merge base would be useful to narrow it down between i915 vs the world.
Ok, between v4.12 and drm-tip I see a drop from 6 to 4.5 on my bxt. Should poke Martin about ezbench bisection...
First bisection gives: commit 3fed382b46baac83703130fe4cd3d9147f427fb9 Author: Rik van Riel <riel@redhat.com> Date: Fri Jun 23 12:55:29 2017 -0400 sched/numa: Implement NUMA node level wake_affine() Since select_idle_sibling() can place a task anywhere on a socket, comparing loads between individual CPU cores makes no real sense for deciding whether to do an affine wakeup across sockets, either. Instead, compare the load between the sockets in a similar way the load balancer and the numa balancing code do. Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jhladky@redhat.com Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-4-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Created attachment 133763 [details] [review] Fix wake_affine regression Playing around to restore the old code; gives us the spread across cores required to make OglMultithread happier.
Ok, found a version by PeterZ: 20170801121912.fnykqlq3r5jcbtn2@hirez.programming.kicks-ass.net
Gave it a cursory test and it appears to have fixed this particular regression, so pushed to topic/core-for-CI for wider testing: commit 367a3aa35e7bed10a74bed139f6728916f5d1508 Author: Peter Zijlstra <peterz@infradead.org> Date: Tue Aug 1 23:43:07 2017 +0200 sched/fair: Fix wake_affine() for !NUMA_BALANCING It should only affect !llc platforms, but the logic is slightly different than the original prior to commit 3fed382b46baac83703130fe4cd3d9147f427fb9 Author: Rik van Riel <riel@redhat.com> Date: Fri Jun 23 12:55:29 2017 -0400 sched/numa: Implement NUMA node level wake_affine() so testing definitely required.
Part 1 upstream: commit 90001d67be2fa2acbe3510d1f64fa6533efa30ef Author: Peter Zijlstra <peterz@infradead.org> Date: Mon Jul 31 17:50:05 2017 +0200 sched/fair: Fix wake_affine() for !NUMA_BALANCING Needs another fix which is pending.
Commit-ID: a731ebe6f17bff9e7ca12ef227f9da4d5bdf8425 Gitweb: http://git.kernel.org/tip/a731ebe6f17bff9e7ca12ef227f9da4d5bdf8425 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Wed, 6 Sep 2017 12:51:31 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 7 Sep 2017 09:29:31 +0200 sched/fair: Fix wake_affine_llc() balancing rules Chris Wilson reported that the SMT balance rules got the +1 on the wrong side, resulting in a bias towards the current LLC; which the load-balancer would then try and undo. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: 90001d67be2f ("sched/fair: Fix wake_affine() for !NUMA_BALANCING") Link: http://lkml.kernel.org/r/20170906105131.gqjmaextmn3u6tj2@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
There have been *two*, equally large BXT improvements to SynMark Multithread performance. First one, which already more than fixed the earlier regression was between: 650ffdaee4 at 2017-08-24 14:30:13 UTC 4823f19b97 at 2017-08-25 15:55:28 UTC and the second was between: 587d276952 at 2017-09-06 17:54:04 UTC 2300dd9c44 at 2017-09-07 17:00:01 UTC I assume latter to be the above referred commit. That actually improved Multithread performance slightly also on several other platforms. While I can now verify this, I'm wondering a bit about the first commit... Thanks!
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.