Summary: | [BXT]15% performance drop in SynMark v7.0 Multithread use-case | ||||||
---|---|---|---|---|---|---|---|
Product: | DRI | Reporter: | Eero Tamminen <eero.t.tamminen> | ||||
Component: | DRM/Intel | Assignee: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Status: | CLOSED FIXED | QA Contact: | Intel GFX Bugs mailing list <intel-gfx-bugs> | ||||
Severity: | normal | ||||||
Priority: | medium | CC: | intel-gfx-bugs | ||||
Version: | DRI git | ||||||
Hardware: | Other | ||||||
OS: | All | ||||||
Whiteboard: | |||||||
i915 platform: | BXT | i915 features: | |||||
Attachments: |
|
Description
Eero Tamminen
2017-08-21 12:46:13 UTC
I don't have those commit ids available, but that date does coincide with the rc1 merge. Is that true? Just checking the merge base would be useful to narrow it down between i915 vs the world. Ok, between v4.12 and drm-tip I see a drop from 6 to 4.5 on my bxt. Should poke Martin about ezbench bisection... First bisection gives: commit 3fed382b46baac83703130fe4cd3d9147f427fb9 Author: Rik van Riel <riel@redhat.com> Date: Fri Jun 23 12:55:29 2017 -0400 sched/numa: Implement NUMA node level wake_affine() Since select_idle_sibling() can place a task anywhere on a socket, comparing loads between individual CPU cores makes no real sense for deciding whether to do an affine wakeup across sockets, either. Instead, compare the load between the sockets in a similar way the load balancer and the numa balancing code do. Signed-off-by: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: jhladky@redhat.com Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/20170623165530.22514-4-riel@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Created attachment 133763 [details] [review] Fix wake_affine regression Playing around to restore the old code; gives us the spread across cores required to make OglMultithread happier. Ok, found a version by PeterZ: 20170801121912.fnykqlq3r5jcbtn2@hirez.programming.kicks-ass.net Gave it a cursory test and it appears to have fixed this particular regression, so pushed to topic/core-for-CI for wider testing: commit 367a3aa35e7bed10a74bed139f6728916f5d1508 Author: Peter Zijlstra <peterz@infradead.org> Date: Tue Aug 1 23:43:07 2017 +0200 sched/fair: Fix wake_affine() for !NUMA_BALANCING It should only affect !llc platforms, but the logic is slightly different than the original prior to commit 3fed382b46baac83703130fe4cd3d9147f427fb9 Author: Rik van Riel <riel@redhat.com> Date: Fri Jun 23 12:55:29 2017 -0400 sched/numa: Implement NUMA node level wake_affine() so testing definitely required. Part 1 upstream: commit 90001d67be2fa2acbe3510d1f64fa6533efa30ef Author: Peter Zijlstra <peterz@infradead.org> Date: Mon Jul 31 17:50:05 2017 +0200 sched/fair: Fix wake_affine() for !NUMA_BALANCING Needs another fix which is pending. Commit-ID: a731ebe6f17bff9e7ca12ef227f9da4d5bdf8425 Gitweb: http://git.kernel.org/tip/a731ebe6f17bff9e7ca12ef227f9da4d5bdf8425 Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Wed, 6 Sep 2017 12:51:31 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 7 Sep 2017 09:29:31 +0200 sched/fair: Fix wake_affine_llc() balancing rules Chris Wilson reported that the SMT balance rules got the +1 on the wrong side, resulting in a bias towards the current LLC; which the load-balancer would then try and undo. Reported-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: 90001d67be2f ("sched/fair: Fix wake_affine() for !NUMA_BALANCING") Link: http://lkml.kernel.org/r/20170906105131.gqjmaextmn3u6tj2@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> There have been *two*, equally large BXT improvements to SynMark Multithread performance. First one, which already more than fixed the earlier regression was between: 650ffdaee4 at 2017-08-24 14:30:13 UTC 4823f19b97 at 2017-08-25 15:55:28 UTC and the second was between: 587d276952 at 2017-09-06 17:54:04 UTC 2300dd9c44 at 2017-09-07 17:00:01 UTC I assume latter to be the above referred commit. That actually improved Multithread performance slightly also on several other platforms. While I can now verify this, I'm wondering a bit about the first commit... Thanks! Closing old verified. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.