System Environment: -------------------------- Platform: Piketon Libdrm: (master)2.4.24-7-gfd3ed34a2070fca3804baf54ece40d0bc2666226 Mesa: (master)8752824f27c979986ae855667337e89637b005fb Xserver: (master)xorg-server-1.10.0-125-g03f45df93469f6aef391e97007b9614e0770cc4c Xf86_video_intel: (master)2.14.901-22-g79e7f4ca3b5f035af6f473b5a53c3fe7d1361089 Cairo: (master)90156f6ab7b94e9e576e34f6a2d8cb809242f8d0 Libva: (master)b7ff2141aeb2adbf5743fed7910a62d971c15013 Kernel: (drm-intel-next) f0c860246472248a534656d6cdbed5a36d1feb2e Bug detailed description: -------------------------------------- On gnome-desktop with compiz enabled, the 2D performance will drop about 40-50% on Piketon.Performance for x11perf: x11perf -aa10text: 979k 427k x11perf -rgb10text: 699k 412k Especially, 1.It only exist on Piketon.It works fine on Pineview and Hunronriver. 2.It's kernel regression.Bisect shows d4aeee776017b6da6dcd12f453cd82a3c951a0dc is the first bad commit. ommit d4aeee776017b6da6dcd12f453cd82a3c951a0dc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Mar 14 15:11:24 2011 +0000 drm/i915: Disable pagefaults along execbuffer relocation fast path Along the fast path for relocation handling, we attempt to copy directly from the user data structures whilst holding our mutex. This causes lockdep to warn about circular lock dependencies if we need to pagefault the user pages. [Since when handling a page fault on a mmapped bo, we need to acquire the struct mutex whilst already holding the mm semaphore, it is then verboten to acquire the mm semaphore when already holding the struct mutex. The likelihood of the user passing in the relocations contained in a GTT mmaped bo is low, but conceivable for extreme pathology.] In order to force the mm to return EFAULT rather than handle the pagefault, we therefore need to disable pagefaults across the relocation fast path. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reproduce steps: -------------------------------------- 1. gnome-session(with compiz) 2. x11perf -aa10text
Lacking a piketon, I can't investigate immediately. In the meantime, can you please gather a couple of cpu profiles using: $ sudo perf record -f -g -a x11perf -aa10text -d :0 for the bad commit and without. As the fix is a correctness/stability issue we can't simply revert it, but must instead understand why it impacts performance so and thence take mitigating steps.
Created attachment 44980 [details] The perf.date for the bad commit f0c8602464
Created attachment 44981 [details] The perf.data for the good commit This good commit is that the bad commit f0c860 which git revert the first bad commit d4aeee
*grin* And now can you do "perf report -i perf.data.[good|bad] | head -150" :) You will need the sources for each kernel build available (iirc).
perf does report a big spike in kernel time, so it should provide some useful information once we marry it to some symbols. head -150 might not be enough. Try head -1500 instead.
Created attachment 44983 [details] The perf.data.bad which head -1500
Created attachment 44985 [details] The perf.data.good which head -1500
Bah, bad symbol data. But my interpretation is that it is indeed just due to hitting a page-fault and having to fallback to the relocation slow-path - note the appearance of vmalloc/rb_next in the profile. I have curbed the number of relocations required by the ddx, but I know where I can find many more...
(In reply to comment #8) > Bah, bad symbol data. Do you mean you can't open my attachment(id=44983) or some other?
Just that the symbols perf used for the i915.ko addresses are garbage. However, I can guess what they should have been judging by the delta elsewhere.
Created attachment 44988 [details] [review] Retire requests before disabling pagefaults This is not the ultimate patch, but it may help reduce the number of unnecessary fallbacks.
Created attachment 44989 [details] [review] An ugly hack These should recover the performance, but only as an absolute last resort.
If it is not clear, please try the first patch first by itself. The second patch is just in case the first is not enough and to prove that the relocation on an active bo is the cause.
*** Bug 35761 has been marked as a duplicate of this bug. ***
Sorry,but I think the bug still exists.Testing in commit f0c8602,performance for x11perf -aa10text,compare with good commit(979k): no patch: 426k patch(44988) only: 432k both(44988,44988): 391k
I just noticed this bisected patch is on -backport branch, so I'd put it into P1.
No, it is not a P1 bug. The P1 bug is the OOPS that this fixes.
Pushed what should be a couple of mitigating patches to xf86-video-intel. Please compare.
(In reply to comment #18) > Pushed what should be a couple of mitigating patches to xf86-video-intel. > Please compare. Wiht the xf86 2.14.902-28-g47462f65e90e49e5ffd48c77c4f95255b9573f83, 2D performance: rgb10text:1060k aa10text:1380k
(In reply to comment #18) > Pushed what should be a couple of mitigating patches to xf86-video-intel. > Please compare. Compare commit 5f31025cce with the commit 47462f6 in xf86: rgb10text:522k 1060k aa10text: 552k 1380k
The 2D performance downgraded again after we git revert commit 59ed6b05db on xf86-video-intel.
Not unexpected, but still :(
We wait for sna.
Fixed in SNA.
Verified it.
Closing old verified.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.