Created attachment 28410 [details] xorg.0.log System Environment: ---------------------- Platform: G41 Arch: x86_64 OSD: Fedora release 9 (Sulphur) Libdrm: (master)5a73f066ba149816cc0fc2de4b97ec4714cf8ebc Mesa: (master)03607708b0499816291f0fb0d1c331fbf034f0ba Xserver: (master)a85523dc50f392a33a1c00302a0946828bc9249d Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 Kernel: (drm-intel-next)0c2e39525b3b53a97a0202c5f35058147e53977e Bug Description: --------------------- I test with cairo-perf on G41, find there is regression when test with swfdec-fill-rate-2xaa.trace and swfdec-fill-rate-4xaa.trace. Maybe they are the same issue. The performance data I get with the newest code of 20090729 is 3 times slower than the data with our Q2 release code. And I find the regression is caused by xserver, if I only change the xserver from master to 1.6 branch, it performs much better. with code of 20090729: swfdec-fill-rate-2xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 36.754 36.769 0.02% 5/6 [ 0] xlib swfdec-fill-rate-2xaa 184.982 194.616 2.23% 6/6 swfdec-fill-rate-4xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-4xaa 135.698 135.709 0.01% 6/6 [ 0] xlib swfdec-fill-rate-4xaa 743.517 744.167 0.05% 6/6 I only change the xserver to server-1.6 branch(606f6dba16d42e3546a82a386d5a01087467b511): swfdec-fill-rate-2xaa.trace.KMS [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 36.642 36.643 0.00% 4/6 [ 0] xlib swfdec-fill-rate-2xaa 51.859 51.883 0.06% 5/6 swfdec-fill-rate-4xaa.trace.KMS [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-4xaa 135.807 135.948 0.30% 6/6 [ 0] xlib swfdec-fill-rate-4xaa 199.334 199.804 0.12% 6/6 Reproduce Steps: --------------------- 1. xinit& 2. cairo-perf-trace swfdec-fill-rate-2xaa.trace(swfdec-fill-rate-4xaa.trace)
(In reply to comment #0) > System Environment: > ---------------------- > Platform: G41 > Arch: x86_64 > OSD: Fedora release 9 (Sulphur) > Libdrm: (master)5a73f066ba149816cc0fc2de4b97ec4714cf8ebc > Mesa: (master)03607708b0499816291f0fb0d1c331fbf034f0ba > Xserver: (master)a85523dc50f392a33a1c00302a0946828bc9249d > Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 > Kernel: (drm-intel-next)0c2e39525b3b53a97a0202c5f35058147e53977e Thanks for the bug report. The details above showing the versions at which the regression first appeared are very appreciated. Thanks! What's missing is the previously tested versions at which things were last seen to be working. From a separate report, I believe these are the working versions: Last known versions without regression -------------------------------------- Libdrm: (master)30449829c0347dc7dbe29acb13e49e2f2cb72ae9 Mesa: (master)506bacb8e40b0a170a4b620113506925d2333735 Xserver: (master)b1c3dc6ae226db178420e3b5f297b94afc87c94c Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 Kernel_unstable: (drm-intel-next)2a2430f4542467502d39660bfd66b0004fd8d6a9 Let me know if I didn't get those right. -Carl
(In reply to comment #1) > (In reply to comment #0) > > System Environment: > > ---------------------- > > Platform: G41 > > Arch: x86_64 > > OSD: Fedora release 9 (Sulphur) > > Libdrm: (master)5a73f066ba149816cc0fc2de4b97ec4714cf8ebc > > Mesa: (master)03607708b0499816291f0fb0d1c331fbf034f0ba > > Xserver: (master)a85523dc50f392a33a1c00302a0946828bc9249d > > Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 > > Kernel: (drm-intel-next)0c2e39525b3b53a97a0202c5f35058147e53977e > Thanks for the bug report. > The details above showing the versions at which the regression > first appeared are very appreciated. Thanks! > What's missing is the previously tested versions at which things were > last seen to be working. From a separate report, I believe these are > the working versions: > Last known versions without regression > -------------------------------------- > Libdrm: (master)30449829c0347dc7dbe29acb13e49e2f2cb72ae9 > Mesa: (master)506bacb8e40b0a170a4b620113506925d2333735 > Xserver: (master)b1c3dc6ae226db178420e3b5f297b94afc87c94c > Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 > Kernel_unstable: (drm-intel-next)2a2430f4542467502d39660bfd66b0004fd8d6a9 > Let me know if I didn't get those right. > -Carl No. Carl, maybe you can pay more attention on my bug description. :) I first found this regression with the code of 20090729 compared with our Q2 release. Finally I find it was caused by xserver, that is if I only change 20090729's xserver from master to server-1.6-branch, it works well. You can just try with the commit of 20090729 and compare server on master branch and on server-1.6-branch. And the code of 20090729 is: Libdrm: (master)5a73f066ba149816cc0fc2de4b97ec4714cf8ebc Mesa: (master)03607708b0499816291f0fb0d1c331fbf034f0ba Xserver: (master)a85523dc50f392a33a1c00302a0946828bc9249d (bad) Xf86_video_intel: (master)50e2a6734de43a135aa91cd6e6fb5147e15ce315 Kernel: (drm-intel-next)2a2430f4542467502d39660bfd66b0004fd8d6a9 Xserver: (server-1.6-branch) 606f6dba16d42e3546a82a386d5a01087467b511 (good)
(In reply to comment #2) > No. Carl, maybe you can pay more attention on my bug description. :) Yes, clearly I need to do that. To help me avoid mistakes in the future, it still would be helpful to have both "before" and "after" git commit IDs for any regressions identified. Thanks! -Carl
Here are the results of my attempt to reproduce this: System environment ------------------ Platform: GM965 (Lenovo Thinkpad x200s) Arch: x86 OSD: Debian unstable xf86-video-intel: master: b8c5c996e888485c3a16d645c8490592534a7882 cairo: master: 56c9b2de7a2b93b2e0c59cf98326d8c0d4d508ba cairo-traces: master: b889dfc97c585d737b1b6ab139c0dbcd1ef01cf4 I tested with cairo-perf-trace. I first trimmed down the testcases of interest with: ./csi-trace --trim=10 < full/swfdec-fill-rate-2xaa.trace > benchmark/swfdec-fill-rate-2xaa.trace ./csi-trace --trim=10 < full/swfdec-fill-rate-4xaa.trace > benchmark/swfdec-fill-rate-4xaa.trace That makes each take only about 10 seconds on the image backend, which just makes it faster to go trhough many runs quickly. I then tested two X server versions (master and 1.6) with the following results: xserver master (b8c5c996e888485c3a16d645c8490592534a7882) --------------------------------------------------------- $ CAIRO_TEST_TARGET="image,xlib" ./cairo-perf-trace -i 3 ./cairo-traces/benchmark/swfdec-fill-rate-2xaa.trace ./cairo-traces/benchmark/swfdec-fill-rate-4xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 18.682 18.683 0.13% 3/3 [ 1] image swfdec-fill-rate-4xaa 19.388 19.396 0.02% 3/3 [ 0] xlib swfdec-fill-rate-2xaa 33.758 34.072 2.91% 3/3 [ 1] xlib swfdec-fill-rate-4xaa 37.228 37.324 0.28% 3/3 xserver 1.6 (606f6dba16d42e3546a82a386d5a01087467b511) ------------------------------------------------------ $ CAIRO_TEST_TARGET="image,xlib" ./cairo-perf-trace -i 3 ./cairo-traces/benchmark/swfdec-fill-rate-2xaa.trace ./cairo-traces/benchmark/swfdec-fill-rate-4xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 18.569 18.639 0.19% 3/3 [ 1] image swfdec-fill-rate-4xaa 19.232 19.238 0.03% 3/3 [ 0] xlib swfdec-fill-rate-2xaa 20.165 20.168 0.44% 3/3 [ 1] xlib swfdec-fill-rate-4xaa 24.654 24.816 0.43% 3/3 So on this system I have reproduced a slowdown with the current master X server, (though not quite as dramatic as the 4x of the original bug report). That could be from different CPU speed affecting the change, due to the trimming, etc. I'll bisect the xserver next to identify a commit introducing the performance regression. -Carl
(In reply to comment #3) > (In reply to comment #2) > > No. Carl, maybe you can pay more attention on my bug description. :) > Yes, clearly I need to do that. > To help me avoid mistakes in the future, it still would be helpful to have both > "before" and "after" git commit IDs for any regressions identified. Thanks! > -Carl OK. I will list both "before" and "after" git commit IDs. :)
I bisected this change through the X server and found that the commit causing the performance regression was simply the commit changing the version number of the X server. The issue is that cairo is querying the X server and changing its behavior depending on the X server version. So in one sense, this isn't a driver bug at all, since it's the application that is actually doing something different. But Chris Wilson took a different approach and said, "But still, with the new X server cairo is doing what it *should* have been doing all along, (and simply wasn't doing to avoid X server bugs). So why is it actually slower?. Chris then answered this with the following commit: commit 57fc09cef28bad2e3e8455b93ef2927118f8a3a3 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sun Sep 20 01:02:39 2009 +0100 Avoid fallbacks for a1 src/mask Carl Worth did the hard work in identifying that the regression in cairo between X.org 1.6 and 1.7 was caused by cairo sending an a1 mask to the server in 1.7 whereas in 1.6 cairo used local fallbacks (as the source was using RepeatPad, which triggers cairo's 'buggy_pad_reflect' fallback for X.org 1.6). This was causing the driver to do a fallback to handle the a1 mask instead, which due to the GPU pipeline stall is much more expensive than the equivalent fallback in cairo. Reference: cairo's performance downgrades 4X with server master than server-1.6. https://bugs.freedesktop.org/show_bug.cgi?id=23184 The fix is a relatively simple extension of the current uxa_picture_from_pixman_image() to use CompositePicture() instead of CopyArea() when we need to convert to a new format. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
With the newest code, the swfdec-fill-rate-2xaa.trace and swfdec-fill-rate-4xaa.trace's performance data improve amazingly 10X~15X, from about 5X slower than its image backend to 2X faster than its image backend now. So verified. The data with 0917's code: swfdec-fill-rate-2xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 26.087 26.137 0.12% 6/6 [ 0] xlib swfdec-fill-rate-2xaa 125.475 125.515 0.03% 6/6 swfdec-fill-rate-4xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-4xaa 93.333 93.498 0.08% 6/6 [ 0] xlib swfdec-fill-rate-4xaa 501.693 501.919 0.04% 6/6 The data with 0924's code: swfdec-fill-rate-2xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-2xaa 26.122 26.129 0.08% 5/6 [ 0] xlib swfdec-fill-rate-2xaa 10.566 10.687 0.68% 6/6 swfdec-fill-rate-4xaa.trace [ # ] backend test min(s) median(s) stddev. count [ 0] image swfdec-fill-rate-4xaa 93.288 93.325 0.02% 5/6 [ 0] xlib swfdec-fill-rate-4xaa 34.080 34.162 0.16% 5/6
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.