Kernel: (drm-intel-next) 2d7b8366ae4a9ec2183c30e432a4a9a495c82bcd
Bug detailed description:
On our Pineview platform, the 2D performance will drop 40%~55% if we test with compiz enabled. I test with x11perf.
x11perf -aa10text: 1070k(no compiz) 460k(with compiz)
x11perf -rgb10text: 699k(no compiz) 450k(with compiz)
2. x11perf -aa10text
450k, you should be happy! ;-)
More seriously, rgb10text should be well over 1Mglyphs/s on PNV on bare X.
Time to learn perf. :)
If you haven't already have the tool installed (should be available with something like a perf or linux-tools package), then go into the kernel source directory cd tools/perf && make.
2. sudo perf record -f -g -a x11perf -rgb10text
I think symbol resolution is at report time, so then do sudo perf report > rgb10text.txt and attach that file. Thanks.
Created attachment 39438 [details]
the log file got by perf when test x11perf.
I have got some data with perf running x11perf, one is in gnome desktop without compiz and another with compiz. The performance data is :
x11perf -aa10text: 1020k (without compiz) 446k (with compiz)
and the log got with perf is in attachment.
Ok, that doesn't show the processor hotspots I have seen in the past when tuning the glyph performance, the no-compiz profile is in line with what I see here.
Similarly with compiz there are no true hotspots, so the throughput drop is purely due to extra rendering latency incurred through the compositor round-trip. There maybe some room for improving the batching between the compositor/X, but the only way to truly eliminate the compositor latency is by moving to Wayland [viz a compositing X server].
I'm currently seeing 800k/1400k aa10text on PineView with and without mutter respectively.
Considering the overheads, I'm lowering my acceptance threshold for mutter/compiz to 40% of raw speed. (There's the damage computation, plus the smaller batches and extra copies which all add up).
The only way to rectify this is to integrate the compositor with X, a story I have heard before.