Setup: * Ubuntu 16.04 * kernel & X server built from Git * Modifiers support enabled (requires git version , or latest X release, which isn't yet in distros) There's a large performance drop visible in many onscreen tests when using graphics stack supporting modifiers. Performance drops on all GEN9+ platforms. Drop is largest in SynMark Fill* tests on eDRAM machines where the drop is 50%. On GT2 machines they drop ~30% and on BXT 20%. This drop is visible both with Unity desktop and when using just plain X server. Smaller performance drop is visible in most of the other SynMark tests. I bisected the regression to following commit: ------------------------------------------------- commit a26693493570a9d0f0fba1be617e01ee7bfff4db Author: Jason Ekstrand <jason.ekstrand@intel.com> AuthorDate: Wed Jun 6 10:24:01 2018 -0700 Commit: Jason Ekstrand <jason.ekstrand@intel.com> CommitDate: Thu Jun 7 11:23:34 2018 -0700 i965/screen: Return false for unsupported formats in query_modifiers Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> ------------------------------------------------- Based on the size of the regressions, and GeomPoint improving with the change (only test which regresses with RBC), I'd say that above commit disables E2E RBC for these tests. Any idea why? Rendering looked fine before this change. On GEN9+ GT2 machines, there's also 3-4% performance drop visible in GfxBench onscreen ALU2 & Tessellation tests from this commit, and maybe 1% in GfxBench Manhattan 3.0, but those regressions are slightly less than half of what the E2E RBC improvement was. Interestingly, this same commit also improved SynMark VSInstancing performance noticeably on GEN7 & GEN8 (which don't support RBC).
SynMark uses EGL with GL tests, whereas most other benchmarks use GLX -> maybe this issue has something to do with EGL?
It took me a while to set up, but I did't see a drop in synmark OglFillPixel on Debian testing for the bisected commit. For SKLGT2, I'm getting a score around 33.
(In reply to Mark Janes from comment #2) > It took me a while to set up, but I did't see a drop in synmark OglFillPixel > on Debian testing for the bisected commit. For SKLGT2, I'm getting a score > around 33. If you were running it on FullHD resolution, you you don't have working E2E RBC setup (and your memory is 2x 2100Mhz DDR4?). You should get 45-50 FPS with E2E RBC. Maybe you didn't enable modifier support in your X server config: -------------------------------- Section "ServerFlags" Option "Debug" "dmabuf_capable" EndSection -------------------------------- ?
And naturally X server & Mesa XCB etc dependencies need to be from April or later, otherwise they're missing things required for X / modifier support being built in. When I was testing modifier support in spring with X & Wayland, I needed to build following from git: - git://anongit.freedesktop.org/xorg/proto/xorgproto - git://anongit.freedesktop.org/xcb/proto - git://anongit.freedesktop.org/xcb/libxcb - git://anongit.freedesktop.org/mesa/drm - git://anongit.freedesktop.org/mesa/mesa - https://github.com/anholt/libepoxy (commit 516b4fb8d0bce) - git://anongit.freedesktop.org/xorg/xserver - git://anongit.freedesktop.org/xorg/app/xkbcomp - git://anongit.freedesktop.org/xorg/driver/xf86-input-libinput After the build, libxcb header needs to match: grep "XCB_DRI3_M[AI][JN]OR_VERSION" $(find $PREFIX -name dri3.h) And X server build: grep \ -e GBM_BO_WITH_MODIFIERS \ -e GLAMOR_HAS_DRM_ATOMIC \ -e GLAMOR_HAS_DRM_MODIFIERS \ -e GLAMOR_HAS_DRM_NAME_FROM_FD_2 \ -e GLAMOR_HAS_EGL_QUERY_DMABUF \ -e GLAMOR_HAS_GBM \ -e GLAMOR_HAS_GBM_LINEAR \ include/dix-config.h Otherwise you haven't built a working modifier supporting setup. (Jason can tell more.)
I must not have a new enough X on debian testing. I added the xorg config with no useful result. I don't doubt Eero's results, I just can't easily confirm them to get traction on this bug. Jason, do you have a properly configured system where you can measure performance for the synmark fill tests?
Jason, Mark, any news on this bug ? Notice that I've extenced the Release Candidates period for 18.2 in order to get room for fixing blockers from bug 107457. If you don't think this can be solved in a reasonable period of time we should target for 18.3 and warn in the release notes about the known reduction in performance. Please, let me know as soon as possible.
I haven't had a system configured for this stuff for quite some time now. I think I still have the scripts so maybe I can cook something up.
Eero is spot on, I can confirm this. OglFillPixel on my Skull Canyon (SKL GT4e) improves from ~46 FPS to ~93 FPS with the following change on top of master (partial revert of the cited commit): diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index eaf5a3b9feb..19ddf2b4072 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -1333,9 +1333,6 @@ intel_query_dma_buf_modifiers(__DRIscreen *_screen, int fourcc, int max, if (f == NULL) return false; - if (!intel_image_format_is_supported(&screen->devinfo, f)) - return false; - for (i = 0; i < ARRAY_SIZE(supported_modifiers); i++) { uint64_t modifier = supported_modifiers[i].modifier; if (!modifier_is_supported(&screen->devinfo, f, 0, modifier))
FWIW, it's easy to reproduce this on Archlinux - the xserver, xcb, etc. in their [testing] repo is new enough that you don't need to build any of that. I only swapped in my Mesa, and added the xorg.conf.d change Eero suggested in comment #3.
Mesa's DRI3 loader is calling intel_query_dma_buf_modifiers with __DRI_IMAGE_FOURCC_SARGB8888 every time, which Jason's patch rejects, leading to an empty modifier list every time.
FillPixel is pretty optimal case for E2E RBC. It does layered write to onscreen buffer, with a well compressible moving gradient with its performance being fully dictated by the bandwidth used by those writes.
Well, I know how to make the current bug go away but I'm not sure if it would introduce additional bugs. :-( The problem is that we outright reject SRGB formats because there are no corresponding FOURCC formats and so you can't really create a truely SRGB modified surface. However, we need to support it somehow because most apps ask for sRGB and really ought to get it. I'll have to give this a think.
18.2.0-rc5 is tomorrow and by Jason comment it seems solving this bug won't happen soon. Will it be OK with you to go ahead making 18.2.0-rc5 the final for 18.2.0 (unless some other regression happens) and have this eventually fixed in a 18.2.x fix release ?
Jason is looking at this bug today. 50% is a pretty big perf regression, and end-to-end render buffer compression is a big feature for i965. We'd like to get it fixed before release.
I've got three patches on the mailing list. I'm hoping Daniel Stone gets a chance to look at and review them tomorrow and we should be able to merge soon.
https://patchwork.freedesktop.org/series/48842/
(In reply to Mark Janes from comment #14) > Jason is looking at this bug today. 50% is a pretty big perf regression, > and end-to-end render buffer compression is a big feature for i965. We'd > like to get it fixed before release. I will extended yet another week and have a -rc6 before final release. Thanks for the feedback and the big effort to make this regression disappear before 18.2.0.
Fixed by the following commit in master: commit d9cf4308ceea3762c1aab48f9c48e12a72162b5a (public/master) Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Tue Aug 28 15:25:23 2018 -0500 i965/screen: Allow modifiers on sRGB formats This effectively reverts a26693493570a9d0f0fba1be617e01ee7bfff4db which was a misguided attempt at protecting intel_query_dma_buf_modifiers from invalid formats. Unfortunately, in some internal EGL cases, we can get an SRGB format validly in this function. Rejecting such formats caused us to not allow CCS in some cases where we should have been allowing it. This regressed the performance of some SynMark tests as well as GfxBench ALU2, Tessellation and Manhattan 3.0 tests There's some question of whether or not we really should be using SRGB "fourcc" formats that aren't actually in drm_foucc.h but there's not much harm in allowing them through here. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107223 Fixes: a26693493570 "i965/screen: Return false for unsupported..." Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Thanks! Verified with SynMark tests. (GfxBench ALU2 perf is currently hidden behind bug 107772, and Tessalation behind bug 107706 though.)
FYI: this series seems to have fixed also earlier perf regression (from around May) that was visible on GEN6-GEN8 in (unity/compiz) composited windowed 3D benchmarks, but only with modifier enabled gfxstack.
(In reply to Eero Tamminen from comment #20) > FYI: this series seems to have fixed also earlier perf regression (from > around May) that was visible on GEN6-GEN8 in (unity/compiz) composited > windowed 3D benchmarks, but only with modifier enabled gfxstack. Btw. while it improved a lot of onscreen tests on older GENs too, it dropped perf on older GENs in SynMark ZBuffer & VSInstancing tests which draw a large amount of geometry on top of each other. Apparently caching changed so that (depth?) reads are now more expensive on those older (6-8) GENs?
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.