Summary: | Performance regression in mpv caused by enabling SENDS for surface writes | ||
---|---|---|---|
Product: | Mesa | Reporter: | Nicolas Frattaroli <fdbugs> |
Component: | Drivers/DRI/i965 | Assignee: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Status: | RESOLVED DUPLICATE | QA Contact: | Intel 3D Bugs Mailing List <intel-3d-bugs> |
Severity: | normal | ||
Priority: | medium | CC: | denys.kostin, jason, kevin.rogovin |
Version: | 19.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=110412 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
compute shader_dumped
itel_debug=cs with bad commit itel_debug=cs without commit itel_debug=cs previous commit |
Description
Nicolas Frattaroli
2019-04-06 03:06:23 UTC
hi Nicolas, thanks for the report. Could you please clarify, how did you measure performance degradation? In my case I could only enable fps counter and saw these values: mesa-18.1.5 (system) - 43.* FPS mesa-19.0.0 (from git, release) - 36.* FPS video in use - http://cdn.clipcanvas.com/sample/clipcanvas_14348_ProResHQ_720p50.mov My monitor is Dell (system resolution 1920x1080) To enable fps counter I removed (--no-config) and in the mpv.conf file added this: >osd-msg1="FPS: ${estimated-display-fps}" path for config file: >~/.config/mpv/mpv.conf To summarize, this fps drop could be determined as degradation I think, but I want to clarify what did you mean. oh, btw, mpv version is: den@den-HP-ZBook-14u-G4:~/repositories/mesa$ mpv --version mpv git-2019-02-24-5370069 Copyright © 2000-2018 mpv/MPlayer/mplayer2 projects built on Sun Mar 10 00:58:59 UTC 2019 ffmpeg library versions: libavutil 56.26.100 libavcodec 58.47.102 libavformat 58.26.101 libswscale 5.4.100 libavfilter 7.48.100 libswresample 3.4.100 ffmpeg version: git-2019-02-27-4571c7c Taken from this ppa https://launchpad.net/~mc3man/+archive/ubuntu/mpv-tests You can see the dropped frames in mpv's terminal status line, or by hitting I (capital i) to show the stats overlay, which will show you dropped frames. On 0.18.x I get next to no dropped frames, on 0.19.x I get many dropped frames. --video-sync=display-resample also seems to play a role, as on X11 it will make mpv drive the display loop at the monitor's refresh rate, which amplifies the effect as it will essentially attempt to redraw the same frame or render a new frame 60 times a second. you are talking exactly about "Dropped" frames printed after finishing video? here are my values: mesa-18.1.5 - Dropped: 118 mesa-19.0.0 - Dropped: 158 mesa-18.3.2 - Dropped: 125 Apparently number of them was increased, but I didn't see "0" in old mesa versions. All test I am running on the same conditions (2 browsers, youtube, and a lot of other apps opened). Tomorrow I will try to make them on clear (as possible) machine, without extra processes and apps. And maybe will try to find more fresh ppa with mpv (I couldn't build it from source because of missed dependencies of ffmpeg, and was asked for to build them from source). Or maybe on manjaro - there all needed packages should be installed. BTW my GPU is the same with your's (KBL, HD 620). kernel 4.20.6 Ubuntu 18.04 with unity (on X) In 18.3, I get 7 dropped frames with the test clip you linked mpv --no-config --profile=gpu-hq --scale=ewa_lanczossharp --video-sync=display-resample --fullscreen clipcanvas_14348_ProResHQ_720p50.mov and 1 dropped frame in 18.3 if I reencode it to H.264 to remove the ProRes decoder out of the equation ffmpeg -i clipcanvas_14348_ProResHQ_720p50.mov -c:v libx264 -preset veryslow -profile:v high422 -level 4.2 -crf 19 -c:a libopus testclip.mkv mpv --no-config --profile=gpu-hq --scale=ewa_lanczossharp --video-sync=display-resample --fullscreen testclip.mkv Maybe your GPU is stuck in a lower power state. hi again. If Drops is the main degradation trigger for now, so, I think, below information may help. In my case (in compare with your's, Nicolas), in old mesa versions I had about 50-60 drops, and 19+ mesa - 120+ drops. So I desided to bisect between. Providing full bisect logs with "dropps" on each commit: den@den-HP-ZBook-14u-G4:~/repositories/mesa$ git bisect log git bisect start good: [190a79f462710f04d67eaefe498ef6ae5b7f5b1a] docs: add release notes for 18.3.3 git bisect good 190a79f462710f04d67eaefe498ef6ae5b7f5b1a Dropped: 52 bad: [5925a5725831b22a92f4597388d1081126d8bc91] docs: Add release notes for 19.0.0 git bisect bad 5925a5725831b22a92f4597388d1081126d8bc91 Dropped: 130 good: [1f41104b9bab50652050bf4524f2b9f371f7ca9b] meson: don't install translation files git bisect good 1f41104b9bab50652050bf4524f2b9f371f7ca9b Dropped: 52 good: [e890aaabed777e7c7736a519e94aef648081bd1d] travis: meson: add unwind handling git bisect good e890aaabed777e7c7736a519e94aef648081bd1d Dropped: 62 good: [5486c9d526f393eff4b189e0e0a44eafeedf4407] freedreno/a6xx: Turn on texture tiling by default git bisect good 5486c9d526f393eff4b189e0e0a44eafeedf4407 Dropped: 64 good: [41a0acd6a149ec9f47ea527ad08a2b29bf1ee6b2] Switch imx to kmsro and remove the imx winsys git bisect good 41a0acd6a149ec9f47ea527ad08a2b29bf1ee6b2 Dropped: 52 bad: [fb3485bc9248a12f47b07b593f0a81d58cbb3155] gallium/u_threaded: fix EXPLICIT_FLUSH for flush offsets > 0 git bisect bad fb3485bc9248a12f47b07b593f0a81d58cbb3155 Dropped: 111 bad: [82365595e9b4d947f1bdeec2b2eff1cdb226de5a] automake: Add float64.glsl to dist tarball git bisect bad 82365595e9b4d947f1bdeec2b2eff1cdb226de5a Dropped: 112 good: [7f1cf046cd1fb8a3af0e24b622179e4adb398764] intel/fs: Add a generic SEND opcode git bisect good 7f1cf046cd1fb8a3af0e24b622179e4adb398764 Dropped: 51 good: [014edff0d20d52191570a4cb125c37b63955d664] intel/fs: Add interference between SENDS sources git bisect good 014edff0d20d52191570a4cb125c37b63955d664 Dropped: 51 bad: [bcefa0f1cb99229b6dc241ff50b2c88da1dad950] freedreno: fix invalidate logic git bisect bad bcefa0f1cb99229b6dc241ff50b2c88da1dad950 Dropped: 115 bad: [820dfcea431e4f96f25e6b340edd9cd1e449158b] egl/wayland-drm: Only announce formats via wl_drm which the driver supports. git bisect bad 820dfcea431e4f96f25e6b340edd9cd1e449158b Dropped: 117 bad: [a34b0d68bbf8571e4d858cf4e1176766a50364de] egl/wayland: Allow client->server format conversion for PRIME offload. (v2) git bisect bad a34b0d68bbf8571e4d858cf4e1176766a50364de Dropped: 118 bad: [a920979d4f30a48a23f8ff375ce05fa8a947dd96] intel/fs: Use split sends for surface writes on gen9+ git bisect bad a920979d4f30a48a23f8ff375ce05fa8a947dd96 Dropped: 117 first bad commit: [a920979d4f30a48a23f8ff375ce05fa8a947dd96] intel/fs: Use split sends for surface writes on gen9+ ____ commit a920979d4f30a48a23f8ff375ce05fa8a947dd96 Author: Jason Ekstrand <jason.ekstrand@intel.com> Date: Fri Nov 16 10:46:27 2018 -0600 intel/fs: Use split sends for surface writes on gen9+ Surface reads don't need them because they just have the one address payload. With surface writes, on the other hand, we can put the address and the data in the different halves and avoid building the payload all together. The decrease in register pressure and added freedom in register allocation resulting from this change reduces spilling enough to improve the performance of one customer benchmark by about 2x. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Created attachment 143912 [details]
compute shader_dumped
Jason, could you please take a look into it? In attachments you may find compute shader, which might cause the regression, and 2 results of it compilation (with bad commit and without it).
Created attachment 143913 [details]
itel_debug=cs with bad commit
Created attachment 143914 [details]
itel_debug=cs without commit
Created attachment 143915 [details]
itel_debug=cs previous commit
re-uploaded log file for commit right before "bad" one.
git-014edff0d2
It's pretty clear what's going on here. The change in a920979d4f30 caused RA to either succeed or fail differently with respect to scheduling, so the scheduling algorithm changed and the new scheduling is utterly horrible compared to the old one. In other words, our scheduler sucks. Unfortunately, this isn't news.... We've got a new scheduler in the works (in theory) which will hopefully degrade more gracefully. In the mean time for this particular bug, one could look into why it's failing (or succeeding; I don't know) to register allocate with the a920979d4f30 and maybe try to improve it. It's entirely possible, however, that what *should* be an improvement in RA is causing worse performance due to the terrible scheduler. More specifically, it's post-RA scheduling that's blowing up. The shader register allocates on the first try in both cases. RA must now be creating a more restrictive allocation which prevents post-RA scheduling from being able to schedule nicely and/or gives more freedom and post-RA scheduling makes a hash of things. *** Bug 110412 has been marked as a duplicate of this bug. *** *** This bug has been marked as a duplicate of bug 109517 *** |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.