ChromeOS uses glCopyTexSubImage2D to copy immutable RGBA8 texture to regular RGB texture. It's very slow. https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_apply_framebuffer_attachment_cmaa_intel.cc?sq=package:chromium&rcl=1477653611&l=552 If using glDrawArrays to copy, speed up significantly. ApplyFramebufferAttachmentCMAAINTELResourceManager::CopyTexture() in following patch shows how to use glDrawArrays instead of glCopyTexSubImage2D https://codereview.chromium.org/2460973002/diff/1/gpu/command_buffer/service/gles2_cmd_apply_framebuffer_attachment_cmaa_intel.cc I measure FPS on WebGL using Chrome browser in various platforms. Intel Mesa is always slower when using glCopyTexSubImage2D. However, Qualcomm Adreno is faster when using glCopyTexSubImage2D. test site: http://webglsamples.org/aquarium/aquarium.html glDrawArrays glCopyTexSubImage2D 4k fish on Chromebook Pixel 2015 (Broadwell): 22 FPS 32.6 FPS 4k fish on Ubuntu and Haswell: 23 FPS 30.9 FPS 500 fish on Android OnePlus One (Adreno 330): 25 FPS 22.5 FPS It looks Intel Mesa bug. There is not any reason in theory which glCopyTexSubImage2D is slower. glCopyTexSubImage2D is very frequently used, so I think this issue is quite severe.
The numbers you've put in seems to show that glCopyTexSubImage2D is faster. Did you invert the columns' labels?
Oh, yes, I made mistake. I invert the colums' label. following is true glCopyTexSubImage2D glDrawArrays 4k fish on Chromebook Pixel 2015 (Broadwell): 22 FPS 32.6 FPS 4k fish on Ubuntu and Haswell: 23 FPS 30.9 FPS 500 fish on Android OnePlus One (Adreno 330): 25 FPS 22.5 FPS
The next step would be to determine what method i965 is using for CopyTexSubImage: 1. BLORP (should be fast) 2. BLT (should be slow) 3. CPU maps (should be slow) Putting a breakpoint in intelCopyTexSubImage and stepping through it should make it pretty clear which is happening, and if we're falling off the fast path, why.
Here's intel_gpu_top data When using glDrawArrays render busy: 48%: █████████▋ render space: 67/16384 task percent busy CS: 49%: █████████▉ vert fetch: 650289482 (650202512/sec) GAM: 43%: ████████▋ prim fetch: 216764414 (216735424/sec) VS: 11%: ██▎ VS invocations: 636105016 (636047896/sec) SVG: 10%: ██ GS invocations: 0 (0/sec) VF: 10%: ██ GS prims: 0 (0/sec) GAFS: 9%: █▉ CL invocations: 216759538 (216730978/sec) CL: 9%: █▉ CL prims: 97832823 (97804263/sec) SF: 5%: █ PS invocations: 18278204160 (-430295856/sec) DS: 4%: ▉ PS depth pass: 4741623475 (77865535/sec) SOL: 3%: ▋ GS: 3%: ▋ SDE: 2%: ▌ TDG: 2%: ▌ TE: 1%: ▎ HS: 1%: ▎ GAFM: 1%: ▎ RS: 1%: ▎ VFE: 0%: TSG: 0%: URBM: 0%: When using glCopyTexSubImage2D render busy: 53%: ██████████▋ render space: 38/16384 task percent busy CS: 55%: ███████████ vert fetch: 421770029 (15081065/sec) GAM: 49%: █████████▉ prim fetch: 140591035 (5026989/sec) VS: 14%: ██▉ VS invocations: 411982523 (14758688/sec) SVG: 14%: ██▉ GS invocations: 0 (0/sec) GAFS: 12%: ██▌ GS prims: 0 (0/sec) VF: 12%: ██▌ CL invocations: 140587183 (5027260/sec) CL: 11%: ██▎ CL prims: 63411022 (2244392/sec) SF: 7%: █▌ PS invocations: 9766779712 (286387776/sec) DS: 6%: █▎ PS depth pass: 2582256091 (72883919/sec) SOL: 5%: █ GS: 5%: █ SDE: 3%: ▋ TDG: 3%: ▋ HS: 2%: ▌ TE: 2%: ▌ GAFM: 1%: ▎ TSG: 0%: RS: 0%: VFE: 0%: URBM: 0%:
This is cool. Can you also please get the data that Ken requested?
(In reply to Kenneth Graunke from comment #3) > The next step would be to determine what method i965 is using for > CopyTexSubImage: > > 1. BLORP (should be fast) > 2. BLT (should be slow) > 3. CPU maps (should be slow) > > Putting a breakpoint in intelCopyTexSubImage and stepping through it should > make it pretty clear which is happening, and if we're falling off the fast > path, why. Dongseong, have you had a chance to follow up on this? This informations is necessary for us to make progress on this issue.
Hi, I'm in vacation and 3 weeks more. When I come back to office, I'll do it as the first task. Sorry for delaying. If someone wants to try to reproduce it, here's instruction. 1. build chromium in linux (ubuntu or debian is easiest) https://chromium.googlesource.com/chromium/src/+/master/docs/linux_build_instructions.md 2. patch this +++ b/gpu/command_buffer/service/feature_info.cc @@ -954,8 +954,7 @@ void FeatureInfo::InitializeFeatures() { if (extensions.Contains("GL_INTEL_framebuffer_CMAA")) { feature_flags_.chromium_screen_space_antialiasing = true; AddExtensionString("GL_CHROMIUM_screen_space_antialiasing"); - } else if (!workarounds_.disable_framebuffer_cmaa && - (gl_version_info_->IsAtLeastGLES(3, 1) || + } else if ( (gl_version_info_->IsAtLeastGLES(3, 1) || (gl_version_info_->IsAtLeastGL(3, 0) && extensions.Contains("GL_ARB_shading_language_420pack") && extensions.Contains("GL_ARB_texture_gather") && 3. run any webgl site > ./out/Release/chrome http://webglsamples.org/aquarium/aquarium.html 4. break point from following two points https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_apply_framebuffer_attachment_cmaa_intel.cc?q=gles2_cmd_ap&sq=package:chromium&l=250 https://cs.chromium.org/chromium/src/gpu/command_buffer/service/gles2_cmd_copy_texture_chromium.cc?sq=package:chromium&rcl=1478244283&l=311
Ian-Are we still waiting on the info needed here or can we close this bug?
Compiling Chromium from source with patches is rather painful...if possible, we would like to avoid that. Dongseong, if you still want us to look at this, can you provide an apitrace which uses glCopyTexSubImage2D? Install apitrace, then run "apitrace trace chromium ...". It should create a "chromium.trace" file. Then I can answer the question I had in comment 3...
Dongseong-Can we close this bug?
no, sorry for delaying. I'll provide stack trace soon.
INVALID is not a great representation but the best category I can see from the list. We haven't had a reporter update in over four months so per the mesa bug guidelines, we are closing. If this is still a bug, feel free to reopen with the proper documentation.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.