Bug 98815

Summary: [SKL/BDW GT2] large perf regression in TessMark
Product: Mesa Reporter: Eero Tamminen <eero.t.tamminen>
Component: Drivers/DRI/i965Assignee: Kenneth Graunke <kenneth>
Status: VERIFIED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Eero Tamminen 2016-11-22 12:24:49 UTC
Setup:
- SKL i5 GT2 (i5-6600K + 2x 2400Mhz DDR4)
- Ubuntu 16.04
- Mesa from Git
- GpuTest v0.7: http://www.geeks3d.com/gputest/download/


Test-case:
MESA_GL_VERSION_OVERRIDE=4.0 ./GpuTest /test=tess_x64 /width=1366 /height=768 /benchmark

(Test-case fails with Mesa unless one overrides it with some GL 4.x version.)


Following commit drops TessMark x64 performance considerably:
-------------------------------------------------------
commit 6d416bcd846a49414f210cd761789156c37a7b3e
Author:     Kenneth Graunke <kenneth@whitecape.org>
AuthorDate: Tue Nov 15 01:03:13 2016 -0800
Commit:     Kenneth Graunke <kenneth@whitecape.org>
CommitDate: Sat Nov 19 11:40:00 2016 -0800

    i965: Use arrays in Gen7+ URB code.
    
    So much of this code was cut and pasted per stage.  We can accomplish
    much of it by looping over shader stages.
    
    Improves performance of OglBatch7 (version 6) by 1.50783% +/- 0.287049%
    (n = 71) at 1024x768 on Cherryview.
    
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
-------------------------------------------------------

On SKL GT2, drop is ~17% in FullHD fullscreen and increases when the window size decreases.  With 1/2 FullHD window, the drop is ~27%.

Drop is visible also on BDW GT2, but it's clearly smaller.  I don't see drop on BDW GT3, BSW, or on edram machines (HSW GT3e, SKL GT3e, SKL GT4e).
Comment 1 Eero Tamminen 2016-11-22 12:49:43 UTC
(In reply to Eero Tamminen from comment #0)
> With 1/2 FullHD window, the drop is ~27%.

Correction, with 13336x768 (HalfHD) it's ~23%, with 1024x576 it's 27% drop.
Comment 2 Kenneth Graunke 2016-11-23 20:29:43 UTC
Thank you for catching this!  I made a mistake in dividing out the remaining space, causing us to underallocate VS/HS/DS and waste the rest of the URB.

Patch to come as soon as Jenkins finishes testing it.
Comment 4 Kenneth Graunke 2016-11-24 03:06:58 UTC
Fixed by:

commit 5da84a7e120d1df848531c6e7eb60340ac4dc43c
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Wed Nov 23 12:24:22 2016 -0800

    i965: Fix a mistake from porting the URB allocation code to arrays.
    
    Commit 6d416bcd846a49414f210cd761789156c37a7b3e (i965: Use arrays in
    Gen7+ URB code.) introduced a regression which caused us to fail to
    allocate all of our URB space.
    
       -         total_wants -= ds_wants;
       +         total_wants -= additional;
    
    The new line should have been total_wants -= wants[i].
    
    Fixes a large performance regression in TessMark.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98815
    Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
    Reviewed-by: Matt Turner <mattst88@gmail.com>
Comment 5 Eero Tamminen 2016-11-25 12:26:18 UTC
Thanks, verified!

TessMark SKL GT2 perf is back where it was, as are SynMark terrain tessellation tests perf (don't have data yet for BDW GT2).

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.