Bug 102260

Summary: 15% perf drop in DrvRes with "i965/miptree: Use num_samples of 1 instead of 0 for single-sampled"
Product: Mesa Reporter: Eero Tamminen <eero.t.tamminen>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: VERIFIED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Eero Tamminen 2017-08-16 13:37:28 UTC
Bisecting on SKL GT2 revealed following commit:
----------------------------------------------------
commit 76e2f390f9863a356d1419982dec705260d67eff
Author:     Topi Pohjolainen <topi.pohjolainen@intel.com>
AuthorDate: Wed Jul 19 09:25:19 2017 +0300
Commit:     Topi Pohjolainen <topi.pohjolainen@intel.com>
CommitDate: Thu Jul 20 11:32:21 2017 +0300

    i965/miptree: Use num_samples of 1 instead of 0 for single-sampled
----------------------------------------------------

To be responsible for large performance drop in SynMark v7 DrvRes test.

Drop is largest (~15%) on SKL & KBL GT2, on most platforms it's 3-5%, on SKL GT4e there's no measurable impact.

(Test is run in FullHD fullscreen.)

From the commit message it seems that it should be just refactoring + potential bug fix i.e. this large perf impact is unexpected.  Other tests beside DrvRes weren't noticeably impacted.
Comment 1 Jason Ekstrand 2017-08-19 00:41:46 UTC
I've verified that the stated commit does, indeed, cause the regression though I have no idea why yet.  It should have been an entirely innocuous change.
Comment 2 Jason Ekstrand 2017-08-19 18:09:04 UTC
Patch on the list:

https://patchwork.freedesktop.org/patch/172628/
Comment 3 Jason Ekstrand 2017-08-19 22:40:48 UTC
This is fixed by the following commit:

commit f24cf82d6db290a88abfff0669d2c5e2aa463901
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Sat Aug 19 11:03:38 2017 -0700

    i965/tex: Don't pass samples to miptree_create_for_teximage
    
    In 76e2f390f9863a35, when Topi switched num_samples from 0 to 1 for
    single-sampled, he accidentally switched the last parameter in the call
    to miptree_create_for_teximage from 0 to 1 thinking it was num_samples
    when it was actually layout_flags.  Switching from 0 to 1 added the
    MIPTREE_LAYOUT_ACCELERATED_UPLOAD flag which causes us to allocate a
    busy BO instead of an idle one.  This caused the subsequent CPU upload
    to consistently stall.  The end result was a 15% performance drop in the
    SynMark v7 DrvRes microbenchmark.  This restores the old behavior and
    fixes the performance regression.
    
    Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
    Fixes: 76e2f390f9863a356d1419982dec705260d67eff
    Bugzilla: https://bugs.freedesktop.org/102260
    Cc: mesa-stable@lists.freedesktop.org
Comment 4 Eero Tamminen 2017-08-21 07:52:20 UTC
Verified. DrvRes perf is now same, and in some cases slightly better, than it was before the drop.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.