Bug 85999 - [BDW Bisected]GpuTest_v0.5_triangle/GLbenchmark v2.5.1 EgyptTestStandard/SynMark2 performance are reduced ~10%-60%
Summary: [BDW Bisected]GpuTest_v0.5_triangle/GLbenchmark v2.5.1 EgyptTestStandard/SynM...
Status: CLOSED NOTABUG
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/intel (show other bugs)
Version: unspecified
Hardware: Other All
: medium major
Assignee: Chris Wilson
QA Contact: Intel GFX Bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-07 12:19 UTC by wendy.wang
Modified: 2014-11-13 01:16 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments

Description wendy.wang 2014-11-07 12:19:27 UTC
Environment:
-----------------------------------
Platform:BDW
Libdrm:                 (master)libdrm-2.4.58-4-g00847fa48b83a85b0cb882594a12ed1511f780db
Mesa:                   (master)cd745d46ce7ee9adc95c903670dd11cf3443e7a1
Xserver:                               (master)xorg-server-1.16.99.901-3-g63bb5c5ef16edf652179770294dcca4fc07dc992
Xf86_video_intel:                            (master)2.99.916-132-g0532a3313ad9c76a6e1d28e8a1c2ea495583fead
Cairo:                    (master)adbeb3d53c6c6e8ddcc63988200da4c5c9627717
Libva:                    (master)ccd93de5a707e92a629cccd595757c8d436fa3cc
Libva_intel_driver:                          (master)24cba20a119c96556ae4dc9a90043896ea70e567
Kernel:   (drm-intel-nightly)0642a51748ff46a205cf2a3fc45ba18cc92c9bda

Bug detailed description:
---------------------------------------------
It's Xf86_video_intel regression
GpuTest_v0.5_triangle_fullscreen show largest performance FPS reduce: -60%
SynMark2_v6.0.0_OglBatch0: -33%
SynMark2_v6.0.0_OglBatch1: -33%
SynMark2_v6.0.0_OglBatch2: -32%
SynMark2_v6.0.0_OglBatch3: -30%
SynMark2_v6.0.0_OglBatch4: -34%


From bisect result, below is first bad commit:

commit 0532a3313ad9c76a6e1d28e8a1c2ea495583fead 
Author:     Chris Wilson <chris@chris-wilson.co.uk (file://chris@chris-wilson.co.uk/)> 
AuthorDate: Wed Nov 5 20:11:54 2014 +0000 
Commit:     Chris Wilson <chris@chris-wilson.co.uk (file://chris@chris-wilson.co.uk/)> 
CommitDate: Wed Nov 5 21:06:57 2014 +0000  

    sna/gen8: Clear instancing enabled bit between batches 

    gen8 sets the instancing bit relative to the vertex element, but we were 
    clearing it for the vertex buffer. As the maximum number of vertex 
    elements is fixed, just clear them all when emitting our header. Note 
    that VF_SGVS is not sufficient by itself to disable all side-effects of 
    instancing. 

    Thanks to Kenneth Graunke for pointing out the change from vertex buffer 
    to vertex element of the instancing enable bit. 

    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84958 (https://bugs.freedesktop.org/show_bug.cgi?id=84958) 
    Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk (file://chris@chris-wilson.co.uk/)>


Below cases all will fail:
Lightsmark v2008
Warsow v1.0
GpuTest_v0.5_triangle_fullscreen
GpuTest_v0.5_triangle_Windowed_640*480
SynMark2_v6.0.0_OglBatch0
SynMark2_v6.0.0_OglBatch1
SynMark2_v6.0.0_OglBatch2
SynMark2_v6.0.0_OglBatch3
SynMark2_v6.0.0_OglBatch4
SynMark2_v6.0.0_OglBatch5
SynMark2_v6.0.0_OglGeomPoint
SynMark2_v6.0.0_OglGeomTriList
SynMark2_v6.0.0_OglGeomTriStrip
SynMark2_v6.0.0_OglShMapVsm
SynMark2_v6.0.0_OglTexFilterTri
SynMark2_v6.0.0_OglTexFilterAniso
SynMark2_v6.0.0_OglVSDiffuse1
SynMark2_v6.0.0_OglVSDiffuse8
SynMark2_v6.0.0_OglVSTangent
Unigine-Valley v1.0
GLbenchmark v2.5.1 EgyptTestStandard


Reproduce steps:
---------------------------------------------
1.            xinit&
2.            ./GpuTest_0.5 triangle_fullscreen

log message will upload later.
Comment 1 Chris Wilson 2014-11-07 12:20:50 UTC
Previous FPS values would have been skipping the copy operation and so these bandwidth tests would have been incorrect.
Comment 2 wendy.wang 2014-11-07 12:29:55 UTC
(In reply to Chris Wilson from comment #1)
> Previous FPS values would have been skipping the copy operation and so these
> bandwidth tests would have been incorrect.

Chris, excuse me, I do not understand your comments, would you pls educate me more, why this is not bug?Thanks.
Comment 3 Chris Wilson 2014-11-07 12:36:38 UTC
There was a bug in the ddx that was causing the GPU to not evaluate the blits for SwapBuffers(). This gave inflated results for swap/bandwidth bound benchmarks like Triangles. 0532a3313ad9c76a6e1d28e8a1c2ea495583fead fixes the copies so that they actually are performed by the GPU and corrects the benchmark results.
Comment 4 wendy.wang 2014-11-07 12:53:06 UTC
(In reply to Chris Wilson from comment #3)
> There was a bug in the ddx that was causing the GPU to not evaluate the
> blits for SwapBuffers(). This gave inflated results for swap/bandwidth bound
> benchmarks like Triangles. 0532a3313ad9c76a6e1d28e8a1c2ea495583fead fixes
> the copies so that they actually are performed by the GPU and corrects the
> benchmark results.

Thanks a lot Chris, so that means previous our performance values higher is wrong, right?
We have many cases performance data decreased this time. 

Lightsmark v2008
Warsow v1.0
GpuTest_v0.5_triangle_fullscreen
GpuTest_v0.5_triangle_Windowed_640*480
SynMark2_v6.0.0_OglBatch0
SynMark2_v6.0.0_OglBatch1
SynMark2_v6.0.0_OglBatch2
SynMark2_v6.0.0_OglBatch3
SynMark2_v6.0.0_OglBatch4
SynMark2_v6.0.0_OglBatch5
SynMark2_v6.0.0_OglGeomPoint
SynMark2_v6.0.0_OglGeomTriList
SynMark2_v6.0.0_OglGeomTriStrip
SynMark2_v6.0.0_OglShMapVsm
SynMark2_v6.0.0_OglTexFilterTri
SynMark2_v6.0.0_OglTexFilterAniso
SynMark2_v6.0.0_OglVSDiffuse1
SynMark2_v6.0.0_OglVSDiffuse8
SynMark2_v6.0.0_OglVSTangent
Unigine-Valley v1.0
GLbenchmark v2.5.1 EgyptTestStandard
Comment 5 Chris Wilson 2014-11-07 12:57:31 UTC
Right, it is going to affect almost everything that calls SwapBuffers or CopyRegion. It is more or less the same as employing a swap evasion technique for benchmarks.

Hopefully, the game benchmarks are the least effected (or anything running at around refresh rate).
Comment 6 wendy.wang 2014-11-07 13:02:13 UTC
(In reply to Chris Wilson from comment #5)
> Right, it is going to affect almost everything that calls SwapBuffers or
> CopyRegion. It is more or less the same as employing a swap evasion
> technique for benchmarks.
> 
> Hopefully, the game benchmarks are the least effected (or anything running
> at around refresh rate).

Chris,Thanks for your kindly explanation!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.