Bug 98549

Summary: [HSW] Extremely low x11perf putimage score with glamor
Product: Mesa Reporter: Clemens Eisserer <linuxhippy>
Component: Drivers/DRI/i965Assignee: Intel 3D Bugs Mailing List <intel-3d-bugs>
Status: RESOLVED MOVED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium    
Version: 12.0   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:

Description Clemens Eisserer 2016-11-02 13:42:18 UTC
When using glamor on top of the i965 driver (haswell GPU), the x11perf -putimage10 score is extremly low, while shmput10 is quite fast:

      20000 trep @   1.2639 msec (   791.0/sec): PutImage 10x10 square
   14000000 trep @   0.0019 msec (537000.0/sec): ShmPutImage 10x10 square

a netbook based on AMD's mullins chip (low-power jaguar cores) is fast for both operations:

     800000 trep @   0.0344 msec ( 29000.0/sec): PutImage 10x10 square
     800000 trep @   0.0356 msec ( 28100.0/sec): ShmPutImage 10x10 square

This issue slows down Java's antialiased rendering a lot (it doesn't use trapezoids for AA geometry, instead it uploads 32x32 coverage masks using XPutImage and later uses this data as mask for XRenderComposite). 

As more and more distributions switch to glamor directly or indirectly (XWayland or X+modesetting) it would be great to see this fixed.


Laptop:

  Vendor: Intel Open Source Technology Center (0x8086)
    Device: Mesa DRI Intel(R) Haswell Mobile  (0xa16)
    Version: 12.0.3
    Accelerated: yes
    Video memory: 1536MB
    Unified memory: yes
    Preferred profile: core (0x1)
    Max core profile version: 3.3
    Max compat profile version: 3.0
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.0
Comment 1 Clemens Eisserer 2017-05-14 20:20:48 UTC
So this is not limited to haswell, on an arrandale laptop I get (90% of the cycles are spent in clflush_object):

       1200 reps @   6.1682 msec (   162.0/sec): PutImage XY 10x10 square
Comment 2 Marina Chernish 2018-11-16 15:07:04 UTC
Hi Clemens,

I've checked x11perf -putimage10 on my Haswell on 12.0.3, 10.6.9 and 12.1.0 versions of mesa and score looks good. Here is outputs:

OpenGL ES profile version string: OpenGL ES 3.1 Mesa 12.1.0-devel (git-3ef8d42)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10

user@HRK1-LHP-F49171:~$ x11perf -putimage10
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 11906000 on :0
from HRK1-LHP-F49171
Fri Nov 16 15:18:23 2018

Sync time adjustment is 0.0136 msecs.

1600000 reps @ 0.0034 msec (290000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (290000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0035 msec (287000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0035 msec (286000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0035 msec (288000.0/sec): PutImage 10x10 square
8000000 trep @ 0.0035 msec (288000.0/sec): PutImage 10x10 square
//--------------------------------------------------------------

OpenGL ES profile version string: OpenGL ES 3.0 Mesa 12.0.3 (git-d79b2e7)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00

user@HRK1-LHP-F49171:~/Work/mesa$ x11perf -putimage10
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 11906000 on :0
from HRK1-LHP-F49171
Fri Nov 16 15:44:35 2018

Sync time adjustment is 0.0129 msecs.

1600000 reps @ 0.0034 msec (296000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (295000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (296000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (294000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (295000.0/sec): PutImage 10x10 square
8000000 trep @ 0.0034 msec (295000.0/sec): PutImage 10x10 square
//---------------------------------------------------------------

OpenGL ES profile version string: OpenGL ES 3.0 Mesa 10.6.9 (git-ab9aacc)
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00

user@HRK1-LHP-F49171:~/Work/mesa$ x11perf -putimage10
x11perf - X11 performance program, version 1.2
The X.Org Foundation server version 11906000 on :0
from HRK1-LHP-F49171
Fri Nov 16 16:03:50 2018

Sync time adjustment is 0.0123 msecs.

1600000 reps @ 0.0033 msec (299000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0033 msec (299000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0034 msec (298000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0033 msec (300000.0/sec): PutImage 10x10 square
1600000 reps @ 0.0033 msec (302000.0/sec): PutImage 10x10 square
8000000 trep @ 0.0033 msec (299000.0/sec): PutImage 10x10 square

I've set following configuration in /usr/share/X11/xorg.conf.d/20-intel.conf:
Section "Device"
    Identifier "Intel"
    Driver "intel"
    Option "TearFree" "true"
    Option "DRI" "3"
    Option "AccelMethod" "glamor"
EndSection

Used environment: 
Haswell: CPU: Intel Core i5-4300M; GPU: IntelĀ® HD Graphics 4600
Ubuntu 16.04; kernel  4.18.16-041816-generic;
Comment 3 GitLab Migration User 2019-09-25 18:58:59 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1547.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.