Bug 80705 - [IVB/HSW/BYT-M/BDW] GpuTest_v0.5_triangle performance slower by 10%~50% in fullscreen than in windowed & composited mode
Summary: [IVB/HSW/BYT-M/BDW] GpuTest_v0.5_triangle performance slower by 10%~50% in fu...
Status: NEW
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i965 (show other bugs)
Version: git
Hardware: All Linux (All)
: medium normal
Assignee: Ian Romanick
QA Contact: Intel 3D Bugs Mailing List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-30 08:11 UTC by zhoujian
Modified: 2015-08-17 17:35 UTC (History)
7 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Equivalent test to the one the bug was reported too (1.71 KB, text/plain)
2015-08-08 15:51 UTC, Alejandro Piñeiro (freenode IRC: apinheiro)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description zhoujian 2014-06-30 08:11:34 UTC
Platform: BYT-M/IVB/HSW/BDW
Libdrm: (master)libdrm-2.4.54-17-ge8c3c1358ecaf4e90f7d43762357ae6f8e2022b6
Mesa: (master)c58486516f2ec8341f92554e28fd84c10d835a45
Xserver: (master)xorg-server-1.15.99.902-121-g2f5cf9ff9a0f713b7e038636484c77f113a5f10a
Xf86_video_intel: (master)2.99.912-164-gc5c7dd24a55f04322d5eec10dc4352d8a8e92b1e
Cairo: (master)f574fec8d2d1f83525fd7e4dbb266b6e5091627d
Libva: (master)c61d8c6ce9ffc27320e9e177c1e1123d5f1b5014
Libva_intel_driver: (master)745340dd013399f64507de73401ab3adb712dad5
Kernel:	(drm-intel-nightly) git-1087d4

Bug detailed description:
----------------------------------------------
GpuTest_v0.5_triangle is slower by 10%~50% in fullscreen than in windowed & composited mode on BYT-M/IVB/HSW/BDW, The problem exists both on gnome-session and Raw X.

Reproduce steps:
---------------------------------------------
1.   xinit&
2.   ./GpuTest /test=triangle /width=1920 /height=1080 /fullscreen /benchmark /no_scorebox
Comment 1 Eero Tamminen 2014-06-30 13:07:07 UTC
According to Mengmeng, this issue happens only with DRI3, not DRI2.

Can you check with intel_gpu_top (on BYT where it works fine) what is the "render" value both in fullscreen and windowed mode?

I.e. is GPU 100% utilized in both cases, like it's with DRI2.


FYI: In an unrelated test it was noticed that with DRI3, X still does occasionally Vsync wait although test had zeroed swap interval.

According to Chris' comment on that test:
---------
The current implementation requires 4 buffers to run the entire pipeline in an synchronous manner, Keith is only feeding in 3 (from mesa/src/glx/dri3_glx.c) and so forces the synchronization.
---------

-> With Vsync, test wouldn't be using GPU fully.
Comment 2 zhoujian 2014-07-02 08:39:30 UTC
I have checked with intel_gpu_top,the render busy is 100% both in fullscreen and windowed mode.
Comment 3 Alejandro Piñeiro (freenode IRC: apinheiro) 2015-08-08 15:51:51 UTC
Created attachment 117594 [details]
Equivalent test to the one the bug was reported too

Although the test itself is just the basic triangle example, I uploaded the example I was using as the source code of the original test is not available. This also confirms that the bug doesn't come from the bug.

If you want to test it without vsync and get the fps, just use envvars. 

So if you want to to test what mentioned on comment 0, for fullscreen:

vblank_mode=0 LIBGL_SHOW_FPS=1 LIBGL_DEBUG=verbose ./simple-opengl-test 1920 1080 1

For windowed:

vblank_mode=0 LIBGL_SHOW_FPS=1 LIBGL_DEBUG=verbose ./simple-opengl-test 1920 1080 0
Comment 4 Alejandro Piñeiro (freenode IRC: apinheiro) 2015-08-08 16:12:39 UTC
Hi, these days I have been taking a look to this bug, and I will share my first discoveries.

(In reply to zhoujian from comment #0)
> 
> Bug detailed description:
> ----------------------------------------------
> GpuTest_v0.5_triangle is slower by 10%~50% in fullscreen than in windowed &
> composited mode on BYT-M/IVB/HSW/BDW, The problem exists both on
> gnome-session and Raw X.

At this moment I was not able to find v0.5, but the bug is reproducible with the last version v0.7

I detected a ~10% performance lost with 1920x1080 resolution. ~50% using 1280x720

(In reply to Eero Tamminen from comment #1)
> According to Mengmeng, this issue happens only with DRI3, not DRI2.

I reproduced this bug on DRI2. In fact, I was not able to test this on DRI3 after a system upgrade. But I really think that that is unrelated to that bug.

> Can you check with intel_gpu_top (on BYT where it works fine) what is the
> "render" value both in fullscreen and windowed mode?
> 
> I.e. is GPU 100% utilized in both cases, like it's with DRI2.

As zhoujian mentioned on comment 2, GPU is at 100% on both cases.

> FYI: In an unrelated test it was noticed that with DRI3, X still does
> occasionally Vsync wait although test had zeroed swap interval.
> 
> According to Chris' comment on that test:
> ---------
> The current implementation requires 4 buffers to run the entire pipeline in
> an synchronous manner, Keith is only feeding in 3 (from
> mesa/src/glx/dri3_glx.c) and so forces the synchronization.
> ---------
> 
> -> With Vsync, test wouldn't be using GPU fully.

If Vsync were happening with fullscreen, the FPS (that is the measure of performance in this test) would be equal to the refresh rate. So for example, in my case 60FPS. But that is not the case, but ~360 FPS. The curious thing is that it doesn't matter the resolution, on fullscreen the FPS are always 360FPS. But on windowed I detected ~380 for 1920x1280, ~860 for 1280x720 and so on.

After that I created a small example in order to discard the original test. You can find it on comment 3, and Im removing the vsync using vblank_mode=0.

Fullscreen:
  1920x1080: ~372
  1280x720:  ~372
  1024x640:  ~372

Windowed
  1920x1080: ~390 
  1280x720:  ~840
  1024x640:  ~1280

So although it is not vsynced, for some reason on fullscreen is set to a defined FPS.
Comment 5 Eero Tamminen 2015-08-10 08:05:47 UTC
(In reply to Alejandro Piñeiro (freenode IRC: apinheiro) from comment #4)
> (In reply to zhoujian from comment #0)
> > FYI: In an unrelated test it was noticed that with DRI3, X still does
> > occasionally Vsync wait although test had zeroed swap interval.
...
> If Vsync were happening with fullscreen, the FPS (that is the measure of
> performance in this test) would be equal to the refresh rate. So for
> example, in my case 60FPS.

Only if it happens for every frame.  Year ago, when Vsync was disabled with *DRI3*, I was seeing some tests being synched to 120 or 180 FPS i.e. every second or third frame.


> But that is not the case, but ~360 FPS.
>
> curious thing is that it doesn't matter the resolution, on fullscreen the
> FPS are always 360FPS.

This is multiple of 60FPS, and it not changing sounds like some of the frames were still synched.  I would suggest checking frame timings for individual frames to see whether this is the case.


> But on windowed I detected ~380 for 1920x1280, ~860
> for 1280x720 and so on.

I haven't seen DRI3 sync-when-vsync-disabled issue in Windowed tests.
Comment 6 Alejandro Piñeiro (freenode IRC: apinheiro) 2015-08-17 17:35:05 UTC
(In reply to Eero Tamminen from comment #5)
> (In reply to Alejandro Piñeiro (freenode IRC: apinheiro) from comment #4)
> > (In reply to zhoujian from comment #0)
> > > FYI: In an unrelated test it was noticed that with DRI3, X still does
> > > occasionally Vsync wait although test had zeroed swap interval.
> ...
> > If Vsync were happening with fullscreen, the FPS (that is the measure of
> > performance in this test) would be equal to the refresh rate. So for
> > example, in my case 60FPS.
> 
> Only if it happens for every frame.  Year ago, when Vsync was disabled with
> *DRI3*, I was seeing some tests being synched to 120 or 180 FPS i.e. every
> second or third frame.

Ok. In any case setting LIBGL_DEBUG to verbose I got this:
libGL: Using DRI2 for screen 0

So, as I mentioned this is also happening on DRI2.

> > But that is not the case, but ~360 FPS.
> >
> > curious thing is that it doesn't matter the resolution, on fullscreen the
> > FPS are always 360FPS.
> 
> This is multiple of 60FPS, and it not changing sounds like some of the
> frames were still synched.  I would suggest checking frame timings for
> individual frames to see whether this is the case.

Doing more tests: my previous numbers were on a gnome-session with metacity. On a gnome-session with gnome-shell, or without compositor (ctrl+alt+fX, xinit) I get similar outcome for windowed (slightly better) but tied to a little more of 790fps for the fullscreen case in any resolution.

> 
> > But on windowed I detected ~380 for 1920x1280, ~860
> > for 1280x720 and so on.
> 
> I haven't seen DRI3 sync-when-vsync-disabled issue in Windowed tests.

Ok.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.