Bug 97260 - R9 290 low performance in Linux 4.7
Summary: R9 290 low performance in Linux 4.7
Status: RESOLVED WONTFIX
Alias: None
Product: Mesa
Classification: Unclassified
Component: Mesa core (show other bugs)
Version: git
Hardware: x86-64 (AMD64) Linux (All)
: high major
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords: bisected, regression
Depends on:
Blocks:
 
Reported: 2016-08-09 13:45 UTC by Clésio Luiz
Modified: 2017-03-20 16:37 UTC (History)
9 users (show)

See Also:
i915 platform:
i915 features:


Attachments
git bisect log output for the bisection run leading to c63dd758589b1f7e8398841d1f443f06ebfbcefc (2.55 KB, text/plain)
2016-08-12 19:58 UTC, Kai
Details
radeon: Add some page flip debugging output (2.66 KB, patch)
2016-08-16 03:52 UTC, Michel Dänzer
Details | Splinter Review
dmesg output with additional debug info from attachment 125808 (14.04 KB, text/plain)
2016-08-16 15:49 UTC, Kai
Details
loader/dri3: Always use 3 back buffers when flipping (1.16 KB, patch)
2016-08-17 08:05 UTC, Michel Dänzer
Details | Splinter Review
Unigine valley benchmark (Kernel 4.6.0-1) (2.57 KB, text/html)
2016-09-21 02:20 UTC, Xavier Sellier
Details
Unigine valley benchmark (Kernel 4.8.0-rc5) (2.57 KB, text/html)
2016-09-21 02:36 UTC, Xavier Sellier
Details
Use 4 buffers for flipping (392 bytes, patch)
2016-09-21 02:41 UTC, Michel Dänzer
Details | Splinter Review
disable async flip support (3.36 KB, patch)
2016-10-04 15:06 UTC, Alex Deucher
Details | Splinter Review
attachment-13933-0.html (2.11 KB, text/html)
2016-10-12 04:46 UTC, Tim Writer
Details
bisect log leading to 7050c6ef5f0e9bc5e6bf9eb035320b70f731b919 (2.59 KB, text/plain)
2017-03-19 19:44 UTC, tarpoon
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Clésio Luiz 2016-08-09 13:45:48 UTC
Starting with Linux Kernel 4.7, the Radeon R9 290 started showing very low performance, about 20/30% of what is available up to kernel 4.6. As can be seen in this article, the performance regression started even before the launch of kernel 4.7 RC1, using Alex Deucher's drm-next-4.7 branch:

http://www.phoronix.com/scan.php?page=article&item=radeon-drm-next-47&num=1

Some time later it was speculated that the problem may be with DPM:

http://www.phoronix.com/scan.php?page=news_item&px=Linux-4.7-R9-290-Regression

To replicate the bug, install Ubuntu 16.04 64bits and install kernel 4.7 from here (preferably above rc5):

http://kernel.ubuntu.com/~kernel-ppa/mainline/ 

The performance regression will show right away. Rebooting the system and choosing a kernel version up to 4.6 will fix the bug.
Comment 1 Alex Deucher 2016-08-09 13:48:29 UTC
Can you bisect?
Comment 2 Jan Ziak 2016-08-10 18:45:28 UTC
Just a note:

I am not experiencing such a performance regression on R9 390 with kernel 4.7.0 - 4.8.0-rc1, amdgpu.ko kernel module, Gentoo Linux.
Comment 3 Clésio Luiz 2016-08-10 19:17:40 UTC
(In reply to Alex Deucher from comment #1)
> Can you bisect?

Unfortunately, no. But the best I can say is that it started very early in kernel 4.7 schedule, since the Phoronix benchmark on the first link was made before 4.7 RC1 was out, using Alex Deucher's drm-next-4.7 branch, in 05/14/2016.

The problem is still present in 4.8 RC1.
Comment 4 Alex Deucher 2016-08-10 19:40:00 UTC
Unfortunately, none of the power management code for these asics changed during that time period, so nothing jumps out as a likely culprit.  Did you also change your mesa stack or firmware in that time period?  With everything else the same, does the issue go away when using an older kernel?  If so, any chance you could narrow it down which kernels have the issue vs. not?
Comment 5 Alex Deucher 2016-08-10 19:43:38 UTC
Can you verify that 4.6 works properly?
Comment 6 Alex Deucher 2016-08-10 19:51:15 UTC
Does reverting 5e031d9fe8b0741f11d49667dfc3ebf5454121fd (drm/radeon/pm: update current crtc info after setting the powerstate) help?
Comment 7 Kai 2016-08-10 20:25:59 UTC
I can confirm this issue with 4.7.0 (vs. 4.6.4 in my case). In XCOM 2 I'm losing a third in FPS (down to <= 19 FPS from 26-29 FPS with 4.6.4, "measured" with Gallium's HUD) or about a fifth in The Talos Principle 64 bit (down to 43 Avg FPS from 53 Avg FPS, measured by the benchmark of TTP when run for 60 seconds). The only change between these numbers is the different kernel (and that's with all the VM faults I'm seeing for XCOM 2 with 4.6.4 on occasion and haven't been able to reproduce with 4.7.0 yet).

The stack I'm using (Debian testing as a base) is:
GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1)
Mesa: Git:master/3fb4a9b3b3
libdrm: 2.4.70-1
LLVM: SVN:trunk/r277307 (4.0 devel)
X.Org: 2:1.18.4-1
Linux: 4.6.4 / 4.7.0
Firmware: Git:master/c170c8d957 (placed in /lib/firmware/updates; fallback would be firmware-amd-graphics/20160110-1)
libclc: Git:master/785bfd3719
DDX: 1:7.7.0-1

I'm not going to be able to do a bisect before the weekend and even then I don't have high hopes, since the last time I tried bisecting the kernel all the merge commits screwed me over and I was unable to find a single offending commit.
Comment 8 Kai 2016-08-10 20:52:52 UTC
(In reply to Alex Deucher from comment #6)
> Does reverting 5e031d9fe8b0741f11d49667dfc3ebf5454121fd (drm/radeon/pm:
> update current crtc info after setting the powerstate) help?

No, it does not for me (I fetched the patch from <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/?id=5e031d9fe8b0741f11d49667dfc3ebf5454121fd> and applied it with patch -p1 -R < /path/to.patch). The Talos Principle is still down to 43 FPS, haven't tested XCOM 2 or anything else.

Btw, not sure if it can help tracking down the cause: the GPU doesn't seem to go into its highest performance mode or at least the fan is not reaching its highest RPM when I'm on 4.7.0 (with or without the revert).
Comment 9 Alex Deucher 2016-08-10 21:00:25 UTC
(In reply to Kai from comment #8)
> (In reply to Alex Deucher from comment #6)
> > Does reverting 5e031d9fe8b0741f11d49667dfc3ebf5454121fd (drm/radeon/pm:
> > update current crtc info after setting the powerstate) help?
> 
> No, it does not for me (I fetched the patch from
> <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/
> ?id=5e031d9fe8b0741f11d49667dfc3ebf5454121fd> and applied it with patch -p1
> -R < /path/to.patch). The Talos Principle is still down to 43 FPS, haven't
> tested XCOM 2 or anything else.
> 
> Btw, not sure if it can help tracking down the cause: the GPU doesn't seem
> to go into its highest performance mode or at least the fan is not reaching
> its highest RPM when I'm on 4.7.0 (with or without the revert).

You can check by running apps and monitoring the content of
/sys/kernel/debug/dri/64/radeon_pm_info
Do you see it scaling up under load on the problematic kernels?  Does forcing the clocks to high via sysfs work ok?
Comment 10 Kai 2016-08-11 17:07:10 UTC
(In reply to Alex Deucher from comment #9)
> (In reply to Kai from comment #8)
> > (In reply to Alex Deucher from comment #6)
> > > Does reverting 5e031d9fe8b0741f11d49667dfc3ebf5454121fd (drm/radeon/pm:
> > > update current crtc info after setting the powerstate) help?
> > 
> > No, it does not for me (I fetched the patch from
> > <https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/patch/
> > ?id=5e031d9fe8b0741f11d49667dfc3ebf5454121fd> and applied it with patch -p1
> > -R < /path/to.patch). The Talos Principle is still down to 43 FPS, haven't
> > tested XCOM 2 or anything else.
> > 
> > Btw, not sure if it can help tracking down the cause: the GPU doesn't seem
> > to go into its highest performance mode or at least the fan is not reaching
> > its highest RPM when I'm on 4.7.0 (with or without the revert).
> 
> You can check by running apps and monitoring the content of
> /sys/kernel/debug/dri/64/radeon_pm_info
> Do you see it scaling up under load on the problematic kernels?  Does
> forcing the clocks to high via sysfs work ok?

While the clocks *do* scale up on 4.7.0, it doesn't go as high as 4.6.4. On 4.6.4 I'm reaching the maximum clocks for my GPU (power level avg    sclk: 98000 mclk: 125000). With 4.7.0 the highest I'm seeing is "power level avg    sclk: 90843 mclk: 125000" and the overall level is way lower (lots of sclk values starting with a 3) compared to 4.6.4 where most values are > 60000 for sclk.

Note: I used watch to dump /sys/kernel/debug/dri/64/radeon_pm_info every five seconds to a file. Between runs the actual numbers logged vary, but the general trend stays the same.

Writing "high" to /sys/class/drm/card0/device/power_dpm_force_performance_level locks the clock indeed to the highest values (sclk: 98000 mclk: 125000) on 4.7.0 and recovers almost all of the lost performance. There's still a small hit to performance.

Note: I've only tested "The Talos Principle" so far, if you need me to test XCOM 2 or other titles as well, let me know.

Do you still need the bisect?
Comment 11 Alex Deucher 2016-08-11 19:35:12 UTC
(In reply to Kai from comment #10)
> 
> Do you still need the bisect?

That would be great.  As I mentioned before, there weren't really any dpm related changes for these parts in 4.7 (there weren't many changes to radeon period), so I don't really have any ideas on what would have caused it right now.
Comment 12 Kai 2016-08-12 19:58:07 UTC
Ok, after 14 steps of bisection I identified the first bad commit as:

c63dd758589b1f7e8398841d1f443f06ebfbcefc is the first bad commit
commit c63dd758589b1f7e8398841d1f443f06ebfbcefc
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Fri Apr 1 18:51:34 2016 +0900

    drm/radeon: Support DRM_MODE_PAGE_FLIP_ASYNC
    
    When this flag is set, we program the hardware to execute the flip
    during horizontal blank (i.e. for the next scanline) instead of during
    vertical blank (i.e. for the next frame).
    
    Currently this is only supported on ASICs which have a page flip
    completion interrupt (>= R600), and only if the use_pflipirq parameter
    has value 2 (the default).
    
    Reviewed-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

:040000 040000 2f3d8295e7fa2809a3546a23c64da33311e624b9 6cd9fd9b53df0942efab559295e4c11fc6cc0463 M      drivers

An additional build of 4.7.0 with c63dd758589b1f7e8398841d1f443f06ebfbcefc reverted maintains the performance level of 4.6.x.

Adding Michel to the CC list of this bug, since he authored the offending commit. Let me know, if you need anything else.
Comment 13 Kai 2016-08-12 19:58:58 UTC
Created attachment 125755 [details]
git bisect log output for the bisection run leading to c63dd758589b1f7e8398841d1f443f06ebfbcefc
Comment 14 Alexander Tsoy 2016-08-13 18:49:35 UTC
(In reply to Kai from comment #12)

This is interesting. The same bad commit was found here:
https://bugzilla.kernel.org/show_bug.cgi?id=119631
Comment 15 nadro-linux 2016-08-15 17:57:08 UTC
I also noticed a big performance regression since kernel v4.7rc1. With kernel 4.6 all works fine. I have Radeon R9 380 2GB (Sapphire ITX Compact). I'll try to build a custom kernel without "support for DRM_MODE_PAGE_FLIP_ASYNC" commit and check if performance will be ok.
Comment 16 Alex Deucher 2016-08-15 18:12:54 UTC
Does just reverting this chunk fix the issue?

@@ -1630,6 +1631,9 @@ int radeon_modeset_init(struct radeon_device *rdev)
 
        rdev->ddev->mode_config.funcs = &radeon_mode_funcs;
 
+       if (radeon_use_pflipirq == 2 && rdev->family >= CHIP_R600)
+               rdev->ddev->mode_config.async_page_flip = true;
+
        if (ASIC_IS_DCE5(rdev)) {
                rdev->ddev->mode_config.max_width = 16384;
                rdev->ddev->mode_config.max_height = 16384;
Comment 17 nadro-linux 2016-08-15 20:02:34 UTC
Thanks for your help, when I removed followind line:
"adev->ddev->mode_config.async_page_flip = true;"
from all three files "dce_v8_0.c", "dce_v10_0.c", "dce_v11_0.c" (I use AMDGPU driver) a performance is fine again!
Comment 18 Michel Dänzer 2016-08-16 03:52:54 UTC
Created attachment 125808 [details] [review]
radeon: Add some page flip debugging output

Well, that's a surprising result of the bisection.

I can imagine two possible causes, or possibly some combination thereof:

* The processing of asynchronous flips or the corresponding completion interrupts
  is delayed for some reason
* Using flips instead of blits for buffer swaps lowers the load on the GPU 3D
  engine, so the SMU doesn't switch to higher clocks

The attached debugging patch should give us more information about the former. With it applied, run the following while an affected application is running in fullscreen:

sudo sh -c 'echo 2 >/sys/module/drm/parameters/debug'; sleep 1; sudo sh -c 'echo 0 >/sys/module/drm/parameters/debug'

Then attach the resulting dmesg output.

BTW, does the problem still happen with Alex's current drm-next-4.9-wip branch?
Comment 19 Michel Dänzer 2016-08-16 06:15:54 UTC
BTW, there are some potential workarounds:

* Disable DRI3 for affected games with the environment variable
  LIBGL_DRI3_DISABLE=1

* Enable sync-to-vblank in affected applications, or force it with vblank_mode=3
Comment 20 Kai 2016-08-16 15:49:54 UTC
Created attachment 125820 [details]
dmesg output with additional debug info from attachment 125808 [details] [review]

(In reply to Alex Deucher from comment #16)
> Does just reverting this chunk fix the issue?
> 
> @@ -1630,6 +1631,9 @@ int radeon_modeset_init(struct radeon_device *rdev)
>  
>         rdev->ddev->mode_config.funcs = &radeon_mode_funcs;
>  
> +       if (radeon_use_pflipirq == 2 && rdev->family >= CHIP_R600)
> +               rdev->ddev->mode_config.async_page_flip = true;
> +
>         if (ASIC_IS_DCE5(rdev)) {
>                 rdev->ddev->mode_config.max_width = 16384;
>                 rdev->ddev->mode_config.max_height = 16384;

Yes, just removing the default enable here has the same effect as reverting the entire patch.

(In reply to Michel Dänzer from comment #18)
> Created attachment 125808 [details] [review]
> radeon: Add some page flip debugging output
> 
> Well, that's a surprising result of the bisection.
> 
> I can imagine two possible causes, or possibly some combination thereof:
> 
> * The processing of asynchronous flips or the corresponding completion
> interrupts
>   is delayed for some reason
> * Using flips instead of blits for buffer swaps lowers the load on the GPU 3D
>   engine, so the SMU doesn't switch to higher clocks
> 
> The attached debugging patch should give us more information about the
> former. With it applied, run the following while an affected application is
> running in fullscreen:
> 
> sudo sh -c 'echo 2 >/sys/module/drm/parameters/debug'; sleep 1; sudo sh -c
> 'echo 0 >/sys/module/drm/parameters/debug'
> 
> Then attach the resulting dmesg output.

Here you go. That was generated by running XCOM 2.

> BTW, does the problem still happen with Alex's current drm-next-4.9-wip
> branch?

Haven't tested that yet. Maybe somebody else can do that. ;-)

(In reply to Michel Dänzer from comment #19)
> BTW, there are some potential workarounds:
> 
> * Disable DRI3 for affected games with the environment variable
>   LIBGL_DRI3_DISABLE=1
> 
> * Enable sync-to-vblank in affected applications, or force it with
> vblank_mode=3

Well, this is going to be odd: I had VSync enabled in XCOM, since without that option I got poorer performance in the past than with it. Now, after your note here I actually *disabled* the VSync option in the game. 4.6.4 (or 4.7.0 without the offending commit/the enable removed) shows no longer a performance difference and I'm getting ~30 FPS in XCOM 2. BUT with your ASYNC patch (vanilla 4.7.0) and VSync turned of in the game gives me ALSO ~30 FPS! I'd still say this is a regression as there is no difference without your patch, but maybe this information can help you in narrowing down the cause?

I hope I haven't missed any open question. Let me know if you need anything else.
Comment 21 Michel Dänzer 2016-08-17 08:05:20 UTC
Created attachment 125838 [details] [review]
loader/dri3: Always use 3 back buffers when flipping

Does this Mesa patch help?
Comment 22 Kai 2016-08-17 16:31:21 UTC
(In reply to Michel Dänzer from comment #21)
> Created attachment 125838 [details] [review]
> loader/dri3: Always use 3 back buffers when flipping
> 
> Does this Mesa patch help?

Yes! With attachment 125838 [details] [review] applied on top of Mesa master 607ab6d3bf I'm seeing the same performance on 4.7.1 as on 4.6.4.
You can have my
 Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>

The full stack used (Debian testing as a base) was:
GPU: Hawaii PRO [Radeon R9 290] (ChipID = 0x67b1)
Mesa: Git:master/3fb4a9b3b3 / Git:master/607ab6d3bf + attachment 125838 [details] [review]
libdrm: 2.4.70-1
LLVM: SVN:trunk/r278555 (4.0 devel)
X.Org: 2:1.18.4-1
Linux: 4.7.1
Firmware: Git:master/c170c8d957 (placed in /lib/firmware/updates; fallback would be firmware-amd-graphics/20160110-1)
libclc: Git:master/785bfd3719
DDX: 1:7.7.0-1
Comment 23 Michel Dänzer 2016-08-18 01:02:31 UTC
Note that since the crash happens when Xorg tries to use the GPU, testing with DRI3 and Xorg not using the GPU directly cannot trigger the crash.
Comment 24 Michel Dänzer 2016-08-18 01:03:11 UTC
Sorry, wrong bug. :(
Comment 25 Alex Deucher 2016-08-23 15:56:35 UTC
Is anyone still having issues with vblank_mode=0?
Comment 26 Kai 2016-08-23 16:09:26 UTC
(In reply to Alex Deucher from comment #25)
> Is anyone still having issues with vblank_mode=0?

With or without Michel's patch (<https://patchwork.freedesktop.org/patch/106274/>), which btw, still hasn't landed despite the reviews?
Comment 27 Michel Dänzer 2016-08-24 00:42:44 UTC
(In reply to Kai from comment #26)
> (In reply to Alex Deucher from comment #25)
> > Is anyone still having issues with vblank_mode=0?
> 
> With or without Michel's patch
> (<https://patchwork.freedesktop.org/patch/106274/>), [...]

My patch doesn't have any effect with vblank_mode=0, which is why Alex asked. However, if there are still issues with that, I'd like them to be tracked in a separate report, ideally with an independent bisection confirming they started with the same kernel change.
Comment 28 Kai 2016-08-24 14:16:27 UTC
(In reply to Michel Dänzer from comment #27)
> (In reply to Kai from comment #26)
> > (In reply to Alex Deucher from comment #25)
> > > Is anyone still having issues with vblank_mode=0?
> > 
> > With or without Michel's patch
> > (<https://patchwork.freedesktop.org/patch/106274/>), [...]
> 
> My patch doesn't have any effect with vblank_mode=0, which is why Alex
> asked.

Ah, ok. That wasn't clear to me.

> However, if there are still issues with that, I'd like them to be
> tracked in a separate report, ideally with an independent bisection
> confirming they started with the same kernel change.

For me it doesn't matter if I set vblank_mode=0 or not. With your patch applied I'm getting 30 FPS in XCOM 2 either way (also doesn't matter if I enable VSync in the game's options or not, though the FPS graph is smoother and closer to 30 FPS flat with VSync on).

This is with 4.7.2, LLVM r279473, Mesa a2ae67aa47 + <https://patchwork.freedesktop.org/patch/106274/> and otherwise the stack described in comment #22.
Comment 29 Michel Dänzer 2016-08-25 06:41:33 UTC
(In reply to Kai from comment #28)
> For me it doesn't matter if I set vblank_mode=0 or not. With your patch
> applied I'm getting 30 FPS in XCOM 2 either way [...]

Right, the point being that you were getting 30 FPS even without the patch with sync-to-vblank disabled (which is equivalent to vblank_mode=0). So the issue you were seeing may not be exactly the same as the one reported on Phoronix, and possibly not the same as the one seen by Clésio.

Clésio (or anyone else who thinks they ran into the same problem), does this patch fix the problem you were seeing?
Comment 30 Clésio Luiz 2016-08-25 11:25:47 UTC
Well, I'm too much a newbie to compile those things (I'm using Ubuntu...). But, if you point me to some documentation I can test it in a clear install in another HDD.
Comment 31 Michel Dänzer 2016-08-26 00:48:53 UTC
(In reply to Clésio Luiz from comment #30)
> Well, I'm too much a newbie to compile those things (I'm using Ubuntu...).

The change is in Mesa Git master now, so you can test it with a PPA using that.
Comment 32 Clésio Luiz 2016-08-27 12:50:16 UTC
Padoka updated (mesa 12.1~git1600826212300.cf7be70~x~padoka0), so I made a test with kernels 4.7.2 and 4.8 RC3. The problem persist.

The package xserver-xorg-video-ati from his PPA is from 08/24 though.
Comment 33 nadro-linux 2016-08-28 10:04:17 UTC
I didn't test mesa with provided patch, but I always run games with vblank_mode=0 (most of the time via game menu options -> Vsync disabled) and DRI3 enabled, so it looks like provided mesa patch will not fix issues on my R9 380 (I still use kernel 4.8 RC1 without async page flip). I see big regression eg. in Shadow of Mordor and OpenMW, but IIRC Metro 2033 and LL (both Redux) were affected too.
Comment 34 nadro-linux 2016-08-28 11:07:55 UTC
Sorry for a wrong information in my previous post, it looks like I had problems with 'forced vblank_mode to 1' in some apps. I downloaded fresh kernel 4.8rc3 from Ubuntu Kernel PPA + Mesa from Padoka PPA and all works fine at now. (I tested Shadow of Mordor and OpenMW with vblank_mode set to 0 and 1).
Comment 35 Michel Dänzer 2016-08-29 02:16:50 UTC
(In reply to Clésio Luiz from comment #32)
> The package xserver-xorg-video-ati from his PPA is from 08/24 though.

That doesn't matter; Mesa does, and it looks like it should have my change.

It would be interesting if setting draw->num_back = 4 in dri3_update_num_back helps, but otherwise we really need someone who can still reproduce the problem to bisect the kernel.
Comment 36 Jos van Wolput 2016-09-02 13:26:02 UTC
Since your patch 1e3218bc... (loader/dri3: Overhaul dri3_update_num_back),
vblank_mode=0 glxgears shows a performance regression of about 30% of what I got
before your patch was applied.

I am using Debian/Sid with
GL_RENDERER   = Gallium 0.4 on AMD RS780 (DRM 2.45.0 / 4.7.0-2.1-liquorix-amd64, LLVM 3.9.0)
GL_VERSION    = 3.0 Mesa 12.1.0-devel (git-cee459d8)
libGL: OpenDriver: /usr/lib/x86_64-linux-gnu/dri/r600_dri.so
libGL: Using DRI3 for screen 0

Reverting your patch to the previous state fixes this issue.
Comment 37 Dieter Nützel 2016-09-02 22:42:25 UTC
Have a look at Bug 97549, too.
Comment 38 Dieter Nützel 2016-09-02 22:59:35 UTC
I got regression wtih 1e3218b... (loader/dri3: Overhaul dri3_update_num_back)
on Turks XT/Ni 6670

for

glxgears: ~5600 fps -> ~2600 fps (~50%)

mesa-demos/objviever 'bobcat.obj': 1415 fps -> ~1300 fps (~8%)

and flickering with Blender 2.76b
'User Preferences...' -> 'Window Draw Method' 'Automatic'
Which is solvable by changing the  'Window Draw Method' to 'Triple Buffer'
and is an app bug like Michel Dänzer told in Bug 97059.

All three are fine, again with 1e3218b reverted.
Comment 39 Michel Dänzer 2016-09-05 08:27:57 UTC
It would have been better to file new reports instead of adding comments here.

Anyway, please test the patch I attached to bug 97549.
Comment 40 Dieter Nützel 2016-09-05 16:11:01 UTC
(In reply to Michel Dänzer from comment #39)
> It would have been better to file new reports instead of adding comments
> here.

Sorry for that Michel.
I've appended it here because it overlap with attachment 125838 [details] [review]
https://bugs.freedesktop.org/attachment.cgi?id=125838
and your related from patchwork
https://patchwork.freedesktop.org/patch/106274
Comment #26.
 
> Anyway, please test the patch I attached to bug 97549.

I did.

FIX the former two on _my_ hardware.
Except Blender, but that was expected...

So you have my T-b.
Comment 41 Dieter Nützel 2016-09-05 16:54:40 UTC
With Marek's latest Mesa commits (0d7ec8b) it helps even more.
The regression with 'glxgears' and 'objview/bobcat.obj' is much bigger and
FIXED with your patch from bug 97549.
Comment 42 Dieter Nützel 2016-09-06 11:15:07 UTC
Hello Kai et al.,

can you please retest with current Mesa git master (dc3bb5d).
And let us know if R9 290 low performance regression is fixed for you, too?
Comment 43 Kai 2016-09-06 11:21:03 UTC
(In reply to Dieter Nützel from comment #42)
> Hello Kai et al.,
> 
> can you please retest with current Mesa git master (dc3bb5d).
> And let us know if R9 290 low performance regression is fixed for you, too?

My regression was fixed with 1e3218bc5ba2b739261f0c0bacf4eb662d377236, see eg. comment #28. Michel didn't remove the "always three buffers" part so no need to retest anything.
Comment 44 Jos van Wolput 2016-09-06 11:51:39 UTC
(In reply to Michel Dänzer from comment #39)
> Anyway, please test the patch I attached to bug 97549.

Fixed on my hardware, thanks!
Comment 45 Clésio Luiz 2016-09-13 13:34:53 UTC
Padoka PPA finally updated, version 2.1~git1600912162600.546bc07~x~padoka0.

Here the bug continues. This time a take the time to test in various games. Valve's Source Engine take less hit by this bug, only about 50% loss in performance. But others, like American Truck Simulator and Unigine Valley take a 70/80 % hit in loss of performance, both cannot pass 10/12 FPS in a beefy Core i7 and a R9 290.

I tested in kernel 4.7.3 and 4.8-RC6.
Comment 46 alvarex 2016-09-14 23:31:09 UTC
I have hit this bug too with r7 260x. I don't have time for bisecting but with Mesa from 7th september the perfomance is fine . I ve also noticed some perfomance delta dif between the 12.0.2 release and git from september 7th . With dirt showdown with kernel 4.6.7 and LIBL_DRI3_DISABLE, I get an average of 35~30 framerates with Mesa from 7th september I get 50 fps. 
Anyway here I have setup a repo if someone else on Opensuse is hitting the same bug. 

https://build.opensuse.org/package/show?project=home%3Aalvarex%3Abranches%3Ahome%3Apontostroy%3AX11&package=Mesa
Comment 47 alvarex 2016-09-14 23:32:45 UTC
*edit* I get 50 fps we newer kernel 4.8rc5; with kernel 4.8rc5 and Mesa 12.0.2 the performance is the same.
Comment 48 alvarex 2016-09-15 00:13:40 UTC
edit: It's from september the 6th not from the 7th.
Comment 49 Michel Dänzer 2016-09-15 01:17:51 UTC
Seems like there are various different issues at play here. The bottom line is: Don't expect your issue to get fixed without bisecting it.
Comment 50 Clésio Luiz 2016-09-20 22:17:29 UTC
I opened the bug, but a couple weeks ago my card died, so I cannot provide info anymore.
Comment 51 Michel Dänzer 2016-09-21 02:10:06 UTC
Per comment 50, we won't be able to do anything more about Clésio's problem.

Anyone else still seeing similar symptoms, please file your own report and try narrowing down which change of which component introduced it.
Comment 52 Xavier Sellier 2016-09-21 02:20:01 UTC
Hey,

I have 2 computers with the very same hardware (AMD FX 6300 with a AMD R9 290).
We are experiencing the very same issue.

We're both running Debian sid.

uname -a
Linux binogure 4.6.0-1-amd64 #1 SMP Debian 4.6.4-1 (2016-07-18) x86_64 GNU/Linux

glxinfo | grep -i opengl
OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD HAWAII (DRM 2.43.0 / 4.6.0-1-amd64, LLVM 3.8.1)
OpenGL core profile version string: 4.1 (Core Profile) Mesa 12.0.3
OpenGL core profile shading language version string: 4.10
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 12.0.3
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 12.0.3
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

lspci | grep -i vga
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Hawaii PRO [Radeon R9 290]

Up to this kernel (4.6.1), everything runs fine (around 100 fps on most of games). Each time I upgrade the kernel to 4.7.0-1 or 4.8.0-rc5, fps are dropping to 25~30.
To compare fps, I use unigine valley benchmark, I will attach result to this bug.
Comment 53 Xavier Sellier 2016-09-21 02:20:36 UTC
Created attachment 126685 [details]
Unigine valley benchmark (Kernel 4.6.0-1)
Comment 54 Xavier Sellier 2016-09-21 02:35:38 UTC
I'll upload my next benchmark, and it seems I have the very same result between kernel 4.6 and kernel 4.8.0-rc5
Comment 55 Xavier Sellier 2016-09-21 02:36:01 UTC
Created attachment 126686 [details]
Unigine valley benchmark (Kernel 4.8.0-rc5)
Comment 56 Michel Dänzer 2016-09-21 02:41:33 UTC
Created attachment 126687 [details] [review]
Use 4 buffers for flipping

(In reply to Xavier Sellier from comment #52)
> We are experiencing the very same issue.

And you ignored my explicit request to file your own report because... ?

Anyway, there are only two things which can move this issue forward: bisecting the kernel, or trying if this Mesa patch helps.
Comment 57 Alex Deucher 2016-10-04 15:06:31 UTC
Created attachment 126995 [details] [review]
disable async flip support

If anyone else is seeing a regression not already fixed, please bisect and open a new bug with the results.  If you can't bisect, does disabling async flips help (see attached patch)?
Comment 58 Michel Dänzer 2016-10-05 06:27:59 UTC
(In reply to Alex Deucher from comment #57)
> If you can't bisect, does disabling async flips help (see attached patch)?

Phoronix was still hitting the bad performance even with the broken drm_mode_page_flip_ioctl code, which prevented page flips from working at all. So that problem can't be even indirectly related to async flips. Seems like it's most likely a DPM issue in the radeon driver. We really need someone affected by that to bisect.
Comment 59 Alex Deucher 2016-10-05 17:57:24 UTC
does reverting either of these patches help?

commit 5e031d9fe8b0741f11d49667dfc3ebf5454121fd
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Wed Feb 24 17:38:38 2016 -0500

    drm/radeon/pm: update current crtc info after setting the powerstate
    
    On CI, we need to see if the number of crtcs changes to determine
    whether or not we need to upload the mclk table again.  In practice
    we don't currently upload the mclk table again after the initial load.
    The only reason you would would be to add new states, e.g., for
    arbitrary mclk setting which is not currently supported.
    
    Acked-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org

or

commit d74e766e1916d0e09b86e4b5b9d0f819628fd546
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Tue Mar 8 11:31:00 2016 -0500

    Revert "drm/radeon/pm: adjust display configuration after powerstate"
    
    This reverts commit 39d4275058baf53e89203407bf3841ff2c74fa32.
    
    This caused a regression on some older hardware.
    
    bug:
    https://bugzilla.kernel.org/show_bug.cgi?id=113891
    
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
    Cc: stable@vger.kernel.org
Comment 60 alvarex 2016-10-12 04:46:32 UTC
I ve been following this up on phoronix and with different builds of mesa. Recently it was fixed on mesa for amdgpu but not for radeon (there was an article on phoronix about that) but as from oct 11th is broken again on amdgpu. I m not sure when it got broken again I have a working build of oct 7th , maybe on the 9th o the 10th. Just my two cents I 'm not quite sure if it's a power manager issue. I tried bisecting mesa but I failed sometimes it won't compile sorry I'm not good with git. 
Anyway I 'm on different hardware (r7 260x) and yes you are right Michel is a different issue because sometimes I will hit the bug with the same build and someone else with a r9 290 wont or viceversa (not always). But it might be related as well because the performance difference hit is similar and with some build setups ie kernel 4.6 and mesa 12 the performance is fine in both hardware.
Comment 61 Tim Writer 2016-10-12 04:46:44 UTC
Created attachment 127231 [details]
attachment-13933-0.html

I'm out of the office, returning Tue Oct 18. I will be checking mail from time to time but responses will be delayed.

Regards,
Tim
Comment 62 Michel Dänzer 2016-10-12 06:07:23 UTC
(In reply to alvarex from comment #60)
> I tried bisecting mesa but I failed sometimes it won't compile sorry I'm not
> good with git. 

Just run "git bisect skip" for commits which don't compile.


> Anyway I 'm on different hardware (r7 260x) and yes you are right Michel is
> a different issue because sometimes I will hit the bug with the same build
> and someone else with a r9 290 wont or viceversa (not always).

This means you should file your own bug report.
Comment 63 alvarex 2016-10-12 10:24:11 UTC
If I remove the line "adev->ddev->mode_config.async_page_flip = true;" from the files on comment 17, and boot with radeon.dpm=0 radeon.runpm=0. That will trigger the bug, on Tombraider for expample it will run at 9FPS when previously running at 75FPS, on amdgpu.
Comment 64 Alex Deucher 2016-10-12 15:40:41 UTC
(In reply to alvarex from comment #63)
> If I remove the line "adev->ddev->mode_config.async_page_flip = true;" from
> the files on comment 17, and boot with radeon.dpm=0 radeon.runpm=0. That
> will trigger the bug, on Tombraider for expample it will run at 9FPS when
> previously running at 75FPS, on amdgpu.

When you set radoen.dpm=0 low performance is expected because you've disabled the power management so the card stays at it's low boot up clocks.  You shouldn't mess with the radeon.dpm option for this issue.
Comment 65 tarpoon 2017-03-19 00:37:07 UTC
I just upgraded to a R9 290 (Asus Direct CU II OC) from my HD7950 and also have less then half the performance in 4.8 kernel vs 4.4 on Ubuntu 16.04. I'll try to make some time to bisect it next week. Should I post the results here or open a new bug?
Comment 66 tarpoon 2017-03-19 19:42:12 UTC
Okay. I bisected the issue today already:

7050c6ef5f0e9bc5e6bf9eb035320b70f731b919 is the first bad commit
commit 7050c6ef5f0e9bc5e6bf9eb035320b70f731b919
Author: Arindam Nath <arindam.nath@amd.com>
Date:   Wed Apr 6 15:33:51 2016 -0400

    drm/radeon: add support for loading new UVD fw
    
    Signed-off-by: Christian König <christian.koenig@amd.com>
    Signed-off-by: Arindam Nath <arindam.nath@amd.com>
    Reviewed-by: Leo Liu <leo.liu@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

:040000 040000 25ff8fc8784746ea4e371a13655581a3e844e7b9 9016d0a57c0409e91eff182bbccd91510a8b85c1 M      drivers

I use mesa from oibaf ppa:

glxinfo | grep -i opengl
OpenGL vendor string: X.Org
OpenGL renderer string: Gallium 0.4 on AMD HAWAII (DRM 2.43.0 / 4.6.0-rc3-funfzehn, LLVM 4.0.0)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 17.1.0-devel
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.0 Mesa 17.1.0-devel
OpenGL shading language version string: 1.30
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.1 Mesa 17.1.0-devel
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10
OpenGL ES profile extensions:


I also tested the 4.10 kernel, it has the same issue.


Content of /sys/kernel/debug/dri/64/radeon_pm_info in the start menu of Shadow of Morder (which has 100 FPS in 4.6.7 and 25 FPS in 4.7.0 and later):

Before starting the game:
uvd    disabled
vce    disabled
power level avg    sclk: 30000 mclk: 15000

In the main menu of the game with 4.7.0
uvd    disabled
vce    disabled
power level avg    sclk: 100000 mclk: 15000

In the main menu of the game with 4.6.7
uvd    disabled
vce    disabled
power level avg    sclk: 91198 mclk: 126000
Comment 67 tarpoon 2017-03-19 19:44:21 UTC
Created attachment 130317 [details]
bisect log leading to 7050c6ef5f0e9bc5e6bf9eb035320b70f731b919
Comment 68 Alex Deucher 2017-03-20 14:30:25 UTC
The firmware was fixed a while ago.  Please make sure your distro is using the latest firmware.
Comment 69 Alex Deucher 2017-03-20 14:31:05 UTC
(In reply to Alex Deucher from comment #68)
> The firmware was fixed a while ago.  Please make sure your distro is using
> the latest firmware.

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git
Comment 70 tarpoon 2017-03-20 16:37:48 UTC
(In reply to Alex Deucher from comment #69)
> (In reply to Alex Deucher from comment #68)
> > The firmware was fixed a while ago.  Please make sure your distro is using
> > the latest firmware.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git

Thanks a lot for the reply. My Distro is Kubuntu 16.04 with Oibaf PPA for MESA.

Replacing the content of /lib/firmware/radeon with the files found in https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/radeon (I just deleted and redownloaded everything) and doing a "sudo update-initramfs -u" restored the performance to normal levels using radeon (haven't tried amdgpu yet).

Shadow of Mordor Benchmark FPS:
Kernel Version | Menu | AVG | MAX | MIN
Kernel 4.6.7 | 100 | 58 | 161 | 30
Kernel 4.10 (with new Firmware) | 100 | 58 | 114 | 30


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct.