84662 – Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU

Bug 84662 - Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP cache before submitting a CS to the GPU

Summary: Long pauses with Unreal demo Elemental on R9270X since : Always flush the HDP...

Status:	RESOLVED FIXED

Alias:	None

Product:	Mesa
Classification:	Unclassified
Component:	Drivers/Gallium/radeonsi (show other bugs)
Version:	git
Hardware:	Other All

Importance:	medium normal
Assignee:	Default DRI bug account
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-10-04 14:50 UTC by Andy Furniss
Modified:	2015-11-05 09:55 UTC (History)
CC List:	3 users (show)

See Also:
i915 platform:
i915 features:

Attachments
dmesg on bad commit (55.21 KB, text/plain) 2014-10-04 14:51 UTC, Andy Furniss	Details
gallium hud on good - longer pause matches bump on vram/gtt graphs (1.40 MB, image/png) 2014-10-04 14:57 UTC, Andy Furniss	Details
gallium hud on bad showing different vram gtt usage (1.40 MB, image/png) 2014-10-04 14:58 UTC, Andy Furniss	Details
Make Mesa behave as if the kernel was older (1.05 KB, patch) 2014-10-06 07:24 UTC, Michel Dänzer	Details \| Splinter Review
bad gtt usage (Borderlands 2) (2.86 MB, image/png) 2014-10-06 16:59 UTC, Chernovsky Oleg	Details
ttm: Don't evict BOs outside of requested placement range (1.67 KB, patch) 2014-10-07 03:47 UTC, Michel Dänzer	Details \| Splinter Review
vram usage with Don't evict BOs outside of requested placement range (1.38 MB, image/png) 2014-10-07 13:17 UTC, Andy Furniss	Details
r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled (945 bytes, patch) 2014-10-08 08:03 UTC, Michel Dänzer	Details \| Splinter Review
winsys/radeon: Use separate caching buffer manager for each set of flags (5.81 KB, patch) 2014-10-08 08:04 UTC, Michel Dänzer	Details \| Splinter Review
drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM (2.88 KB, patch) 2014-10-08 08:06 UTC, Michel Dänzer	Details \| Splinter Review
Hud with 2 kernel and 2 mesa patches better but still bad. (1.39 MB, image/png) 2014-10-08 15:22 UTC, Andy Furniss	Details
Kernel errors with mesa patched + agd5f drm-next-3.19 (91.33 KB, text/plain) 2014-10-15 09:55 UTC, Andy Furniss	Details
new issue long pause at end of demo (647.54 KB, image/png) 2014-11-03 22:03 UTC, Andy Furniss	Details
dmesg while playing L4D2 on a RV770 with kernel 3.17.7 (10.06 KB, text/plain) 2014-12-22 21:54 UTC, Benjamin Bellec	Details
elemental stalls near end with large vram -> gtt move (986.82 KB, image/png) 2015-01-08 17:30 UTC, Andy Furniss	Details
Show Obsolete (1) View All

Description Andy Furniss 2014-10-04 14:50:48 UTC

R9270X PCIE 2.0 2 gig vram 4 gig ram.

Unreal demo Elemental only recently started working with git llvm/mesa, do no bisection done on these.

The demo even when working is very broken - constant 1/4 sec stutters with a couple of 1/2 - 1 sec.

But after -

commit 4439d469706699b4e69ef410ebc9115339f6e9e6
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Thu Jul 31 18:43:49 2014 +0900

    drm/radeon: Always flush the HDP cache before submitting a CS to the GPU
    
    This ensures the GPU sees all previous CPU writes to VRAM, which makes it
    safe:
    
    * For userspace to stream data from CPU to GPU via VRAM instead of GTT
    * For IBs to be stored in VRAM instead of GTT
    * For ring buffers to be stored in VRAM instead of GTT, if the HPD flush
      is performed via MMIO


There are far longer pauses of many seconds - run time to unreal logo on "good" is 3:45 on bad best 5:39 can be longer.

Screens to be attached show quite different vram/gtt usage between good and bad.

Related to 

https://bugs.freedesktop.org/show_bug.cgi?id=82050

I bisected to the same commit in that bug, but applying the mesa patch

https://bugs.freedesktop.org/show_bug.cgi?id=82050#c19

doesn't help here.

CONFIG_CMA is not set.

Tried kernel org 3.17-rc7 as it was reported in above as working, but not for me.

Comment 1 Andy Furniss 2014-10-04 14:51:41 UTC

Created attachment 107325 [details]
dmesg on bad commit

Comment 2 Andy Furniss 2014-10-04 14:57:10 UTC

Created attachment 107326 [details]
gallium hud on good - longer pause matches bump on vram/gtt graphs

Comment 3 Andy Furniss 2014-10-04 14:58:41 UTC

Created attachment 107327 [details]
gallium hud on bad showing different vram gtt usage

Comment 4 Andy Furniss 2014-10-05 15:41:42 UTC

(In reply to Andy Furniss from comment #0)

> I bisected to the same commit in that bug, but applying the mesa patch
> 
> https://bugs.freedesktop.org/show_bug.cgi?id=82050#c19
> 
> doesn't help here.

It seems that with current llvm/mesa that patch doesn't help Unigine Valley anymore.

Comment 5 smoki 2014-10-05 17:31:25 UTC

(In reply to Andy Furniss from comment #3)
> Created attachment 107327 [details]
> gallium hud on bad showing different vram gtt usage

 Hmm whatever GTT starts fueling fps gows down to crawl... I can't run Elemental on Kabini have just 1-3 fps :D , but i am curios what happens for you in HUD when you just comment out this in r600_buffer_common.c:

	default:
		/* Not listing GTT here improves performance in some apps. */
		res->domains = RADEON_DOMAIN_VRAM;
//		flags |= RADEON_FLAG_GTT_WC;
		break;
	}

Comment 6 smoki 2014-10-05 17:48:50 UTC

 I mean only that line of course:
 
//		flags |= RADEON_FLAG_GTT_WC;

 Just checked on valley so yeah, seems like anything about GTT does not like to be mentioned there as default ;)

Comment 7 Andy Furniss 2014-10-05 19:24:47 UTC

(In reply to smoki from comment #5)
> (In reply to Andy Furniss from comment #3)
> > Created attachment 107327 [details]
> > gallium hud on bad showing different vram gtt usage
> 
>  Hmm whatever GTT starts fueling fps gows down to crawl... I can't run
> Elemental on Kabini have just 1-3 fps :D , but i am curios what happens for
> you in HUD when you just comment out this in r600_buffer_common.c:
> 
> 	default:
> 		/* Not listing GTT here improves performance in some apps. */
> 		res->domains = RADEON_DOMAIN_VRAM;
> //		flags |= RADEON_FLAG_GTT_WC;
> 		break;
> 	}

It looks roughly the same as the pic posted.

Comment 8 smoki 2014-10-05 20:15:15 UTC

 
 And about Valley? If that is also the same, i don't know (might be dedicated vram does not trigger this or something) that one made a noticable difference for me :) Guessing you run all this without composite, because there as unsolvable sttuter for me with Unigine Valley on Linux and Windows with Aero, so when i comment out this it runs the same as in Windows - fist round only two slight sttuters (normal ones), then second time there is no stuter.

 Tried now in Xonotic ultimate there is also one moment scene case where this show much less fps drop for me, that is also the case when gtt get fueled.

Comment 9 Chernovsky Oleg 2014-10-05 20:50:50 UTC

I experience the same freezes (1-1.3 sec) every 10 sec in any Valve Source engine-based games.

Comment 10 Chernovsky Oleg 2014-10-05 21:17:27 UTC

kernel 3.17-rc7, mesa 10.3, llvm 3.5.0

Can provide any additional info on request.

Comment 11 smoki 2014-10-05 21:29:46 UTC

(In reply to Chernovsky Oleg from comment #9)
> I experience the same freezes (1-1.3 sec) every 10 sec in any Valve Source
> engine-based games.

 Well open a new bug, sttuter/freezes can be of varios cause... hardware combinations, APU or dedicated card, on AMD procesor on Intel so what governor used, composite or not, vblank on or not, spilling freezes, some particular effect unoptimized, etc... so various causes could be, this one is i think only about GTT load and fps drop because of that, so if you are sure it is because of that (check with GALLIUM_HUD=fps,GTT-usage) than that is the same bug.

Comment 12 Michel Dänzer 2014-10-06 07:24:18 UTC

Created attachment 107391 [details] [review]
Make Mesa behave as if the kernel was older

(In reply to Andy Furniss from comment #3)
> gallium hud on bad showing different vram gtt usage

That's very interesting indeed — on the 'good' screenshot, requested and actual VRAM usage match up almost perfectly, but on the 'bad' one, there's a large discrepancy.

With a current kernel, does the bad behaviour start with the same Mesa commit as the Unigine stutters?

The attached patch makes Mesa behave the same with current kernels as it does with older ones which don't have the bisected commit. Does that avoid the problem?

Comment 13 Andy Furniss 2014-10-06 11:35:29 UTC

(In reply to Michel Dänzer from comment #12)
> Created attachment 107391 [details] [review] [review]
> Make Mesa behave as if the kernel was older
> 
> (In reply to Andy Furniss from comment #3)
> > gallium hud on bad showing different vram gtt usage
> 
> That's very interesting indeed — on the 'good' screenshot, requested and
> actual VRAM usage match up almost perfectly, but on the 'bad' one, there's a
> large discrepancy.
> 
> With a current kernel, does the bad behaviour start with the same Mesa
> commit as the Unigine stutters?

I can't go back that far with Elemental due to llvm asserts.

> 
> The attached patch makes Mesa behave the same with current kernels as it
> does with older ones which don't have the bisected commit. Does that avoid
> the problem?

No, the patch makes no difference.

Comment 14 Andy Furniss 2014-10-06 14:33:58 UTC

(In reply to Michel Dänzer from comment #12)
> Created attachment 107391 [details] [review] [review]
> Make Mesa behave as if the kernel was older
> 
> (In reply to Andy Furniss from comment #3)
> > gallium hud on bad showing different vram gtt usage
> 
> That's very interesting indeed — on the 'good' screenshot, requested and
> actual VRAM usage match up almost perfectly, but on the 'bad' one, there's a
> large discrepancy.
> 
> With a current kernel, does the bad behaviour start with the same Mesa
> commit as the Unigine stutters?
> 
> The attached patch makes Mesa behave the same with current kernels as it
> does with older ones which don't have the bisected commit. Does that avoid
> the problem?

More testing - incomplete but will post initial findings as I don't know when I'll finish.

I managed to get older mesa to build with new llvm by applying the build fixes.

With mesa on the commit before
r600g,radeonsi: Set RADEON_GEM_NO_CPU_ACCESS flag for tiled BOs
(I don't know yet exactly what mesa commit changed things)

I can with your patch or the stream revert get "good" behavior, this was with a kernel on the "bad" HPD kernel commit.

Will test more over time.

Comment 15 Chernovsky Oleg 2014-10-06 16:59:03 UTC

Created attachment 107438 [details]
bad gtt usage (Borderlands 2)

(In reply to smoki from comment #11)
> (In reply to Chernovsky Oleg from comment #9)
> > I experience the same freezes (1-1.3 sec) every 10 sec in any Valve Source
> > engine-based games.
> 
>  Well open a new bug, sttuter/freezes can be of varios cause... hardware
> combinations, APU or dedicated card, on AMD procesor on Intel so what
> governor used, composite or not, vblank on or not, spilling freezes, some
> particular effect unoptimized, etc... so various causes could be, this one
> is i think only about GTT load and fps drop because of that, so if you are
> sure it is because of that (check with GALLIUM_HUD=fps,GTT-usage) than that
> is the same bug.

I know what I'm talking about. It's this bug, so I've written here.

Here's the proof (Unreal Engine 3, though, I can provide it with any Valve game at will). Spikes on the GTT usage have corresponding falls of the FPS.

Radeon R7 260X (BONAIRE)

Comment 16 Aaron B 2014-10-06 17:15:06 UTC

Don't know what you guys are testing for exactly, but here's my chart for your consumption. R9 270X 2GB. Where I screen shot, the demo also just freezes. But, that's another issue for later. Here you go. I also have long pauses and freezes and everything.

http://i.imgur.com/K5uMaAr.jpg

Comment 17 Michel Dänzer 2014-10-07 03:47:36 UTC

Created attachment 107451 [details] [review]
ttm: Don't evict BOs outside of requested placement range

Does this kernel patch help?

Comment 18 Andy Furniss 2014-10-07 09:39:29 UTC

(In reply to Andy Furniss from comment #14)

> With mesa on the commit before
> r600g,radeonsi: Set RADEON_GEM_NO_CPU_ACCESS flag for tiled BOs
> (I don't know yet exactly what mesa commit changed things)
> 
> I can with your patch or the stream revert get "good" behavior, this was
> with a kernel on the "bad" HPD kernel commit.
> 
> Will test more over time.

I haven't tried the patch yet. will do but I got some strange results last thing yesterday which will make me hesitant about declaring any patch good or bad.

I started resetting around in mesa and noticed unexpected goods - so I tried head and got a good noth with and without 1st patch.

Rebooted into 3.17-rc7 still good.

Did some recompiling of llvm/mesa as I had changed my normal setup slightly - got bad.

Repeated resetting mesa to older + patch got good, reset to head again, bad. Applied patch on head good, reversed patch on head - still good.

I am wondering at this time whether card is in some nice state so power cycle - still good. Boot into 3.18 still good, clean & recompile the same mesa without doing anything else - bad again. Still bad after power cycles.

I always make distclean + git clean -dfx when doing anything other than applying or reversing a patch.

So it seems that there is some randomness whether I am good or bad between mesa build/installs.

This does not however tally with my kernel bisect which seemed to go flawlessly.

I assume there is no card state that can survive power off.

So currently a bit confused - as reported by Christoph in the other bug, it is possible to have good with vanilla mesa/kernel, but apparently for me the "same" mesa can also be bad.

I did try more clean/rebuild cycles including single thread but am currently still bad.

Comment 19 Andy Furniss 2014-10-07 13:13:56 UTC

(In reply to Michel Dänzer from comment #17)
> Created attachment 107451 [details] [review] [review]
> ttm: Don't evict BOs outside of requested placement range
> 
> Does this kernel patch help?

agd5f drm-next-3.18-wip + patch is still bad.

Bad can be a bit variable, but I would call it as one of the better bads and the vram usage is closer to a good - pic to follow.

Another observation - On good or bad the demo takes a while to get going.

Time to start of normal rendering on this patched kernel is around 2m5s

If I wiggle the window around while it's loading/stuck = 1m23s

Comment 20 Andy Furniss 2014-10-07 13:17:01 UTC

Created attachment 107493 [details]
vram usage with Don't evict BOs outside of requested placement range

Comment 21 Alexandre Demers 2014-10-07 14:19:41 UTC

(In reply to Andy Furniss from comment #18)
> (In reply to Andy Furniss from comment #14)
> 
> > With mesa on the commit before
> > r600g,radeonsi: Set RADEON_GEM_NO_CPU_ACCESS flag for tiled BOs
> > (I don't know yet exactly what mesa commit changed things)
> > 
> > I can with your patch or the stream revert get "good" behavior, this was
> > with a kernel on the "bad" HPD kernel commit.
> > 
> > Will test more over time.
> 
> I haven't tried the patch yet. will do but I got some strange results last
> thing yesterday which will make me hesitant about declaring any patch good
> or bad.
> 
> I started resetting around in mesa and noticed unexpected goods - so I tried
> head and got a good noth with and without 1st patch.
> 
> Rebooted into 3.17-rc7 still good.
> 
> Did some recompiling of llvm/mesa as I had changed my normal setup slightly
> - got bad.
> 
> Repeated resetting mesa to older + patch got good, reset to head again, bad.
> Applied patch on head good, reversed patch on head - still good.
> 
> I am wondering at this time whether card is in some nice state so power
> cycle - still good. Boot into 3.18 still good, clean & recompile the same
> mesa without doing anything else - bad again. Still bad after power cycles.
> 
> I always make distclean + git clean -dfx when doing anything other than
> applying or reversing a patch.
> 
> So it seems that there is some randomness whether I am good or bad between
> mesa build/installs.
> 
> This does not however tally with my kernel bisect which seemed to go
> flawlessly.
> 
> I assume there is no card state that can survive power off.
> 
> So currently a bit confused - as reported by Christoph in the other bug, it
> is possible to have good with vanilla mesa/kernel, but apparently for me the
> "same" mesa can also be bad.
> 
> I did try more clean/rebuild cycles including single thread but am currently
> still bad.

I'm also seeing this behavior while bisecting a different bug on a 7950... Probably related...

Comment 22 Andy Furniss 2014-10-07 23:38:13 UTC

(In reply to Alexandre Demers from comment #21)

> > So currently a bit confused - as reported by Christoph in the other bug, it
> > is possible to have good with vanilla mesa/kernel, but apparently for me the
> > "same" mesa can also be bad.
> > 
> > I did try more clean/rebuild cycles including single thread but am currently
> > still bad.
> 
> I'm also seeing this behavior while bisecting a different bug on a 7950...
> Probably related...

It seems like current mesa doesn't build/install properly when patched without being totally clean.

The mesa patch in this bug on an already built tree does provoke -

make[3]: Entering directory '/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/radeon'
  CC       r600_buffer_common.lo
  CCLD     libradeon.la

but radeonsi_dri.so is not changed after make install.

Comment 23 Andy Furniss 2014-10-07 23:44:46 UTC

(In reply to Andy Furniss from comment #13)
> (In reply to Michel Dänzer from comment #12)
> > Created attachment 107391 [details] [review] [review] [review]
> > Make Mesa behave as if the kernel was older

> > The attached patch makes Mesa behave the same with current kernels as it
> > does with older ones which don't have the bisected commit. Does that avoid
> > the problem?
> 
> No, the patch makes no difference.

It seems I wasn't testing it - so the answer now is yes it does - in fact the goods I get with it on current kernels are better = less stutters than the goods I saw during the kernel bisect.

Comment 24 Michel Dänzer 2014-10-08 03:04:28 UTC

(In reply to Andy Furniss from comment #22)
> It seems like current mesa doesn't build/install properly when patched
> without being totally clean.
> 
> The mesa patch in this bug on an already built tree does provoke -
> 
> make[3]: Entering directory
> '/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/radeon'
>   CC       r600_buffer_common.lo
>   CCLD     libradeon.la
> 
> but radeonsi_dri.so is not changed after make install.

How about lib/gallium/radeonsi_dri.so in the build tree? That gets updated correctly for me. I don't use make install though, and I'm using an out-of-tree build, so maybe it's related to one of those. Anyway, please file another report about that.

Comment 25 Michel Dänzer 2014-10-08 08:03:26 UTC

Created attachment 107542 [details] [review]
r600g,radeonsi: Use staging texture for transfers if any miplevel is tiled

Comment 26 Michel Dänzer 2014-10-08 08:04:19 UTC

Created attachment 107543 [details] [review]
winsys/radeon: Use separate caching buffer manager for each set of flags

Comment 27 Michel Dänzer 2014-10-08 08:06:44 UTC

Created attachment 107544 [details] [review]
drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM

Please try current Mesa with no patches except for the two I just attached, plus this kernel patch together with the previous kernel patch.

Comment 28 Andy Furniss 2014-10-08 09:15:24 UTC

(In reply to Michel Dänzer from comment #24)
> (In reply to Andy Furniss from comment #22)
> > It seems like current mesa doesn't build/install properly when patched
> > without being totally clean.
> > 
> > The mesa patch in this bug on an already built tree does provoke -
> > 
> > make[3]: Entering directory
> > '/mnt/sdb1/Src64/Mesa-git/mesa/src/gallium/drivers/radeon'
> >   CC       r600_buffer_common.lo
> >   CCLD     libradeon.la
> > 
> > but radeonsi_dri.so is not changed after make install.
> 
> How about lib/gallium/radeonsi_dri.so in the build tree? That gets updated
> correctly for me. I don't use make install though, and I'm using an
> out-of-tree build, so maybe it's related to one of those. Anyway, please
> file another report about that.

No, so it seems like it's make that fails to update that for me unless I have distcleaned.

Comment 29 Andy Furniss 2014-10-08 15:20:04 UTC

(In reply to Michel Dänzer from comment #27)
> Created attachment 107544 [details] [review] [review]
> drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM
> 
> Please try current Mesa with no patches except for the two I just attached,
> plus this kernel patch together with the previous kernel patch.

With these it's better but still a bad. There are still pauses but they are a bit shorter, the start time without intervention is still around 2min. Start time on a good is around 1 min (assuming first run or mem caches flushed).

Hud wise - pic to follow the vram req/used match well, but the gtt is still different from a good.

On the upside - Unigine Valley is the best I've ever seen it - way better than what I've called as good in the past.

Comment 30 Andy Furniss 2014-10-08 15:22:17 UTC

Created attachment 107560 [details]
Hud with 2 kernel and 2 mesa patches better but still bad.

Comment 31 Andreas Hartmetz 2014-10-08 15:52:37 UTC

The second kernel patch "radeon-NO_CPU_ACCESS-placement.diff" does not apply to 3.17.
Michel, looks like the patch is against some version only you have?
Andy, I think you didn't apply at least the second kernel patch because it's simply not possible, most likely you ignored that all hunks failed.

Comment 32 Andreas Hartmetz 2014-10-08 15:57:14 UTC

OK, looks like the patch will apply to current Linux drm-next. Should have read comment 19. Sorry.

Comment 33 Andy Furniss 2014-10-09 12:54:56 UTC

(In reply to Andy Furniss from comment #29)
> (In reply to Michel Dänzer from comment #27)
> > Created attachment 107544 [details] [review] [review] [review]
> > drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM
> > 
> > Please try current Mesa with no patches except for the two I just attached,
> > plus this kernel patch together with the previous kernel patch.
> 
> With these it's better but still a bad. There are still pauses but they are
> a bit shorter, the start time without intervention is still around 2min.
> Start time on a good is around 1 min (assuming first run or mem caches
> flushed).
> 
> Hud wise - pic to follow the vram req/used match well, but the gtt is still
> different from a good.
> 
> On the upside - Unigine Valley is the best I've ever seen it - way better
> than what I've called as good in the past.

I see the patches are on the list now, so just in case it wasn't expected -

Running patched kernel with unpatched mesa causes signal 7 when running games.

Comment 34 Luzipher 2014-10-09 19:58:27 UTC

As todays commit 7b4276d7acf2e0f77044cb50caa6ad936fa78786, "r600g,radeonsi: Always use GTT again for PIPE_USAGE_STREAM buffers", refers to this bug:
The patch helps tremendously against stuttering (~0.5s pause irregularly about twice a minute) in Borderlands 2. Kernel is an unmodified agd5f 3.18-next, hardware r9 290x.

Thanks !

Comment 35 Chernovsky Oleg 2014-10-09 20:27:01 UTC

(In reply to Luzipher from comment #34)
> The patch helps tremendously against stuttering (~0.5s pause irregularly
> about twice a minute) in Borderlands 2

So are the stutters completely gone or just became smaller?

Comment 36 José Suárez 2014-10-09 21:18:18 UTC

(In reply to Luzipher from comment #34)
> As todays commit 7b4276d7acf2e0f77044cb50caa6ad936fa78786, "r600g,radeonsi:
> Always use GTT again for PIPE_USAGE_STREAM buffers", refers to this bug:
> The patch helps tremendously against stuttering (~0.5s pause irregularly
> about twice a minute) in Borderlands 2. Kernel is an unmodified agd5f
> 3.18-next, hardware r9 290x.
> 
> Thanks !

That mesa patch also reduced my stutter in BL2. Right now (I have just tested 5 minutes) I only get short lags when the game must load new areas/models (e.g. making a turn or entering new map zones). I'm on 3.17 rc7 (Radeon HD 7870).

The stutter was also very noticeable in EuroTruck Simulator 2, making the game almost unplayable. There still a few stutters in this game, so I would say the stutter is not completely gone (even though my BL2 experience is good).

Comment 37 Michel Dänzer 2014-10-10 01:23:21 UTC

Comments about Borderlands 2 should go to bug 84570.

(In reply to Andy Furniss from comment #33)
> Running patched kernel with unpatched mesa causes signal 7 when running
> games.

I fixed that in the kernel patch I posted to the dri-devel list.


So I finally convinced myself that using VRAM for PIPE_USAGE_STREAM buffers is not a good idea, at least not at this time, and pushed the Mesa change which makes those use GTT again. But while trying to avoid that, I found and fixed a fair number of other issues, so hopefully the end result will be even better than before. :)

Comment 38 Michel Dänzer 2014-10-10 01:26:24 UTC

(In reply to José Suárez from comment #36)
> The stutter was also very noticeable in EuroTruck Simulator 2, making the
> game almost unplayable. There still a few stutters in this game, so I would
> say the stutter is not completely gone (even though my BL2 experience is
> good).

Please file your own report about EuroTruck Simulator 2, preferably with GALLIUM_HUD screenshots like here.

Comment 39 Michel Dänzer 2014-10-15 06:54:22 UTC

Resolving, please reopen if there are remaining issues with the Elemental demo.

Comment 40 Andy Furniss 2014-10-15 09:52:16 UTC

(In reply to Michel Dänzer from comment #39)
> Resolving, please reopen if there are remaining issues with the Elemental
> demo.

It's still not right for me and never has been.

I haven't had much time to test recently, but what testing I did it seems that some of it may be when it loads from disk/disk cache. Maybe because I only have 4G RAM it's worse for me - but then I don't know why it still messes up from cache.

Last night I updated kernel to agd5f 3.19-wip and current mesa, after applying the mesa patches I got a lockup/crash. I hadn't retarted X though so glamor was using vanilla. I didn't crash instantly - first run was OK, and I haven't had time yet to see if it's just something new with different mesa/kernel, or if the patches + 3.19 are the cause.

Are the mesa patches still valid?

I'll attach the log, initially it was just that elemental didn't start, I was still OK, could switch desktops and use top to see it was maxing a single core, after a while I tried to start sysprof, which segfaulted, after that I was unstable eg. free hung and I tried to quit X but hung then I SysRqd.

Comment 41 Andy Furniss 2014-10-15 09:55:10 UTC

Created attachment 107863 [details]
Kernel errors with mesa patched + agd5f drm-next-3.19

Comment 42 Michel Dänzer 2014-10-16 03:12:56 UTC

(In reply to Andy Furniss from comment #40)
> It's still not right for me and never has been.

Define 'right'. What's the target we're aiming for here?


> Last night I updated kernel to agd5f 3.19-wip and current mesa, after
> applying the mesa patches I got a lockup/crash.

Please file another report about that.


> Are the mesa patches still valid?

They're in Git master now.

Comment 43 Andy Furniss 2014-10-19 20:20:44 UTC

(In reply to Michel Dänzer from comment #42)
> (In reply to Andy Furniss from comment #40)
> > It's still not right for me and never has been.
> 
> Define 'right'. What's the target we're aiming for here?
> 
> 
> > Last night I updated kernel to agd5f 3.19-wip and current mesa, after
> > applying the mesa patches I got a lockup/crash.
> 
> Please file another report about that.

Haven't had much time, but I have failed to reproduce so far with updated mesa (and I am making sure to restart X now so glamor is not running different code)

Comment 44 Andy Furniss 2014-10-19 20:34:16 UTC

(In reply to Michel Dänzer from comment #42)
> (In reply to Andy Furniss from comment #40)
> > It's still not right for me and never has been.
> 
> Define 'right'. What's the target we're aiming for here?

It's still stuttery, but it could be my 4G mem/the way the game loads.

If I run from clean most of the stutters/long start time correspond with "vmstat 1" showing disk activity. It seems having 4 gig (swap is off to avoid extra disk use) it seems I don't have enough ram to disk cache the whole demo. If I stop it after a while I can get the first bit cached (ie. I can run it without vmstat 1 showing anything), but it's still stuttery in the places where it would be loading from disk were it first run.

Comment 45 Andy Furniss 2014-10-19 20:40:42 UTC

(In reply to Andy Furniss from comment #43)
> (In reply to Michel Dänzer from comment #42)
> > (In reply to Andy Furniss from comment #40)
> > > It's still not right for me and never has been.
> > 
> > Define 'right'. What's the target we're aiming for here?
> > 
> > 
> > > Last night I updated kernel to agd5f 3.19-wip and current mesa, after
> > > applying the mesa patches I got a lockup/crash.
> > 
> > Please file another report about that.
> 
> Haven't had much time, but I have failed to reproduce so far with updated
> mesa (and I am making sure to restart X now so glamor is not running
> different code)

Ignore that, I just reproduced it, will file a new bug, though I am a few days old on mesa/kernel now.

Comment 46 Michel Dänzer 2014-10-20 07:16:34 UTC

(In reply to Andy Furniss from comment #44)
> If I stop it after a while I can get the first bit cached (ie. I
> can run it without vmstat 1 showing anything), but it's still stuttery in
> the places where it would be loading from disk were it first run.

Sounds like it's loading new content in those places. Doesn't that cause stutter with other drivers?

Comment 47 Andy Furniss 2014-11-03 22:00:20 UTC

(In reply to Michel Dänzer from comment #46)
> (In reply to Andy Furniss from comment #44)
> > If I stop it after a while I can get the first bit cached (ie. I
> > can run it without vmstat 1 showing anything), but it's still stuttery in
> > the places where it would be loading from disk were it first run.
> 
> Sounds like it's loading new content in those places. Doesn't that cause
> stutter with other drivers?

Hard for me to test fglrx on my setup, but soon I should have more ram to play with.

The recent changes to to agd5f 3.19-wip seem to have provoked a new issue - haven't had time to find a commit yet.

Will attach a screen showing that near the end of the demo there is a new really long pause (graph doesn't really show length, but it was around 10 sec) that corresponds to a big move from vram to gtt.

Other demos eg. valley don't show any issues and run well.

IIRC there is already a bug somewhere about the rendering issue (common to most unreal demos) that makes part of the logo black in the screen shot.

Comment 48 Andy Furniss 2014-11-03 22:03:24 UTC

Created attachment 108866 [details]
new issue long pause at end of demo

Comment 49 Andy Furniss 2014-11-04 00:30:29 UTC

(In reply to Andy Furniss from comment #48)
> Created attachment 108866 [details]
> new issue long pause at end of demo

Caused by drm/mm: Remove DRM_MM_SEARCH_BEST

Comment 50 Benjamin Bellec 2014-12-22 21:52:52 UTC

I think I'm hitting a similar issue on a RV770 (Radeon HD4850) hardware, while playing the game Left 4 Dead 2.

I was on Fedora 19 (kernel 3.14) and upgraded to Fedora 21 (kernel 3.17). Since this time, the game became unplayable due to GPU lockup. See the dmesg in attachement.

Here is the summary of the tests I then performed on a fresh Fedora 20 install :
kernel 3.11 + mesa 10.5-devel + sb = OK
kernel 3.17 + mesa 10.5-devel + sb = BAD
kernel 3.17 + mesa 10.5-devel + nosb = OK
kernel 3.17 + mesa 10.3.3 + sb = BAD (worse, 3 times more lockup)

I'm not skilled to compile/install a custom kernel, so I will just try to download a 3.16 kernel from koji when the site will be up again, in order to see if the issue appeared with 3.17.

Comment 51 Benjamin Bellec 2014-12-22 21:54:04 UTC

Created attachment 111191 [details]
dmesg while playing L4D2 on a RV770 with kernel 3.17.7

Comment 52 Benjamin Bellec 2014-12-22 23:13:43 UTC

Sorry, after testing several kernel, my issue is actually related to the 3.14 series. I will open another bug for that.

Comment 53 Michel Dänzer 2015-01-08 07:12:21 UTC

The latest version of the Elemental demo (and many other UE4 demos, in fact) seems to run much smoother in general for me. Andy, how's that for you?

Comment 54 Andy Furniss 2015-01-08 17:30:32 UTC

Created attachment 111966 [details]
elemental stalls near end with large vram -> gtt move

Generally the new version is a bit better, it's still not right, it has glitches but they are shorter.

I have 8 Gig ram now so can run from tmpfs to exclude the glitching being caused by disk activity. 

There is sometimes a new issue very near the end where there is a chance of a multi second stall corresponding to vram -> gtt move as in the screenshot.

This didn't happen at all the first couple of times I tested.

The third time only 200meg was moved the fourth is this shot.

Comment 55 Michel Dänzer 2015-02-06 02:39:03 UTC

Does http://cgit.freedesktop.org/mesa/mesa/commit/?id=a338dc01866ce50bf7555ee8dc08491c7f63b585 help for this by any chance?

Comment 56 Andy Furniss 2015-02-06 18:12:47 UTC

(In reply to Michel Dänzer from comment #55)
> Does
> http://cgit.freedesktop.org/mesa/mesa/commit/
> ?id=a338dc01866ce50bf7555ee8dc08491c7f63b585 help for this by any chance?

No, it's the same on current Mesa head.

Comment 57 Michel Dänzer 2015-11-05 07:45:45 UTC

I'm currently getting a "run time to unreal logo" of just over three minutes on my Kaveri. How about you, Andy? You said 3:45 is "good", so if it's below that now, this report can be resolved, right? :)

AFAICT the remaining pauses are mostly due to delayed shader compiles, which Marek has been working on reducing.

Comment 58 Andy Furniss 2015-11-05 09:55:31 UTC

(In reply to Michel Dänzer from comment #57)
> I'm currently getting a "run time to unreal logo" of just over three minutes
> on my Kaveri. How about you, Andy? You said 3:45 is "good", so if it's below
> that now, this report can be resolved, right? :)
> 
> AFAICT the remaining pauses are mostly due to delayed shader compiles, which
> Marek has been working on reducing.

Yes, this should be resolved -

I can't test 270X as it died and I now have a Tonga (so can't compare times/glitching  due to low clocks.

IIRC the "big" glitches were fixed with only smaller ones remaining.

Since I got the Tonga I tried with fglrx and there are some glitches with that, so maybe the demo its self is "special".

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.