Bug 75900

Summary: [r600g-evergreen] GPU lockup or app segfault in several games (bisected)
Product: Mesa Reporter: Benjamin Bellec <b.bellec>
Component: Drivers/Gallium/r600Assignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: major    
Priority: medium CC: b.bellec, gquigs+bugs
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments: Serious Sam 3 log when segfault
Log including first crash
dmesg log showing GPU lockups

Description Benjamin Bellec 2014-03-08 00:50:12 UTC
Several Steam games crash since this commit:
67aef6dafa29fed008ea6065c425a6a92a651be9
winsys/radeon: if there's VRAM-only usage, keep it
author: Marek Olšák

Games which crash:
- Left 4 Dead 2: GPU lockup after 15-20 seconds of gameplay. If I set the gfx details lower, I can play more before a lockup (60 seconds I would say).
- Metro Last Light: GPU lockup
- Serious Sam 3: segfault

What I call a "GPU lockup":
- game freezes
- sound freeze and repeat
- screen is dark
- screen then go in "Out of range" mode
- I need to hard reboot

Games which doesn't crash:
- Half-Life² Episode Two
- Team Fortress 2
- Counter Strike Source
- Unigine Heaven
- Unigine Valley

Hardware:
AMD CYPRESS (HD5850 with 1GB VRAM)
I can also test with RV770 if necessary.

Software:
Fedora 19 x86-64
kernel 3.13.5-101.fc19.x86_64
libdrm 2.4.50

Build:
./autogen.sh --with-gallium-drivers=r600 --with-dri-drivers= --enable-texture-float --disable-dri3 --disable-r600-llvm-compiler --disable-gallium-llvm --enable-32-bit CFLAGS="-O2 -m32 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64" CXXFLAGS="-O2 -m32" --libdir=/usr/lib
Comment 1 Benjamin Bellec 2014-03-08 00:51:48 UTC
Created attachment 95322 [details]
Serious Sam 3 log when segfault
Comment 2 Benjamin Bellec 2014-03-08 01:00:54 UTC
With R600_DEBUG=nosb, Left 4 Dead 2 crash even faster (5 seconds after the scenery is loaded).
Comment 3 Benjamin Bellec 2014-03-08 10:22:35 UTC
With R600_DEBUG=hyperz I can play L4D2 several minutes before a GPU lockup.
Comment 4 Marek Olšák 2014-03-08 15:03:31 UTC
I reverted the commit. Does it help?
Comment 5 Benjamin Bellec 2014-03-08 16:02:45 UTC
Yes current master works correctly now.
Comment 6 Wojciech Pyczak 2014-03-08 16:33:06 UTC
I'm experiencing the same problems on my HD6850, currently upgrading to master to see if it helps, however unlike Benjamin case my card hanged when I watched some movie (vpdu backend), the other time I was viewing some web page - there was a (old, via wine) game running in the backround though. 

The crashes seemed completely random, no indication what might have caused it. I'll attach log file containing first hang (I had to do a hard reset, sysrq didn't work, on the second occasion sysrq did work but there was no info in the log file but I think I've waited long enough, strange).
Comment 7 Wojciech Pyczak 2014-03-08 16:34:05 UTC
Created attachment 95360 [details]
Log including first crash
Comment 8 Marek Olšák 2014-03-08 19:53:17 UTC
(In reply to comment #6)
> I'm experiencing the same problems on my HD6850, currently upgrading to
> master to see if it helps, however unlike Benjamin case my card hanged when
> I watched some movie (vpdu backend), the other time I was viewing some web
> page - there was a (old, via wine) game running in the backround though. 
> 
> The crashes seemed completely random, no indication what might have caused
> it. I'll attach log file containing first hang (I had to do a hard reset,
> sysrq didn't work, on the second occasion sysrq did work but there was no
> info in the log file but I think I've waited long enough, strange).

Please let me know after you test Mesa master. Thanks.
Comment 9 Nick Tenney 2014-03-08 20:20:09 UTC
I am experiencing some issues with Diablo III via wine after these changes (I believe) as well. Upon loading the GPU to any extent, random large polygons cover large portions of the screen. Not super pretty. Reverting to a mesa build from the 5th fixed things, but the most recent update to master did *not*.
Comment 10 Nick Tenney 2014-03-08 21:12:25 UTC
I backtraced and found f112ba03bbd6072df9f8879bb231f7f0abb14d2e to be my first bad commit. Not sure if this should be a separate bug report or if that is causing issues for anyone else.
Comment 11 Chris Rankin 2014-03-08 21:49:02 UTC
Created attachment 95374 [details]
dmesg log showing GPU lockups

My RV790 had also started locking up while playing WoW, but is now magically fixed.

cf1c52575d6fea966d818eac4a32ec2decc48576    GOOD
a995f564c7b226438d10e5d5895692ed1fd550e3    BAD	
1e25aa4cdb3bb1f190ea3905eb1d169e0c5a1ef0    GOOD

I have attached the dmesg log from the lockups, in case they might be useful.
Comment 12 Nick Tenney 2014-03-09 03:15:49 UTC
FYI, Marek's latest patches on the ML fixed my issue listed above.
Comment 13 Benjamin Bellec 2014-03-09 18:07:00 UTC
(In reply to comment #12)
> FYI, Marek's latest patches on the ML fixed my issue listed above.

This is probably not the same bug than mine indeed.
Comment 14 Wojciech Pyczak 2014-03-10 10:32:37 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > I'm experiencing the same problems on my HD6850, currently upgrading to
> > master to see if it helps, however unlike Benjamin case my card hanged when
> > I watched some movie (vpdu backend), the other time I was viewing some web
> > page - there was a (old, via wine) game running in the backround though. 
> > 
> > The crashes seemed completely random, no indication what might have caused
> > it. I'll attach log file containing first hang (I had to do a hard reset,
> > sysrq didn't work, on the second occasion sysrq did work but there was no
> > info in the log file but I think I've waited long enough, strange).
> 
> Please let me know after you test Mesa master. Thanks.

With current master I'm no longer experiencing any hangs, I can't however guarantee it's the same problem, because I've no idea how to reproduce it. 

All I know is that the problem was introduced somewhere between 4-7.03, I guess I could cherry pick or "revert the revert" and see what happens but if I'm going to do that I might as well collect some additional information, just let me know what you're interesed in.
Comment 15 Marek Olšák 2014-03-10 12:22:07 UTC
(In reply to comment #14)
> With current master I'm no longer experiencing any hangs, I can't however
> guarantee it's the same problem, because I've no idea how to reproduce it. 
> 
> All I know is that the problem was introduced somewhere between 4-7.03, I
> guess I could cherry pick or "revert the revert" and see what happens but if
> I'm going to do that I might as well collect some additional information,
> just let me know what you're interesed in.

I just wanted to know if you were still seeing any hangs, and you weren't, which is great.
Comment 16 Marek Olšák 2014-04-03 16:04:09 UTC
This has been fixed. Closing.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.