Bug 99055 - Games hang / freeze completely
Summary: Games hang / freeze completely
Status: RESOLVED NOTOURBUG
Alias: None
Product: Mesa
Classification: Unclassified
Component: Other (show other bugs)
Version: 13.0
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: mesa-dev
QA Contact: mesa-dev
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-11 15:31 UTC by Etienne Bruines
Modified: 2016-12-21 05:03 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Backtrace upon crash (2.43 KB, text/plain)
2016-12-12 14:45 UTC, Etienne Bruines
Details

Description Etienne Bruines 2016-12-11 15:31:56 UTC
I recently updated to 13.0.2-1, and since then, any game will hang after about a few minutes of gameplay (sometimes even less than a minute). I am not entirely sure this is related to Mesa, but I'm confident that you'll be able to point me in the right direction. 

Games tested: Factorio, Democracy 3. 
Things that do work: Google Chrome (Netflix / YouTube). 

-
  GL_VERSION:  3.0 Mesa 13.0.2
  GL_VENDOR:   Intel Open Source Technology Center
  GL_RENDERER: Mesa DRI Intel(R) Sandybridge Mobile x86/MMX/SSE2

-
No logs available at (looking around the moment of the 'hanging')
/var/log/syslog
/var/log/messages
/var/log/Xorg.0.log
dmesg

The games do not output anything either. The game just freezes, as if the drawing-thread is hanging. Meaning after switching workspaces (screens), it'll refuse to load anything new and show the "old workspace" at that location. 

At this stage I have no idea how to debug this. SIGINT won't stop the program, I have to use SIGKILL. 

I have tried using `apitrace trace`. This "fixed" the bug: not able to reproduce when using apitrace. The current workaround I'm using right now is `apitrace trace -o /dev/null factorio` to start the game; it's a bit slower because of the tracing, but it hasn't crashed so far. (It often looks like it's going to crash, but it seems to recover from it)

I'd be happy to provide any additional information.
Comment 1 Etienne Bruines 2016-12-11 20:48:17 UTC
Update: apitrace did not magically fix it, but it did delay the symptom by a few hours.
Comment 2 Timothy Arceri 2016-12-11 23:30:07 UTC
(In reply to Etienne Bruines from comment #0)
> 
> I'd be happy to provide any additional information.

The most helpful thing you could do is build mesa from source and bisect the commit that introduces the hang.

I'd recommend building mesa from the master branch to start with to see if its still in master of if this is something introduced in the stable branch only.
Comment 3 Eero Tamminen 2016-12-12 14:27:29 UTC
Could you attach to the game process with GDB:
  sudo gdb PATH PID

(PID is the game process ID. PATH is path to the game binary:
  ls -l /proc/PID/exe
)

And attach here backtrace for all the threads:
  (gdb) thread apply all bt
?
Comment 4 Etienne Bruines 2016-12-12 14:43:12 UTC
@Eero  Tamminen:

I have attached gdb. Upon crashing, one of these two errors was shown in gdb:

# This is the first error
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007fb687ee756d in poll () at ../sysdeps/unix/syscall-template.S:84
84	../sysdeps/unix/syscall-template.S: No such file or directory.

# This is the second error
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00000000004a131e in renderInternal ()
    at /tmp/factorio-ZKDrW7/src/Graphics/SpriteDrawOrder.cpp:202
202	/tmp/factorio-ZKDrW7/src/Graphics/SpriteDrawOrder.cpp: No such file or directory.

They are never shown at the same time, but upon repeating the process, one or the other is shown.
Comment 5 Etienne Bruines 2016-12-12 14:45:48 UTC
Created attachment 128433 [details]
Backtrace upon crash
Comment 6 Etienne Bruines 2016-12-12 15:33:16 UTC
Apparently some time between the release of 13.0.2 til the current master, this issue got fixed. I do not know exactly where, but at this stage I'm not sure if anyone cares. 

@Timothy Arceri:
When I build it from `master`, the issue is gone.
Comment 7 Eero Tamminen 2016-12-12 15:39:17 UTC
(In reply to Etienne Bruines from comment #5)
> Created attachment 128433 [details]
> Backtrace upon crash

If that's the whole backtrace command output, deadlock doesn't seem be be in Mesa, at least in the Factorio case.  Could you attach same info also for some other game?


(In reply to Etienne Bruines from comment #6)
> Apparently some time between the release of 13.0.2 til the current master,
> this issue got fixed. I do not know exactly where, but at this stage I'm not
> sure if anyone cares. 
> 
> @Timothy Arceri:
> When I build it from `master`, the issue is gone.

I just noticed that Ubuntu 16.04 updated Glibc with comment "Disable lock-elision on all targets to avoid regressions":
  https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1642390

Have you updated any other components than Mesa in the meanwhile, e.g. glibc/pthreads?

(I.e. is this bug rather NOTOURBUG than FIXED?)
Comment 8 Etienne Bruines 2016-12-12 15:57:35 UTC
The issue was introduced after updating all package, among which most likely glibc ones. However, I was unable to find any glibc package on my computer. So I am unable to confirm if those were the issue. It would be strange though, that building Mesa from master would fix this. 

@Eero Tamminen:
I am not entirely sure which state for this issue is appropriate. Something like "was fixed even before posting" would have been great. However, it was brought to my attention that this issue wasn't fixed, just delayed a bit :-( . This time it lasted 45 minutes before freezing. 

The issue that's experienced does somewhat feel like a deadlock. For example: when I exit gdb when the gamg hangs, the game continues. Sometimes this behavior is observed when using apitrace (the game hanging and then 'fixing' itself within a second or so). 

The backtrace for other games (Democracy 3) is the same as for factorio.
Comment 9 Eero Tamminen 2016-12-12 16:27:30 UTC
(In reply to Etienne Bruines from comment #8)
> However, it was brought to my attention that this issue wasn't fixed, just delayed a bit :-(
> This time it lasted 45 minutes before freezing. 

Do you mean that the freeze happens also with latest Mesa, it just takes a bit more time?


> The issue that's experienced does somewhat feel like a deadlock. For
> example: when I exit gdb when the gamg hangs, the game continues. Sometimes
> this behavior is observed when using apitrace (the game hanging and then
> 'fixing' itself within a second or so). 
> 
> The backtrace for other games (Democracy 3) is the same as for factorio.

So the locks are all called from game code, Allegro library or libstdc++, there's no Mesa in the deadlock backtraces?

What other games freeze?
Comment 10 Etienne Bruines 2016-12-12 16:32:11 UTC
> Do you mean that the freeze happens also with latest Mesa, it just takes a bit more time?

The freeze also happens, it just takes a bit more time, yes. Meaning: a while it's running like it should, and then all of a sudden: freeze. 

> So the locks are all called from game code, Allegro library or libstdc++, there's no Mesa in the deadlock backtraces?

That appears to be correct (to the best of my limited knowledge).

> What other games freeze?

So far 100% of the games freeze (those two). I don't really have any other games that run on Linux.
Comment 11 Michel Dänzer 2016-12-13 01:23:50 UTC
(In reply to Etienne Bruines from comment #8)
> The issue was introduced after updating all package, among which most likely
> glibc ones. However, I was unable to find any glibc package on my computer.

The glibc packages are called libc6*.
Comment 12 Eero Tamminen 2016-12-13 13:25:54 UTC
Based on above backtraces, marking as NOTOURBUG.

Note: you can check Glibc package changelog on Debian based systems (like Ubuntu) with:
  zless /usr/share/doc/libc6/changelog.Debian.gz

According to Wikipedia games seem to be from different Indie developers (Factorio from Wube Software, Democracy from Positech games), so I'm not sure there's a common component causing it, the games themselves could be buggy.

According to your comment 10 both games use Allegro, so there might be some issue with that.  Steam's ancient Ubuntu 12.04 libstdc++ is known to be incompatible with newer distros (after GCC ABI change), so if that is used, that could also explain deadlocks.


Next you should check whether your game games binaries are using Allegro & libstdc++ libraries from the system, from Steam's ancient & incompatible Ubuntu 12.04 dump, or are they included with the game itself.  If they're not using system libstc++, remove / rename the non-system one(s) and try again.

You can see what libraries given process has loaded with:
  awk '{print $6}' /proc/PID/maps|sort -u
Comment 13 Michel Dänzer 2016-12-14 07:54:13 UTC
Note that it's not really necessary to remove anything from the Steam directory; one can force using the system libraries via the LD_PRELOAD environment variable, e.g.

LD_PRELOAD='/usr/$LIB/libstdc++.so.6':$LD_PRELOAD
Comment 14 Eero Tamminen 2016-12-14 08:27:28 UTC
(In reply to Michel Dänzer from comment #13)
> Note that it's not really necessary to remove anything from the Steam
> directory; one can force using the system libraries via the LD_PRELOAD
> environment variable, e.g.
> 
> LD_PRELOAD='/usr/$LIB/libstdc++.so.6':$LD_PRELOAD

It's harder to guarantee that the environment variable gets always passed to the game, better just to rename the offending binary to be sure.
Comment 15 Kenneth Graunke 2016-12-15 05:11:04 UTC
(In reply to Eero Tamminen from comment #14)
> It's harder to guarantee that the environment variable gets always passed to
> the game, better just to rename the offending binary to be sure.

It's really easy, actually, you just right click on the game in Steam, hit properties, click the [Set Launch Options...] and enter

LD_PRELOAD='/usr/$LIB/libstdc++.so.6':$LD_PRELOAD %command%

(or whatever you want before/after %command%)

Renaming the binaries or editing scripts installed with the game is liable to break when new updates for the game comes out, because Steam will overwrite those changes.
Comment 16 Michel Dänzer 2016-12-15 06:54:40 UTC
Setting LD_PRELOAD when launching Steam has worked for every game I've tried so far. YMMV.
Comment 17 Eero Tamminen 2016-12-15 11:04:53 UTC
(In reply to Kenneth Graunke from comment #15)
> Renaming the binaries or editing scripts installed with the game is liable
> to break when new updates for the game comes out, because Steam will
> overwrite those changes.

Only very few (if any) games include their own libstdc++, so the main issue is one included with Steam's Ubuntu 12.04 snapshot.  Does that get (ever) updated?


> (In reply to Eero Tamminen from comment #14)
> > It's harder to guarantee that the environment variable gets always passed to
> > the game, better just to rename the offending binary to be sure.
> 
> It's really easy, actually, you just right click on the game in Steam, hit
> properties, click the [Set Launch Options...] and enter
> 
> LD_PRELOAD='/usr/$LIB/libstdc++.so.6':$LD_PRELOAD %command%
> 
> (or whatever you want before/after %command%)

With people having more games, editing launch options for every one of them  would be really tedious.


(In reply to Michel Dänzer from comment #16)
> Setting LD_PRELOAD when launching Steam has worked for every game I've tried
> so far. YMMV.

Good to know.  In general, it would get lost if there's any suid binary or something sets LD_PRELOAD, but those would be really ugly things to do in release versions.
Comment 18 Etienne Bruines 2016-12-15 11:50:00 UTC
It turns out to be using this library (and nothing that shipped with it) :
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22

The libraries in use are all part of the distro.
Comment 19 Eero Tamminen 2016-12-15 12:12:31 UTC
There are also few other libraries which too old versions can cause problems.  E.g. some X ones, if you're using DRI3.  Could you attach the list of libraries used by your game?

You can find out the DRI version with:
  LIBGL_DEBUG=verbose glxgears

It will print out whether DRI2 or DRI3 is used.  If it's DRI3, try running steam with:
LIBGL_DRI3_DISABLE=1 steam
Comment 20 Kenneth Graunke 2016-12-21 05:03:22 UTC
(In reply to Eero Tamminen from comment #17)
> (In reply to Kenneth Graunke from comment #15)
> > Renaming the binaries or editing scripts installed with the game is liable
> > to break when new updates for the game comes out, because Steam will
> > overwrite those changes.
> 
> Only very few (if any) games include their own libstdc++, so the main issue
> is one included with Steam's Ubuntu 12.04 snapshot.  Does that get (ever)
> updated?

Yes.  Steam client updates can replace those.  Also, if Steam crashes, it thinks that it needs to verify itself and detects missing/corrupt (replaced) files and puts them back.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.