Bug 103246 - PoE: GPU hang with mesa >= 17.2.0 + gallium-nine
Summary: PoE: GPU hang with mesa >= 17.2.0 + gallium-nine
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: 17.2
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Default DRI bug account
QA Contact: Default DRI bug account
URL:
Whiteboard:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2017-10-12 18:20 UTC by kmk3.bugs
Modified: 2018-05-13 18:12 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
Related packages info (685 bytes, text/plain)
2017-10-12 18:20 UTC, kmk3.bugs
Details

Description kmk3.bugs 2017-10-12 18:20:20 UTC
Created attachment 134818 [details]
Related packages info

General system info:
System information:
    Wine build: wine-2.18 (Staging)
    Platform: i386
    Version: Windows 7
    Host system: Linux
    Host version: 4.13.5-1-MANJARO

GPU: R9 280X (GCN 1.0)
GPU driver: xf86-video-ati 1:7.10.0-1 (radeonsi)
DE: plasma-desktop 5.10.5-1
Game info: https://appdb.winehq.org/objectManager.php?sClass=version&iId=30942
Game version: 3.0.1e (Steam)

# Overview
So, when using mesa 17.2+ with wine-staging-nine, the whole system crashes
when entering certain areas on PoE.

After trying to enter my Hideout in the game, the loading screen appears for a
few seconds (as usual), then a sound loop occurs for about 3 seconds, then
silence and the whole system is completely unresponsive.
The keyboard does not work and the monitor has no video output (blue screen).
Then, I just hard-reset the system.

Also, a few days ago, the game somehow crashed without bringing the whole
system down, it just showed a "game crashed" dialog box.

# Game-specific info
I tried to enter the affected areas from Highgate (Act 9), which seems to have
no major issues after wandering around a bit.
So far, it seems to crash on Sarn and the personal Hideout.
The only similarity that I can think of is the presence of Vagan and Vorici in
both areas.
But, AFAIK, it seems unlikely that characters textures would cause a GPU hang.

# Rambling
It is the exact same symptom (including the sound loop) that occured on my
6770m with Linux 3.13+, when I tried to run "startx" without setting
"radeon.dpm=0".

# Debug
The whole screen, including the terminal wine is launched from freezes, so I'm
unable to see if wine printed anything during/after the crash.
I tried setting MESA_DEBUG=1 and MESA_LOG_FILE, but nothing is ever written to
the log file (it is not even created).
I'm not sure if it is because everything really hangs, no mesa errors actually
occur or if I need to compile it with debug flags.
In the case of the latter, should I just follow this guide?
https://wiki.ixit.cz/d3d9_debugging

Also, I'm not really sure whether the problem is on mesa or gallium-nine.
If you know how to debug this, please leave a comment.

# Packages
After testing with different packages, the problem occurs only with
wine-staging-nine.
Tested on wine-staging-nine 2.16+ (2.16-2.18) and mesa 17.2.0+ (17.2.0-17.2.2).

Mesa < 17.2.0 has no problems with wine, wine-staging nor wine-staging-nine.
Wine + gallium-nine works well enough with mesa < 17.2.0.

Misc: At least on manjaro, mesa 17.1.8 depends on llvm 4.0 and mesa 17.2+
depends on llvm 5.0.
Not sure if the version could be related to the issue.
Comment 1 i.kalvachev 2018-04-11 18:11:26 UTC
The symptoms are typical for shader infinite loop, followed by botched driver restart. Definitely not Wine fault.

Does the bug still happen with current mesa3d+llvm?

While the game is free, it would still be good idea if you can create an apitrace recording of a place where the game crashes. Ask in #d3d9 at irc.freenode about the ftp access, or use google drive or similar file sharing. (We might use the trace for regression testing in future).
It might be good idea if you also fill bugreport to the github Ixit/Mesa-3D issue tracker.

I usually place the apitrace wrapper d3d9.dll inside the game directory (main or where game.exe is). Then using `winecfg` add library override for "d3d9" to be "native, built-in".
If everything goes well, the wrapper would create a new trace inside the game directory every time the game is started. So don't forget to remove it when you are done.

Naturally, use working mesa3d version for the trace. Then install new version of mesa and check if the trace crashes.

If you can compile mesa3d on your own, you might be able to help narrow down the problem further. I cannot do that, since my card is using the R600 driver, so it is very unlikely to have the same shader miscompilation.
Comment 2 i.kalvachev 2018-04-11 18:21:26 UTC
I just sow that there is already issue opened for the same/similar issue:

https://github.com/iXit/Mesa-3D/issues/296
Comment 3 kmk3.bugs 2018-05-13 17:50:00 UTC
Hello iive, thanks for the detailed response.

> Does the bug still happen with current mesa3d+llvm?
I tested briefly a week after you commented and it no longer occurred with mesa
17.3.8 :)

# Packages:
linux416 4.16.0-1

lib32-llvm-libs 6.0.0-1
llvm-libs 6.0.0-4

lib32-mesa 17.3.8-1
lib32-mesa-vdpau 17.3.8-1
libva-mesa-driver 17.3.8-1
mesa 17.3.8-1
mesa-vdpau 17.3.8-1

wine-gaming-nine 3.5-2

# ----------

I just got around to recompiling the latest nine and testing with the latest
mesa, to make sure it still works and it does.

# Packages:
linux416 4.16.4-1

lib32-llvm-libs 6.0.0-1
llvm-libs 6.0.0-4

lib32-mesa 18.0.1-1
lib32-mesa-vdpau 18.0.1-1
libva-mesa-driver 18.0.1-1
mesa 18.0.1-1
mesa-vdpau 18.0.1-1

wine-gaming-nine 3.7-1

# ----------

> It might be good idea if you also fill bugreport to the github Ixit/Mesa-3D issue tracker.
This is something I'm still not sure about: Where is it appropriate to file
bugs?
Considering the next time I find a bug with increased likelihood of it being
in mesa, should it be reported on each project (mesa, wine and nine) for
tracking or would it just add noise?

> While the game is free, it would still be good idea if you can create an apitrace recording of a place where the game crashes. Ask in #d3d9 at irc.freenode about the ftp access, or use google drive or similar file sharing. (We might use the trace for regression testing in future).
> 
> I usually place the apitrace wrapper d3d9.dll inside the game directory (main or where game.exe is). Then using `winecfg` add library override for "d3d9" to be "native, built-in".
> If everything goes well, the wrapper would create a new trace inside the game directory every time the game is started. So don't forget to remove it when you are done.
> 
> Naturally, use working mesa3d version for the trace. Then install new version of mesa and check if the trace crashes.
Thanks for the details, I was struggling on finding a proper way to debug it.
Now digging through the mesa website, I found the mention of apitrace on [1],
but not on [2], while the debugging build instructions are only on [2].
It seems to me that it would make sense to merge the pages together.

Also, that's an interesting approach that seems to differ a little from [3].
Would you mind documenting it if/when you have the time?
As someone new to debugging graphics drivers, it seems like information is a
bit scattered around different pages and projects.
It would be quite helpful to have instructions for a general debugging
workflow centralized in one place.
But I digress.

> If you can compile mesa3d on your own, you might be able to help narrow down the problem further. I cannot do that, since my card is using the R600 driver, so it is very unlikely to have the same shader miscompilation.
For future reference, I managed to recompile mesa and wine-staging-nine
(or wine-gaming-nine) months ago (around November by the log date) with all
the debug flags in [4].
But even then, the only lines ever logged to MESA_LOG_FILE were these
(repeated ~1k times):
Mesa: User error: GL_INVALID_OPERATION in glResumeTransformFeedback(wrong program bound)
Mesa: User error: GL_INVALID_OPERATION in glPauseTransformFeedback(feedback not active or already paused)

Maybe it was a really catastrophic error.

[1] https://www.mesa3d.org/bugs.html
[2] https://www.mesa3d.org/debugging.html
[3] https://github.com/apitrace/apitrace/blob/master/docs/USAGE.markdown
[4] https://wiki.ixit.cz/d3d9_debugging
Comment 4 kmk3.bugs 2018-05-13 18:12:31 UTC
(In reply to iive from comment #2)
> I just sow that there is already issue opened for the same/similar issue:
> 
> https://github.com/iXit/Mesa-3D/issues/296
Good catch, just commented in there and linked to this bug.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.