Bug 79338 - Something's broken after llvm's svn commit 209067
Summary: Something's broken after llvm's svn commit 209067
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/Gallium/radeonsi (show other bugs)
Version: git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-05-27 23:21 UTC by José Suárez
Modified: 2014-05-29 22:22 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg log (77.66 KB, text/plain)
2014-05-27 23:36 UTC, José Suárez
Details
dmesg while gpu lock up (185.39 KB, text/plain)
2014-05-29 21:22 UTC, José Suárez
Details

Note You need to log in before you can comment on or make changes to this bug.
Description José Suárez 2014-05-27 23:21:45 UTC
Something seems to have been broken in llvm 3.5 after svn revision 209067.

What I am experiencing is that the system is stable for desktop usage but after a few minutes from launching a game (I have tried with CKII, Witcher 2, Watelands 2, Cities in motion, Deadfall Adventures and Serious Sam 3) or just after a few seconds after exiting the game the desktop freezes, the screen goes black (can't switch to vt to kill the game/desktop session) and the system becomes unresponsive. I'm afraid I cannot obtain a dmesg log due to that unresponsiveness.

If a dmesg log is needed to know the system's specs, the following attachement of another bug report may be of use: https://bugs.freedesktop.org/attachment.cgi?id=97787

I am using a llvm ppa to install llvm 3.5 git. I've tried to compile llvm with no luck, so I cannot bisect. However, I still have in /var/cache/apt/archives the deb packages for svn revisions 209564, 209581 and 209600 and those revisions hang the system in the way described above. So judging by the R600 llvm backend activity (https://github.com/llvm-mirror/llvm/commits/master/lib/Target/R600) and my experience with those svn .deb packages I still have in the apt cache, it seems the culprit must be a commit after 16 May.
Comment 1 José Suárez 2014-05-27 23:36:51 UTC
Created attachment 99988 [details]
dmesg log
Comment 2 José Suárez 2014-05-27 23:37:36 UTC
I have uploaded a new dmesg log because I just realised the one in the other bug report was not recent.
Comment 3 Michel Dänzer 2014-05-28 02:58:08 UTC
(In reply to comment #3)
> [...] after a few minutes from launching a game (I have tried with CKII,
> Witcher 2, Watelands 2, Cities in motion, Deadfall Adventures and Serious Sam
> 3) or just after a few seconds after exiting the game [...]

If you capture an apitrace of one of those games, does the problem also occur after replaying the apitrace?


> I am using a llvm ppa to install llvm 3.5 git.

Do the ppa packages use any patches versus upstream?

> I've tried to compile llvm with no luck, so I cannot bisect.

What's the problem compiling LLVM? It would probably be best if you could bisect yourself.
Comment 4 José Suárez 2014-05-28 10:09:26 UTC
Thank you for your quick reply, Michel.

The ppa I use is llvm.org's ppa (llvm.org/apt/) for Ubuntu 14.04 so I guess it will just be plain upstream with no additional patches.

With regard to compiling, I have two chroot environments, one is i386 and the other one is amd64, which I use to recompile mesa (from oibaf's ubuntu ppa) with llvm 3.5 via dpkg-buildpackage (previously editing debian/rules so that the config looks for llvm 3.5 rather than llvm 3.4).

I have tried to compile the original tar.gz packages (dpkg-buildpackage) from the llvm.org ppa within those two chroot environments but I just can't get them to work (config fails complaining about llvm tools, if I remember correctly, even though I downloaded all the tar.gz packages in the ppa and among which there was a package named llvm extra tools or something like that). Using apt-source results in obtaining the llvm 3.4 source tarball from Ubuntu's repositories, not the one from llvm.org, so that does not help either. I think that manually compiling llvm is a bit too much for me (I am not that advanced), so I'll try to get the dpkg-buildpackage working. If that doesn't work I guess I'll give a look at manually compiling.

I have never used apitrace either but I will try to find instructions and give it a go.

I currently don't have too much free time due to real life (TM) but I'll try to address the llvm compilation (and bisect) and apitrace test. I'll also check newer llvm builds from the ppa in order to check if the problem has been solved.
Comment 5 José Suárez 2014-05-29 08:19:10 UTC
OK, I finally managed to setup and compile with the dpkg-buildpackage thing. So I guess I shall be able to reproduce the svn versions to bisect via patches over the ppa llvm source. I'll post my findings.
Comment 6 José Suárez 2014-05-29 21:22:41 UTC
Created attachment 100123 [details]
dmesg while gpu lock up
Comment 7 José Suárez 2014-05-29 21:33:40 UTC
Hmmm... the problem actually seems to not be related to llvm.

I managed to obtain a crash dmesg log (see attachment above) while using llvm 3.5 svn 209067 which in theory was a good llvm svn revision. That made me rethink the cause of the problem and now I think the problem is coming from the kernel.

So I tried linux 3.14 (3.14.0-031400-generic #201403310035), which I had used before the 3.15 rc's started to come out. The thing is that with llvm 3.5 svn 209708 and linux 3.14 the gpu does not lock up. This svn 209708 llvm revision was making the gpu hang quite fast with linux 3.15-rc6 when launching a game.

GPU hang with 3.15-rc6 with llvm revs higher than 209067 was quite frequent, so could this be that those higher revs are hitting a bug in linux 3.15-rc6?
Comment 8 José Suárez 2014-05-29 22:22:52 UTC
Finally the problem I was experiencing was indeed related to the linux kernel. I have installed 3.15-rc7 (I which I had not updated since rc6) and the problems are gone. Not sure why, but the newer llvm revisions were triggering the bug more frequently.

Guessing by the 3.15-rc7 changelog, I was probably affected by a drm bug:

Alex Deucher (4):
      drm/radeon: fix DCE83 check for mullins
      drm/radeon: handle non-VGA class pci devices with ATRM
      drm/radeon: fix register typo on si
      drm/radeon/pm: don't allow debugfs/sysfs access when PX card is off (v2)

This report may now be closed.

Thank you, Michel, for your help.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.