Bug 89772

Summary: xf86-video-ati-git unstable
Product: xorg Reporter: John <john.ettedgui>
Component: Driver/RadeonAssignee: xf86-video-ati maintainers <xorg-driver-ati>
Status: CLOSED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: git   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg after the restart of Xorg
none
Xorg log after Xorg restarted
none
Xorg log of the previous Xorg session, including the crash information
none
gdb trace of Xorg crashing
none
new gdb log with debug symbols
none
new gdb log with latest git
none
xorg log with latest git
none
and dmesg as well
none
Use radeon_get_pixmap_handle() in radeon_dri2_schedule_flip()
none
gdb trace of Xorg crashing with that patch
none
Only enable SYNC extension fences and the Present extension along with DRI3
none
dri3 gdb log
none
dri3 xorg log
none
dri3 dmesg
none
Handle NULL bo in can_exchange() none

Description John 2015-03-26 06:15:36 UTC
Hello,

after Michael implemented DRI3 I was excited to test it, but the latest git makes my system quite unstable (even with DRI3 off).

The easiest way for me to test this is to start a game with nine. I am not sure why but on bad xf86-video-ati-git alt-tab within the game restarts the whole Xorg (I'm guessing it crashed and was auto-restarted).

I've bisected it, and the result is quite surprising to me but...:

af1862a37570fa512a525ab47d72b30400d2e2d6 is the first bad commit
commit af1862a37570fa512a525ab47d72b30400d2e2d6
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Wed Mar 18 11:05:40 2015 +0900

    Always include misync.h before other misync headers
    
    Older versions of xserver didn't include misync.h from other misync
    headers as needed.

Before this commit I don't have stability issues.
(I've been using xf86-video-ati-git for a while and never had issues till the batch of DRI3 patches came in).

I have a R7 280x, on Linux 4.0-rc4 and Xorg 1.17, mesa-git and llvm-svn.

Thanks!
Comment 1 John 2015-03-26 06:19:55 UTC
Created attachment 114635 [details]
dmesg after the restart of Xorg
Comment 2 John 2015-03-26 06:20:41 UTC
Created attachment 114636 [details]
Xorg log after Xorg restarted
Comment 3 John 2015-03-26 06:21:40 UTC
Created attachment 114637 [details]
Xorg log of the previous Xorg session, including the crash information
Comment 4 Michel Dänzer 2015-03-26 06:36:57 UTC
Does the patch from bug 89681 help?

If not, can you get a backtrace of the crash with gdb, and double-check that the git bisect result is correct? It really doesn't make sense for that commit to be the culprit, in particular with DRI3 disabled.
Comment 5 John 2015-03-26 07:25:22 UTC
I got a crash in the commit I thought was safe, so I guess it's more randomized than I thought. I was fine for about an hour and then got it...

The patch did not help either, although that other bug describes pretty much the behavior I am seeing (and I'm also using KDE if that matters).

I'll try to find the correct git commit and then the backtrace.

Thanks!
Comment 6 John 2015-03-28 00:29:38 UTC
Alright so far I've had that issue as early as commit:
6c3a721cde9317233072b573f9502348dcd21b16 DRI2: Use helper functions for DRM event queue management v3.

I'll keep using a build at the previous commit for a day or so before calling it good, but I've already used it for about half a day without issue.
Comment 7 John 2015-03-28 22:05:57 UTC
So it's been about a day and still no issue on 
commit c3fa22a479e61d1899fa9d327d9c4e2a7f64b0c1
DRI2: Move radeon_dri2_flip_event_handler

Since the problem is not as easy to trigger as I thought, it could not be it, but for now I'd say the problem came from the following commit:
6c3a721cde9317233072b573f9502348dcd21b16
DRI2: Use helper functions for DRM event queue management v3.

Next step is to get the gdb backtrace, hopefully I can manage today.

Thanks
Comment 8 John 2015-03-29 00:15:01 UTC
Created attachment 114690 [details]
gdb trace of Xorg crashing

Alright that was easier than I expected!
Though my distribution does not provide debug packages for Xorg... for now only the ddx driver has its debug information. I hope that's enough.
Comment 9 John 2015-03-29 02:49:32 UTC
Created attachment 114693 [details]
new gdb log with debug symbols

Actually rebuilding Xorg with symbols was fairly easy.
Also, if that matters, my new quick way of getting it to crash is loading XBMC and then switching from window to fullscreen till it crashes, that usually takes few tries, but I don't think it's a sure way either.
Comment 10 Michel Dänzer 2015-03-30 02:20:46 UTC
Is this still an issue with current xf86-video-ati Git master?
Comment 11 John 2015-03-30 04:51:34 UTC
Created attachment 114716 [details]
new gdb log with latest git

I was unable to quickly crash it with xbmc, but it did quickly crash with wine/nine, so no master is still not good.
Comment 12 John 2015-03-30 04:56:32 UTC
Created attachment 114717 [details]
xorg log with latest git
Comment 13 John 2015-03-30 04:57:05 UTC
Created attachment 114718 [details]
and dmesg as well
Comment 14 Michel Dänzer 2015-03-30 06:35:02 UTC
Created attachment 114720 [details] [review]
Use radeon_get_pixmap_handle() in radeon_dri2_schedule_flip()

Does this patch fix it?
Comment 15 John 2015-03-30 07:17:26 UTC
No it's the same as latest master.
Comment 16 Michel Dänzer 2015-03-30 07:23:35 UTC
Please attach another gdb backtrace with the patch applied.
Comment 17 John 2015-03-30 07:29:57 UTC
Alright. I just have to switch room for the 2nd computer everytime so I got lazy this time :)
Comment 18 John 2015-03-30 07:30:15 UTC
Alright. I just have to switch room for the 2nd computer everytime so I got lazy this time :)
Comment 19 John 2015-03-30 07:42:17 UTC
Created attachment 114722 [details]
gdb trace of Xorg crashing with that patch
Comment 20 Michel Dänzer 2015-03-31 04:12:35 UTC
Created attachment 114741 [details]
Only enable SYNC extension fences and the Present extension along with DRI3

Does this patch avoid the problem?
Comment 21 John 2015-03-31 08:39:22 UTC
So far I have not seen any crash with this patch with xbmc or wine, so this may be the correct patch, but as I don't always crash let's give it a day or so before calling it correct.


Also, should I try with DRI3 or stick to DRI2 for now?
Comment 22 John 2015-04-02 00:11:21 UTC
So it's been a few days now, I can't say that I tried really hard to make it crash, but so far I got no crash so I think the patch is good.

Now I should turn on DRI3 and see what that changes.
Comment 23 Michel Dänzer 2015-04-03 02:05:35 UTC
Fix pushed to Git. Leaving this report open until you confirm it doesn't happen with DRI3 enabled.

commit 98fb4199e63fedd4607cddee64bf602d6398df81
Author: Michel Dänzer <michel.daenzer@amd.com>
Date:   Tue Mar 31 12:25:18 2015 +0900

    Only enable SYNC extension fences and the Present extension along with DRI3
    
    This avoids some trouble with the Gallium nine state tracker, which uses
    the Present extension even when DRI3 is disabled.
Comment 24 John 2015-04-03 10:20:30 UTC
Unfortunately I had it in a few minutes once I turned DRI3 on.
I will attach logs once I get gdb running.
Comment 25 John 2015-04-03 11:27:29 UTC
Created attachment 114846 [details]
dri3 gdb log
Comment 26 John 2015-04-03 11:29:16 UTC
Created attachment 114847 [details]
dri3 xorg log
Comment 27 John 2015-04-03 11:29:46 UTC
Created attachment 114848 [details]
dri3 dmesg
Comment 28 Nick Sarnie 2015-04-07 16:39:50 UTC
This appears to be the issue that I was explaining to MrCooper. Same crash on DRI3 and Nine when Alt-Tabbing. Really strange. Let me know if I can help.
Comment 29 John 2015-04-07 22:35:29 UTC
If you can help, feel free to :)

In this case, I wonder if the issues I had with wine, and the ones with xbmc were somewhat unrelated even if the end result was the same, as I usually was able to easily trigger one but not the other at times.

I've been on dri2 since my last post and didn't have any Xorg crash... so Michel's latest patch has been great for DRI2, but I am still waiting for something on DRI3.
Comment 30 John 2015-04-13 07:24:44 UTC
Is there anything I can add to the report to help?
Thanks!
Comment 31 Michel Dänzer 2015-04-16 07:28:50 UTC
Created attachment 115107 [details] [review]
Handle NULL bo in can_exchange()

Does this patch fix the problem with DRI3 enabled?

If yes, can you try reverting patch 114741 and seeing if the problem is still reproducible with DRI3 disabled?
Comment 32 John 2015-04-16 08:15:32 UTC
Thank you for the patch Michel.
Unfortunately I am having some unrelated computer issue right now, and may not be able to test this for a few days.
I'll get back to you as soon as I can.

Thanks!
Comment 33 Nick Sarnie 2015-04-16 16:04:33 UTC
(In reply to Michel Dänzer from comment #31)
> Created attachment 115107 [details] [review] [review]
> Handle NULL bo in can_exchange()
> 
> Does this patch fix the problem with DRI3 enabled?
> 
> If yes, can you try reverting patch 114741 and seeing if the problem is
> still reproducible with DRI3 disabled?

Hi Michel.

The patch has no effect for me. X still dies when I Alt-Tab out of a game. I've linked below to some debug info from Wine that may be useful. I only see these PRESENT errors right before X crashes after the alt-tab.

Log: http://pastebin.com/raw.php?i=zAM4YJuA

Hope this helps, 
sarnex
Comment 34 Michel Dänzer 2015-04-17 02:18:25 UTC
(In reply to sarnex from comment #33)
> The patch has no effect for me. X still dies when I Alt-Tab out of a game.

Please attach a gdb backtrace of the Xorg crash with that patch applied.
Comment 35 Nick Sarnie 2015-04-17 03:11:03 UTC
(In reply to Michel Dänzer from comment #34)
> (In reply to sarnex from comment #33)
> > The patch has no effect for me. X still dies when I Alt-Tab out of a game.
> 
> Please attach a gdb backtrace of the Xorg crash with that patch applied.

Hi Michel,

I've attached the backtrace. I got it twice and the backtrace was the same.

Backtrace: http://pastebin.com/raw.php?i=KyjRtCaq

Hope it helps,
sarnex
Comment 36 John 2015-04-17 06:20:03 UTC
Thanks for doing this for me Sarnex!
Comment 37 Michel Dänzer 2015-04-17 10:01:07 UTC
(In reply to sarnex from comment #35)
> I've attached the backtrace. I got it twice and the backtrace was the same.
> 
> Backtrace: http://pastebin.com/raw.php?i=KyjRtCaq

Thanks, but for the future, when I say 'attach' I mean using the 'Add an attachment' link on this page.

AFAIR you already pastebinned the same backtrace earlier on IRC, so it's probably not related to the patch and not the same crash as John's. Please file your own report.
Comment 38 Nick Sarnie 2015-04-17 12:24:55 UTC
(In reply to Michel Dänzer from comment #37)
> (In reply to sarnex from comment #35)
> > I've attached the backtrace. I got it twice and the backtrace was the same.
> > 
> > Backtrace: http://pastebin.com/raw.php?i=KyjRtCaq
> 
> Thanks, but for the future, when I say 'attach' I mean using the 'Add an
> attachment' link on this page.
> 
> AFAIR you already pastebinned the same backtrace earlier on IRC, so it's
> probably not related to the patch and not the same crash as John's. Please
> file your own report.

Done. Sorry for the ignorance and noise.

Sarnex
Comment 39 John 2015-04-19 10:05:54 UTC
So I just tried with the new patch and it didn't crash when I alt-tabbed, BUT it kind of froze the UI.

All I found was this line in Xorg.log:
"[   364.633] (II) AIGLX: Suspending AIGLX clients for VT switch".

It could be unrelated to the patch so let me try a few more things before.
Comment 40 John 2015-04-19 10:07:15 UTC
oh and by froze the UI, I mean I could hear the music, when I alt-tabbed I got to see my other window, I could see the mouse moving, but that was it, I could not interact with anything, switch desktops or alt-tab back.
I was able to switch VTs back and forth though.
Comment 41 John 2015-04-19 10:11:42 UTC
Just tried with the same xf86-video-ati-git package, but disabling DRI3 in xorg.conf and this time it worked fine... so I guess it's either DRI3 or the way nine uses it.

Would there be any use in a gdb trace as Xorg didn't crash this time? I'm not sure if I'd even get something...
Comment 42 Michel Dänzer 2015-04-21 02:05:45 UTC
Note that the problem actually happens during a DRI2 page flip operation. Any idea whether it's the app or compositor (which one are you using, BTW?) which is falling back to DRI2?
Comment 43 John 2015-04-21 05:19:31 UTC
I am unable to answer that question, it's too deep for me, maybe Sarnex can.

But I am using kwin_x11 as my compositor. Is there any special settings you'd like me to try there?
Comment 44 Michel Dänzer 2015-05-01 10:09:34 UTC
Does the current master branch of git://people.freedesktop.org/~daenzer/xf86-video-ati help for this?
Comment 45 John 2015-05-07 18:50:36 UTC
This branch is partially better, so we have some progress!


The good part is it doesn't seem to crash with your new changes. I have not tried it extensively, but with the standard master I usually crash  X at my first-tab.

The bad part is I don't really get a correct alt-tab behavior, meaning if I tab out of the game I expect to only see the other window (konsole in my case), but with this branch it keeps flipping back and forth between the 2.
I can alt-tab fine between chromium windows though.
Comment 46 John 2015-05-07 21:31:44 UTC
Oh with this branch and dri3 on, I get visual corruptions inside of XBMC/Kodi, but only in the menus, not while playing a movie.
Switching dri3 off removes that problem.
Comment 47 Michel Dänzer 2015-05-13 09:40:04 UTC
I've updated that branch again, does it work better now?
Comment 48 John 2015-05-13 23:18:48 UTC
I just tried and the behavior is the same.

Something I have noticed though, is that I can alt-tab fine during the in-game-loading screen, but past it I get the issue... No idea what this means but it might help.

Is there any native DRI3 app you think I should try, to see if my issue at this point is not in nine instead of the ddx?

Thank you for keeping at this!
Comment 49 Michel Dänzer 2015-05-14 02:20:47 UTC
(In reply to John from comment #48)
> Is there any native DRI3 app you think I should try, to see if my issue at
> this point is not in nine instead of the ddx?

Apps don't use DRI3 or DRI2 directly. If you enable DRI3 for the Mesa build, OpenGL apps should use DRI3. With current Mesa Git, the debugging output enabled by LIBGL_DEBUG=verbose explicitly says whether it's using DRI3 or DRI2.

I don't have nine set up yet, but I was able to reproduce issues when running kwin with DRI3 and xbmc with DRI2 (via LIBGL_DRI3_DISABLE=1) and switching between fullscreen and windowed mode in xbmc, which those changes fix for me. Can you confirm that?
Comment 50 John 2015-05-14 02:42:24 UTC
(In reply to Michel Dänzer from comment #49)
> Apps don't use DRI3 or DRI2 directly. If you enable DRI3 for the Mesa build,
> OpenGL apps should use DRI3. With current Mesa Git, the debugging output
> enabled by LIBGL_DEBUG=verbose explicitly says whether it's using DRI3 or
> DRI2.
> 
Cool!

> I don't have nine set up yet, but I was able to reproduce issues when
> running kwin with DRI3 and xbmc with DRI2 (via LIBGL_DRI3_DISABLE=1) and
> switching between fullscreen and windowed mode in xbmc, which those changes
> fix for me. Can you confirm that?

Hmmm, I had issues with XBMC going fullscree to window a while back, but you fixed them, so I am not sure what you are asking me to confirm here.
I just tried with dri3 enabled and LIBGL_DRI3_DISABLE=1 and kodi behaves fine, or did you want me to switch to the standard branch before testing this?
Comment 51 Michel Dänzer 2015-05-20 05:50:50 UTC
(In reply to John from comment #50)
> (In reply to Michel Dänzer from comment #49)
> > With current Mesa Git, the debugging output enabled by LIBGL_DEBUG=verbose
> > explicitly says whether it's using DRI3 or DRI2.
> > 
> Cool!

Does it say DRI3 or DRI2 for you?


Branch updated again with another attempt, does that work better?
Comment 52 John 2015-05-20 05:53:04 UTC
It said what I expected (dri3 when not using the disable command option, dri2 when using it).

I'll try your new branch and report a bit later.

Thanks!
Comment 53 John 2015-05-21 03:15:43 UTC
It's the same behavior with the latest pull from your branch.
Comment 54 John 2015-08-07 06:42:12 UTC
So I had not tested this in a while, but it seems to be fixed now.
I just did a quick alt-tab test with Heroes Of the Storm and it looked fine.

I have no clue what fixed it, since then there's been changes in the DDX, Mesa and Linux.. but maybe it is time to close this.
Comment 55 Michel Dänzer 2015-08-10 06:15:17 UTC
(In reply to John from comment #54)
> So I had not tested this in a while, but it seems to be fixed now.
> I just did a quick alt-tab test with Heroes Of the Storm and it looked fine.

Great, resolving this report as fixed, thanks for the update. Please reopen this report if the problem occurs again.

BTW, you did test with DRI3 enabled, right? :)
Comment 56 John 2015-08-10 07:12:16 UTC
I believe so yes :)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.