Created attachment 117719 [details] Xorg log from crash I've been experiencing periodic Xorg segfaults on my laptop for some time now. The odd thing is that it only appears to happen while playing videos, and then only using Gnome's Totem video player (not vlc, not HTML5 video in a browser). Sometimes this results in being kicked back to a login screen. Others the video and input seems to lock up, but the audio keeps playing, forcing me to power off the laptop. I have a core i5-2540M, and am using the integrated HD3000 graphics. I'm attaching an Xorg log from the crash, containing a stack trace. Please let me know if there's any further information needed to help diagnose this.
This looks like bug 91577 (hard to tell with the incorrect stack traces). That was from mishandling an allocation failure. Could you compile xf86-video-intel from http://cgit.freedesktop.org/xorg/driver/xf86-video-intel/ with ./configure --enable-debug so that I know exactly what version you have plus include a little more debugging information?
Oh sorry. Slipped my mind somehow. I'm using xorg-x11-intel-drv-2.99.917-12.20150615 on Fedora 22. I'll see if I can manage to install from git and crash again, though (the trace I submitted is with the debug packages installed, but maybe that isn't good enough).
Created attachment 117720 [details] Crash log with git driver Attached the log after a crash with xf86-video-intel compiled from git. % git log -1 --format="%H" 9b0ed16385ae076c262a2e09639822d9488ccf57 Does this look more helpful?
Hmm, we have symbols, which is a good start! Looks like a different type of confusion. Next step, could you compile with ./configure --enable-debug=full and capture the debug log proceeding that event?
Created attachment 117722 [details] Crash log with full debugging Do you just need the Xorg.log.old from after the crash? Attached. (xz compressed since it's huge.) If you mean something else, you'll probably have to explain how to get it.
(In reply to Dan Doel from comment #5) > If you mean something else, you'll probably have to explain how to get it. Perfect, thanks.
I experienced same issue as described (playing videos full screen crashes X)(downstream bug report https://bugzilla.redhat.com/show_bug.cgi?id=1252660 ) but trace look different, is this the same issues as this?
(In reply to (bitlord) from comment #7) > I experienced same issue as described (playing videos full screen crashes > X)(downstream bug report https://bugzilla.redhat.com/show_bug.cgi?id=1252660 > ) but trace look different, is this the same issues as this? Missed you on irc, no this is a different issue, bug 91120.
(In reply to Chris Wilson from comment #8) > (In reply to (bitlord) from comment #7) > > I experienced same issue as described (playing videos full screen crashes > > X)(downstream bug report https://bugzilla.redhat.com/show_bug.cgi?id=1252660 > > ) but trace look different, is this the same issues as this? > > Missed you on irc, no this is a different issue, bug 91120. I'll link it downstream. Thank you! ;-)
Ok, I think I understand that assert and commit 07eee812b2047642c76190d043ee4aa4ce338c64 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Mon Aug 17 20:38:57 2015 +0100 sna/dri2: Avoid pushing the triple buffer into the cache list twice should undo the damage. Can you please update (git pull) and retest? Hopefully we should be getting past that crash and back to the original bug.
Created attachment 117740 [details] Full debug log with patch I still get crashes with this patch. New full debug log attached. Looks like the same assertion failure. I cut the first half of the log off (million lines) because I couldn't compress it small enough. Hope that's all right.
Created attachment 117744 [details] Full crash log for 07eee812b2047642c76190d043ee4aa4ce338c64 I found a much more reliable/quick way to produce the crash (using Epiphany + Youtube), so here's a much smaller and complete log for a crash with the latest git.
Both of those are failures after a stale back buffer. commit 79fc9a923cdfa4218868f4c371ca80fd40f41253 Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Aug 18 09:21:20 2015 +0100 sna/dri2: Immediately complete a stale swap if any are queued Both of these patches are for very recent changes, so still onwards towards resolving the original issue...
Created attachment 117766 [details] Full crash log for 18e484502727f2e2e16138a3de5b6727f3879a2b Similar assertion failure, new location. With all new git commits as of this post.
Created attachment 117778 [details] Another Xorg.log showing segfault (w/o debugging) FWIW, it seems like I'm hitting the very same bug on ArchLinux with an HD4000. This is my Xorg.log showing the segfault. Using the latest git revision (18e4845) and enabling debugging I hit the same assertion as shown above. Even compressed the log is too large so I uploaded it somewhere else instead: http://homepages.uni-paderborn.de/lass/crash-debug.log.xz I can easily reproduce the assertion error by just using the chromium web browser, as far as I can see without any video playback.
Another go, now with more assertions: commit dab1c0f159d74fc82618b88262e064010e6387ec Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Tue Aug 18 23:27:22 2015 +0100 sna/dri2: Move the pending swap from the buffer to the event To ease tracking of the next swap, stash it on the event (which is then reused) rather than the back buffer (which changes frequently). In addition, add debug flags and assertions to track event stages (such as making sure we do not decouple/free an event that we have sent a signal back to the client). As always hopefully this gets us to the point of chasing down the original bug!
It seems like I now got the original segfault back again: http://homepages.uni-paderborn.de/lass/crash-debug-8c59c5b.log.xz
Created attachment 117786 [details] Full crash log for 8c59c5ba4e368af2ee4a4a811ebf3934de7e4402 This commit removed my very easy reproduction method for the crash, so the log is much bigger this time. The assertion failure is also back in sna_dri2_schedule_swap.
Ok, that may well be the original crash, stack trace looks similar enough. So it is really a victim of the bug we were tracing anyway. commit 8a59e85801cb0592eb2d0a074d9254d26a65240f Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 19 16:39:11 2015 +0100 sna/dri2: Initialise scratch.pScreen for copying fixes the crash. commit 6027bfc461c01838577be052d1a76ffc6906e3cc Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Aug 19 16:49:01 2015 +0100 sna/dri2: Also track the front bo as an active buffer should catch it happening.
Created attachment 117791 [details] Crash log for 6027bfc461c01838577be052d1a76ffc6906e3cc Here's a new crash log for the latest commit. Quite different this time. It's not with full debugging enabled. I can try to get it to happen with full debugging, but I was having trouble making it happen before the log file got to several hundred megabytes. Maybe I'm confused, though. Was one of those patches intended as a fix for this bug, and this is a separate issue?
The crash was just a result of the bug that I've been looking at in the full debug logs, i.e. the crash itself was just a symptom of the same underlying issue. (Though it was worth fixing by itself, we are still looking for the root cause.) The second patch is trying to track down where the confusion comes from.
Okay, I see. Unfortunately, I'm still having trouble getting it to cause an error during full debugging. I'm not sure if I just got lucky while I only had normal debugging on, or if the full debugging somehow makes it less likely for the problem to happen (if it were a timing issue, everything seems to be running slower, for instance). I'll continue trying.
Created attachment 117799 [details] Full debug log for 6027bfc461c01838577be052d1a76ffc6906e3cc Here's a full debug log for commit 6027bfc461c01838577be052d1a76ffc6906e3cc. I had to chop off the first couple hundred thousand lines. I'll start trying to get an error on the latest commit.
So, I haven't had any luck producing an error with the latest patches (84854419471ebd338ae20d411e44f256be569d1a). I've tried a lot of the things that were causing exits previously, and X is still running. Does it make sense to you that one of your changes has fixed the bug?
That was the intent, yes. Keep running with --enable-debug for a few days as these races can be hard to trigger.
Will do. Thanks for your help.
Still no more crashes. Seems like that last patch did the trick.
I've been running recent versions with debug enabled and couldn't reproduce this anymore, too.
Let's truly test it by marking it as resolved then! Thanks for the testing and feedback.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.