Bug 13328 - VT switching with OpenGL application running freezes Xorg
Summary: VT switching with OpenGL application running freezes Xorg
Status: RESOLVED DUPLICATE of bug 13196
Alias: None
Product: Mesa
Classification: Unclassified
Component: Drivers/DRI/i915 (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-11-20 16:26 UTC by Ben Gamari
Modified: 2007-12-04 09:06 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
A log from a crashed Xorg session after VT switch (34.85 KB, text/plain)
2007-11-20 16:27 UTC, Ben Gamari
Details
Backtrace from frozen Xorg process (1.15 KB, text/plain)
2007-11-29 14:23 UTC, Ben Gamari
Details
Test patch (627 bytes, patch)
2007-11-30 00:19 UTC, Michel Dänzer
Details | Splinter Review
A backtrace from Xorg patched with Attachment 12861 (1.95 KB, text/plain)
2007-11-30 09:18 UTC, Ben Gamari
Details
Xorg.log from crashed patched server (93.74 KB, text/plain)
2007-11-30 09:37 UTC, Ben Gamari
Details

Description Ben Gamari 2007-11-20 16:26:52 UTC
When one attempts to suspend the system while running compiz (have yet to try any other application), Xorg freezes on resume with nothing but a black screen and a cursor. The machine is still otherwise responsive (tested through serial console) although the keyboard does nothing. Log attached.
Comment 1 Ben Gamari 2007-11-20 16:27:42 UTC
Created attachment 12660 [details]
A log from a crashed Xorg session after VT switch
Comment 2 Ben Gamari 2007-11-20 16:28:56 UTC
The problem actually appears to be an issue with VT switching as a whole. Switching to a VT then switching back to Xorg with compiz running results in a freeze 100% of the time.
Comment 3 Michael Fu 2007-11-27 17:40:54 UTC
might be a dup of bug# 13196...
Comment 4 Jesse Barnes 2007-11-27 17:43:23 UTC
Can you get a backtrace?  See http://www.x.org/wiki/Development/Documentation/ServerDebugging
Comment 5 Ben Gamari 2007-11-29 14:23:52 UTC
Created attachment 12836 [details]
Backtrace from frozen Xorg process

This was acquired by attaching a gdb session to Xorg remotely, and switching away from and back to Xorg.
Comment 6 Ben Gamari 2007-11-29 14:28:14 UTC
It might be significant to note that after getting the above backtrace, I was able to restart Xorg and the system returned to fully working order.
Comment 7 Ben Gamari 2007-11-29 14:35:33 UTC
Moreover, it appears that the following sequence of events also causes the same type of freeze:

- Start Xorg with non-compositing window manager
- Switch to a VT
- Switch back to Xorg
- Try starting compiz
- Admire frozen Xorg session
Comment 8 Michel Dänzer 2007-11-30 00:19:53 UTC
Created attachment 12861 [details] [review]
Test patch

So it's a deadlock due to recursive locking...

Does this patch happen to work? That 'drop batchbuffer on the floor' code's still a little iffy though.
Comment 9 Ben Gamari 2007-11-30 09:18:22 UTC
Created attachment 12877 [details]
A backtrace from Xorg patched with Attachment 12861 [details]

As you can see, the patch certainly did something. Now, instead of deadlock, Xorg crashes from a SIGABRT, apparently after waiting for the GPU. Looks promising.
Comment 10 Ben Gamari 2007-11-30 09:21:39 UTC
After restarting X after the crash (which perhaps unsurprisingly resulted in another deadlock with X running 100% of a CPU), I noticed the following message logged to the kernel log. It could have been produced by either the initial or restarted X session:

[drm:drm_fence_lazy_wait] *ERROR* Fence timeout. GPU lockup or fence driver was taken down. 0 0x0000344e 0x03 0x01 0x00
[drm:drm_fence_lazy_wait] *ERROR* Pending exe flush 1 0x0000344e
[drm:drm_bo_expire_fence] *ERROR* Detected GPU lockup or fence driver was taken down. Evicting buffer.
Comment 11 Ben Gamari 2007-11-30 09:37:34 UTC
Created attachment 12878 [details]
Xorg.log from crashed patched server

Log gives ring buffer dump for your perusal
Comment 12 Michel Dänzer 2007-12-03 00:56:43 UTC
So this results in a GPU lockup instead, which probably isn't too surprising given the recursive batchbuffer flush...

Does it work if you comment out the whole if (sarea->width != intel->width || sarea->height != intel->height) block in intelContendedLock() instead?
Comment 13 Ben Gamari 2007-12-03 07:15:16 UTC
Nope, no dice. Commenting this block just seems to produce a complete hardware lockup instead. It may be that the kernel is still alive so I can pull some more detailed post mortem information over ssh, but I won't have another computer until later today. Should I try this?
Comment 14 Michael Fu 2007-12-03 22:43:17 UTC
Ben, does the fix in bug# 13196 helps?
Comment 15 Ben Gamari 2007-12-03 23:40:27 UTC
(In reply to comment #14)
> Ben, does the fix in bug# 13196 helps?
> 

Michael,

Thanks a ton, that was the bug. I can now VT switch with complete reliability. This will be in git soon?
Comment 16 Jesse Barnes 2007-12-04 09:06:13 UTC
Reopening so I can DUP it.
Comment 17 Jesse Barnes 2007-12-04 09:06:20 UTC

*** This bug has been marked as a duplicate of bug 13196 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.