Bugzilla – Bug 70123
Freeze caused by 'winsys/radeon: remove cs_queue_empty' commit
Last modified: 2013-10-16 13:50:40 UTC
I was getting the whole wm freeze up immediately after logging into my Openbox systems and had to switch console to get roll back to a previously working state.
There was nothing in any of the system logfiles and disabling things like conky and compton (the compositor) didn't resolve anything.
I was able to log in via the slim display manager and could see the wallpaper, but there was no openbox menu (not available via a keyboard shortcut either). Ctrl-Alt-Bksp put me back to slim, but an attempt at using slim's console login produced a reponsive but unreadable xterm display (white blocks instead of text).
Through trial and error I traced a regression caused by the 'winsys/radeon: remove cs_queue_empty' commit from Sept 22. Doing a git revert -n 0653c66ef40ac553f91b29bbda7f59f7ce6948fa and recompiling fixed the issue for me.
I'm running debian unstable: 32-bit on a laptop with a Mobility Radeon HD 4530/4570; 64-bit on a desktop with an X850 XT.
I haven't yet had a chance to see if the desktop machine works with the above solution, but both machines were previously in sync and went wrong with the same update.
I've just tested the workaround on my 64-bit Debian desktop (X850 XT), and reverting 0653c66ef40ac553f91b29bbda7f59f7ce6948fa fixes this too.
I've changed the category from 'Other' to 'Drivers/Gallium/r600' so that the right people see this.
Created attachment 87229 [details] [review]
Please try the attached patch, it might fix your issue.
Unfortunately this doesn't fix the issue. When I get a spare moment I'll try the patch on the desktop pc (which has a different card).
(In reply to comment #3)
> Unfortunately this doesn't fix the issue. When I get a spare moment I'll try
> the patch on the desktop pc (which has a different card).
Please try to attach a gdb to the deadlocked process and provide the output of the following command: "thread apply all bt"
Thanks in advance,
Created attachment 87341 [details]
Backtrace of compton deadlock
Backtrace of compton deadlock
gdb attach <pid of compton>
gdb thread apply all bt
I should have re-disabled compton after applying your patch, as I can start up without it and crash when I run it. Before the patch disabling compton had no effect and things froze anyway.
(In reply to comment #5)
> Created attachment 87341 [details]
> Backtrace of compton deadlock
Are you sure that this is the whole output of "thread apply all bt"?
There is only one thread shown and that's the locked up one, but there should also be at least the command submission thread.
Created attachment 87498 [details]
'thread apply all bt' with the commit reverted instead of the patch
Sorry for the delay. I've recompiled, reinstalled and double checked everything and there's definitely just one thread showing on 'thread apply all bt'.
If I revert the 'winsys/radeon: remove cs_queue_empty' commit instead of applying the patch then I get 2 threads as per the gdb_compton_good file I've just attached.
Created attachment 87517 [details] [review]
That looks like we are accessing the CS after the winsys has already been destroyed.
Please try the attached workaround on top of the latest mesa code and also compile the driver with debugging symbols (give --enable-debug to configure) and then do the backtrace again.
Created attachment 87588 [details]
thread apply all bt
With that patch things are still freezing up, see the attachment for the gdb output (which still has just one thread).
The command which causes the freeze is :-
compton -b --backend glx --config /dev/null
(The '--config /dev/null' is suggested by compton's maintainer to force everthing to their defaults when trying to troubleshoot; Using my own config file and changing the glx-related config settings within it seems to have no effect on whatever is at fault.)
In my openbox autostart script the following line invokes compton in the background without any problems at all :-
compton --backend glx --config /dev/null &
If the backend is changed to xrender then things run fine whether the -b switch is used or not.
So compton crashes when using the -b switch to daemonise in conjunction with the glx backend, and omitting the switch works around the problem.
Perhaps not unexpectedly, if I revert the following commits then things run fine.
8bc7673ef874faa95d43c255c7fc631c2d2160c0 radeon/winsys: fix handling in radeon_drm_cs_flush v2
0653c66ef40ac553f91b29bbda7f59f7ce6948fa winsys/radeon: remove cs_queue_empty
I'm starting to wonder if this is a bug in compton.
(In reply to comment #10)
> The command which causes the freeze is :-
> compton -b --backend glx --config /dev/null
Thanks for this, I can reproduce the problem now.
Not sure if that's an issue in compton or not, but it's definately a bit odd.
Created attachment 87660 [details]
It's indeed a specific problem that only happens with comptons "-b" option.
Compton initializes the X connection and GLX context (loads and starts the driver) and then forks into the background. This creates all kinds of problems with out driver infrastructure and I'm not 100% sure if it's allowed or not.
Previously it just worked because of pure coincident and the attached patch is a workaround for at least this problem, but there might be others as well.
Going to dicuss this with the maillinglist.
Please file a bug report with compton. I'm still not 100% sure, but it indeed looks like forking into the background with a current GLX context is not allowed.
Previously it just worked because of coincident.