Bug 27232 - A small java program that causes X to hang
Summary: A small java program that causes X to hang
Status: RESOLVED NOTOURBUG
Alias: None
Product: xorg
Classification: Unclassified
Component: Server/General (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Xorg Project Team
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: xserver-1.9
  Show dependency treegraph
 
Reported: 2010-03-21 15:15 UTC by Mike Fairbank
Modified: 2010-06-28 08:08 UTC (History)
3 users (show)

See Also:
i915 platform:
i915 features:


Attachments
the java program that causes the error (4.58 KB, application/zip)
2010-03-21 15:15 UTC, Mike Fairbank
no flags Details
Output from xscope during hang (218.34 KB, text/plain)
2010-05-29 03:05 UTC, Mike Fairbank
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Fairbank 2010-03-21 15:15:26 UTC
Created attachment 34303 [details]
the java program that causes the error

When this java program is run, it causes X Windows to hang (i.e. No windows respond to mouse clicks at all, even though the mouse pointer still can move.)  Even though the bug may be caused by Java, I still think there is a bug with X windows, because it should handle the problem more gracefully.

To reproduce the error take the following steps (this works every time for me):

1. unzip the attached folder and cd into it.
2. run the java debugger by typing "jdb"
3. in the java debugger, type "stop at src.MainClass:45"
4. in the java debugger, type "run src.MainClass"

A window should open that displays a drop down item.  Change the drop-
down item from A to B.  This causes X windows/Gnome to hang.  The mouse
pointer still moves but you can't click on anything (i.e. nothing
responds to mouse-clicks).

The java code attached is the minimal code I could create that causes
the problem. It took hours of crashing my OS and rebooting to get it
that far, but it probably can go smaller.

Sorry I haven't managed to test this bug on the latest version of X, but I've had several people verify this bug on different linux OS combinations, so I hope you can reproduce it too.

Thanks, I hope this helps! 

End of bug report


Further Details of my OS + Java:

X.Org X Server 1.6.4
Release Date: 2009-9-27
X Protocol Version 11, Revision 0
Build Operating System: Linux 2.6.24-23-server x86_64 Ubuntu
Current Operating System: Linux mike-desktop 2.6.31-20-generic #58-Ubuntu SMP Fr
i Mar 12 04:38:19 UTC 2010 x86_64

java version "1.6.0_16"
Java(TM) SE Runtime Environment (build 1.6.0_16-b01)
Java HotSpot(TM) 64-Bit Server VM (build 14.2-b01, mixed mode).
Comment 1 Peter Hutterer 2010-03-25 21:58:08 UTC
Could this be a dupe of bug 25400?
Comment 2 Mike Fairbank 2010-03-26 11:51:34 UTC
I couldn't get bug 25400 to run, but superficially it sounds different because that bug states "Clicking anywhere else on the button traps the mouse cursor within the bounds of the button, with no means of escape other than switching to a console and killing the application.", whereas with this bug (27232), when X windows hangs you can freely move the mouse pointer over the whole screen, it's just that no windows or buttons respond to mouse clicks.  

By the way: 
1. The computer is still running fine despite not being able to mouse-click on anything, for example my CPU graph animation continues running in the background.
2. You can escape from this hang with alt+f7 (in Ubuntu, gnome anyway).


Comment 3 Peter Hutterer 2010-04-15 23:54:58 UTC
(In reply to comment #2)
> I couldn't get bug 25400 to run, but superficially it sounds different because
> that bug states "Clicking anywhere else on the button traps the mouse cursor
> within the bounds of the button, with no means of escape other than switching
> to a console and killing the application.", whereas with this bug (27232), when
> X windows hangs you can freely move the mouse pointer over the whole screen,
> it's just that no windows or buttons respond to mouse clicks.  

whether the pointer is trapped is just one parameter in the grab request (ConfineTo) and no-one but motif still uses it. So it could still be the same bug.

> By the way: 
> 1. The computer is still running fine despite not being able to mouse-click on
> anything, for example my CPU graph animation continues running in the
> background.
> 2. You can escape from this hang with alt+f7 (in Ubuntu, gnome anyway).

I couldn't reproduce it with the test program though (that's on 1.8 with the patches from 25400 applied). Do you have the chance of trying the patches?
Comment 4 Mike Fairbank 2010-04-16 16:02:24 UTC
(In reply to comment #3)

> I couldn't reproduce it with the test program though (that's on 1.8 with the
> patches from 25400 applied).

That's good news, just to double check: you did try to reproduce the error using the java debugger (jdb) with the breakpoint in the correct place, as described in the original post?  

> Do you have the chance of trying the patches?

I'd like to, but I'm a novice/intermediate on linux in general, how easy is it?  Last time I had problems with X starting, I had to reinstall my whole OS distribution, so I'm a bit wary.  But if there's a simple guide you can link me to (for both the xorg upgrade, that will work under ubuntu; and how to install a patch), then I'll give it a go.
Comment 5 Peter Hutterer 2010-04-21 21:12:13 UTC
(In reply to comment #4)
> (In reply to comment #3)
> 
> > I couldn't reproduce it with the test program though (that's on 1.8 with the
> > patches from 25400 applied).
> 
> That's good news, just to double check: you did try to reproduce the error
> using the java debugger (jdb) with the breakpoint in the correct place, as
> described in the original post?  


argh, no, I overlooked this every single time I read the report.

but it makes a lot more sense now :)
I'm pretty sure what you're running into here is a common debugging issue under X. When a popup menu is created, the client (java in your case) requests a grab (either passive or active, what matters is that it _activates_ when the popup is displayed). The reason is simple - if you click outside of the popup, the window still gets the event knowing that it must undisplay itself.

if you set a breakpoint between requesting that grab and the grab being released, then nothing will work until you continue and release the grab. this also explains why I didn't see it without the jdb.

in the extreme case, if the app also has a keyboard grab, you won't be able to use the keyboard either since both are grabbed by the now halted client.

IIRC GTK works around this with magic flags for debugging that skip the grabs.

does this explanation make sense? do you see it if the breakpoint is anywhere before or after the popup has been displayed and undisplayed again?
Comment 6 Mike Fairbank 2010-04-22 03:33:51 UTC
>argh, no, I overlooked this every single time I read the report.

Can you confirm whether you managed to reproduce the bug then on the latest version of xorg?

>does this explanation make sense?

Sounds a nice explanation, but I know nothing of xorg or its internal workings so you could tell me anything and I'd agree.  Thanks for looking into it and thinking about it - I hope this leads to a fix.

> do you see it if the breakpoint is anywhere
before or after the popup has been displayed and undisplayed again?

I'll get back to you on that.

Thanks again!
Comment 7 Mike Fairbank 2010-04-25 10:09:12 UTC
Peter, here are some answers to your questions and a further question:

> do you see it if the breakpoint is anywhere before or after the popup has been displayed and undisplayed again?

The breakpoint must be in the "actionperformed" event for that dropdown.  It seems it can also be on any of the lines of the Tabfolder class:

public void actionPerformed(ActionEvent event) {
		final Object object = event.getSource();
		if (object == comboBox_descentMode) {
			final Modes descentMode = (Modes) comboBox_descentMode.getSelectedItem();
			mainClass.update(descentMode);
		}
	}

> I'm pretty sure what you're running into here is a common debugging issue under X. 
>...
> IIRC GTK works around this with magic flags for debugging that skip the grabs.

Does that mean a fix has already been created (is there a bug duplicate)?  Do you know which version this will be fixed in?  

Thanks!

Mike.
Comment 8 Peter Hutterer 2010-04-28 18:33:22 UTC
(In reply to comment #7)
> Peter, here are some answers to your questions and a further question:
> 
> > do you see it if the breakpoint is anywhere before or after the popup has been displayed and undisplayed again?
> 
> The breakpoint must be in the "actionperformed" event for that dropdown.  It
> seems it can also be on any of the lines of the Tabfolder class:
> 
> public void actionPerformed(ActionEvent event) {
>         final Object object = event.getSource();
>         if (object == comboBox_descentMode) {
>             final Modes descentMode = (Modes)
> comboBox_descentMode.getSelectedItem();
>             mainClass.update(descentMode);
>         }
>     }

urgh. I have no idea what the JVM does in that part though, but a protocol snoop should show us. if you install xscope, you get a localhost:4 to DISPLAY=:0 forwarding with the following command:
  xscope -i4 -o0

if you then start your test app with DISPLAY=localhost:4 ./mycommand all the protocol data should be routed through xscope. set the breakpoint and check the last requests that went to the server. If any of them is a GrabPointer or GrabKeyboard request without a paired UngrabPointer/Keyboard request, that's the issue.


> > I'm pretty sure what you're running into here is a common debugging issue under X. 
> >...
> > IIRC GTK works around this with magic flags for debugging that skip the grabs.
> 
> Does that mean a fix has already been created (is there a bug duplicate)?  Do
> you know which version this will be fixed in?

this is a gtk-internal solution. IIRC (and that's a while ago) if the debug options are set, gtk simply doesn't request the grab.
Comment 9 Mike Fairbank 2010-05-19 12:55:29 UTC
Hi Peter, 
Sorry about the delayed reply.

>urgh. I have no idea what the JVM does in that part though, but a protocol
>snoop should show us. if you install xscope, you get a localhost:4 to
>DISPLAY=:0 forwarding with the following command:
>  xscope -i4 -o0

I can't get this to work easily.  I'm not an experienced xorg person.

Here is my attempt to do this:

mike@mike-desktop:~$ sudo apt-get install xserver-xorg-core
Reading package lists... Done
Building dependency tree       
Reading state information... Done
xserver-xorg-core is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 14 not upgraded.
mike@mike-desktop:~$ xscope -i4 -o0
No command 'xscope' found, did you mean:
 Command 'cscope' from package 'cscope' (universe)
 Command 'xoscope' from package 'xoscope' (universe)
xscope: command not found
mike@mike-desktop:~$ 


Hmmm.  Even if I got this to work, I'd also need help on your next instruction:

>... and check the last requests that went to the server.
>If any of them is a GrabPointer or GrabKeyboard request without 
> a paired UngrabPointer/Keyboard request, that's the issue.

So can anyone fill me in with fuller instructions for this, or even better, do this for me? I hope the attachment and instructions of this original bug report should make the error easily reproducible.  It sounds to me like the diagnostic by Peter in very on the right track...

Thanks.

Mike.
Comment 10 Tormod Volden 2010-05-24 01:16:04 UTC
Mike, xscope is not packaged for Debian/Ubuntu. There used to be "xmon" but it got dropped with reference to wireshark as an alternative. OTOH, xscope is easy to build: Get the source tarball from http://cgit.freedesktop.org/xorg/app/xscope/ unpack it and run ./autogen.sh && make

I would recommend using the 1.2 tarball, since the latest git commit broke building on Ubuntu (if you are familiar with git you can instead revert that commit of course). If your build fails, it is probably due to missing dependencies. Pulling in build dependencies for some other Xorg drivers should probably take care of it, i.e. sudo apt-get build-dep xserver-xorg-input-evdev
Comment 11 Mike Fairbank 2010-05-29 03:05:42 UTC
Created attachment 35928 [details]
Output from xscope during hang

Thanks Tormod, your instructions worked.

Attached is the output from xscope.

Peter does this confirm your diagnosis?

Thanks.

Mike.

PS: Note to myself: as a Gnome user, to recover from the X-windows hang I used "ctrl+alt+f1" followed by login and "sudo /etc/init.d/gdm restart"
Comment 12 Peter Hutterer 2010-05-30 16:15:32 UTC
run a "egrep  "Ungrab|Grab" -A 1" query against the file. You'll see an output like this (i cut out the false positive lines and the bit at the start that refers to other things)

............REQUEST: GrabPointer
        owner-events: True
--
	 ..............REPLY: GrabPointer
	              status: Success
--
 ............REQUEST: GrabKeyboard
        owner-events: True
--
	 ..............REPLY: GrabKeyboard
	              status: Success

This simply means that the client requests a pointer grab and succeeds, then
requests a keyboard grab and succeeds. There is no Ungrab following this so by
the time the log ends the client still holds the grab. And while a client
holds a pointer + keyboard grab, you cannot interact with other clients.

So yes, this confirms the hypothesis in comment #5 and there's no easy way to work around this . Sorry.
Comment 13 Keith Packard 2010-06-15 14:43:42 UTC
Note that most X toolkits have a 'debugger' mode which disables grabs to make debugging possible.
Comment 14 Mike Fairbank 2010-06-15 15:01:15 UTC
Concerning the new status (RESOLVED & NOTOURBUG), is there some other more appropriate project to raise this bug?  I don't want to see this bug report, that I've put a lot of effort into, not yielding any eventual benefit.
Comment 15 Mike Fairbank 2010-06-27 06:58:36 UTC
I've reopened (just one more time) this because there are no clues on what do do with this bug-report next.

It was a lot of effort for me to isolate it in a reproducible way, and if it just dies here that effort was wasted, and someone else might go through the whole process of raising it again.  Alternatively you could classify this bug as "wontfix"?


Can you tell me where to raise this bug if it's not xorg's bug?  

If the next step is to raise this with Java developers, then I'm not optimistic about getting a fix since this bug is not reproducible on other windowing systems (non X systems,e.g. Windows), so it really does seem to me that the problem is with xorg.

Thanks.
Comment 16 Peter Hutterer 2010-06-27 18:10:12 UTC
(In reply to comment #15)
> I've reopened (just one more time) this because there are no clues on what do
> do with this bug-report next.

This is not something we can fix in X.org, at least not yet. There are future plans to work around the complete block but for now we can't. The debugger obtains a grab and never releases it. From an X server POV this is a misbehaving client, the context of the client (i.e. that the client is halted during debugging) is invisible to the X server.

As Keith said, what is needed here is a debugging mode in the toolkit that prevents the client from issuing grabs while it is being debugged.

> Can you tell me where to raise this bug if it's not xorg's bug?
 
> If the next step is to raise this with Java developers, then I'm not optimistic
> about getting a fix since this bug is not reproducible on other windowing
> systems (non X systems,e.g. Windows), so it really does seem to me that the
> problem is with xorg.

This needs to be fixed in Java, it cannot be fixed in the X server. Sorry.
Comment 17 Mike Fairbank 2010-06-28 08:08:04 UTC
I've raised it in Java's bug reporting system (detailed below), and I cross referenced a link to this bug report so they can use your advice.

Thanks everyone for your work on this bug.

Mike

----------------

Here's the reply I got from Java:

Dear Java Developer,

Thank you for reporting this issue.

We have determined that this report is a new bug and entered the bug into our internal bug tracking system under Bug Id: 6964615.

You can monitor this bug on the Java Bug Database at
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6964615.

It may take a day or two before your bug shows up in this external database.  If you are a member of the Sun Developer Network (SDN), there are two additional options once the bug is visible.

1. Voting for the bug
  Click http://bugs.sun.com/bugdatabase/addVote.do?bug_id=6964615.

2. Adding the report to your Bug Watch list.
  You will receive an email notification when this bug is updated.
  Click http://bugs.sun.com/bugdatabase/addBugWatch.do?bug_id=6964615.

The Sun Developer Network (http://developers.sun.com) is a free service that Sun offers.  To join, visit https://softwarereg.sun.com/registration/developer/en_US/new_user.

Regards,
Java Developer Support


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.