Bug 30515

Summary: Race means that client can get no response from agent
Product: PolicyKit Reporter: James Westby <jw+debian>
Component: daemonAssignee: David Zeuthen (not reading bugmail) <zeuthen>
Status: RESOLVED WORKSFORME QA Contact: David Zeuthen (not reading bugmail) <zeuthen>
Severity: normal    
Priority: medium CC: jw+debian
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: Patch to fix the race
Patch to ensure that the helper is declared running before a handler can fire
Patch to ensure that things written to the child are written in full

Description James Westby 2010-09-30 13:48:27 UTC
Created attachment 39080 [details] [review]
Patch to fix the race

Hi,

There's a race that means that the user can type in their
password, have the text entry disappear, and then nothing
else happen for a couple of minutes until the call times
out and they get a cryptic error. If they cancel the
dialog in that time then the program will work as they
are successfully authenticated.

The reason for this is that if the SIGCHLD handler is called
before the stdout one it unregisters the stdout handler, so
it is never triggered, and that is the only way that a response
is sent, positive or negative, except for cancelling.

I'm attaching a patch which works around this in most
cases by giving the stdout handler a chance to go first.
You may prefer to reorganise a bit to not have one
handler remove the other instead. It also ensures that
the stdout handler is registered first, in case the
child exits very quickly, but I don't know if that's
possible.

I'm also attaching a couple of other patches I produced
in the course of investigating this that might be of
interest. They fix theoretical problems, but they may
never occur in the real world.

Thanks,

James
Comment 1 James Westby 2010-09-30 13:49:34 UTC
Created attachment 39081 [details] [review]
Patch to ensure that the helper is declared running before a handler can fire
Comment 2 James Westby 2010-09-30 13:50:32 UTC
Created attachment 39082 [details] [review]
Patch to ensure that things written to the child are written in full
Comment 3 James Westby 2010-10-12 11:20:51 UTC
Hi,

I've been asked to patch this in Ubuntu, as there are apparently
a fair number of users for whom policykit rarely works due to this
issue. I'm not sure what the variable is there given that I have
very rarely seen this myself.

Could you review the changes please?

If I don't hear either way soon then I will add distro patch
just reviewed by Ubuntu developers.

Thanks,

James
Comment 4 David Zeuthen (not reading bugmail) 2010-10-12 11:24:11 UTC
Pretty sure these patches don't apply to current master as there were some changes post 0.96 - any chance you can check if this is still an issue in master (I can't reproduce this) and, if so, update the patches? Thanks.
Comment 5 James Westby 2010-10-12 12:14:52 UTC
(In reply to comment #4)
> Pretty sure these patches don't apply to current master as there were some
> changes post 0.96 - any chance you can check if this is still an issue in
> master (I can't reproduce this) and, if so, update the patches? Thanks.

I can't easily check this either, as I can't reproduce it at will.

I've looked at the changes you make, and textually these won't apply, but
I don't see that they will have fixed the race here, unless the g_source
functions will somehow ensure that the stdout watch is serviced before the
child one?

Thanks,

James
Comment 6 David Zeuthen (not reading bugmail) 2011-02-23 06:20:45 UTC
Still can't reproduce and haven't seen any authentication agent bugs mentioning this problem so closing as WORKSFORME. Please reopen if you manage to reproduce. Thanks.
Comment 7 James Westby 2011-02-23 07:21:01 UTC
Hi,

I have no consistent way to reproduce, but the Ubuntu bug report
was very "popular" indeed, with many duplicates.

https://bugs.launchpad.net/ubuntu/+source/policykit-1/+bug/649939 for where
I worked with Jean-Baptiste to find the patch, and https://bugs.launchpad.net/ubuntu/+source/update-manager/+bug/445303 for the bug that received most of the attention.

There was a theoretical race that I found via code inspection, and a
patch to remove it was confirmed to fix the issue with someone who
could reliably reproduce. Unless you know that the g_source functions
remove the race, then I think you should apply the patch.

Thanks,

James
Comment 8 Simon McVittie 2015-03-07 15:15:44 UTC
In the absence of a recent polkit release, I'm looking into updating Debian experimental's polkit (which currently includes this patch) to current git master.

This looks suspiciously like Bug #60847. James, does the patch that was merged for that bug look OK? It stops using the child watch at all, and only reads stdout, which seems a more correct solution to this.
Comment 9 James Westby 2015-03-09 15:23:17 UTC
(In reply to Simon McVittie from comment #8)
> In the absence of a recent polkit release, I'm looking into updating Debian
> experimental's polkit (which currently includes this patch) to current git
> master.
> 
> This looks suspiciously like Bug #60847. James, does the patch that was
> merged for that bug look OK? It stops using the child watch at all, and only
> reads stdout, which seems a more correct solution to this.

Hi,

That does sound rather similar, yes, though the symptoms in the arch bug sound a bit different.

If it's not using the child watch then the problem may well be gone. Unfortunately I can't remember the specifics of a race condition from over 4 years ago to say for sure whether it will be handled by the other patch.

Thanks,

James

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.