Summary: | pkexec doesn't invoke PAM session close hooks | ||
---|---|---|---|
Product: | PolicyKit | Reporter: | Michael Biebl <mbiebl> |
Component: | daemon | Assignee: | David Zeuthen (not reading bugmail) <zeuthen> |
Status: | RESOLVED MOVED | QA Contact: | David Zeuthen (not reading bugmail) <zeuthen> |
Severity: | normal | ||
Priority: | medium | CC: | colin, crrodriguez, david, fred, sam, shirishag75 |
Version: | unspecified | ||
Hardware: | Other | ||
OS: | All | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: | strace of logind |
Description
Michael Biebl
2012-02-06 00:48:52 UTC
I am pretty sure your sudo is just broken, and doesn't waipid() before closing the PAM session. That was fixed in sudo quite some time ago iirc. (In reply to comment #1) > I am pretty sure your sudo is just broken, and doesn't waipid() before closing > the PAM session. That was fixed in sudo quite some time ago iirc. just for documentation: The said fix has landed in sudo 1.7.4 and the Debian package is 1.8.3p2-1, so it looks like a different issue. (In reply to comment #2) > (In reply to comment #1) > > I am pretty sure your sudo is just broken, and doesn't waipid() before closing > > the PAM session. That was fixed in sudo quite some time ago iirc. > > just for documentation: > The said fix has landed in sudo 1.7.4 and the Debian package is 1.8.3p2-1, so > it looks like a different issue. I've also compared the sudo sources from rawhide (where Lennart reported it as working) and Debian unstable and didn't find any relevant differences. So I'm inclined to *not* blame it on sudo. As a workaround, I've removed pam_systemd from sudo's pam session configuration. Created attachment 56794 [details]
strace of logind
<mezcalero> mbiebl: so, the problem goes something like this i think: <mezcalero> mbiebl: on fedora, when the PAM session is opened, the calling process gets returned that fifo fd that logind uses to check when a session goes away <mezcalero> mbiebl: then, when the session is destructed this fd is closed and logind knows <mezcalero> mbiebl: now, when you invoke the PAM session hooks from a process that is already in a logind session <mezcalero> mbiebl: we will pass it an fd to the same fifo, again <mezcalero> mbiebl: that means that if you login on a getty, and then do sudo <mezcalero> mbiebl: you'll have to session fds, to the same fifo <mezcalero> mbiebl: only if both are closed the kernel will tell logind that the fifo is now closed <mezcalero> mbiebl: now, if i run things here, then everything works fine like this <mezcalero> mbiebl: and even though the sudo fd ends up being immediately closed simply because the original fd is still open everything works fine <mezcalero> now, of course sudo should be fixed not to close the our logind fd in the parent <mezcalero> but it appears as if your original login process also closes the login fd (In reply to comment #5) > <mezcalero> mbiebl: so, the problem goes something like this i think: > <mezcalero> mbiebl: on fedora, when the PAM session is opened, the calling process gets returned that fifo fd that logind uses to check when a session goes > away > <mezcalero> mbiebl: then, when the session is destructed this fd is closed and logind knows > <mezcalero> mbiebl: now, when you invoke the PAM session hooks from a process that is already in a logind session > <mezcalero> mbiebl: we will pass it an fd to the same fifo, again > <mezcalero> mbiebl: that means that if you login on a getty, and then do sudo > <mezcalero> mbiebl: you'll have to session fds, to the same fifo > <mezcalero> mbiebl: only if both are closed the kernel will tell logind that the fifo is now closed > <mezcalero> mbiebl: now, if i run things here, then everything works fine like this > <mezcalero> mbiebl: and even though the sudo fd ends up being immediately closed simply because the original fd is still open everything works fine > <mezcalero> now, of course sudo should be fixed not to close the our logind fd in the parent > <mezcalero> but it appears as if your original login process also closes the login fd We can only reproduce this issue with KDM ... using xdm or plain old console makes the issue go away. This bug is now 100% reproducible once a Debian system is upgraded to systemd 44; logging in via both GDM and login. Sorry, can't do anything about this, since I don't run Debian. This really needs somebody using Debian to debug. (In reply to comment #7) > This bug is now 100% reproducible once a Debian system is upgraded to systemd > 44; logging in via both GDM and login. Unfortunately it's not that simple. I'm running an up-to-date Debian sid system with systemd 44-1 and can't reproduce the issue anymore (whereas I could it some time ago). It's still unclear to me under which circumstances the bug occurs (if it's architecture/kernel version/ etc related) (In reply to comment #8) > Sorry, can't do anything about this, since I don't run Debian. This really > needs somebody using Debian to debug. Apparently this is also seen on non-Debian system, as e.g we have reports from openSUSE users. Frederic suspected a kernel bug, but I dunno if he found out more in that direction. nothing conclusive : I'm unable to reproduce this bug in KVM and some folks still have it (and apparently, even for them, it is intermittent) : https://bugzilla.novell.com/show_bug.cgi?id=746704 (In reply to comment #9) > (In reply to comment #7) > > This bug is now 100% reproducible once a Debian system is upgraded to systemd > > 44; logging in via both GDM and login. > > Unfortunately it's not that simple. I'm running an up-to-date Debian sid system > with systemd 44-1 and can't reproduce the issue anymore (whereas I could it > some time ago). > > It's still unclear to me under which circumstances the bug occurs (if it's > architecture/kernel version/ etc related) Ah, I see. Both of the systems I hit this are VirtualBox virtual machines, running the Debian 3.2.0-2-amd64 kernel (version 3.2.12-1 at the present moment). For me, sudo stopped working immediately as soon as I upgraded the systemd Debian packages to version 44-1. Admittedly a few other packages were upgraded at the same time, but I think the systemd packages were the only ones on both systems that looked like they could have broken sudo in this way. As a workaround, removing pam_systemd.so from /etc/pam.d/common-session-noninteractive restores sudo to working order. Seems the reason why I couldn't reproduce the problem anymore is that I've added pam_loginuid.so to my pam config locally. As soon as I comment out those lines again from the login/gdm pam config, I can reproduce the sudo bug. This also explains, why this problem doesn't happen on fedora. I tried to backport commit 75c8e3cffd7da8eede614cf61384957af2c82a29 to v44 and things got worse: - sudo and su-l (and su) pam config all contains a call to pam_systemd in the session section - the following test scenario is causing session leader to be terminated by logind: login as user su - logout from root sudo -s => terminate user session completely. and after debugging further, SUSE bug "sudo get broken under systemd" seems to be related with kdm autologin configuration which doesn't include pam_loginuid.so, resulting into a /proc/self/sessionid equal to (uint32_t) -1 which is always the same and is screwing pam_systemd a lot.. second possible way to get a broken state, by using screen or tmux: -login -start a screen session -detach the session -logout -login again -re-attach the session - try to use sudo or su -l in the screen session for our case, pam_systemd is called in both sudo and su-l pam configuration, and will cause the call to be killed (either immediatly with sudo or when exiting the su -l shell) by pam_systemd, because the audit session id in screen process is no longer valid and screws logind sudo 1.4.5 seems to have fixed the screen / tmux issue, as least on openSUSE. I've got something similar in Ubuntu after changing system language, with sudo and resume from suspend not accepting passwd. login and tty working. I fixed it creating ~/.pam_environment with the following content: LC_NUMERIC=en_US.UTF-8 LC_TIME=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_ADDRESS=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 for en-us, save-reboot and now it works Can someone do this fast test ? Fabioy Ubuntu is not using systemd, your bug must be somewhere else. Michael, can you verify that sudo 1.4.5 fixes the problem on Debian, too? (In reply to comment #19) > Michael, can you verify that sudo 1.4.5 fixes the problem on Debian, too? I'm using sudo 1.8.3p2. Are you sure you meant 1.4.5? (In reply to comment #20) > (In reply to comment #19) > > Michael, can you verify that sudo 1.4.5 fixes the problem on Debian, too? > > I'm using sudo 1.8.3p2. Are you sure you meant 1.4.5? Hmm, that's what Frederic mentioned in comment #16. Frederic, what is going on? I should learn to type, I was meaning 1.8.5 Is this problem still remaining with recent sudo? not with sudo 1.8.5. We have similar issues with pkexec thought.. OK, renaming bug accordingly. But my guess is that pkexec also incorrectly calls the PAM hooks, much like sudo initially did. (In reply to comment #23) > Is this problem still remaining with recent sudo? I'm using sudo 1.8.5p2-1 on Debian now and this version seems to work fine, indeed. Even the old "workaround" to add pam_loginuid is no longer necessary and sudo works with and without pam_loginuid. As for pkexec (0.105): I can indeed confirm the behaviour. If pam_systemd is enabled, but pam_loginuid not, I simply get a "Hangup" message With both pam_systemd and pam_loginuid enabled it works fine. (In reply to comment #26) > As for pkexec (0.105): I can indeed confirm the behaviour. > If pam_systemd is enabled, but pam_loginuid not, I simply get a "Hangup" > message > With both pam_systemd and pam_loginuid enabled it works fine. Since this came up recently again, let me repeat that: To reproduce the bug, one simply needs to remove pam_loginuid from the PAM stack (for testing purposes I've done it for login, as I can quickly login/logout on the console). So, as it turns out pkexec doesn't invoke the PAM session hooks at all, but it really should. Reassigning to pkexec. Or actually, it does invoke pam_open_session(), but never invokes pam_close_session(), and that's what is broken. It really needs to invoke pam_open_session() in the parent process before forking off the shell, and then pam_close_session() in the parent process after the shell died. (In reply to comment #29) > Or actually, it does invoke pam_open_session(), but never invokes > pam_close_session(), and that's what is broken. > > It really needs to invoke pam_open_session() in the parent process before > forking off the shell, and then pam_close_session() in the parent process > after the shell died. Hmm, right now I think we just exec() - this change would include forking a child process and baby-sitting it. Right? FWIW, I'm not opposed to doing that but I don't have a lot of bandwidth right now to work on it. I'd be happy to review a patch if someone wants to work on this before I get around to it... (In reply to comment #30) > (In reply to comment #29) > > Or actually, it does invoke pam_open_session(), but never invokes > > pam_close_session(), and that's what is broken. > > > > It really needs to invoke pam_open_session() in the parent process before > > forking off the shell, and then pam_close_session() in the parent process > > after the shell died. > > Hmm, right now I think we just exec() - this change would include forking a > child process and baby-sitting it. Right? Yeah, right now the code apparently just does pam_open_session() and then exec(). What it should do is pam_open_session(), followed by exec() in the child, and waitpid() in the parent. Afterwards the parent should invoke pam_close_session() and exit. (In reply to comment #31) > (In reply to comment #30) > > (In reply to comment #29) > > > Or actually, it does invoke pam_open_session(), but never invokes > > > pam_close_session(), and that's what is broken. > > > > > > It really needs to invoke pam_open_session() in the parent process before > > > forking off the shell, and then pam_close_session() in the parent process > > > after the shell died. > > > > Hmm, right now I think we just exec() - this change would include forking a > > child process and baby-sitting it. Right? > > Yeah, right now the code apparently just does pam_open_session() and then > exec(). What it should do is pam_open_session(), followed by exec() in the > child, and waitpid() in the parent. Afterwards the parent should invoke > pam_close_session() and exit. Sorry, let's try this again: Yeah, right now the code apparently just does pam_open_session() and then exec(). What it should do is pam_open_session(), followed by fork(), then exec() in the child, and waitpid() in the parent. Afterwards the parent should invoke pam_close_session() and exit. (In reply to comment #32) > Yeah, right now the code apparently just does pam_open_session() and then > exec(). What it should do is pam_open_session(), followed by fork(), then > exec() in the child, and waitpid() in the parent. Afterwards the parent > should invoke pam_close_session() and exit. Sounds good to me. This shouldn't be a very big change, I'll try to look into this in the weekend. Thanks. Btw, is this what su(8) and sudo(8) is doing? I'm also idly wondering if this extra process in the process tree is going to confuse other software (say, the desktop shell, gnome-system-monitor and other top(1)-ish software walking the process tree). If so, my view is that it's probably their bug, yea? (In reply to comment #33) > Btw, is this what su(8) and sudo(8) is doing? Yes, this is what any software using pam sessions _must_ do. > If so, my view is that it's probably their bug, > yea? Yes. if there is any tool that for some reason misbehave after pkexec makes correct use of the PAM API then it is bug somewhere else. ;-) -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/polkit/polkit/issues/41. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.