Created attachment 114751 [details]
There are 3 entities/processes:
- M the manager process, its role is to launch/relaunch session processes S
- S the session process, its role is to open a user session using PAM (pam_start, pam_open_session, pam_close_session, pam_end) for a utility process U and for a user given (as parameter)
- U a utility process that should run under a user session
M fork/exec S and re-launch it when it dies.
S open the session (using PAM) for the given user, fork exec U and when U dies, close the PAM session and dies itself.
U does something (for our concern, it is a graphical launcher).
Let the user be of UID 1000.
This works only the first time: the user session is well created, I can see the status of user-1000.slice that handles the email@example.com and the session's scope.
Then when U dies and thus S closes the session and dies and thus M relaunches S that reopens the session, the service firstname.lastname@example.org is not started but is stopped. Thus the user context set by the service email@example.com is NOT available nor set for the new sessions.
From my investigations, closing the session doesn't close immediately anything: neither scope nor slice nor user service are closed. When opening a new session, I can see that user-1000.slice and firstname.lastname@example.org units are stopped (StopUnit). This stoppings are mixed with creation messages then I suspect an internal problem in systemd-logind.
An other interesting observation is that if I stop manually the scope unit of the session (systemctl stop session-cXX.scope), the email@example.com is correctly started half the time (once of two).
I attached the status of the slice as returned by the commands:
- systemctl status user-1000.slice
- systemctl status firstname.lastname@example.org
Hmm, so did I get this right: if the user fully logs out and immediately logs back in under the same user then it might happen that user@.service instance is still being stopped and no new start job will be queued? That is indeed a bug.
This looks like an issue being discussed on the archlinux forms here:
The logs there in a malfunctioning instance *lack* this journalctl entry:
> Jul 27 18:27:40 Think systemd-logind: Removed session c2.
Which subsequently prevents these entries which are in a working case but absent from the malfunctioning case:
> Jul 27 18:27:45 Think systemd-logind: New session c3 of user username.
> Jul 27 18:27:45 Think systemd: Started Session c3 of user username.
> Jul 27 18:27:45 Think systemd: Starting Session c3 of user username.
In the malfunctioning case I get this instead:
> Jul 27 18:28:41 Think login: pam_systemd(login:session): Cannot create session: Already occupied by a session
This seems to be triggered by any number of backgrounded processes including gpg-agent, a detachted tmux session, or dbus-launch.
https://github.com/systemd/systemd/pull/9824 should fix things, but https://github.com/systemd/systemd/issues/10414 prevents the fix from working. Once #10414 is resolved, we should be able to close this.