85746 – Required units are not started in certain setups

Bug 85746 - Required units are not started in certain setups

Summary: Required units are not started in certain setups

Status:	RESOLVED WONTFIX

Alias:	None

Product:	systemd
Classification:	Unclassified
Component:	general (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium normal
Assignee:	systemd-bugs
QA Contact:	systemd-bugs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-11-01 21:34 UTC by Luca Bruno
Modified:	2014-12-05 10:06 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments

Description Luca Bruno 2014-11-01 21:34:57 UTC

With a certain setup some required units may not be started. It is always reproducible.

S1.service requires/after D.device and S2.service
S1 is activating, thus starts D and S2
S2 starts, which activates D and deactivates D
S1 "can" now start because D has no job running [1] and the device is not asked to start. Hence D is not there and the service fails to run.

The steps to reproduce are really simple.

Setup some helper files in e.g. /root/tests:

- Create a disk, truncate -s 1M disk
- mkfs.ext4 disk
- mkdir mnt

Add the following two services.

s1.service:

[Unit]
Requires=s2.service dev-loop0.device
After=s2.service dev-loop0.device
[Service]
ExecStart=/bin/mount /dev/loop0 /root/tests/mnt
Type=oneshot

s2.service:

[Service]
ExecStart=/sbin/losetup /dev/loop0 /root/tests/disk
ExecStart=/sbin/losetup -d /dev/loop0
Type=oneshot
RemainAfterExit=true

systemctl start s1

It fails, because /dev/loop0 has been unmounted. I expect instead s1 to hang waiting for the device, i.e. add back the start job for the device.

[1] http://lxr.devzen.net/source/xref/systemd/src/core/job.c#429

Comment 1 Lennart Poettering 2014-12-05 01:29:30 UTC

systemd's dependency logic is fully parallel, and it will dispatch and complete jobs as early as possible (i.e. eagerly), looking at them individually and independently of anything else. With a setup like yours, if D.device appears, then this is all systemd was asked to wait for, and it will complete the job waiting for it so that this dependency for S1 is fulfilled. The fact that D goes away immediately afterwards is not relevant for this case.

Also not the distinction between Requires= and BindTo=. Requires= just means that the the start job of some other unit has to have completed successfully, and if it didn#t any start job for your unit would fail too. The start job of units doesn't necessarily result in a unit being active though. For example, all services of type=oneshot and RemainAfterExit=no generally go from activating directly into inactive, with the start job being completed sucessfully. Now, BindTo= is like Requires= but it has one additional effect: when the specified other unit goes down, your unit will be pulled down too again, it's hence not only about starting, but also about stopping.

That said, BindTo= is not going to make you happy either I fear, since it#s not really useful here, because things will be executed in parallel, and hence the effect of BindTo= pulling your service down is likely to hit your service at a time where it already started up and failed.

My suggestion for a workaround would be to readd the D.device job a second time, from s2.service. if the loop device showed up for the the first time, the D.device is properly completed by systemd. However, after removing the device again you enqueue the job a second time, so that it needs to be processed by systemd a second time. For this, you could add ExecStart=/usr/bin/systemctl start --no-block dev-loop0.device from s2.service, after the two losetup commands...

Anyway, other than this work-around I cannot suggest much. The state engine tries to eagerly process jobs, to make things as quick as possible, and it is simply not compatible with what you are trying to do. Sorry!

Comment 2 Luca Bruno 2014-12-05 10:06:48 UTC

Thanks for the lengthy answer, I make sense of it. For my use case I think I can start --no-block the device in s1 itself.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.