87929 – RFE: more informative mount report upon failure or issue

Bug 87929 - RFE: more informative mount report upon failure or issue

Summary: RFE: more informative mount report upon failure or issue

Status:	NEW

Alias:	None

Product:	systemd
Classification:	Unclassified
Component:	general (show other bugs)
Version:	unspecified
Hardware:	Other All

Importance:	medium enhancement
Assignee:	systemd-bugs
QA Contact:	systemd-bugs

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2015-01-01 06:33 UTC by ivo welch
Modified:	2015-01-29 04:17 UTC (History)
CC List:	0 users

See Also:
i915 platform:
i915 features:

Attachments

Description ivo welch 2015-01-01 06:33:33 UTC

A start job is running for *.device...waiting for 90s

this is probably a relatively common error message for users.  most of the time, this will be because the partition(s) are not found.

suggestion:

it would be better if there was more information here.  for example, it could list available local devices and/or file systems that it does see and then say that the requested device named 'x' is not among them.  ideally, it could even guess what file system is used, what the file system likely contains (e.g., whether it is a linux boot disk, a linux root disk, etc).

(or that the mount binary is bad.  or that the fsck went awry.  or whatever else that could have gone wrong indeed went wrong.)

Comment 1 Zbigniew Jedrzejewski-Szmek 2015-01-01 18:53:30 UTC

We can't really add a message here, but I think that a good catalogue entry (that would be printed by journalctl -x) would be helpful.

Comment 2 Chris Atkinson 2015-01-11 18:28:28 UTC

I Googled for forum threads matching the error message to see how it behaved in the wild. As ivo thought, most (but not all) of the cases were mount errors. 

Another issue was people not being able to figure out what unit is actually having the issue as a result of badly written "Description=" tags. (For this would it be possible to have the error message explicitly include the unit name, e.g., "A start job is running for Incredibly Bad Description (foo.service) (89s / 90s)"?)

See below for a proposed catalog entry:

-- 48b695296b4d42fe8c127c20d01bc9bc
Subject: A @TYPE@ job is running for @UNIT@ (@TIME@ / @LIMIT@)
Defined-By: systemd
Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Something has delayed the start or stop job for a unit with the description 
"@UNIT". You will need to debug that unit to see why. @TIME@ of a total 
timeout of @LIMIT@ have elapsed. If you would like to shorten the timeout 
in future, you can edit the "TimeoutStartSec=" or "TimeoutStopSec" 
properties.

A common situation is the delay in mounting a filesystem's *.device unit, 
which is typically caused by problems in "/etc/fstab".

Comment 3 ivo welch 2015-01-11 19:05:30 UTC

thank you for taking my suggestion seriously, whatever you end up deciding.

my last 5 cents: as I wrote, most boot problems that newbies are experiencing with systemd (i.e., after the initrd was found and loaded) are /etc/fstab misconfig related.  anything we can do to be smart about it (for users who are not), even if this just means giving precise fixing instructions, would be good.  we can diagnose this better with our expertise than our newbies can with their's.  the harm that is being done to expert users by more smarts here is minimal.

the coding effort for some diagnoses and fixing suggestions is probably reasonable.  (if we needed a standalone perl script, even I as a novice who is ignorant of systemd procedures and collaborative code development, could do this in a day.)  socially speaking, the net time savings will be very positive.

Comment 4 Zbigniew Jedrzejewski-Szmek 2015-01-29 04:17:23 UTC

(In reply to Chris Atkinson from comment #2)
for a proposed catalog entry:
> 
> -- 48b695296b4d42fe8c127c20d01bc9bc
> Subject: A @TYPE@ job is running for @UNIT@ (@TIME@ / @LIMIT@)
> Defined-By: systemd
> Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
> 
> Something has delayed the start or stop job for a unit with the description 
                            @TYPE@ here?
> "@UNIT". You will need to debug that unit to see why. @TIME@ of a total 
> timeout of @LIMIT@ have elapsed.

> If you would like to shorten the timeout 
> in future, you can edit the "TimeoutStartSec=" or "TimeoutStopSec" 
> properties.
I don't think we should suggest that. 

> A common situation is the delay in mounting a filesystem's *.device unit, 
> which is typically caused by problems in "/etc/fstab".

Add
"You can list the dependencies with
    systemctl list-dependencies @UNIT@
"?

Can you make this into a patch?

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.