Bug 87277 - manages Btrfs multiple device volumes incorrectly, cannot umount
Summary: manages Btrfs multiple device volumes incorrectly, cannot umount
Status: NEW
Alias: None
Product: udisks
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium major
Assignee: David Zeuthen (not reading bugmail)
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-13 09:11 UTC by Chris Murphy
Modified: 2016-09-10 19:21 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
journalctl -b -u udisks2 (16.54 KB, text/plain)
2014-12-13 09:11 UTC, Chris Murphy
Details

Description Chris Murphy 2014-12-13 09:11:50 UTC
Created attachment 110807 [details]
journalctl -b -u udisks2

udisks2-2.1.3-4.fc21.x86_64

1. Btrfs volume made of two drives /dev/sd[bc] using raid1.
2. Plug them in, and Gnome Nautilus shows two identically named volumes (bug1) wen there should be just one.
3. Only one of the listed volume names has an arror to umount the volume and it doesn't work. Whenever I click it, the mount point name merely changes with these entries in journal


Dec 13 01:52:54 f21m.localdomain udisksd[1502]: Cleaning up mount point /run/media/chris/btrfs13 (device 8:16 is not mounted)
Dec 13 01:52:54 f21m.localdomain udisksd[1502]: Cleaning up mount point /run/media/chris/btrfs15 (device 8:16 is not mounted)
Dec 13 01:52:54 f21m.localdomain udisksd[1502]: Error cleaning up mount point /run/media/chris/btrfs15: Error removing directory: Device or resource busy
Dec 13 01:52:54 f21m.localdomain udisksd[1502]: Cleaning up mount point /run/media/chris/btrfs16 (device 8:16 is not mounted)
Dec 13 01:52:54 f21m.localdomain udisksd[1502]: Error cleaning up mount point /run/media/chris/btrfs16: Error removing directory: Device or resource busy
Dec 13 01:52:54 f21m.localdomain org.gtk.Private.UDisks2VolumeMonitor[1763]: index_parse.c:191: indx_parse(): error opening /run/media/chris/btrfs15/BDMV/index.bdmv
Dec 13 01:52:54 f21m.localdomain org.gtk.Private.UDisks2VolumeMonitor[1763]: index_parse.c:191: indx_parse(): error opening /run/media/chris/btrfs15/BDMV/BACKUP/index.bdmv

When trying to umount the volume from CLI as root, I get the same messages.


Expected results:

a.) Only one instance of this single volume should appear in nautilus
b.) Whether I umount it from CLI or Nautilus it should be unmounted, instead of being apparently remounted automatically by udisks2.
Comment 1 Chris Murphy 2014-12-13 09:12:51 UTC
kernel-3.18.0-1.fc22.x86_64
Comment 2 Chris Murphy 2015-03-25 21:51:24 UTC
The behavior of this is slightly different on Fedora 22 with udisks2-2.1.4-1.fc22 and GNOME 3.15.92.x. The first click of the eject button produces an audible response, no visual response, but is not umounted. The second click behaves per the bulk of this Nautilus bug:
https://bugzilla.gnome.org/show_bug.cgi?id=746769

The biggest problem is that the UI will suggest it's safe to disconnect the device (when it's in fact two devices) even though the fs volume is still mounted read-write.

Nautilus, gvfs folks say this problem stems from udisks2.
Comment 3 Chris Murphy 2016-09-10 19:21:21 UTC
Ok so this is actually much worse than I originally thought. What udisks does it is tries to mount each device node, the first time it typically fails because both devices don't appear at the same time. If they both appear at the same time, then udisks will mount them both at different mount points. And as Nautilus presents each with an icon, the user clicking between them will cause udisks to keep adding more mount points:

[ 1648.696242] localhost.localdomain udisksd[1561]: Mounted /dev/sdb1 at /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b951 on behalf of uid 1001
[ 1652.812191] localhost.localdomain udisksd[1561]: Mounted /dev/sdb1 at /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b952 on behalf of uid 1001
[ 1654.638652] localhost.localdomain udisksd[1561]: Mounted /dev/sdb1 at /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b953 on behalf of uid 1001
[ 1655.983291] localhost.localdomain udisksd[1561]: Mounted /dev/sdb1 at /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b954 on behalf of uid 1001
[ 1748.919726] localhost.localdomain udisksd[1561]: Mounted /dev/sdb1 at /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b955 on behalf of uid 1001



[root@localhost ~]# cat /proc/mounts | grep btrfs
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b95 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b951 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b952 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b953 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b954 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b955 btrfs rw,seclabel,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0


Next, when the user tries to eject one of the devices, udisks does not clean up and umount all instances first. It actually *deletes* the device node for that device, causing the raid1 to now be degraded. I'm guessing it does something like 'echo 1 > /sys/block/sdb/device/delete' because the actual device node is simply gone after I've ejected it. But the file system is still mounted via /dev/sdc1.

So in Nautilus I try to delete the 2nd device, but I keep getting "Unable to eject 2.0 GB Volume Cannot eject drive in use: Device /dev/sdc1 is mounted" while it does umount one of the remaining mount points each time until finally none are left.

[ 2075.070620] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b951: Error removing directory: Device or resource busy
[ 2075.071115] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b952 (device 8:17 no longer exist)
[ 2075.071320] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b952: Error removing directory: Device or resource busy
[ 2075.071521] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b953 (device 8:17 no longer exist)
[ 2075.071740] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b953: Error removing directory: Device or resource busy
[ 2075.071864] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b954 (device 8:17 no longer exist)
[ 2075.071991] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b954: Error removing directory: Device or resource busy
[ 2075.072120] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b955 (device 8:17 no longer exist)
[ 2075.072289] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b955: Error removing directory: Device or resource busy
[ 2075.072416] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b956 (device 8:17 no longer exist)
[ 2075.072529] localhost.localdomain udisksd[1561]: Error cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b956: Error removing directory: Device or resource busy
[ 2075.075833] localhost.localdomain dbus-daemon[1422]: Activating service name='org.gnome.Shell.HotplugSniffer'
[ 2075.076948] localhost.localdomain udisksd[1561]: Unmounted /dev/sdc1 on behalf of uid 1001

But then Brfs also gets pissed:

[ 2245.235467] localhost.localdomain udisksd[1561]: Cleaning up mount point /run/media/chris2/3ba00179-69aa-4335-b94d-2ba01b7b8b951 (device 8:17 no longer exist)
[ 2244.761951] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 1, rd 0, flush 0, corrupt 0, gen 0
[ 2244.762089] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 2, rd 0, flush 0, corrupt 0, gen 0
[ 2244.762198] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 3, rd 0, flush 0, corrupt 0, gen 0
[ 2244.877121] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 4, rd 0, flush 0, corrupt 0, gen 0
[ 2244.877245] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 5, rd 0, flush 0, corrupt 0, gen 0
[ 2245.024779] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
[ 2245.024794] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
[ 2245.024800] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
[ 2245.273089] localhost.localdomain kernel: BTRFS warning (device sdc1): lost page write due to IO error on /dev/sdb1
[ 2245.273098] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0
[ 2245.273179] localhost.localdomain kernel: BTRFS warning (device sdc1): lost page write due to IO error on /dev/sdb1
[ 2245.273184] localhost.localdomain kernel: BTRFS error (device sdc1): bdev /dev/sdb1 errs: wr 10, rd 0, flush 0, corrupt 0, gen 0
[ 2245.944628] localhost.localdomain udisksd[1561]: Unmounted /dev/sdc1 on behalf of uid 1001
[ 2245.484756] localhost.localdomain kernel: sdc: detected capacity change from 2004877312 to

We find out that a bunch of writes for sdb failed because it had been deleted before the file system was unmounted.


Upon manually remounting the raid1, it fixes itself, but only because it's raid1.

[ 2387.460784] localhost.localdomain dbus-daemon[1422]: Successfully activated service 'org.gnome.Shell.HotplugSniffer'
[ 2386.980333] localhost.localdomain kernel: BTRFS error (device sdc1): parent transid verify failed on 29622272 wanted 25 found 18
[ 2386.983426] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29622272 (dev /dev/sdb1 sector 18944)
[ 2386.989515] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29626368 (dev /dev/sdb1 sector 18952)
[ 2386.991136] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29630464 (dev /dev/sdb1 sector 18960)
[ 2386.993113] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29634560 (dev /dev/sdb1 sector 18968)
[ 2386.996330] localhost.localdomain kernel: BTRFS error (device sdc1): parent transid verify failed on 29605888 wanted 25 found 21
[ 2386.999216] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29605888 (dev /dev/sdb1 sector 18912)
[ 2387.003528] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29609984 (dev /dev/sdb1 sector 18920)
[ 2387.005113] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29614080 (dev /dev/sdb1 sector 18928)
[ 2387.007075] localhost.localdomain kernel: BTRFS info (device sdc1): read error corrected: ino 1 off 29618176 (dev /dev/sdb1 sector 18936)
[ 2417.133346] localhost.localdomain kernel: BTRFS error (device sdc1): space cache generation (23) does not match inode (25)
[ 2417.133362] localhost.localdomain kernel: BTRFS warning (device sdc1): failed to load free space cache for block group 29360128, rebuilding i


Had metadata been single or raid0, the fs volume gets corrupted.

Option A: 
Mount *one* Btrfs device at a time, and then check /sys/fs/btrfs/UUID/devices to find out if it has any other devices, all devices listed there should not be additionally mounted again.

[root@f24m devices]# ls -la /sys/fs/btrfs/3ba00179-69aa-4335-b94d-2ba01b7b8b95/devices/
total 0
drwxr-xr-x. 2 root root 0 Sep 10 13:16 .
drwxr-xr-x. 5 root root 0 Sep 10 13:16 ..
lrwxrwxrwx. 1 root root 0 Sep 10 13:16 sdb1 -> ../../../../devices/pci0000:00/0000:00:1a.7/usb1/1-1/1-1.3/1-1.3.3/1-1.3.3:1.0/host6/target6:0:0/6:0:0:0/block/sdb/sdb1
lrwxrwxrwx. 1 root root 0 Sep 10 13:16 sdc1 -> ../../../../devices/pci0000:00/0000:00:1a.7/usb1/1-1/1-1.3/1-1.3.4/1-1.3.4:1.0/host7/target7:0:0/7:0:0:0/block/sdc/sdc1
[root@f24m devices]# 


Option B:
Before mounting, use ioctl BTRFS_IOC_FS_INFO and BTRFS_IOC_DEV_INFO to find out which devices are members of the same Btrfs volume, and only mount one device node per fs volume UUID.


Option C:
Do not automount btrfs. Right now this is the best short term option because the way things behave, it's dangerous, and just not ready for prime time.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.