Summary: | Creating new partition and filesystem sometimes only creates the partition | ||
---|---|---|---|
Product: | udisks | Reporter: | intrigeri |
Component: | operations | Assignee: | David Zeuthen (not reading bugmail) <zeuthen> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | intrigeri, marc.deslauriers |
Version: | unspecified | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: |
Description
intrigeri
2014-10-26 10:10:54 UTC
I retried the same when running udisksd without --no-debug. The journal (for the partition+FS creation) says: Oct 26 11:29:02 sid-desktop kernel: sda: sda1 Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.286:[1790]:[DEBUG]: uevent change /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.287:[1795]:[INFO]: Cleanup check start [udisksstate.c:419, udisks_state_check_in_thread()] Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.287:[1795]:[INFO]: Cleanup check end [udisksstate.c:457, udisks_state_check_in_thread()] Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.771:[1790]:[DEBUG]: uevent add /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda/sda1 [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Oct 26 11:29:03 sid-desktop kernel: sda: sda1 Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.817:[1790]:[DEBUG]: uevent remove /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda/sda1 [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.817:[1795]:[INFO]: Cleanup check start [udisksstate.c:419, udisks_state_check_in_thread()] Oct 26 11:29:03 sid-desktop udisksd[1790]: 11:29:03.818:[1795]:[INFO]: Cleanup check end [udisksstate.c:457, udisks_state_check_in_thread()] Oct 26 11:29:04 sid-desktop udisksd[1790]: 11:29:04.416:[1790]:[DEBUG]: uevent change /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Oct 26 11:29:04 sid-desktop udisksd[1790]: 11:29:04.416:[1795]:[INFO]: Cleanup check start [udisksstate.c:419, udisks_state_check_in_thread()] Oct 26 11:29:04 sid-desktop udisksd[1790]: 11:29:04.416:[1795]:[INFO]: Cleanup check end [udisksstate.c:457, udisks_state_check_in_thread()] Oct 26 11:29:05 sid-desktop udisksd[1790]: 11:29:05.004:[1790]:[DEBUG]: uevent change /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Oct 26 11:29:05 sid-desktop udisksd[1790]: 11:29:05.004:[1795]:[INFO]: Cleanup check start [udisksstate.c:419, udisks_state_check_in_thread()] Oct 26 11:29:05 sid-desktop udisksd[1790]: 11:29:05.004:[1795]:[INFO]: Cleanup check end [udisksstate.c:457, udisks_state_check_in_thread()] Oct 26 11:29:05 sid-desktop udisksd[1790]: 11:29:05.482:[1790]:[DEBUG]: uevent add /sys/devices/pci0000:00/0000:00:01.2/usb1/1-2/1-2.2/1-2.2:1.0/host8/target8:0:0/8:0:0:0/block/sda/sda1 [udiskslinuxprovider.c:902, udisks_linux_provider_handle_uevent()] Reproduced with udisks 2.1.6-2 on current Debian sid, this time in Python, with any of direct D-Bus calls (both sync and async), libudisks (both sync and async) and gdbus. This tends to confirm that the bug is in UDisks2 itself. Here's my understanding of this issue: In src/udiskslinuxpartitiontable.c, handle_create_partition(): 1- New partition is created by calling parted 2- udisks_linux_block_object_trigger_uevent() is called to trigger udev 3- Wait for partition to show up by using wait_for_partition() 4- Wipe new partition by calling wipefs 5- udisks_linux_block_object_trigger_uevent() is called to trigger udev The problem arises because #2 and #5 aren't done synchronously. When udev processes the device change request, it removes the newly created partition device (ie: sdb1), and then adds it back again. This happens in a race with #4, and results in wipefs being called while the device is removed. This issue can be solved in a few ways, from best to worst: 1- Create a sync version of udisks_linux_block_object_trigger_uevent() that waits until the device is actually handled in #2 and #5 2- Make sure there are no pending events in the udev queue after #2 and #5 3- Adding an artificial delay after #2 and #5 to wait until the device has settled down (some places in the code add a udisks_linux_block_object_reread_partition_table(), which I assume results in enough delay to not hit the race most of the time.) I've tried removing #2 and #5 altogether, and that appeared to work also, but then gnome-disks crashes because udev hasn't added the device back fast enough after the wipefs. I applied Mathieu's patch: https://cgit.freedesktop.org/udisks/commit/?id=5c859c99dfcd3625 Thank you! I think the patch is not fully effective. We still see failures related to unexpected remove/add uevent pairs with storaged and Cockpit. Storaged can create a partition and format it with a single method call, and the remove/add pair caused by the partition creation interferes frequently with the immediately following formatting, so I guess it will still interfere with wipefs as well. The events come from udevd itself, when it calls BLKRRPART as part of synthesizing the change event for the parent device of a partition. The change event is synthesized when the parent device is closed after having been open for writing (udev listens for that via inotify). I don't know whether there are any ordering guarantees between uevents and inotify, so I don't know whether udisks/storaged can do anything to make sure that the removed/add pair has definitely happened by now. My ideal fix would probably be in the kernel: BLKRRPART should not emit remove/add pairs in the first place, but be more accurate about what has really changed. I have a patch proposal for storaged that tries to deal with this situation: https://github.com/storaged-project/storaged/pull/61 This helps a lot, but I guess wipefs or other operations might still hit a missing block device. Hmm... > I think the patch is not fully effective.
In fact, the patch seems to make things worse. :-/
One way to prevent the rogue BLKRRPART ioctl by udevd is to take a shared lock on the parent device. Everything becomes sane with that, except that now udisks2 (or storaged) will issue random BLKRRPARTs themselves, which just make things worse.
I vote to revert 5c859c99d.
I'll work on a patch for storaged to put a shared lock around parted/wipefs/cryptsetup/mkfs.
I think this is better: https://github.com/storaged-project/storaged/pull/64 The current patch 5c859c99dfcd3625 makes our tests fail reliably, so I'll reopen this. Thanks Marius! I finally got around to this, and ported the storaged patches to udisks: https://cgit.freedesktop.org/udisks/commit/?id=e168e59f https://cgit.freedesktop.org/udisks/commit/?id=554daa4b I tried this with an USB stick; I do get the wipefs failure pretty consistently, and with this creating the partition and FS seems to work reliably. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.