This is a follow-up of a Mandriva bug (http://qa.mandriva.com/show_bug.cgi?id=23268 for complete logs and analysis). It is possible to get hald to assert (d_it == NULL) in blockdev.c in boot process, while udev initial start events are still being processed and sent by udev to hald, while hald is initializing (we hit this problem because we use parallel init, causing hal to be started very early). This is because /org/freedesktop/Hal/devices/computer is only added in GDL after initial probe of hald-probe-smbios is returned, when acpi is enabled. In the mean time, hald is getting some new hotplug events through udev, and therefore, assertion can sometime be triggered. I'm not sure what is the best way to fix this : -make sure /org/freedesktop/Hal/devices/computer is always present in GDL, even before initial computer probe -remove assertion by a simple break in loop (I'm experimenting this) -plugging hotplug event handling only after initial smbios probe is done and /org/freedesktop/Hal/devices/computer is in GDL -do smbios probe synchronously (not a good idea IMO)
(For the sake of correctness, the assertion happens when hald waits for callout - hal-system-storage-cleanup-mountpoints). But the issue is more complicated than that. HAL assumes specific ordering of hotplug events. After initial initialization it is ensured by kernel (udev) events ordering. During initialization coldplug synthesizes events in specific order. In particular, computer object being the root of all tree should appear as the very first one before anything else. Now what happens, is - osspec_init() creates udev socket processing. Then osspec_probe() creates and queues initial coldplug events. But osspec_probe() returns not when /computer is entered in GDL but when it spawns callouts; after that hald enters main loop thus triggering udev socket processing. Any udev event received during this time (in our case it was sound initialization - loading of modules etc) will trigger event queue processing *before* /computer callout returned and /computer appeared in GDL. Now, when I look at it, it seems to be a general issue. Basically, as far as I understand, node is moved in GDL only when all callouts have finished; but processing of next event starts when callouts have been spawned. I.e. imagine hotpluggable device having node for controller (e.g. PCMCIA) and block device. Controller requires callout that takes long time to finish. In this case we hit exactly the same situation - block device may be processed before controller appears in GDL. In view of this the only fix seems to be to relax assertion in storage loop and allow parent to be present either in GDL or in TDL. Coldplug vs. hotplug ordering should be OK as we create complete coldplug events first before entering main loop. There is tiny chance of missing some hotplug events after scanning /sys before udev processing starts.
fixed on GIT (commit b644d7fe9899f863013cc025764dd86c763e54ba)
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.