Bug 8210

Summary: assertion when hald is started very early on system
Product: hal Reporter: Frederic Crozat <fred>
Component: haldAssignee: David Zeuthen (not reading bugmail) <zeuthen>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: high CC: arvidjaar
Version: unspecified   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Frederic Crozat 2006-09-10 00:54:03 UTC
This is a follow-up of a Mandriva bug
(http://qa.mandriva.com/show_bug.cgi?id=23268 for complete logs and analysis).

It is possible to get hald to assert (d_it == NULL) in blockdev.c in boot
process, while udev initial start events are still being processed and sent by
udev to hald, while hald is initializing (we hit this problem because we use
parallel init, causing hal to be started very early).

This is because /org/freedesktop/Hal/devices/computer is only added in GDL after
initial probe of hald-probe-smbios is returned, when acpi is enabled. In the
mean time, hald is getting some new hotplug events through udev, and therefore,
assertion can sometime be triggered.

I'm not sure what is the best way to fix this :
-make sure /org/freedesktop/Hal/devices/computer is always present in GDL, even
before initial computer probe
-remove assertion by a simple break in loop (I'm experimenting this)
-plugging hotplug event handling only after initial smbios probe is done and
/org/freedesktop/Hal/devices/computer is in GDL
-do smbios probe synchronously (not a good idea IMO)
Comment 1 Andrei Borzenkov 2006-09-10 01:32:13 UTC
(For the sake of correctness, the assertion happens when hald waits for 
callout - hal-system-storage-cleanup-mountpoints).

But the issue is more complicated than that.

HAL assumes specific ordering of hotplug events. After initial initialization 
it is ensured by kernel (udev) events ordering. During initialization coldplug 
synthesizes events in specific order. In particular, computer object being the 
root of all tree should appear as the very first one before anything else.

Now what happens, is - osspec_init() creates udev socket processing. Then 
osspec_probe() creates and queues initial coldplug events. But osspec_probe() 
returns not when /computer is entered in GDL but when it spawns callouts; 
after that hald enters main loop thus triggering udev socket processing. Any 
udev event received during this time (in our case it was sound 
initialization - loading of modules etc) will trigger event queue processing 
*before* /computer callout returned and /computer appeared in GDL.

Now, when I look at it, it seems to be a general issue. Basically, as far as I 
understand, node is moved in GDL only when all callouts have finished; but 
processing of next event starts when callouts have been spawned. I.e. imagine 
hotpluggable device having node for controller (e.g. PCMCIA) and block device. 
Controller requires callout that takes long time to finish. In this case we 
hit exactly the same situation - block device may be processed before 
controller appears in GDL.

In view of this the only fix seems to be to relax assertion in storage loop 
and allow parent to be present either in GDL or in TDL.

Coldplug vs. hotplug ordering should be OK as we create complete coldplug 
events first before entering main loop. There is tiny chance of missing some 
hotplug events after scanning /sys before udev processing starts.
Comment 2 Frederic Crozat 2006-11-21 01:17:23 UTC
fixed on GIT (commit b644d7fe9899f863013cc025764dd86c763e54ba)

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.