Forwarding this bug from Ubuntu reporter Florin:
On Ubuntu there appears to be a race condition in libdrm during boot. It appears the i915 drm device exists but isn't fully initialized at the time plymouth wants to use it.
Note I'm filing this against -intel just because it's the intel portion of libdrm where the code is passing through; I think this is really a libdrm bug.
After a force restart of Ubuntu, I've got a System Crash error after logging in.
Description: Ubuntu precise (development branch)
This looks more like a libdrm bug. There's a race condition with the i915 device not being ready by the time plymouth is starting. Possibly it's because it doesn't have drm master.
<Sarvatt> apparently chromeos works around it with http://git.chromium.org/gitweb/?p=chromiumos/third_party/kernel.git;a=commit;h=32a8c5b67163a6ae211ff2683c999b6ad2c76d1f but thats just working around the problem..
googling intel/intel_bufmgr_gem.c:2783 turns up a lot of hits.
The code in question with the assert is:
bufmgr_gem->gen = 2;
else if (IS_GEN3(bufmgr_gem->pci_device))
bufmgr_gem->gen = 3;
else if (IS_GEN4(bufmgr_gem->pci_device))
bufmgr_gem->gen = 4;
else if (IS_GEN5(bufmgr_gem->pci_device))
bufmgr_gem->gen = 5;
else if (IS_GEN6(bufmgr_gem->pci_device))
bufmgr_gem->gen = 6;
else if (IS_GEN7(bufmgr_gem->pci_device))
bufmgr_gem->gen = 7;
$ xpci 8086:0126
snb-m-gt2+ (8086:0126) sandybridge
So it should be going into the IS_GEN6 branch.
DistroRelease: Ubuntu 12.04
Package: plymouth 0.8.2-2ubuntu28
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12
Uname: Linux 3.2.0-20-generic x86_64
Date: Wed Mar 28 09:33:16 2012
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120322)
MachineType: LENOVO 4284BZ4
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
ProcCmdline: /sbin/plymouthd --mode=boot --attach-to-session
PATH=(custom, no user)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
Title: plymouthd crashed with SIGABRT in raise()
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.version: 8BET56WW (1.36 )
dmi.board.asset.tag: Not Available
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.version: Not Available
dmi.product.version: ThinkPad W520
Created attachment 60277 [details]
Created attachment 60278 [details]
Created attachment 60279 [details]
Created attachment 60280 [details]
Created attachment 60281 [details]
Created attachment 60282 [details]
This is another bug that we think is the same root cause:
in this one, X comes up before the drm device is ready, and so trips on a different chunk of code.
You can see from comparing timestamps in Xorg.0.log and dmesg when drm is accessed vs. when it is reporting itself ready.
We've got a couple ideas on how to fix this in the distro. One is to put a loop around the code paths where the failures occur, to continue retrying for some number of seconds. But that feels like a big hack. The other idea would be if there was an event to indicate the driver is ready for use, that we could listen for and delay plymouth, X, etc. until it's received. But we don't know the feasibility of that.
We suspect that the reason this happens is due to a Ubuntu kernel patch, which was added to work around other boot crashing problems:
"When a drm driver is initialised we first allocate and initialise the
drm minor numbers including creating the sysfs files, then we trigger
the driver load method. The act of creating the sysfs files triggers the
uevent. This means udev may start programs which open /dev/dri/card0 and
other interfaces, this can occur before the load method has even started
and thus before the driver has fully initialised its data structures.
In the case of plymouthd this leads to it opening and closing (in disgust)
the interface, which in turn leads to a kernel panic as the mutexes are
yet to be initialised.
"This patch delays the linking up of the drm devices minor numbers until
the driver is fully initialised. As it is possible for consumers of
these interfaces to reach them before they are fully initialised we
arrange for opens of these devices to return EAGAIN until the device is
<jbarnes> so for 48894 I'd open a separate bug against drm for the core issue: if you access the device too early you get a crash
<jbarnes> there's a similar bug with accessing the dpms status files in sysfs
<jbarnes> if the module is unloading at the time, you can panic the kenrel
<jbarnes> also a kernel bug
I'll move this bug to drm, as I think the core issue is what we're really looking for advice on here.
<jbarnes> ok looks like a core drm kernel bug
<jbarnes> we don't lock properly around initialization
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/drm/issues/8.