Forwarding this bug from Ubuntu reporter Florin: http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/966868 [Problem] On Ubuntu there appears to be a race condition in libdrm during boot. It appears the i915 drm device exists but isn't fully initialized at the time plymouth wants to use it. Note I'm filing this against -intel just because it's the intel portion of libdrm where the code is passing through; I think this is really a libdrm bug. [Original Description] After a force restart of Ubuntu, I've got a System Crash error after logging in. lsb_release -rd Description: Ubuntu precise (development branch) Release: 12.04 This looks more like a libdrm bug. There's a race condition with the i915 device not being ready by the time plymouth is starting. Possibly it's because it doesn't have drm master. <Sarvatt> apparently chromeos works around it with http://git.chromium.org/gitweb/?p=chromiumos/third_party/kernel.git;a=commit;h=32a8c5b67163a6ae211ff2683c999b6ad2c76d1f but thats just working around the problem.. googling intel/intel_bufmgr_gem.c:2783 turns up a lot of hits. The code in question with the assert is: if (IS_GEN2(bufmgr_gem->pci_device)) bufmgr_gem->gen = 2; else if (IS_GEN3(bufmgr_gem->pci_device)) bufmgr_gem->gen = 3; else if (IS_GEN4(bufmgr_gem->pci_device)) bufmgr_gem->gen = 4; else if (IS_GEN5(bufmgr_gem->pci_device)) bufmgr_gem->gen = 5; else if (IS_GEN6(bufmgr_gem->pci_device)) bufmgr_gem->gen = 6; else if (IS_GEN7(bufmgr_gem->pci_device)) bufmgr_gem->gen = 7; else assert(0); $ xpci 8086:0126 snb-m-gt2+ (8086:0126) sandybridge So it should be going into the IS_GEN6 branch. Thanks! ProblemType: Crash DistroRelease: Ubuntu 12.04 Package: plymouth 0.8.2-2ubuntu28 ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12 Uname: Linux 3.2.0-20-generic x86_64 ApportVersion: 1.95-0ubuntu1 Architecture: amd64 Date: Wed Mar 28 09:33:16 2012 DefaultPlymouth: /lib/plymouth/themes/ubuntu-logo/ubuntu-logo.plymouth ExecutablePath: /sbin/plymouthd InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120322) MachineType: LENOVO 4284BZ4 ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7 ProcCmdline: /sbin/plymouthd --mode=boot --attach-to-session ProcEnviron: TERM=linux PATH=(custom, no user) ProcFB: 0 inteldrmfb ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7 Signal: 6 SourcePackage: plymouth TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth Title: plymouthd crashed with SIGABRT in raise() UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: dmi.bios.date: 01/19/2012 dmi.bios.vendor: LENOVO dmi.bios.version: 8BET56WW (1.36 ) dmi.board.asset.tag: Not Available dmi.board.name: 4284BZ4 dmi.board.vendor: LENOVO dmi.board.version: Not Available dmi.chassis.asset.tag: No Asset Information dmi.chassis.type: 10 dmi.chassis.vendor: LENOVO dmi.chassis.version: Not Available dmi.modalias: dmi:bvnLENOVO:bvr8BET56WW(1.36):bd01/19/2012:svnLENOVO:pn4284BZ4:pvrThinkPadW520:rvnLENOVO:rn4284BZ4:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable: dmi.product.name: 4284BZ4 dmi.product.version: ThinkPad W520 dmi.sys.vendor: LENOVO
Created attachment 60277 [details] BootDmesg.txt
Created attachment 60278 [details] CurrentDmesg.txt
Created attachment 60279 [details] Lspci.txt
Created attachment 60280 [details] ProcModules.txt
Created attachment 60281 [details] ProcModules.txt
Created attachment 60282 [details] ThreadStacktrace.txt
This is another bug that we think is the same root cause: https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889 in this one, X comes up before the drm device is ready, and so trips on a different chunk of code. You can see from comparing timestamps in Xorg.0.log and dmesg when drm is accessed vs. when it is reporting itself ready. We've got a couple ideas on how to fix this in the distro. One is to put a loop around the code paths where the failures occur, to continue retrying for some number of seconds. But that feels like a big hack. The other idea would be if there was an event to indicate the driver is ready for use, that we could listen for and delay plymouth, X, etc. until it's received. But we don't know the feasibility of that.
We suspect that the reason this happens is due to a Ubuntu kernel patch, which was added to work around other boot crashing problems: http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-precise.git;a=commitdiff;h=6d74feca6235b463ade4ecddd1dfdb73d30a2ff7;hp=e29a4668d7441aa88d8015da51674a7e8159312b "When a drm driver is initialised we first allocate and initialise the drm minor numbers including creating the sysfs files, then we trigger the driver load method. The act of creating the sysfs files triggers the uevent. This means udev may start programs which open /dev/dri/card0 and other interfaces, this can occur before the load method has even started and thus before the driver has fully initialised its data structures. In the case of plymouthd this leads to it opening and closing (in disgust) the interface, which in turn leads to a kernel panic as the mutexes are yet to be initialised. "This patch delays the linking up of the drm devices minor numbers until the driver is fully initialised. As it is possible for consumers of these interfaces to reach them before they are fully initialised we arrange for opens of these devices to return EAGAIN until the device is fully initialised."
<jbarnes> so for 48894 I'd open a separate bug against drm for the core issue: if you access the device too early you get a crash <jbarnes> there's a similar bug with accessing the dpms status files in sysfs <jbarnes> if the module is unloading at the time, you can panic the kenrel <jbarnes> also a kernel bug I'll move this bug to drm, as I think the core issue is what we're really looking for advice on here.
<jbarnes> ok looks like a core drm kernel bug <jbarnes> we don't lock properly around initialization
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/drm/issues/8.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.