Bug 91590 - plymouthd crashes on boot
Summary: plymouthd crashes on boot
Status: RESOLVED MOVED
Alias: None
Product: plymouth
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Ray Strode [halfline]
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-09 15:19 UTC by Dāvis
Modified: 2018-08-07 09:25 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
boot-splash: don't crash in free if module not loaded (3.92 KB, patch)
2015-08-10 14:15 UTC, Ray Strode [halfline]
Details | Splinter Review

Description Dāvis 2015-08-09 15:19:13 UTC
plymouthd crashes on boot, but sometimes it works on shutdown/reboot. Also some while ago (maybe different config, idk) it sometimes worked, but now lately not at all.
I've complied latest git version from master branch. Using proprietary nvidia drivers 352.30, with initramfs, Linux 4.1.4 and rEFInd UEFI bootloader, fb 0 EFI VGA

#0  0x00007f3002d3ca3c in ply_boot_splash_unset_keyboard (splash=splash@entry=0x112fd70) at ply-boot-splash.c:120
#1  0x00007f3002d3d195 in ply_boot_splash_free (splash=0x112fd70) at ply-boot-splash.c:404
#2  0x000000000040b1d5 in load_built_in_theme (state=<optimized out>) at main.c:1644
#3  show_theme (state=state@entry=0x7fffb2b4d630, theme_path=theme_path@entry=0x40f8a0 "/usr/share/plymouth/themes/default.plymouth") at main.c:1700
#4  0x000000000040b71d in show_default_splash (state=0x7fffb2b4d630) at main.c:475
#5  0x000000000040ce10 in show_default_splash (state=0x7fffb2b4d630) at main.c:946
#6  show_splash (state=0x7fffb2b4d630) at main.c:956
#7  0x00007f3002d33d4e in create_text_displays_for_terminal (terminal=0x112f630, manager=0x112f580) at ply-device-manager.c:640
#8  create_devices_for_terminal_and_renderer_type (manager=0x112f580, device_path=<optimized out>, terminal=0x112f630, renderer_type=<optimized out>)
    at ply-device-manager.c:692
#9  0x00007f3002f4ba86 in ply_event_loop_handle_met_status_for_source (status=PLY_EVENT_LOOP_FD_STATUS_HAS_DATA, source=0x112f4c0, loop=0x1128150)
    at ply-event-loop.c:1034
#10 ply_event_loop_process_pending_events (loop=loop@entry=0x1128150) at ply-event-loop.c:1279
#11 0x00007f3002f4bf38 in ply_event_loop_run (loop=0x1128150) at ply-event-loop.c:1310
#12 0x0000000000404fd6 in main (argc=<optimized out>, argv=<optimized out>) at main.c:2308


Dump of assembler code for function ply_boot_splash_unset_keyboard:
   0x00007f3002d3ca38 <+0>:     mov    rax,QWORD PTR [rdi+0x10]
=> 0x00007f3002d3ca3c <+4>:     mov    rax,QWORD PTR [rax+0x18]
   0x00007f3002d3ca40 <+8>:     test   rax,rax


Looks like splash->plugin_interface is NULL pointer

118     ply_boot_splash_unset_keyboard (ply_boot_splash_t *splash)
119     {
120             if (splash->plugin_interface->unset_keyboard == NULL)
121                     return;
122
123             splash->plugin_interface->unset_keyboard (splash->plugin, splash->keyboard);
124     }



from journal

kernel: vgaarb: setting as boot device: PCI:0000:01:00.0
kernel: vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none
kernel: vgaarb: loaded
kernel: vgaarb: bridge control possible 0000:01:00.0
kernel: efifb: probing for efifb
kernel: efifb: framebuffer at 0xdf000000, mapped to 0xffffc90001000000, using 3072k, total 3072k
kernel: efifb: mode is 1024x768x32, linelength=4096, pages=1
kernel: efifb: scrolling: redraw
kernel: efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
kernel: Console: switching to colour frame buffer device 128x48
kernel: fb0: EFI VGA frame buffer device
kernel: GHES: HEST is not enabled!
kernel: Linux agpgart interface v0.103
systemd[1]: Started udev Coldplug all Devices.
systemd[1]: Starting Show Plymouth Boot Screen...
systemd-journal[75]: Journal started
plymouthd[114]: [main.c:1891]                       check_verbosity:redirecting debug output to /dev/tty1
systemd[1]: Reached target System Initialization.
systemd[1]: Started Journal Service.
plymouth[120]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[120]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
kernel: serio: i8042 KBD port at 0x60,0x64 irq 1
plymouth[120]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[120]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[120]: [./ply-boot-client.c:795]            ply_boot_client_detach_from_event_loop:detaching from event loop
kernel: nvidia: module license 'NVIDIA' taints kernel.
systemd[1]: Started Show Plymouth Boot Screen.
systemd[1]: Reached target Basic System.
kernel: vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
kernel: [drm] Initialized nvidia-drm 0.0.0 20150116 for 0000:01:00.0 on minor 0
kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module  352.30  Tue Jul 21 18:53:45 PDT 2015
plymouth[731]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[731]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[731]: [ply-event-loop.c:1060]              ply_event_loop_handle_disconnect_for_source:calling disconnected_handler 0x403310 for fd 6
plymouth[731]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[731]: [ply-event-loop.c:768]               ply_event_loop_stop_watching_fd:source for fd 6 is already disconnected
plymouth[731]: error: unexpectedly disconnected from boot status daemon
plymouth[731]: [./plymouth.c:414]                                 on_disconnect:disconnect
plymouth[731]: [ply-event-loop.c:1064]   ply_event_loop_handle_disconnect_for_source:done calling disconnected_handler 0x403310 for fd
plymouth[731]: [ply-event-loop.c:1144]   ply_event_loop_free_destinations_for_source:freeing destination (0, (nil), 0x403310) of fd 6
plymouth[731]: [ply-event-loop.c:1144]   ply_event_loop_free_destinations_for_source:freeing destination (1, 0x402b80, (nil)) of fd 6
systemd[1]: plymouth-start.service: Main process exited, code=dumped, status=11/SEGV
systemd[1]: plymouth-start.service: Unit entered failed state.
systemd[1]: plymouth-start.service: Failed with result 'core-dump'.
systemd[1]: Started Tell Plymouth To Write Out Runtime Data.
plymouth[731]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
systemd[1]: Reached target System Initialization.
systemd-coredump[799]: Process 117 (plymouthd) of user 0 dumped core.
systemd[1]: Started NVIDIA Persistence Daemon.
systemd[1]: Starting Hold until boot process finishes up...
systemd[1]: Starting Terminate Plymouth Boot Screen...
plymouth[824]: [./ply-boot-client.c:183]                       ply_boot_client_connect:could not connect to /org/freedesktop/plymouthd: Connection refused
plymouth[824]: [./ply-boot-client.c:184]                       ply_boot_client_connect:trying old fallback path /ply-boot-protocol
plymouth[824]: [./ply-boot-client.c:190]                       ply_boot_client_connect:could not connect to /ply-boot-protocol: Connection refused
plymouth[824]: [./plymouth.c:1121]                                          main:daemon not running
plymouth[824]: [./plymouth.c:1132]                                          main:no need to wait
systemd[1]: Started Hold until boot process finishes up.
plymouth[825]: [./ply-boot-client.c:183]                       ply_boot_client_connect:could not connect to /org/freedesktop/plymouthd: Connection refused
plymouth[825]: [./ply-boot-client.c:184]                       ply_boot_client_connect:trying old fallback path /ply-boot-protocol
plymouth[825]: [./ply-boot-client.c:190]                       ply_boot_client_connect:could not connect to /ply-boot-protocol: Connection refused
plymouth[825]: [./plymouth.c:1121]                                          main:daemon not running
plymouth[825]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
systemd[1]: Started Terminate Plymouth Boot Screen.
systemd[1]: Started Simple Desktop Display Manager.
systemd[1]: Started Login Service.
systemd-logind[822]: New seat seat0.
Comment 1 Ray Strode [halfline] 2015-08-10 14:15:23 UTC
Created attachment 117613 [details] [review]
boot-splash: don't crash in free if module not loaded

ply_boot_splash_free currently calls some code that depends
on a module being loaded.  We call ply_boot_splash_free to
clean up the boot splash object if a module can't be loaded,
leading to crash.

This commit addresses that issue by only calling the module
specific destruction code in ply_boot_splash_free in the case
where a module is loaded.
Comment 2 Ray Strode [halfline] 2015-08-10 14:18:56 UTC
So I pushed this, but there's more to the story. If you're seeing
this crash, then you can't load the built in plugin.  That should
never fail since it's ... built in.  Maybe somehow plymouthd isn't
getting built -rdynamic ?

Attachment 117613 [details] pushed as f806843 - boot-splash: don't crash in free if module not loaded
Comment 3 Dāvis 2015-08-10 17:14:01 UTC
(In reply to Ray Strode [halfline] from comment #2)
> So I pushed this, but there's more to the story. If you're seeing
> this crash, then you can't load the built in plugin.  That should
> never fail since it's ... built in.  Maybe somehow plymouthd isn't
> getting built -rdynamic ?
> 
> Attachment 117613 [details] pushed as f806843 - boot-splash: don't crash in
> free if module not loaded

Thanks for quick fix, it improved situation that atleast now it doesn't crash, only it shows this http://i.imgur.com/GRkzyak.jpg splash instead of my image. Now same on shutdown too, previously sometimes on shutdown it would show my image.

I'm compiling with GCC 5.2.0 and

CPPFLAGS="-D_FORTIFY_SOURCE=2"
CFLAGS=-march="native -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4 -g -fvar-tracking-assignments -Wall -Wstrict-aliasing=2"
CXXFLAGS="-march=native -O2 -pipe -fstack-protector-strong --param=ssp-buffer-size=4 -g -fvar-tracking-assignments -Wall -Wstrict-aliasing=2"
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro"


./autogen.sh \
    --prefix=/usr \
    --exec-prefix=/usr \
    --sysconfdir=/etc \
    --localstatedir=/var \
    --libdir=/usr/lib \
    --libexecdir=/usr/lib \
    --sbindir=/usr/bin \
    --enable-systemd-integration \
    --enable-drm \
    --enable-tracing \
    --enable-pango \
    --enable-gtk=no \
    --with-release-file=/etc/os-release \
    --with-logo=/usr/share/plymouth/arch-logo.png \
    --with-background-color=0x000000 \
    --with-background-start-color-stop=0x000000 \
    --with-background-end-color-stop=0x4D4D4D \
    --without-rhgb-compat-link \
    --without-system-root-install

make


I tried without any optimizations and even explicitly adding -rdynamic to CFLAGS/CXXFLAGS but that didn't change anything.
Maybe in initramfs I don't have something needed?
Here's a file list of what I've there https://paste.kde.org/p8r6mi0cs

/etc/plymouth/plymouthd.conf contains
[Daemon]
Theme=arch-glow
ShowDelay=9


and see /var/log/plymouth-debug.log at https://paste.kde.org/pquejnnsr


journal

plymouthd[110]: [main.c:1891]                               check_verbosity:redirecting debug output to /dev/tty1
plymouth[114]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[114]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[114]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[114]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[114]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
plymouth[674]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[674]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[674]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[674]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[674]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
plymouth[1169]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[1169]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
systemd[1]: Received SIGRTMIN+21 from PID 113 (plymouthd).
plymouth[1169]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[1169]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[1169]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
plymouth[1170]: [ply-event-loop.c:1060]   ply_event_loop_handle_disconnect_for_source:calling disconnected_handler 0x403310 for fd 6
plymouth[1170]: [./plymouth.c:414]                                 on_disconnect:disconnect
plymouth[1170]: [ply-event-loop.c:1064]   ply_event_loop_handle_disconnect_for_source:done calling disconnected_handler 0x403310 for fd 6
plymouth[1170]: [ply-event-loop.c:1144]   ply_event_loop_free_destinations_for_source:freeing destination (0, (nil), 0x403310) of fd 6
plymouth[1170]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop
plymouthd[19722]: [main.c:1891]                               check_verbosity:redirecting debug output to /dev/tty63
plymouth[19728]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[19728]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[19728]: [ply-event-loop.c:759]               ply_event_loop_stop_watching_fd:stopping watching fd 6
plymouth[19728]: [ply-event-loop.c:775]               ply_event_loop_stop_watching_fd:removing destination for fd 6
plymouth[19728]: [./ply-boot-client.c:795]        ply_boot_client_detach_from_event_loop:detaching from event loop



Thanks. I'm really interested to getting this to work so I would appreciate any tips on how to debug it further.
Comment 4 Ray Strode [halfline] 2015-08-10 18:00:14 UTC
You have a splash delay of 9 which means it won't try to show the splash for 9 seconds.  shutdown probably takes less than 9 seconds in most cases. Try lowering the splash delay and see if that makes shutdown more reliable.

Now onto the boot log:
[./plugin.c:1003]                                  query_device:Could not get card resources
^-- That's weird. you have drm device but you can't query it.  what sort of graphics hardware do you have?  could be a driver bug there.  I'd suggest adding drm.debug=7 on the kernel command line and see if that helps make things clearer



[ply-device-manager.c:748]                      create_devices_from_udev:Looking for devices from udev
[ply-device-manager.c:273]                  create_devices_for_subsystem:creating objects for drm devices
[ply-device-manager.c:273]                  create_devices_for_subsystem:creating objects for frame buffer devices
[ply-device-manager.c:756]                      create_devices_from_udev:Creating non-graphical devices, since there's no suitable graphics hardware
^-- so despite the devices showing up in udev before, they're not later, after coldplug completes. Is udev crashing or getting killed ? is the graphics driver getting unloaded?
Comment 5 Ray Strode [halfline] 2015-08-10 18:01:19 UTC
oh i just realized you're using the proprietary nvidia drivers.  The proprietary nvidia drivers don't support kernel modesetting. I'm a little confused how it could be working under any circumstance for you.
Comment 6 Dāvis 2015-08-14 04:45:06 UTC
(In reply to Ray Strode [halfline] from comment #4)
> You have a splash delay of 9 which means it won't try to show the splash for
> 9 seconds.  shutdown probably takes less than 9 seconds in most cases. Try
> lowering the splash delay and see if that makes shutdown more reliable.
 
Indeed, I set it to 0 and now it always works on reboot/shutdown, only it first
starts in text mode (3 triangles) and then switches to correct
colourful theme http://i.imgur.com/7oqIdTw.jpg

I didn't find anywhere what exactly means ShowDelay and so I assumed it
means extra delay for how long to show splash. That's why I increased it
hoping I would see it for longer.

> Now onto the boot log:
> [./plugin.c:1003]                                  query_device:Could not
> get card resources
> ^-- That's weird. you have drm device but you can't query it.  what sort of
> graphics hardware do you have?  could be a driver bug there.  I'd suggest
> adding drm.debug=7 on the kernel command line and see if that helps make
> things clearer

I've Nvidia GTX 650 Ti with proprietary nvidia drivers 352.30
Here's journal with drm.debug https://paste.kde.org/ps1ko65wn

> [ply-device-manager.c:748]                     
> create_devices_from_udev:Looking for devices from udev
> [ply-device-manager.c:273]                 
> create_devices_for_subsystem:creating objects for drm devices
> [ply-device-manager.c:273]                 
> create_devices_for_subsystem:creating objects for frame buffer devices
> [ply-device-manager.c:756]                     
> create_devices_from_udev:Creating non-graphical devices, since there's no
> suitable graphics hardware
> ^-- so despite the devices showing up in udev before, they're not later,
> after coldplug completes. Is udev crashing or getting killed ? is the
> graphics driver getting unloaded?

I don't think, I don't have any crash reports, no erorrs in journal.

(In reply to Ray Strode [halfline] from comment #5)
> oh i just realized you're using the proprietary nvidia drivers.  The
> proprietary nvidia drivers don't support kernel modesetting. I'm a little
> confused how it could be working under any circumstance for you.

I actually figured out what's going on. When rebooting/shutting down then
nvidia driver is unloaded and then Plymouth uses framebuffer to display
splash but before nvidia is unloaded it uses that text splash.

There's still some kind of bug that prevents Plymouth from using
framebuffer when starting up even when no driver is loaded, I tested that
if I remove nvidia/nouveau drivers and just leave (U)EFI framebuffer
then Plymouth doesn't display anything on boot (not even that text mode like
it does with nvidia driver) but works correctly on reboot/shutdown. Here's a
plymouth debug log https://paste.kde.org/pwrpkudnh
That (U)EFI framebuffer (/dev/fb0) works fine I tested with fbv that can
disaply images. Also I'm using rEFInd boootloader which displays pretty
boot menu http://i.imgur.com/XjTvjQF.jpg

From kernel log can see that everything is good

kernel: Console: colour dummy device 80x25
kernel: console [tty0] enabled
kernel: vgaarb: setting as boot device: PCI:0000:01:00.0
kernel: vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none
kernel: vgaarb: loaded
kernel: vgaarb: bridge control possible 0000:01:00.0
kernel: efifb: probing for efifb
kernel: efifb: framebuffer at 0xdf000000, mapped to 0xffffc90001000000, using 3072k, total 3072k
kernel: efifb: mode is 1024x768x32, linelength=4096, pages=1
kernel: efifb: scrolling: redraw
kernel: efifb: Truecolor: size=8:8:8:8, shift=24:16:8:0
kernel: Console: switching to colour frame buffer device 128x48
kernel: fb0: EFI VGA frame buffer device

Only for some reason kernel switches to text console even when rEFInd
quits and starts kernel in (U)EFI graphic mode. Any idea how to make it
stay in that so that Plymouth could directly use it and wouldn't have
any mode switches?

Also by the way I managed to get another crash, but this happend only
few times and not sure what caused it.

#0  0x0000000000000000 in ?? ()
#1  0x00007f9990d7d47d in ply_keyboard_new_for_renderer (renderer=renderer@entry=0xea0720) at ply-keyboard.c:139
#2  0x00007f9990d7ac24 in create_devices_for_terminal_and_renderer_type (manager=0xea04d0, device_path=0xea0240 "/dev/tty63",
    terminal=0xea0580, renderer_type=PLY_RENDERER_TYPE_AUTO) at ply-device-manager.c:675
#3  0x0000000000404f98 in load_devices (flags=PLY_DEVICE_MANAGER_FLAGS_IGNORE_UDEV, state=0x7fffbde81f50) at main.c:1073
#4  main (argc=<optimized out>, argv=<optimized out>) at main.c:2305


Anyway I worked around this plymouth bug that it doesn't work on boot
for (U)EFI console. I modified kernel a bit so that efifb is
a module instead of builtin. Then I wrote this switch-modules.sh
script https://gist.github.com/davispuh/84674924dff1db3e7844 and 
switch-nvidia.service https://gist.github.com/davispuh/0eba8609e333b82114d0

With them I boot using nouveau and see nice splash :) then unload nouveau
and load efifb and nvidia driver. And on reboot/shutdown unload nvidia.
This way now I got Plymouth fully working even when using nvidia
drivers :) It's not ideal because there are several mode switches
firstly from bootloader to kernel and then from Plymouth to display
manager. How does Plymouth determine when it should quit and system
is booted? I think it quits way too early for me and I've quite
long time black screen after it finishes before display manager
starts.

I played a bit with nouveau/efifb/nvidia module loading/unloading
and it actually works fine. When booted with efifb I've 1024x768
console reslution, then when loading nouveau it switches to native
resolution 2560x1440 and then when unloading noueveau and loading
efifb, console switches back to 1024x768 only it's all upscaled to
2560x1440 so you can see it very pixelated, that's because when
unloading nouveau it doesn't change GPU resolution back. I haven't
found how to make nouevau change GPU resolution so that before
unloading it I could set it back. But then if I unload nouevau
and load efifb and nvidia I get correct GPU resolution back again
and perfectly working console. Also there's some bug in noueveau
that if I load nvidia, unload nvidia and load nouveau then
it crashes and screen gets garbled.

nouveau E[   PDISP][0000:01:00.0][0xc000917e] fini: 0x40020088
Comment 7 GitLab Migration User 2018-08-07 09:25:49 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/plymouth/plymouth/issues/21.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.