Bug 76886

Summary: Journald can cause an endless loop
Product: systemd Reporter: Bartlomiej Gluch <bartlomiej.gluch.ext>
Component: generalAssignee: systemd-bugs
Status: RESOLVED NOTOURBUG QA Contact: systemd-bugs
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: ARM   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:

Description Bartlomiej Gluch 2014-04-01 07:43:09 UTC
Systemd version 211
Linux kernel 3.13.3
ARMv7 based embedded system

It is possible o create an endless loop, when I2C device or power supply exists, but only with one of following kernel configs enabled:
CONFIG_POWER_SUPPLY_DEBUG
CONFIG_I2C_DEBUG_CORE

Both of this flags enables debug printout of function dev_dbg() in kernel in following files:
drivers/power/power_supply_sysfs.c (line265, function power_supply_uevent())
drivers/i2c/i2c-core.c (line 102, function i2c_device_uevent())

dev_dbg creates an syslog entry - which triggers journald to read uevent file for this device, which than again dev_dbg() is called causing journald to read uevent file - so there is an endless loop (technically, kernel will kill journald after some time)


Situation with CONFIG_POWER_SUPPLY_DEBUG is not observed in my system, but it is described here:
http://forums.gentoo.org/viewtopic-t-977530-postdays-0-postorder-asc-start-0.html

The reasons are the same as for I2C, uevent function 

Result of the bug:

If the kernel is started with "debug" in the bootargs, as soon as the udev starts there is an loop of debug outputs, making system unusable, after 10-12 minutes journald is killed.

If the kernel is started with loglevel=0 system is usable, but logs are filled with debug output.

Dump stack in uevent function shows that each time uevent was triggered by SySFS read/write:

[<c0422f94>] (unwind_backtrace+0x0/0xf8) from [<c041edac>] (show_stack+0x10/0x14)
[<c041edac>] (show_stack+0x10/0x14) from [<c09151b4>] (dump_stack+0x74/0x90)
[<c09151b4>] (dump_stack+0x74/0x90) from [<c07e0760>] (i2c_sysfs_new_device+0xf8/0x20c)
[<c07e0760>] (i2c_sysfs_new_device+0xf8/0x20c) from [<c06f2df4>] (dev_attr_store+0x18/0x24)
[<c06f2df4>] (dev_attr_store+0x18/0x24) from [<c0536d54>] (sysfs_write_file+0x14c/0x1dc)
[<c0536d54>] (sysfs_write_file+0x14c/0x1dc) from [<c04df208>] (vfs_write+0xa8/0x1b0)
[<c04df208>] (vfs_write+0xa8/0x1b0) from [<c04df638>] (SyS_write+0x3c/0x70)
[<c04df638>] (SyS_write+0x3c/0x70) from [<c041b560>] (ret_fast_syscall+0x0/0x30)


lsof -r1 | grep /sys/devices/2010085000.i2c/i2c-1/1-0054/uevent

systemd-j 146           root   12r      REG       0,13     4096       6833 /sys/devices/2010085000.i2c/i2c-1/1-0054/uevent


Bug can be workarounded in kernel by changing dev_dbg() to pr_debug() in uevent functions.
Comment 1 Kay Sievers 2014-04-02 10:39:06 UTC
(In reply to comment #0)
> dev_dbg creates an syslog entry - which triggers journald to read uevent
> file for this device, which than again dev_dbg() is called causing journald
> to read uevent file - so there is an endless loop (technically, kernel will
> kill journald after some time)

This sounds like a bug in the kernel.

Most important, the conceptually broken idea of the power_supply class
to transport *measurement data*, *user access* or other unrelated things
over uevents should really be fixed, it cannot fly.

Uevents were never made for such a use case, uevents are very expensive
events for global device state changes regarding the device and its
*integration* into system management; there must never be any uevents
generated for normal expected device operations like device access or
charge level changes.

Please try to get the specific kernel driver fixed, or even the
power_supply class itself fixed. What userspace is doing here sounds totally
fine and nothing should need to work around such kernel brokeness. This
kernel behaviour makes no sense for userspace.
Comment 2 Abylay Ospan 2015-05-25 01:57:43 UTC
I observe same infinite loop. systemd-journald read following:
p=7,5120,282337378,-;power_supply AC: prop ONLINE=1
 SUBSYSTEM=power_supply
 DEVICE=+power_supply:AC

and then read '/sys/class/power_supply/AC/uevent' again which cause above message again => infinite loop (100% CPU usage by systemd-journald).
If this kernel bug do we need to send bug report/patch to kernel ?

For now I have disabled 'CONFIG_POWER_SUPPLY_DEBUG'. For me this 'fix' is ok (I don't need this debug).
 


Here is backtrace:

/dev/kmsg buffer overrun, some messages lost.

Breakpoint 1, device_read_uevent_file (device=0x5555555caf20) at src/libsystemd/sd-device/sd-device.c:482
482     in src/libsystemd/sd-device/sd-device.c
(gdb) bt
#0  device_read_uevent_file (device=0x5555555caf20) at src/libsystemd/sd-device/sd-device.c:482
#1  0x0000555555587368 in sd_device_get_devname (device=0x5555555caf20, devname=0x7fffffffb8d0) at src/libsystemd/sd-device/sd-device.c:933
#2  0x0000555555581b14 in udev_device_get_devnode (udev_device=0x5555555cacb0) at src/libudev/libudev-device.c:677
#3  0x000055555555be2d in dev_kmsg_record (s=0x7fffffffe450, p=0x7fffffffc2c3 "power_supply AC: prop ONLINE=1", l=0) at src/journal/journald-kmsg.c:228
#4  0x000055555555c834 in server_read_dev_kmsg (s=0x7fffffffe450) at src/journal/journald-kmsg.c:348
#5  0x000055555555cb13 in dispatch_dev_kmsg (es=0x5555555c96b0, fd=7, revents=1, userdata=0x7fffffffe450) at src/journal/journald-kmsg.c:390
#6  0x000055555558d0c1 in source_dispatch (s=0x5555555c96b0) at src/libsystemd/sd-event/sd-event.c:2114
#7  0x000055555558e4f8 in sd_event_dispatch (e=0x5555555c91e0) at src/libsystemd/sd-event/sd-event.c:2471
#8  0x000055555558e670 in sd_event_run (e=0x5555555c91e0, timeout=18446744073709551615) at src/libsystemd/sd-event/sd-event.c:2499
#9  0x000055555555ae0d in main (argc=1, argv=0x7fffffffe768) at src/journal/journald.c:106
(gdb) frame 3
#3  0x000055555555be2d in dev_kmsg_record (s=0x7fffffffe450, p=0x7fffffffc2c3 "power_supply AC: prop ONLINE=1", l=0) at src/journal/journald-kmsg.c:228
228     src/journal/journald-kmsg.c: No such file or directory.
(gdb) p kernel_device
$1 = 0x5555555ca5ef "+power_supply:AC"

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.