When I have a statement like this in a C program if (some error) { syslog(LOG_ERR | LOG_USER,"some error description"\n); exit(EXIT_FAILURE); } the error description will not be found in the journal DB. Even a closelog()-call before exit() doesn't help. The only way to get the description recorded is a code snippet like this: if (some error) { syslog(LOG_ERR | LOG_USER,"some error description"\n); sleep(1); exit(EXIT_FAILURE); } When using syslog-ng instead if (some error) { syslog(LOG_ERR | LOG_USER,"some error description"\n); exit(EXIT_FAILURE); } records the error message without any problems in /var/log/user.log I'm using Arch Linux x86-64 Kernel 3.11.1-1 Systemd 207
Are you sure that those messages are not logged at all? There's a race condition in which messages which are processed by journald after the logging process has exited are not correctly attributed, and don't have the information necessary to display them properly with journalctl -u and similar.
Yes I think so. I wrote a small test program. When I write the syslog() without an delay before the exit of the program, the message will not be logged. With an sleep(1); statement before the program exit, the message is logged. I watched the journal with $ journal -f while running the test-program. My simple test program looks like this: #include <stdio.h> #include <stdlib.h> #include <syslog.h> int main() { FILE *pf; /* logging made in file /var/log/syslog */ openlog("martins",LOG_CONS,LOG_USER); pf = fopen("not_here","r"); if (!pf) syslog(LOG_CRIT | LOG_USER,"oops -- %m\n"); //sleep(1); //closelog(); //int i; //for (i=0;i<1000000;i++) ; exit(EXIT_FAILURE); return 0; } As you can see, I also experimented with some alternatives of the sleep(1), etc.
Can you run strace on the systemd-journal daemon? This will tell us where the messages are getting lost: sudo strace -p $(pidof systemd-journald) -s999
OK, i've run the strace command. Here is the output: [martin@higgs ~]$ sudo strace -p $(pidof systemd-journald) -s999 Process 119 attached epoll_wait(7, {{EPOLLIN, {u32=5, u64=5}}}, 1, -1) = 1 ioctl(5, FIONREAD, [63]) = 0 recvmsg(5, {msg_name(0)=NULL, msg_iov(1)=[{"<10>Sep 29 21:24:58 martins: oops -- No such file or directory\n", 2320}], msg_controllen=64, {cmsg_len=32, cmsg_level=SOL_SOCKET, cmsg_type=0x1d /* SCM_??? */, ...}, msg_flags=MSG_CMSG_CLOEXEC}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = 63 sendmsg(5, {msg_name(29)={sa_family=AF_LOCAL, sun_path="/run/systemd/journal/syslog"}, msg_iov(1)=[{"<10>Sep 29 21:24:58 martins: oops -- No such file or directory", 62}], msg_controllen=28, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=3254, uid=1000, gid=1000}}, msg_flags=0}, MSG_NOSIGNAL) = -1 ESRCH (No such process) sendmsg(5, {msg_name(29)={sa_family=AF_LOCAL, sun_path="/run/systemd/journal/syslog"}, msg_iov(1)=[{"<10>Sep 29 21:24:58 martins: oops -- No such file or directory", 62}], msg_controllen=28, {cmsg_len=28, cmsg_level=SOL_SOCKET, cmsg_type=SCM_CREDENTIALS{pid=119, uid=1000, gid=1000}}, msg_flags=0}, MSG_NOSIGNAL) = -1 ENOENT (No such file or directory) open("/proc/1/cgroup", O_RDONLY|O_CLOEXEC) = 23 fstat(23, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0e14b14000 read(23, "9:blkio:/\n8:net_cls:/\n7:freezer:/\n6:devices:/\n5:memory:/\n4:cpuacct,cpu:/\n3:cpuset:/\n2:name=systemd:/system.slice\n", 1024) = 113 close(23) = 0 munmap(0x7f0e14b14000, 4096) = 0 open("/proc/3254/cgroup", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/proc/3254/comm", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) readlink("/proc/3254/exe", 0xea6ba0, 99) = -1 ENOENT (No such file or directory) open("/proc/3254/cmdline", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/proc/3254/status", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/proc/1/cgroup", O_RDONLY|O_CLOEXEC) = 23 fstat(23, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0e14b14000 read(23, "9:blkio:/\n8:net_cls:/\n7:freezer:/\n6:devices:/\n5:memory:/\n4:cpuacct,cpu:/\n3:cpuset:/\n2:name=systemd:/system.slice\n", 1024) = 113 close(23) = 0 munmap(0x7f0e14b14000, 4096) = 0 open("/proc/3254/cgroup", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) uname({sys="Linux", node="higgs", ...}) = 0 ftruncate(14, 18391040) = 0 ioctl(5, FIONREAD, [0]) = 0 recvmsg(5, 0x7fff74e14350, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable) epoll_wait(7, ^CProcess 119 detached <detached ...> [martin@higgs ~]$
Hmm, your test program works just fine here: Sep 30 00:25:40 lon martins[2196]: oops -- No such file or directory
This is very interesting Kay, May you state which version of systemd you are using, which distro and the hardware you are running? I guess it's a timing problem what I'm facing to. Note the 'No such process' message in my trace output, I posted yesterday. It seems my application has gone too early for writing the output to the journal properly. Or it might also be a configuration problen, but I did nothing special with systemd on my machine, just the installation defaults of Arch Linux. Oh, I can reproduce the problem on two machines, both with Arch Linux installed: - A Lenovo X 1 with an Intel Core i5 with Arch 64Bits - An old Lenovo X 61 with an Intel Core 2 Duo with Arch 32Bits. I hope this information help to find the problem.
(In reply to comment #6) > May you state which version of systemd you are using, which distro and the > hardware you are running? Kay's running systemd from git, and the hardware is the same, although it is unlikely to be relevant. > I guess it's a timing problem what I'm facing to. > Note the 'No such process' message in my trace output, I posted yesterday. > It seems my application has gone too early for writing the output to the > journal properly. Yes. This means that the ancillary information like cgroup, commandline, etc., cannot be attached to the message. This is a known problem and doesn't explain why the message gets lost. > Or it might also be a configuration problen, but I did nothing special with > systemd on my machine, just the installation defaults of Arch Linux. > > Oh, I can reproduce the problem on two machines, both with Arch Linux > installed: > - A Lenovo X 1 with an Intel Core i5 with Arch 64Bits > - An old Lenovo X 61 with an Intel Core 2 Duo with Arch 32Bits. Looking at the strace, journald gets to write_to_journal() correctly, and even calls journal_file_append_entry(). I think so, because ftruncate() can be only called from journal_file_post_change(), and journal_file_append_entry() is in turn the only place where that can be called from. Normally j_file_append_entry() would generate some fallocate calls: open("/proc/28702/cgroup", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) uname({sys="Linux", node="spora", ...}) = 0 fallocate(25, 0, 3817472, 4096) = 0 fstat(25, {st_mode=S_IFREG|0640, st_size=3821568, ...}) = 0 mmap(NULL, 3821568, PROT_READ|PROT_WRITE, MAP_SHARED, 25, 0) = 0x7f271a1f6000 ftruncate(25, 3821568) = 0 In your strace, we go directly to ftruncate (which is used to tell listeners that new data is available in the journal). I'm not sure what is going on here. I'll try to get arch installed and test if I can reproduce this, but it's going to be at least a few days. It would be great if you could attach gdb to journald, and set a breakpoint in write_to_journal() and see what happens there. > I hope this information help to find the problem.
First of all, you're right. My syslog messages are really logged, but I didn't see it. The problem was: even, I was member of the group 'adm' on my system, I couldn't see the messages with the race condition. The reason was that the Arch guys apparently changed the corresponding group from 'adm' to 'systemd-journal' last April. I didn't recognize that. Now, as I'm member of the right group again, I can also see the race condition's messages. I'm sorry that I didn't found this earlier. As you mentioned that the race condition is an already known issue, I would mark this bug ticket as a duplicate, but honestly, I didn't find the appropriate ticket.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.