Bug 55213 - Early boot vsnprintf/vasprintf segfault
Summary: Early boot vsnprintf/vasprintf segfault
Status: RESOLVED FIXED
Alias: None
Product: systemd
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: x86 (IA32) Linux (All)
: high blocker
Assignee: systemd-bugs
QA Contact: systemd-bugs
URL:
Whiteboard:
Keywords:
: 55231 (view as bug list)
Depends on:
Blocks:
 
Reported: 2012-09-22 07:28 UTC by jpsinthemix
Modified: 2012-09-25 10:40 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Working (but ugly) fix of early boot vsnprintf/vasprintf segfault (20.97 KB, text/plain)
2012-09-22 07:28 UTC, jpsinthemix
Details
Working (less ugly) fix of early boot vsnprintf/vasprintf segfault (4.75 KB, text/plain)
2012-09-24 11:22 UTC, jpsinthemix
Details

Description jpsinthemix 2012-09-22 07:28:15 UTC
Created attachment 67533 [details]
Working (but ugly) fix of early boot vsnprintf/vasprintf segfault

Hi,

Beginning in post-systemd-189 git, and still present in systemd-191, there is a nasty problem in function log_struct_internal() (src/shared/log.c), which renders my PC unbootable, giving an early boot systemd[1] segfault in /lib/libc-2.16.90.so, and 'Freezing execution.'

My system: i686-pc-linux-gnu, gcc-4.7.1, glibc-2.16.0, linux-3.6.0-rc6, systemd-191 (all software built from source).

The segfault is due to a misuse of the vsnprintf() and vasprintf() functions. Specifically, one must not use va_arg() after calls to vsnprintf() or vasprintf() as these function/macros internally use va_arg() and may modify it undefined ways. In particular, from the vsnprintf manpage:

The functions vprintf(), vfprintf(), vsprintf(), vsnprintf() are equivalent to the functions printf(), fprintf(), sprintf(), snprintf(), respectively, except that they are called with a va_list instead of a variable number of arguments.  These  functions  do  not  call  the va_end macro.  Because they invoke the va_arg macro, the value of ap is undefined after the call.  See stdarg(3),

and from the stdarg manpage:

va_arg()
    ...
    If ap is passed to a function that uses va_arg(ap,type) then the value of ap is undefined after the return of that function.


In systemd-191/src/shared/log.c, we have the following problematic code in log_struct_internal():

705                 va_start(ap, format);
706                 while (format && n + 1 < ELEMENTSOF(iovec)) {
707                         char *buf;
708 
709                         if (vasprintf(&buf, format, ap) < 0) {
710                                 r = -ENOMEM;
711                                 goto finish;
712                         }
713 
714                         IOVEC_SET_STRING(iovec[n++], buf);
715 
716                         iovec[n].iov_base = (char*) &nl;
717                         iovec[n].iov_len = 1;
718                         n++;
719 
720                         format = va_arg(ap, char *); <---- use of ap here is invalid/non-portable
721                 }
722                 va_end(ap);


741                 /* Fallback if journal logging is not available */
742 
743                 va_start(ap, format);
744                 while (format) {
745 
746                         vsnprintf(buf, sizeof(buf), format, ap);
747                         char_array_0(buf);
748 
749                         if (startswith(buf, "MESSAGE=")) {
750                                 found = true;
751                                 break;
752                         }
753 
754                         format = va_arg(ap, char *); <---- use of ap here is invalid/non-portable
755                 }
756                 va_end(ap);


It appears that the assumption has been made that ap is internally incremented as values are formatted into buf, but there is no guarantee that this will occur. Apparently it does occur for some architecture/compiler/libcs, while not for others. This is truly unfortunate as the vsnprintf/vasprintf-relaled code now in log_struct_internal() is, in my judgement, quite elegant.

One way around this would be to write code to parse the format strings enough to figure out the number and types of args associated with each format (tricky and counter to the use of vsnprintf/vasprintf in the first place, and probably a pain to maintain and port).

I have attached a patch which 'fixes' this issue in the sense that I'm now up and running with systemd-191, and all appears to be well.. but it's a bit of a hack, and admittedly rather ugly. What I've done is to pass each format-vararglist pair in now occuring in log_struct_internal() calls to a new function log_struct_entry() which produces the desired string. Then, log_struct_internal() calls simply have to deal with a vararglist of strings.

hope this helps, and thanks for your time,
John

PS: I've been using systemd since v188 (w/o sysvinit) for desktop/laptops (and even in ramroots) and love it. The unit file syntax is nice and intuitive and things mostly work great out-of-the-box. This, is after several years of using Upstart, which worked adequately, but still was not as flexible as it should've been, and a was bit fragile..
Comment 1 Oleksii Shevchuk 2012-09-23 08:47:27 UTC
*** Bug 55231 has been marked as a duplicate of this bug. ***
Comment 2 Zbigniew Jedrzejewski-Szmek 2012-09-24 10:31:46 UTC
I think that the analysis is correct. I don't like the posted patch though. It is intrusive, and the formatting is done early, and the message might be skipped afterwards.

I have a patch in the works, will post soon, taking a different approach:
va_copy(aq, ap)
vasprintf(aq, ...)
update original ap using parse_printf_format
continue with va_arg(ap, ...)

The downsides are: slight ugliness of having to update ap manually, C99 requirement for parse_printf_format.
Comment 3 Lennart Poettering 2012-09-24 11:00:29 UTC
It really sucks that we have to go though the format string twice. But well, I guess the nice syntax does justify this. 

I wasn't aware of the existance of parse_printf_format(). Sounds like the best possible choice to fix this! Thanks for finding that!
Comment 4 jpsinthemix 2012-09-24 11:20:13 UTC
Wholeheartedly agree.

I have a nicer patch which I've been using for several days w/o problems which does enough format string parsing to allow manual ap updating (turns out to be not that difficult). On the other hand, it'd probably be better/safer to go with parse_printf_format (didn't know about that one). I'll post it for your amusement..
thanks for your time,
John
Comment 5 jpsinthemix 2012-09-24 11:22:19 UTC
Created attachment 67625 [details]
Working (less ugly) fix of early boot vsnprintf/vasprintf segfault
Comment 6 Lennart Poettering 2012-09-24 21:27:56 UTC
OK, I have now commited a fix for this, using Zbigniew's idea. (Sorry for being impatient, I'd really like 192 out of the door, fixing this).
Comment 7 Lennart Poettering 2012-09-24 21:29:11 UTC
Oh, I'd be really thankful if somebody could test this fix on his machine, since I actually never ran into it myself!

http://cgit.freedesktop.org/systemd/systemd/patch/?id=963ddb917de3140308ee62fb642b2307a577a39e
Comment 8 jpsinthemix 2012-09-24 22:57:26 UTC
Thumbs up here.
Comment 9 Gert Michael Kulyk 2012-09-25 05:32:48 UTC
Seems like your commit is fixing this. Thanks.
Comment 10 Lennart Poettering 2012-09-25 10:40:37 UTC
OK, closing then.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.