Original bug report: https://bugzilla.gnome.org/show_bug.cgi?id=720179 Long standing bug 73072 describes a problem where logger may fail to parse logs containing weird strings. Related to this are certain code paths within logger whose poor error and/or NULL checking escalates this problem from an inconvenience to a crash during application startup. A stack trace example of this is: #0 0x00007fffd29c8be7 in tpl_log_iter_xml_get_events (iter=<optimized out>, num_events=5, error=<optimized out>) at log-iter-xml.c:88 #1 0x00007fffd29d4303 in tpl_log_walker_fill_cache_async_thread (simple=0x66bae0 [GSimpleAsyncResult], object=<optimized out>, cancellable=<optimized out>) at log-walker.c:310 #2 0x00007ffff579779f in run_in_thread (job=<optimized out>, c=0x0, _data=0x121c960) at gsimpleasyncresult.c:858 #3 0x00007ffff57845e6 in io_job_thread (task=<optimized out>, source_object=<optimized out>, task_data=0x127f540, cancellable=<optimized out>) at gioscheduler.c:85 #4 0x00007ffff57a7c75 in g_task_thread_pool_thread (thread_data=0x10420e0, pool_data=<optimized out>) at gtask.c:1215 #5 0x00007ffff520bd68 in g_thread_pool_thread_proxy (data=<optimized out>) at gthreadpool.c:307 #6 0x00007ffff520b3d5 in g_thread_proxy (data=0x9b1230) at gthread.c:764 #7 0x00007ffff371b52a in start_thread (arg=0x7fffd0cfe700) at pthread_create.c:310 #8 0x00007ffff345722d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 A little exploring makes clear how this can happen: (gdb) list 83 84 priv->next_event = g_list_last (priv->events); 85 priv->next_date = g_list_previous (priv->next_date); 86 } 87 88 event = TPL_EVENT (priv->next_event->data); 89 events = g_list_prepend (events, g_object_ref (event)); 90 i++; 91 92 priv->next_event = g_list_previous (priv->next_event); (gdb) print priv $1 = (TplLogIterXmlPriv *) 0x10eb760 (gdb) print priv->next_event $2 = 0x0 (gdb) print priv->events $3 = 0x0 The attached patch fixes the crash (but, to be clear, it does *not* fix bug 73072). Nevertheless I view the patch as sensible robustness improvement that fails call is a reasonably sensible way. Messages from polari without the change are: --- cut here --- /home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8 arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: * ^ (org.gnome.Polari:5812): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log' Segmentation fault (core dumped) --- cut here --- With the change we can see the system stay alive with the GUI running: --- cut here --- /home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8 arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: * ^ (org.gnome.Polari:17277): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log' /home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8 arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: * ^ (org.gnome.Polari:17277): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log' --- cut here ---
Created attachment 114349 [details] [review] logger: Better NULL checking in tpl_log_iter_xml_get_event
Comment on attachment 114349 [details] [review] logger: Better NULL checking in tpl_log_iter_xml_get_event Review of attachment 114349 [details] [review]: ----------------------------------------------------------------- Thanks for tracking this down, Daniel. I am not a telepathy reviewer, but I did write most of that file... It would be better if we tried to skip the date with the faulty file and tried the next one. Instead of completely bailing out. ::: telepathy-logger/log-iter-xml.c @@ +80,5 @@ > priv->events = _tpl_log_store_get_events_for_date (priv->store, > priv->account, priv->target, priv->type_mask, > (GDate *) priv->next_date->data); > + if (!priv->events) > + break; Nitpick: priv->events == NULL
Created attachment 115213 [details] [review] log-iter-xml: Fix crash when unable to parse a file How about something like this?
I try to test this if I can. Unfortunately the broken log files have been rotated out now so I no longer get the error when starting Polari. I'll see if I can corrupt them deliberately.
Created attachment 115217 [details] [review] tests: Add new TpLogger logs that don't parse as XML
Created attachment 115218 [details] [review] tests: Add a test for log-iter-xml/get-events
Since bug 40675 , we try to recover from XML parsing failures as much as possible. That seems to be enough for all the faulty XML files that were reported for the original Polari bug [1]. Unfortunately, that patch isn't in the 0.8.x series, so users don't benefit from it. Even then, I guess, it doesn't hurt to defend against this case. [1] https://bugzilla.gnome.org/show_bug.cgi?id=720179
Created attachment 115219 [details] [review] tests: Add new TpLogger logs that don't parse as XML
Created attachment 115220 [details] [review] tests: Add a test for log-iter-xml/get-events
Created attachment 115221 [details] [review] log-iter-xml: Fix crash when logs are corrupted or invalid XML
From #telepathy on Freenode: 16:44 <rishi> stormer: Are you ok with the pushing the patch that checks for NULL? 16:45 <stormer> rishi: yes 16:45 <rishi> ok 16:45 <stormer> definitly Pushed to both master and telepathy-logger-0.8
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.