Bug 89595

Summary: Logs with weird strings may causes SEGV when reloaded
Product: Telepathy Reporter: Daniel Thompson <daniel.thompson>
Component: loggerAssignee: Telepathy bugs list <telepathy-bugs>
Status: RESOLVED FIXED QA Contact: Telepathy bugs list <telepathy-bugs>
Severity: normal    
Priority: medium CC: nicolas
Version: 0.8   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: logger: Better NULL checking in tpl_log_iter_xml_get_event
log-iter-xml: Fix crash when unable to parse a file
tests: Add new TpLogger logs that don't parse as XML
tests: Add a test for log-iter-xml/get-events
tests: Add new TpLogger logs that don't parse as XML
tests: Add a test for log-iter-xml/get-events
log-iter-xml: Fix crash when logs are corrupted or invalid XML

Description Daniel Thompson 2015-03-16 14:47:31 UTC
Original bug report: https://bugzilla.gnome.org/show_bug.cgi?id=720179

Long standing bug 73072 describes a problem where logger may fail to parse logs containing weird strings. Related to this are certain code paths within logger whose poor error and/or NULL checking escalates this problem from an inconvenience to a crash during application startup.

A stack trace example of this is:

#0  0x00007fffd29c8be7 in tpl_log_iter_xml_get_events (iter=<optimized out>, num_events=5, error=<optimized out>) at log-iter-xml.c:88
#1  0x00007fffd29d4303 in tpl_log_walker_fill_cache_async_thread (simple=0x66bae0 [GSimpleAsyncResult], object=<optimized out>, cancellable=<optimized out>)
    at log-walker.c:310
#2  0x00007ffff579779f in run_in_thread (job=<optimized out>, c=0x0, _data=0x121c960) at gsimpleasyncresult.c:858
#3  0x00007ffff57845e6 in io_job_thread (task=<optimized out>, source_object=<optimized out>, task_data=0x127f540, cancellable=<optimized out>)
    at gioscheduler.c:85
#4  0x00007ffff57a7c75 in g_task_thread_pool_thread (thread_data=0x10420e0, pool_data=<optimized out>) at gtask.c:1215
#5  0x00007ffff520bd68 in g_thread_pool_thread_proxy (data=<optimized out>)
    at gthreadpool.c:307
#6  0x00007ffff520b3d5 in g_thread_proxy (data=0x9b1230) at gthread.c:764
#7  0x00007ffff371b52a in start_thread (arg=0x7fffd0cfe700)
    at pthread_create.c:310
#8  0x00007ffff345722d in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

A little exploring makes clear how this can happen:

(gdb) list
83	
84	          priv->next_event = g_list_last (priv->events);
85	          priv->next_date = g_list_previous (priv->next_date);
86	        }
87	
88	      event = TPL_EVENT (priv->next_event->data);
89	      events = g_list_prepend (events, g_object_ref (event));
90	      i++;
91	
92	      priv->next_event = g_list_previous (priv->next_event);
(gdb) print priv
$1 = (TplLogIterXmlPriv *) 0x10eb760
(gdb) print priv->next_event
$2 = 0x0
(gdb) print priv->events
$3 = 0x0

The attached patch fixes the crash (but, to be clear, it does *not* fix bug 73072). Nevertheless I view the patch as sensible robustness improvement that fails call is a reasonably sensible way.

Messages from polari without the change are:
--- cut here ---
/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8
arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: *&#x8;
                                                                               ^
(org.gnome.Polari:5812): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log'
Segmentation fault (core dumped)

--- cut here ---

With the change we can see the system stay alive with the GUI running:
--- cut here ---
/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8
arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: *&#x8;
                                                                               ^
(org.gnome.Polari:17277): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log'
/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log:44: parser error : xmlParseCharRef: invalid xmlChar value 8
arendra2' name='narendra2' token='' isuser='false' type='normal'>Loading: *&#x8;
                                                                               ^
(org.gnome.Polari:17277): tp-logger-WARNING **: Failed to parse file:'/home/drt/.local/share/TpLogger/logs/idle_irc_danielt0/chatrooms/#linaro-lava/20150313.log'
--- cut here ---
Comment 1 Daniel Thompson 2015-03-16 14:51:10 UTC
Created attachment 114349 [details] [review]
logger: Better NULL checking in tpl_log_iter_xml_get_event
Comment 2 Debarshi Ray 2015-04-20 08:56:14 UTC
Comment on attachment 114349 [details] [review]
logger: Better NULL checking in tpl_log_iter_xml_get_event

Review of attachment 114349 [details] [review]:
-----------------------------------------------------------------

Thanks for tracking this down, Daniel. I am not a telepathy reviewer, but I did write most of that file...

It would be better if we tried to skip the date with the faulty file and tried the next one. Instead of completely bailing out.

::: telepathy-logger/log-iter-xml.c
@@ +80,5 @@
>            priv->events = _tpl_log_store_get_events_for_date (priv->store,
>                priv->account, priv->target, priv->type_mask,
>                (GDate *) priv->next_date->data);
> +          if (!priv->events)
> +            break;

Nitpick: priv->events == NULL
Comment 3 Debarshi Ray 2015-04-20 08:58:32 UTC
Created attachment 115213 [details] [review]
log-iter-xml: Fix crash when unable to parse a file

How about something like this?
Comment 4 Daniel Thompson 2015-04-20 09:30:59 UTC
I try to test this if I can. Unfortunately the broken log files have been rotated out now so I no longer get the error when starting Polari.

I'll see if I can corrupt them deliberately.
Comment 5 Debarshi Ray 2015-04-20 11:51:14 UTC
Created attachment 115217 [details] [review]
tests: Add new TpLogger logs that don't parse as XML
Comment 6 Debarshi Ray 2015-04-20 11:51:48 UTC
Created attachment 115218 [details] [review]
tests: Add a test for log-iter-xml/get-events
Comment 7 Debarshi Ray 2015-04-20 12:16:20 UTC
Since bug 40675 , we try to recover from XML parsing failures as much as possible. That seems to be enough for all the faulty XML files that were reported for the original Polari bug [1]. Unfortunately, that patch isn't in the 0.8.x series, so users don't benefit from it.

Even then, I guess, it doesn't hurt to defend against this case.

[1] https://bugzilla.gnome.org/show_bug.cgi?id=720179
Comment 8 Debarshi Ray 2015-04-20 12:17:13 UTC
Created attachment 115219 [details] [review]
tests: Add new TpLogger logs that don't parse as XML
Comment 9 Debarshi Ray 2015-04-20 12:17:42 UTC
Created attachment 115220 [details] [review]
tests: Add a test for log-iter-xml/get-events
Comment 10 Debarshi Ray 2015-04-20 12:18:15 UTC
Created attachment 115221 [details] [review]
log-iter-xml: Fix crash when logs are corrupted or invalid XML
Comment 11 Debarshi Ray 2015-04-28 16:49:33 UTC
From #telepathy on Freenode:

16:44 <rishi> stormer: Are you ok with the pushing the patch that checks for    
      NULL?
16:45 <stormer> rishi: yes
16:45 <rishi> ok
16:45 <stormer> definitly

Pushed to both master and telepathy-logger-0.8

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.