Bug 97539 - weston-terminal race condition crash
Summary: weston-terminal race condition crash
Status: RESOLVED FIXED
Alias: None
Product: Wayland
Classification: Unclassified
Component: weston (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Bryce Harrington
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-29 23:17 UTC by Bryce Harrington
Modified: 2018-06-04 08:57 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Add diagnostics (9.83 KB, patch)
2016-08-29 23:17 UTC, Bryce Harrington
Details | Splinter Review
Minimal fix to the problem (1.61 KB, patch)
2016-08-29 23:18 UTC, Bryce Harrington
Details | Splinter Review

Description Bryce Harrington 2016-08-29 23:17:30 UTC
Created attachment 126109 [details] [review]
Add diagnostics

weston-terminal will crash on startup intermittently.  The triggering conditions are unclear - some times I'll get 5-6 crashes in a row before it starts, other times it'll start right up fine several dozen times without any crash.  It seems like crashing is more likely in a freshly started weston than in a weston that has been running for a time and/or has already launched a few client programs.

I will attach two patches.  The first adds diagnostics, the second adds a fix.

With the diagnostics applied, but without the fix, this is what a non-crashing run looks like:

$ ./weston-terminal
Resizing terminal to 20, 5 as default minimum
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
resize_handler(0xdb7b60, 730, 394, 0xdb5590)        !! Resize handler
triggers
resizing widget: width=730, height=394              !!
resizing cells: 80, 24                              !!
terminal_resize_cells                               !! Sets data_pitch
width = 80, utf8_char size = 4; data_pitch = 320    !!
redraw_handler                                      !!
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320          // <-- sane data
pitch
terminal_get_row: row = 1, index = 1
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320
...
terminal_get_row: row = 23, index = 23
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320
handle_special_char                                 // <-- handling
special char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320

But when crashing, this is the output:

$ ./weston-terminal
Resizing terminal to 20, 5 as default minimum
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
handle_special_char                                !! Handling special
char before resized
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0           // <-- invalid
data_pitch
handle_char
terminal_scroll_buffer                             // <-- unnecessary
scroll
terminal_get_row: row = -1, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0
terminal_get_row: row = -1, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0
Segmentation fault (core dumped)

This issue appears to be a race condition.  data_pitch (and perhaps other parameters) are initialized to 0.  But that is not a valid value, and will cause miscalculations elsewhere (such as when trying to scroll the buffer).  These values are initialized to valid values in terminal_resize_cells() when it gets called by the resize_handler().  Whether that handler gets called in time appears to be a matter of luck; presumably it sometimes doesn't happen in time for whatever reason, and thus leads to the crash.

A trivial fix would be to simply force a call to terminal_resize_cells() using plausible defaults.  The init routine already does this in a pair of calls to terminal_resize().  The second attached patch implements this.  With this change, the crash is non-reproducible after dozens of tries.  I kept doing it until I could reproduce the scenario where a late-resolving resize_handler() occurred, and verified it did not crash when the race condition is lost.  Here's the output in this situation:


Resizing terminal to 20, 5 as default minimum
terminal_resize_cells
width = 20, utf8_char size = 4; data_pitch = 80
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
handle_special_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_special_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_char
terminal_get_row: row = 0, index = 0

While this one-line fix is suitable to remove the bug, I suspect the init routines ought to be changed to explicitly set all of the weston_terminal structure parameters to valid values.  This would make things a bit cleaner and clearer, but such refactoring may be better done as followup.
Comment 1 Bryce Harrington 2016-08-29 23:18:08 UTC
Created attachment 126110 [details] [review]
Minimal fix to the problem
Comment 2 Daniel Stone 2018-06-04 08:57:15 UTC
This got committed.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.