Created attachment 126109 [details] [review] Add diagnostics weston-terminal will crash on startup intermittently. The triggering conditions are unclear - some times I'll get 5-6 crashes in a row before it starts, other times it'll start right up fine several dozen times without any crash. It seems like crashing is more likely in a freshly started weston than in a weston that has been running for a time and/or has already launched a few client programs. I will attach two patches. The first adds diagnostics, the second adds a fix. With the diagnostics applied, but without the fix, this is what a non-crashing run looks like: $ ./weston-terminal Resizing terminal to 20, 5 as default minimum Resizing terminal to 80, 25 display_watch_fd calling terminal_resize(..., 80, 24) resize_handler(0xdb7b60, 730, 394, 0xdb5590) !! Resize handler triggers resizing widget: width=730, height=394 !! resizing cells: 80, 24 !! terminal_resize_cells !! Sets data_pitch width = 80, utf8_char size = 4; data_pitch = 320 !! redraw_handler !! terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 320 // <-- sane data pitch terminal_get_row: row = 1, index = 1 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 320 ... terminal_get_row: row = 23, index = 23 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 320 handle_special_char // <-- handling special char terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 320 But when crashing, this is the output: $ ./weston-terminal Resizing terminal to 20, 5 as default minimum Resizing terminal to 80, 25 display_watch_fd calling terminal_resize(..., 80, 24) handle_special_char !! Handling special char before resized terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 0 // <-- invalid data_pitch handle_char terminal_scroll_buffer // <-- unnecessary scroll terminal_get_row: row = -1, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 0 terminal_get_row: row = -1, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 0 Segmentation fault (core dumped) This issue appears to be a race condition. data_pitch (and perhaps other parameters) are initialized to 0. But that is not a valid value, and will cause miscalculations elsewhere (such as when trying to scroll the buffer). These values are initialized to valid values in terminal_resize_cells() when it gets called by the resize_handler(). Whether that handler gets called in time appears to be a matter of luck; presumably it sometimes doesn't happen in time for whatever reason, and thus leads to the crash. A trivial fix would be to simply force a call to terminal_resize_cells() using plausible defaults. The init routine already does this in a pair of calls to terminal_resize(). The second attached patch implements this. With this change, the crash is non-reproducible after dozens of tries. I kept doing it until I could reproduce the scenario where a late-resolving resize_handler() occurred, and verified it did not crash when the race condition is lost. Here's the output in this situation: Resizing terminal to 20, 5 as default minimum terminal_resize_cells width = 20, utf8_char size = 4; data_pitch = 80 Resizing terminal to 80, 25 display_watch_fd calling terminal_resize(..., 80, 24) handle_special_char terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 80 handle_char terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 80 handle_special_char terminal_get_row: row = 0, index = 0 terminal_get_row: terminal->buffer_height 1024 terminal_get_row: terminal->data_pitch 80 handle_char terminal_get_row: row = 0, index = 0 While this one-line fix is suitable to remove the bug, I suspect the init routines ought to be changed to explicitly set all of the weston_terminal structure parameters to valid values. This would make things a bit cleaner and clearer, but such refactoring may be better done as followup.
Created attachment 126110 [details] [review] Minimal fix to the problem
This got committed.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.