<html>
    <head>
      <base href="https://bugs.freedesktop.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - weston-terminal race condition crash"
   href="https://bugs.freedesktop.org/show_bug.cgi?id=97539">97539</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>weston-terminal race condition crash
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>Wayland
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>x86-64 (AMD64)
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux (All)
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>major
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>medium
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>weston
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>wayland-bugs@lists.freedesktop.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>bryce@osg.samsung.com
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=126109" name="attach_126109" title="Add diagnostics">attachment 126109</a> <a href="attachment.cgi?id=126109&action=edit" title="Add diagnostics">[details]</a></span> <a href='page.cgi?id=splinter.html&bug=97539&attachment=126109'>[review]</a>
Add diagnostics

weston-terminal will crash on startup intermittently.  The triggering
conditions are unclear - some times I'll get 5-6 crashes in a row before it
starts, other times it'll start right up fine several dozen times without any
crash.  It seems like crashing is more likely in a freshly started weston than
in a weston that has been running for a time and/or has already launched a few
client programs.

I will attach two patches.  The first adds diagnostics, the second adds a fix.

With the diagnostics applied, but without the fix, this is what a non-crashing
run looks like:

$ ./weston-terminal
Resizing terminal to 20, 5 as default minimum
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
resize_handler(0xdb7b60, 730, 394, 0xdb5590)        !! Resize handler
triggers
resizing widget: width=730, height=394              !!
resizing cells: 80, 24                              !!
terminal_resize_cells                               !! Sets data_pitch
width = 80, utf8_char size = 4; data_pitch = 320    !!
redraw_handler                                      !!
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320          // <-- sane data
pitch
terminal_get_row: row = 1, index = 1
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320
...
terminal_get_row: row = 23, index = 23
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320
handle_special_char                                 // <-- handling
special char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 320

But when crashing, this is the output:

$ ./weston-terminal
Resizing terminal to 20, 5 as default minimum
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
handle_special_char                                !! Handling special
char before resized
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0           // <-- invalid
data_pitch
handle_char
terminal_scroll_buffer                             // <-- unnecessary
scroll
terminal_get_row: row = -1, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0
terminal_get_row: row = -1, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 0
Segmentation fault (core dumped)

This issue appears to be a race condition.  data_pitch (and perhaps other
parameters) are initialized to 0.  But that is not a valid value, and will
cause miscalculations elsewhere (such as when trying to scroll the buffer). 
These values are initialized to valid values in terminal_resize_cells() when it
gets called by the resize_handler().  Whether that handler gets called in time
appears to be a matter of luck; presumably it sometimes doesn't happen in time
for whatever reason, and thus leads to the crash.

A trivial fix would be to simply force a call to terminal_resize_cells() using
plausible defaults.  The init routine already does this in a pair of calls to
terminal_resize().  The second attached patch implements this.  With this
change, the crash is non-reproducible after dozens of tries.  I kept doing it
until I could reproduce the scenario where a late-resolving resize_handler()
occurred, and verified it did not crash when the race condition is lost. 
Here's the output in this situation:


Resizing terminal to 20, 5 as default minimum
terminal_resize_cells
width = 20, utf8_char size = 4; data_pitch = 80
Resizing terminal to 80, 25
display_watch_fd
calling terminal_resize(..., 80, 24)
handle_special_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_special_char
terminal_get_row: row = 0, index = 0
terminal_get_row: terminal->buffer_height 1024
terminal_get_row: terminal->data_pitch 80
handle_char
terminal_get_row: row = 0, index = 0

While this one-line fix is suitable to remove the bug, I suspect the init
routines ought to be changed to explicitly set all of the weston_terminal
structure parameters to valid values.  This would make things a bit cleaner and
clearer, but such refactoring may be better done as followup.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are the assignee for the bug.</li>
      </ul>
    </body>
</html>