[Pixman] [PATCH] Fix server crash in pixman (to be discussed)

Soeren Sandmann sandmann at daimi.au.dk
Wed Mar 24 10:28:07 PDT 2010


Hi Matthias,

> The following patch fixes Novell bug 568811:
>   VNC Installation aborts right in the middle due to an assertion in Xvnc/libpixman
> 
> The bug seems occur only on *very* special occasions (in this case, only
> in SLES, but *not* in SLED, which is based on the same code basis...).
> 
> Backtrace looks as follows:
> 
> #2  0xb71f7baa in __assert_fail () from /lib/libc.so.6
> #3  0xb781feab in pixman_region_append_non_o (y2=<value optimized out>,
>     y1=<value optimized out>, r_end=<value optimized out>,
>     r=<value optimized out>, region=<value optimized out>) at pixman-region.c:670
> #4  pixman_op (y2=<value optimized out>, y1=<value optimized out>,
>     r_end=<value optimized out>, r=<value optimized out>,
>     region=<value optimized out>) at pixman-region.c:996
> #5  0xb7820dbe in pixman_region_union (new_reg=0x82ee57c, reg1=0x82ee57c,
>     reg2=0xbfd77c90) at pixman-region.c:1439
> #6  0x080f023b in miUnion (newReg=0x82ee57c, reg1=0x82ee57c, reg2=0xbfd77c90)
>     at miregion.c:1005

Thanks for tracking down this issue.

Please note that while pixman is not as strict as the X server in who
can push to the repository, it is not a complete free-for-all.
Committing small, obvious patches that fixes typos or oversights is
fine, but don't commit non-obvious stuff without sending it to the
mailing list and giving people time to comment on it.

When we are this close to stable release, we need to be even more
careful about what goes in. Patches committed between now and next
week will likely ship without no testing at all.

This patch in particular, I don't think shold ship with no testing at
all. So please revert it, and we can consider it again for 0.19.x.

Comments and questions:

- Why are you seeing an assert in the first place? 

  The asserts have been disabled since 0.16.4/0.17.6 because pixman
  shouldn't take down the X server with asserts for things that would
  otherwise be harmless.

  If you are using an older version than those in SLES, then you might
  want to either upgrade to 0.16.4, or backport

       ec6de472d042bec05aaa53f9d14bfbe23edbe01e

  which is the commit that disables asserts.

- How does the degenerate region get created?

  If you call pixman_init_rectangle() with an empty rectangle, the
  region will be correctly initialized to have the 'data' field point
  to pixman_region_empty_data_. I believe the various operations all
  make sure that if the resulting region is empty, they will store a
  pointer to pixman_region_empty_data_ in the data field.
  
  For better or worse, the region code currently relies on this
  invariant, so users shouldn't be filling out the region data
  structure themselves.

  You can argue that this is bad and pixman should cope with regions
  initialized directly with empty rectangles, and I probably would
  agree; it may make sense to fix it in pixman 0.20.

  It certainly does not make sense to make this change with no testing
  this close to a stable release unless it is demonstrably a serious
  bug in pixman that is causing crashes.

- The patch does not look wrong to me - a region with empty extents
  should probably be considered empty rather than degenerate. However,
  unless this issue is truly causing crashes, changing it is the kind
  of thing that could cause all kinds of mysterious bugs to show up
  and therefore definitely should not happen between a release
  candidate and a stable release. Did you carefully review all the
  code to make sure it always uses PIXMAN_NIL to check for empty
  regions?

- The region code is rather tricky, so when bugs are fixed in it, I'd
  like to have tests/region-test.c updated to test the issue. That way

        - we have a clear way to reproduce the bug

        - we can verify that the bug is really fixed

        - we make sure it doesn't get reintroduced

- Finally, we need much more detail in commit messages than "Fixes
  Novell bug xxxxxx".

  Is that even a public bug? I couldn't figure out how to look at
  it. I did not create a Novell Login though.


Soren



More information about the xorg mailing list