[PATCH weston 3/3] tests: fix the cursor race in internal-screenshot

Derek Foreman derekf at osg.samsung.com
Wed Jun 22 23:09:50 UTC 2016


On 17/06/16 03:57 AM, Pekka Paalanen wrote:
> On Thu, 16 Jun 2016 10:59:21 -0500 Derek Foreman
> <derekf at osg.samsung.com> wrote:
> 
>> On 16/06/16 05:36 AM, Pekka Paalanen wrote:
>>> From: Pekka Paalanen <pekka.paalanen at collabora.co.uk>
>>> 
>>> This fix also depends on "compositor-headless: do not create a
>>> seat".
>>> 
>>> If we lose the race against weston-desktop-shell setting
>>> cursors, which is very rare, we get a cursor image in the
>>> screenshot, causing the test to fail. This is now fixed by
>>> moving the (remaining) cursor out of the way.
>>> 
>>> Arguably we should have better solutions for this, but that is
>>> another story. This is a stop-gap measure we can copy also in
>>> new screenshooting tests.
>>> 
>>> Code and explanation on how to lose the race for sure are
>>> included.
>>> 
>>> Signed-off-by: Pekka Paalanen <pekka.paalanen at collabora.co.uk> 
>>> --- tests/internal-screenshot-test.c | 27
>>> +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+)
>>> 
>>> diff --git a/tests/internal-screenshot-test.c
>>> b/tests/internal-screenshot-test.c index 563aa3d..e022d20
>>> 100644 --- a/tests/internal-screenshot-test.c +++
>>> b/tests/internal-screenshot-test.c @@ -26,6 +26,7 @@ #include
>>> "config.h"
>>> 
>>> #include <stdio.h> +#include <time.h>
>>> 
>>> #include "weston-test-client-helper.h"
>>> 
>>> @@ -48,6 +49,27 @@ draw_stuff(void *pixels, int w, int h) } }
>>> 
>>> +/* + * We are racing our screenshooting against
>>> weston-desktop-shell setting the + * cursor. If w-d-s wins, our
>>> screenshot will have a cursor shown, which makes + * the image
>>> comparison fail. Our window and the default pointer position
>>> are + * accidentally causing an overlap that intersects our
>>> test clip rectangle. + * + * This delay, if enabled, ensures
>>> this test loses that race. + * + * To actually see the failure,
>>> comment out the workaround call to + *
>>> weston_test_move_pointer() below. + */ +static void 
>>> +lose_the_race(void) +{ +#if 0 +	struct timespec delay = {
>>> .tv_sec = 0, .tv_nsec = 100 * 1000 * 1000 }; + +
>>> nanosleep(&delay, NULL);
>> 
>> BAAAaaaarrf. :)
>> 
>> I don't get the play here, this is still if 0 so does nothing, so
>> the race is still present.
> 
> Hi,
> 
> this is not what fixes the race. The fix is the move_pointer call 
> below. This is here just to remind a casual reader on how you can 
> ensure to lose the race and see the result.

I think perhaps the comment and the associated code is too elaborate -
I'm willing to take your word that a race condition exists without
needed to reproduce it myself.  The end result is confusing.

Losing the race 100% of the time functionally eliminates the race,
which is why I misunderstood the intent of "lose_the_race".

>> So we're going to propagate this "fix" to other new tests, and
>> "just" set all these #if 0 to #if 1 before running make
>> distcheck, or what?
> 
> No. lose_the_race() should not be copied anywhere. It's just
> something you can enable if you start seeing random failures here,
> or you want to inspect what the failure looks like without having
> to wait a hundred attempts to get lucky.
> 
> I could have just left the delay be there, it's only 100 ms, and
> it would make it much much more likely that the race is lost, and
> if the bug fixed by this patch comes back it will be noticed soon.
> However, the nanonsleep is not a guarantee, and I hate tests that
> include delays because they arbitrarily make the check take longer,
> so I'd rather not have it on by default. Delays are never a good
> solution for races.
> 
> OTOH, actually guaranteeing that w-d-s is ready before the test
> begins would be useful, but quite a lot more effort for just this
> case. I also did consider moving all seat and input device control
> interfaces from weston-test.so into the headless-backend, so that
> we could just remove all pointer devices (or rather not create them
> in the first place) in tests that do not need them, which is what I
> could consider the proper fix for the cursor issue. However, as my
> current goal is to fix some of Weston's sub-surface issues, which
> lead me to wanting screenshot-based tests, which lead to refactor
> the whole pixel buffer stuff in tests (see the 11 patch series), I
> thought it's time to skip one rabbit hole to get forward. So pardon
> my lazyness. :-P

I can completely understand not wanting to spend more time on this
than is immediately necessary for the task at hand.

> (Oh, and there is the issue of ensuring that weston tests will
> *only* use the fallback cursors from libwayland-cursor, so that
> custom cursor themes won't cause false failures in tests.)
> 
> In essence, lose_the_race() function is *purely* documentation.
> 
>> I'm not really sure this is all that good even as a stop-gap
>> measure...
> 
> Well, the real fix is the move_pointer, not the delay.
> 
>>> +#endif +} + TEST(internal_screenshot) { struct wl_buffer
>>> *buf; @@ -62,12 +84,17 @@ TEST(internal_screenshot) bool
>>> dump_all_images = true; void *pixels;
>>> 
>>> +	lose_the_race(); + /* Create the client */ printf("Creating
>>> client for test\n"); client =
>>> create_client_and_test_surface(100, 100, 100, 100); 
>>> assert(client); surface = client->surface->wl_surface;
>>> 
>>> +	/* Move the pointer away from the screenshot area. */ +
>>> weston_test_move_pointer(client->test->weston_test, 0, 0);
> 
> This line ^ here is the actual fix. I suppose I should underline
> that in the commit message?
> 
> Certainly something was off in this patch to receive such comments.
> ;-)

How about just keep the actual fix and put a better comment above it
explaining why it's necessary, and lose the lose_the_race stuff entirely?

Such a patch gets my Rb. :)

Though, I am wondering how hard it would be to make the default cursor
location be 0,0 if the test module is loaded...

Thanks,
Derek
(Who never managed to make this race manifest in practice, even when
trying - and is quite happy someone else bothered to sort it out!)
>>> + buf = create_shm_buffer(client, 100, 100, &pixels); 
>>> draw_stuff(pixels, 100, 100); wl_surface_attach(surface, buf,
>>> 0, 0);
>>> 
>> 
> 
> Thanks, pq
> 



More information about the wayland-devel mailing list