[RFC weston 13/16] compositor: Add a function to test if images transformed by a matrix should be bilinearly filtered

Fri Oct 3 08:21:49 PDT 2014

On 02/10/14 07:09 PM, Jason Ekstrand wrote:
> 
> 
> On Thu, Oct 2, 2014 at 3:21 PM, Derek Foreman <derekf at osg.samsung.com
> <mailto:derekf at osg.samsung.com>> wrote:
> 
>     On 02/10/14 02:37 PM, Jason Ekstrand wrote:
>     >
>     > On Oct 2, 2014 12:37 AM, "Pekka Paalanen" <ppaalanen at gmail.com <mailto:ppaalanen at gmail.com>
>     > <mailto:ppaalanen at gmail.com <mailto:ppaalanen at gmail.com>>> wrote:
>     >>
>     >> On Wed, 1 Oct 2014 18:09:32 -0700
>     >> Jason Ekstrand <jason at jlekstrand.net <mailto:jason at jlekstrand.net>
>     <mailto:jason at jlekstrand.net <mailto:jason at jlekstrand.net>>> wrote:
>     >>
>     >> > Allow me to chip in here.  Sorry that I haven't had a chance to
>     > really look
>     >> > over things carefully.  I have been reading this thread, just
>     > haven't had a
>     >> > chance to respond.
>     >> >
>     >> > On Wed, Oct 1, 2014 at 12:41 AM, Pekka Paalanen <ppaalanen at gmail.com <mailto:ppaalanen at gmail.com>
>     > <mailto:ppaalanen at gmail.com <mailto:ppaalanen at gmail.com>>> wrote:
>     >> >
>     >> > > On Tue, 30 Sep 2014 14:35:24 -0500
>     >> > > Derek Foreman <derekf at osg.samsung.com <mailto:derekf at osg.samsung.com>
>     > <mailto:derekf at osg.samsung.com <mailto:derekf at osg.samsung.com>>>
>     wrote:
>     >> > >
>     >> > > > Thanks for taking a look!
>     >> > > >
>     >> > > > On 26/09/14 05:48 PM, Bill Spitzak wrote:
>     >> > > > > 90 degree rotation about x or y will require filtering.
>     >> > > >
>     >> > > > Yup, you're right.
>     >> > > >
>     >> > > > > You test y scale twice, must be a typo. I think you intended
>     > to test z,
>     >> > > > > but in fact z scale is not relevant so you should not test it
>     > at all.
>     >> > > >
>     >> > > > Argh - thanks.  Why isn't Z scale relevant?  I'm worried about
>     > making
>     >> > > > assumptions about the transformations these matrices
>     represent and
>     >> > > > having those assumptions violated in the future...  For Z
>     not to
>     > matter
>     >> > > > are we assuming projection will always be orthographic?
>     >> > >
>     >> > > Weston never uses the final Z coordinate for anything, so in that
>     > sense
>     >> > > it is always orthographic. Essentially, we could just do with 3x3
>     >> > > matrices perfectly fine. 3x3 supports 2D-projective which is
>     enough to
>     >> > > implement fake-3D effects like
>     >> > > http://people.collabora.com/~pq/rotate3d-fun.webm
>     >> > > (The gl-renderer does not route the W element at all, I had
>     to patch
>     >> > > that. Pixman-renderer OTOH just worked.)
>     >> > >
>     >> > > Weston also hardcodes the input Z coordinate always to 0, no
>     matter
>     >> > > which way you are going between buffer and output spaces.
>     >> > >
>     >> > > I suppose the 4x4 matrix was originally chosen to fit the
>     OpenGL API.
>     >> > > And maybe with some speculation about a desktop cube
>     implementation or
>     >> > > something, but I don't really see the cube or such coming,
>     not as a
>     >> > > generic thing anyway as only the gl-renderer could support it
>     with
>     >> > > true 3D space.
>     >> > >
>     >> > > > > Translation by non-integer will also require filtering.
>     >> > > >
>     >> > > > Good point.
>     >> > > >
>     >> > > > > I recommend instead of checking the rotation to instead look
>     > for zeros
>     >> > > > > in the correct locations in the matrix. Matrix must be of the
>     > form:
>     >> > > > >
>     >> > > > >  |sx 0  0 tx|
>     >> > > > >  |0  sy 0 ty|
>     >> > > > >  |?  ?  ?  ?|
>     >> > > > >  |0  0  0  1|
>     >> > > > >
>     >> > > > > or
>     >> > > > >
>     >> > > > >  |0  sx 0 tx|
>     >> > > > >  |sy 0  0 ty|
>     >> > > > >  |?  ?  ?  ?|
>     >> > > > >  |0  0  0  1|
>     >> > > > >
>     >> > > > > sx and sy must be ±1, and tx and ty must be integers. The ?
>     > can be any
>     >> > > > > value.
>     >> > > >
>     >> > > > That could save us the very expensive matrix decomposition.
>     > I'll try
>     >> > > > this out.  Thanks.
>     >> > > >
>     >> > > > I think this may be better than decomposition for deciding to
>     > use video
>     >> > > > planes in compositor-drm as well.
>     >> > > >
>     >> > > > (In fact, you've got me wondering if we ever need to split a
>     > matrix into
>     >> > > > basic transformations at all...)
>     >> > >
>     >> > > I'd be wondering about that, too. My intuition would say
>     there is no
>     >> > > need to really decompose. Just checking the elements should
>     suffice,
>     >> > > and when the matrix is supportable for whatever, then you
>     pick the
>     >> > > right elements (which is a bit like decomposition, but no
>     need to be
>     >> > > generic at all).
>     >> > >
>     >> >
>     >> > Yeah, I'm not convinced we need to be able to do a full
>     decomposition
>     >> > either.  My original intention was something like this:
>     >> >
>     >> > bool
>     >> > weston_matrix_to_integer_transform(const weston_matrix *mat, enum
>     >> > wl_output_transform& transform, int *scale, int *x, int *y)
>     >>
>     >> Why would there be 'transform' parameter? That implies that the
>     matrix
>     >> is not really the total transformation, which I find odd here.
>     >>
>     >> (Total transformation is between buffer pixel coords and
>     output/scanout
>     >> pixel coords, i.e. buffer-to-output.)
> 
>     btw, what exactly is the buffer-to-output transform?  I think in the
>     pixman renderer that's already calculated in a convenient location (in
>     "matrix" in repaint_region()
> 
> 
> We create it here, among other things:
> 
> https://github.com/ManMower/weston/blob/transforms/src/pixman-renderer.c#L213
> 
> Basically, it's just the full transformation from buffer pixels to
> output pixels.  If that's not scaled or rotated, we want NEAREST filtering.

Ok, that's what I meant by "matrix" in repaint_region...

One thing that's bugging me... I think normally a 90 degree rotation
doesn't require LINEAR filtering - doesn't this change if device pixels
aren't square?

>  
> 
> 
>     For gl-renderer, I suspect I need to build it myself in draw_view():
>     weston_matrix_init(&foo);
>     weston_matrix_multiply(&foo, &ev->surface->buffer_to_surface_matrix);
>     if (ev->transform.enabled)
>             weston_matrix_multiply(&foo, &ev->transform.matrix)
>     weston_matrix_multiply(&foo, &output->matrix);
> 
>     Is that right?  Do I have the order backwards?
> 
>     I'd like to test just that one matrix and no additional if
>     (ev->tranform.enabled) etc to decide on whether to use linear or
>     nearest...
> 
>     > I'm sorry I mistyped but I meant the transform to be an output
>     > parameter.  That way you know of the matrix is a 90-degree rotation or
>     > flip.  Not sure if this is needed but for figuring out GL_LINEAR vs
>     > GL_NEAREST we don't want to fail if there is a 90-degree rotation.
>     >>
>     >> > (do we use "bool" in weston?  Maybe just return int).  We may need
> 
>     I'm kind of interested in the answer to the question "do we use bool in
>     weston?" - It's used in some places and not others - do we care? :)
> 
>     > both x
>     >> > and y scales and it may be useful to get those as floats.  I'm not
>     > sure on
>     >> > that.  Pekka, what would the RPi backend use?
>     >>
>     >> The rpi-renderer uses pretty much the same as what DRM planes/overlays
>     >> offer wrt. coordinates, IIRC: integer position and size on the output,
>     >> and 16.16 fixed point position and size on input (buffer).
>     >>
>     >> Whether scaling factor is integer or not is irrelevant there. I do not
>     >> recall there being an option for sampling (nearest/linear/...) in
>     >> either DispmanX nor DRM.
> 
>     I think rpi has horizontal and vertical flip capabilities as well?
> 
>     I don't think that's exposed by drmModeSetPlane.
> 
> 
> Exactly.  The matrix decompose function should provide enough
> information to figure out this stuff.
>  
> 
> 
>     >> >  Basically, we want to be
>     >> > able to do 2 things: First, detect if it's an entirely integer transform
>     >> > and use GL_NEAREST instead of GL_LINEAR and second, know how to flip the
>     >> > surface around in cases when we can do some simple transformations but
>     >> > can't do an arbitrary matrix transformation.  One example here is DRM
>     >> > planes.  We can only use a plane when there's no scale and the
>     >> > buffer-to-output transform has no rotation.  We need to check for that
>     >> > condition and then pull the needed data out.
>     >>
>     >> I think DRM planes do handle at least limited scaling, as do DispmanX
>     >> (IIRC something like if scaling to less than 1/8 you take an
>     >> additional performance hit, or other funny effects) also.
>     >>
>     >> We may not know all the limitations of a DRM plane in advance, so we
>     >> only need to make sure it can fit through the KMS API, and then the
>     >> kernel will reject it if it violates some hw-specific restrictions.
>     >> (Fallback will be implemented in Weston when atomic/nuclear support
>     >> arrives.)
>     >
>     > Yes, but we probably want to use the above function and then check that
>     > transform == WL_OUTPUT_TRANSFORM_NORMAL.  In any case, having some idea
>     > of the rotation is probably needed.  (I don't know that much about KMS).
> 
>     I think we need to transform the points by the buffer-to-output matrix
>     in order to create the destination x, y, width and height if we can
>     successfully put something in a plane - might it be easier to do that
>     transformation unconditionally and then test the results for viability?
> 
> 
> We should also be able to pick that information directly out of the
> matrix.  The (0, 0) point gets transformed to the top two entries on the
> right hand side divided by the lower right hand corner.

Ah yes, that's true.  That saves a little math.

I'm curious as to how zero a zero has to be (in the d[0],d[5] or
d[4],d[1] positions, depending on rotation) or how 1 a 1 has to be,
since we're using floats.  testing for equality may miss cases where a
screen transform and a surface transform cancel eachother out "almost"
completely.  The tolerances of the zero seem to me to depend on the size
of the surface.

Testing on the transformed vertices makes that a little easier, I think.
 But maybe that won't happen very often.  (or will have a negative
impact if, say, an animated surface rotates through plane viable ->
plane not viable frequently...)

>  
> 
> 
>     (ie: for drm test that the output rect is axis aligned, no flips.  for
>     rpi, axis aligned but flips are ok and result in slight additional
>     setup.)
> 
>     That would leave the the test for linear vs nearest in a separate
>     function (much like the one Bill described)
> 
> 
> Eh, I think the ability to pull out the transform is useful.  Also, once
> you've gone to all the work to check if you can do it, doing it
> shouldn't be hard.

Fair enough, I guess.

I'm not completely convinced knowing what the transform is will really
be all that useful, since much of this effort is to remove big switch
(transform) { constructs in the first place.

Places (rpi) that can put flipped video in a plane could just as easily
get the HFLIP bit from testing dest x1 < x0 instead of a switch that
converts FLIPPED_180 into... wait I have to think about that.

That said, I'll try it with the transforms and see how it goes. :)

> --Jason
>  
> 
> 
>     >>
>     >> > Point is, we don't need a full matrix decomposition.  Also, it's worth
>     >> > throwing out there that the caching probably doesn't help us at all
>     > because
>     >> > we're going to usually be calling this on freshly computed matrices
>     > such as
>     >> > the above mentioned buffer-to-output transform.
>     >> >
>     >> > Does that make sense?
>     >>
>     >> Yes, to me at least.
>     >>
>     >> Futhermore, if you wanted to cache the buffer-to-output matrix, you
>     >> would end up with number_of_views * number_of_outputs matrices to be
>     >> cached. The buffer-to-global per-view matrix might not change too
>     >> often, but we tend to paint outputs in turns, which means doing just
>     >> per-view cached total matrix is a waste.
>     >>
>     >> So you might have buffer-to-surface matrix in weston_surface, then
>     >> buffer-to-global matrix cached in weston_view. I'm not sure it makes
>     >> sense to cache buffer-to-output anywhere.
> 
>     Right - I think we all agree that:
>     Full decomposition is a waste of time (whether or not I try to cache the
>     decomposition results)
> 
>     Caching buffer-to-output matrices is also not a win.
> 
> 
> 
> 
> 
> _______________________________________________
> wayland-devel mailing list
> wayland-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/wayland-devel
>