[Pixman] [cairo] Planar YUV support

Fri Mar 4 11:56:46 PST 2011

Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:

> On Wed, Mar 2, 2011 at 4:16 PM, Soeren Sandmann <sandmann at cs.au.dk> wrote:
>> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>>
>>> I'm not a big fan of "let's make this totally universal and future
>>> proof" approach if only a very small fraction of this functionality is
>>> going to be actually used. Moreover, I suspect that trying to be too
>>> general was responsible for slowing down the original "Planar YUV
>>> support" plan.
>>
>> Part of what was derailing that plan may have been my insisting on being
>> precise about how these formats fit into the image processing pipeline,
>> including how it related to gamma correction and other colorspace
>> issues. I still think this is important. However, we can probably leave
>> out some specific features, if there is a credible story about how to do
>> them later.
>>
>> The pipeline as it is now:
>>
>>    1 convert image sample to a8r8g8b8
>>    2 extend sample grid in all directions according to repeat
>>    3 interpolate between sample according to filter
>>    4 transform
>>    5 resample
>>    6 combine
>>    7 store

> What is the difference between "3 interpolate between sample according
> to filter" and "5 resample"?

The output of stage 3 is an image that is defined on all of the real
plane. There are no pixels any more, so there is no question about what
stage 4, "transform", means. Stage 5 converts back to pixels by point
sampling.

Of course, pixman is implemented starting from the destination, so
thefunction of the real plane from stage 3 is never actually
computed. Because of the sampling, we only need the value of that
function at discrete locations.

(Better gradient and image quality would involve improving stage 5 to do
something better than point sampling).

>> To add support for potentially subsampled YUV, some additional stages
>> have to be inserted before the first:
>>
>>   -2 interpolate subsampled components of YUV to get the same
>>      resolution as the Y plane
>>
>>   -1 if the format is planar, stitch together components to form YUV
>>      pixels
>>
>>    0 convert to sRGB
>>
>> Stage -2 is important because the filter used in that interpolation
>> should probably be user-specifiable eventually, which has the
>> implication that whatever simple support is added first, it needs to be
>> clear what filter precisely is being used.
>>
>> Stage 0 is a color space conversion and need to eventually be
>> configurable too, which means it has to be specified which matrix is
>> being used.
>
> I'm not totally sure about stage -2. So source interpolation is going
> to happen twice in the pipeline, once when getting rid of subsampling,
> and the second time when applying transform? To me it looks more
> natural to just start with the transform, for each pixel in the
> destination space get a transformed fractional coordinate in the
> source image, look it up in the source YUV grid and interpolate each
> color component for each color plane separately according to the
> selected filter, and finally convert interpolated YUV color to
> a8r8g8b8 according to the used color matrix.

Yes, I thought those two stages were the same for a long time, to. In
fact, in this mail:

    http://lists.freedesktop.org/archives/cairo/2010-May/019876.html

I argue that they are the same, and that the pipeline should look like
this:

      * Widen to 8 bit components
      * Extend 
      * Interpolate between samples according to filter
      * Transform
      * Convert to RGB coding
      * Resample
      * Combine
      * Store

which also what you are saying. However, as mentioned here:

    http://lists.cairographics.org/archives/cairo/2010-June/020232.html

I have become convinced that this is wrong and the undoing the chroma
subsampling is something different than interpolation happening on sRGB
pixels, although it may be that they can be combined in some special
cases. In that mail I say that the pipeline should (eventually) look
like this:

    extend
    widen

    chroma reconstruct  <= upsample filter

    convert to ar'g'b'  <= \
    linearize           <= | color space convert
    premultiply         <= /

    interpolate         <= interpolation filter
    transform
    resample            <= resampling filter, plus sampling rate
    combine

    convert to dest format

    store

The basic argument is that the interpolation filtering should be done in
premultiplied RGB (linear, ideally, but sRGB for now), whereas the
chroma reconstruction obviously has to be done on chroma
channels. Therefore, these two operations have to be considered
distinct.

Ie., since a YUV image is generated by this process:

     1. Start with sRGB
     2. Convert to YUV
     3. Filter and subsample chroma

if we are to do some operation on the image in sRGB, reversing this
process as the first step can't possibly be incorrect:

     1. Reconstruct chroma samples
     2. Convert to sRGB
     3. Interpolate sRGB samples
     4. Transform

And so if the alternative process:

     1. Interpolate YUV samples
     2. Transform
     3. Convert to sRGB

produces something different, it can't be correct. 

Note that if some day we add compositing in linear RGB, the alternative
process breaks down because the initial interpolation will be taking
place in non-linear color space, whereas with intermediates in linear
RGB, you'd want to do the second interpolation (but not the first) in
linear light.

There is also a question of what to do with YUV images with a
non-premultiplied alpha channel. Interpolating the samples of such an
image direclty is definitely wrong, but it may be that simply
premultiplying first will work.

The two-interpolation pipeline has the practical benefit that chroma
reconstruction can be done in the fetchers, at least as long as the
chroma filter is fixed, where as the one-step process means the general
code for bilinear filtering would have to sample each component
individually, then filter, and then do a color conversion. It would no
longer be able to simply ask the underlying system to fetch an RGB
pixel.

Finally, one-step interpolation also means it could no longer be
guaranteed that "identity transformation + bilinear filter" is identical
to a nearest filter.

I can see why avoiding an extra interpolation step is desirable,
especially in a fast path. I'm just not convinced it's the same
operation mathematically, and that it's the correct operation. Maybe
there is a combination of chroma filter and interpolation filter such
that for sRGB compositing it *would* be equivalent?

> Though I see some trouble using such approach when handling repeat for
> odd sized images with subsampling (imagine handling of NORMAL repeat
> for fractional sized RGB images as a somewhat similar example). I
> don't know what would be the best solution here.

Can we simply prohibit fractional chroma planes? If people want such a
thing, they would just have to make something up for the final luma
column/row themselves. Fractional chroma planes are going to be
problematic no matter how interpolation is done.

>> There is also the question of how to *write* to a subsampled YUV
>> image. I don't particularly like read-only image formats, but writing to
>> YUV is not simple when you consider clip rectangles, because subsampling
>> involves a filter that probably extends outside the clip boxes.
>
> Writing to subsampled YUV images somewhat resembles writing to
> transformed RGB images to me (if this was supported by pixman of
> course). Both are inherently lossy to the point of becoming
> ridiculous. And if a sequence of compositing operations has to be
> applied to YUV destination image, then each step would introduce a
> major distortion by itself. I see only one recipe here - never allow
> to do this :-) If really high quality image processing has to be done,
> then any subsampled YUV destination image needs to be converted to
> some intermediate format (preferably losslessly). Then all the
> composition has to be done with this intermediate format and only
> converted to the target format at the very last step. A variation of
> this is to hide all these steps from the end user by doing everything
> behind the scene, but it may be too complex and inefficient.

Maybe Benjamin can comment, but I think one of the reasons for
supporting YUV destinations was to allow things like subtitles and
infographics to be overlaid without incurring an expensive roundtrip to
RGB. How are these things done normally?

> As I don't have a serious need for writing to YUV images at this
> point, I would like to skip this YUV writing part completely for now
> :-)

We probably don't need to support writing at first, although that would
mean this:

    The distinction between source and destination pixman formats was
    probably a mistake, and if we bump the ABI of pixman at some point,
    we'll want to fix it.

    So don't base anything around that distinction. Just pretend all
    pixman formats that can be used as sources can also be used as
    destinations. This will be true for all new formats, and is only
    false for YUV formats at the moment. [1]

from http://lists.freedesktop.org/archives/cairo/2009-June/017409.html
was not true.

>> What Andrea is getting at, is presumably how to specify image formats in
>> the API. A fully general API like he suggests is perhaps interesting at
>> some point, but I agree it shouldn't be prerequisite for getting some
>> YUV support in.
>
> I think he tried to suggest that if any pixman API extension is needed
> at all, then it's better to do it flexible enough and more future
> proof. Please correct me if I'm wrong.
>
> Right now I'm only interested in a new function
> 'pixman_image_create_planar' (with or without 'color_space' argument)
> from:
> http://cgit.freedesktop.org/~company/pixman/tree/pixman/pixman.h?h=planar#n764
>
> The color matrix and chroma siting can be assumed to be set to some
> sane default, with the API to actually tweak them to be introduced in
> pixman later.

Agreed. Color management can be introduced later with additional API.

Soren