[cairo] Concerns about using filters for downscaling

Bill Spitzak spitzak at gmail.com
Thu Mar 27 16:41:21 PDT 2014

Owen Taylor wrote:

>> I'm not sure what "Sampling a sparse set of iregularly distributed
>> points" means.
> This is a bit of a throw-away thought - essentially, if the set of
> source pixels corresponding to one destination pixel is large enough,
> then average *all* the source pixels in not necessary to produce
> a high quality result. If you want to find the average hair color
> of the US, you don't need to examine 314 million heads.

Not sampling all the pixels does not produce acceptable results. Some 
source images have very high frequency information (such as scanned 
text) and skipping pixels will produce unacceptable changes as the 
transformation is changed.

You can instead produce an intermediate image that is scaled down. This 
is what mipmapping does.

>> For GOOD and BEST, first compute scale factors xscale and yscale
>> based on the bounding box of a transformed destination
>> pixel. That is, consider the destination pixel as a square and
>> apply the transformation to that square. Then take the bounding
>> box of the resuling parallelogram and compute the scale factors
>> that would turn the destination pixel into that bounding box.
> Hmm, I worry about fuzziness with that approach. Think about pure
> rotations - with your approach the area of the sampling matrix
> doubles for 45 degree rotations. But most nice filters are going
> to be more or less rotationally invariant (a Gaussian filter is
> perfectly so), so we expect the *same* sampling matrix with
> respect to rotation.
> Maybe something like: transform the unit circle into an ellipse - find
> the bounding box of that ellipse to get an axis-aligned ellipse and then
> scale it down uniformly so it has the same area as the transformed
> ellipse.

This is exactly the same as what I said. I used a rhombus and rectangle 
as the source and destination, you are describing ellipses inscribed in 
these shapes.

The axis of the ellipse are the derivatives of the sampling point. 
Picking two new axis alinged with x and y, where the x is the 
sqrt(a^2,b^2) of the x of the two axis, and y similarily, produces a 
rectangle with the same area that matches for all 90 degree rotations.

> For reference, the size of Box/Box is ceil(S+1) for a downscale of S, so
> for scales between 1 and 2 it's generally sampling a 3x3 grid (with
> more or less 0 elements depending on the phase and the scale.)

I would not round boxes up to the next integer until you get to scales 
smaller than about 1/32. This can produce unwanted moire patterns.

Instead the "box" must multiply the pixels at the very edge by a smaller 
factor than the interior pixels. Bilinear sampling is a degenerate case 
of this for a 1-wide box: the two pixels are both multiplied by values 
less than 1.

> Probably need to clamp the subpixel precision for huge upscales.

I have not found that the location of the filter can be quantized to 
about 1/64 of the width of the filter (ie if you are scaling up by 64 
then I completely ignore the fractional part of the sample point). 
However I was impulse-sampling the filter to get the factors to multiply 
each pixel by. On modern hardware I think it is quite possible to 
bilinear sample the filter, which would allow subpixel precision up to 
the whatever the bilinear limits are. I suspect then the filter can be 
stored as only a few numbers, perhaps every .25 in filter space.

>>       - If downscaling, use Impulse/Lanczos3/4
> Do you have sample images where Lanczos3 gives significantly better
> results than Cubic for downscaling?

I have not seen much improvement from wider windowed sync filters.

More differences are found by derivating from a sync. What I called the 
"Mitchell filter" (but may have mis-named) actually amplified 
high-frequency information by making the negative lobes larger than the 
cubic has. A lot of people liked the results of this.

>> Regarding mipmaps, I think this is a somewhat orthogonal issue. They
>> likely are useful for applications that do interactive or animated
>> transformations, and so it would make sense to support them. But for
>> one-off scalings on CPUs, I just don't see how they would be faster or
>> higher quality than the convolutions.

What I think will work for cairo is to keep a single power-of-2 
downscale for each source image. This is not exactly a mipmap as the 
power may be different horizontally and vertically. It is not created 
until first used, and thrown away if the image is altered or a different 
downrez is wanted. For GOOD this image is bilinear sampled. For BEST it 
is twice as large and a filter is used on it.

More information about the cairo mailing list