[cairo] Concerns about using filters for downscaling

Bill Spitzak spitzak at gmail.com
Thu Mar 27 14:42:25 PDT 2014

Søren Sandmann wrote:

> Under the assumption that all we can do is change what the existing
> cairo API does, my suggestions would be:
> * BILINEAR and NEAREST should do what they are doing now

Agreed. However BILINEAR must not be the default, GOOD should be the 
default value.

> For GOOD and BEST, first compute scale factors xscale and yscale
> based on the bounding box of a transformed destination
> pixel. That is, consider the destination pixel as a square and
> apply the transformation to that square. Then take the bounding
> box of the resuling parallelogram and compute the scale factors
> that would turn the destination pixel into that bounding box.

It is better to convert the rhombus into an equal-area rectangle, not 
the bounding box. The bounding box will produce excessive blur for 
rotations. If the two derivatives are dx1,dy1 and dx2,dy2 I believe the 
correct x is hypot(dx1,dx2) and y is hypot(dy1,dy2). This also removes 
any reflections from the calculation.

I usually calculate the area in the *source*, not the destination. This 
just produces numbers that are the inverse of your scale numbers.

> Then,
> * GOOD should do:
>   - If xscale and yscale are both > 1, then
 >     - Otherwise, for each dimension:
 >       - If downscaling, use Box/Box/4
 >         - If upscaling, use Linear/Impulse/6

The filters are totally independent for the horizontal and vertical. You 
can apply either one first (converting the rectangle into a 1-pixel wide 
rectangle) and then then other one (converting this rectangle into a 
single point). Besides making it much easier because you only think 
about one dimension at a time, this is also a good deal faster. It also 
means you don't need the "upscaling" line.

If scale is < 1, then by definition you are downscaling. There will not 
be any upscaling because the two scales are completely independent. In 
addition I believe the cutoff can be at a value smaller than 1, perhaps 
as small as .5, which is the smallest scale where bilinear actually 
touches every pixel (even if the weights are wrong).

There are two schools of thought about the filter. I'm sorry but I am 
going to have to resort to describing filters the way I have always seen 
them used: a function that is stretched by the sample derivative, 
centered on the sample point, and then evaluated at the center of every 
source pixel. I also tend to use a "scale" that is the inverse of what 
you are using, I will attempt to use the word "derivative" instead. 
"cutoff" is 1/scale at which the filter switches, it is some number 
between 1 and 2.

The filter should blend cleanly into the bilinear. The bilinear in my 
terminology is a triangle with width 2. There are two schools of thought 
about how to "widen" this: one is to make it a triangle of width 
2*(derivative-cutoff+1). The other is a trapezoid of width 
(derivative-cutoff+2) with the sloping parts having a width of 1 each.

I prefer the second because it is smaller and is a box filter, though 
the triangle is probably higher quality.

It is certainly possible (as you suggest below) to bilinear sample the 
source image recursively to downscale it by a power of 2, and then do 
all the filtering with a derivative that is between 1 and 2. For the 
triangle this is perfect, for the box it is an approximation.

> * BEST should do:
>   - For each dimension:
>     - If upscaling, use Cubic/Impulse/MAX(3, log of scale factors)
>       - If downscaling, use Impulse/Lanczos3/4
> Where the filters are given as Reconstruction/Sampling/Subsample-bits.

BEST should do a number of modifications to GOOD:

1. It changes the cutoff to scale = 1.

2. It changes the filter function at both ends to something more 
sinc-like. However the two filters must be equal to each other at a 
scale of 1 to avoid a discontinuity. I have not seen any scheme where 
this works and the filter is not bilinear other than making the two 
filters the same.

>>  * In the future, the benchmark for GOOD is downscaling by
>>    factors of two to the next biggest power of two, and sampling
>>    from that with bilinear filtering. Pixel based backends
>>    should do at least that well in *both* performance and
>>    quality.

I have not experimented with this enough, but I do feel this is 
acceptable. This description is equivalent (for scales between 1 and .5) 
as my description of setting the cutoff to scale=.5.

Downscaling to the next *lower* power of 2 and then bilinear filtering 
would be "accurate" in that the weights have some relationship to 
physical reality. However this is excessively blurry.

More information about the cairo mailing list