# [cairo] Concerns about using filters for downscaling

Bill Spitzak spitzak at gmail.com
Thu Mar 27 14:42:25 PDT 2014

```Søren Sandmann wrote:

> Under the assumption that all we can do is change what the existing
> cairo API does, my suggestions would be:
>
> * BILINEAR and NEAREST should do what they are doing now

Agreed. However BILINEAR must not be the default, GOOD should be the
default value.

> For GOOD and BEST, first compute scale factors xscale and yscale
> based on the bounding box of a transformed destination
> pixel. That is, consider the destination pixel as a square and
> apply the transformation to that square. Then take the bounding
> box of the resuling parallelogram and compute the scale factors
> that would turn the destination pixel into that bounding box.

It is better to convert the rhombus into an equal-area rectangle, not
the bounding box. The bounding box will produce excessive blur for
rotations. If the two derivatives are dx1,dy1 and dx2,dy2 I believe the
correct x is hypot(dx1,dx2) and y is hypot(dy1,dy2). This also removes
any reflections from the calculation.

I usually calculate the area in the *source*, not the destination. This
just produces numbers that are the inverse of your scale numbers.

> Then,
>
> * GOOD should do:
>   - If xscale and yscale are both > 1, then
>     - Use PIXMAN_FILTER_BILINEAR
>     - Otherwise, for each dimension:
>       - If downscaling, use Box/Box/4
>         - If upscaling, use Linear/Impulse/6

The filters are totally independent for the horizontal and vertical. You
can apply either one first (converting the rectangle into a 1-pixel wide
rectangle) and then then other one (converting this rectangle into a
single point). Besides making it much easier because you only think
about one dimension at a time, this is also a good deal faster. It also
means you don't need the "upscaling" line.

If scale is < 1, then by definition you are downscaling. There will not
be any upscaling because the two scales are completely independent. In
addition I believe the cutoff can be at a value smaller than 1, perhaps
as small as .5, which is the smallest scale where bilinear actually
touches every pixel (even if the weights are wrong).

There are two schools of thought about the filter. I'm sorry but I am
going to have to resort to describing filters the way I have always seen
them used: a function that is stretched by the sample derivative,
centered on the sample point, and then evaluated at the center of every
source pixel. I also tend to use a "scale" that is the inverse of what
you are using, I will attempt to use the word "derivative" instead.
"cutoff" is 1/scale at which the filter switches, it is some number
between 1 and 2.

The filter should blend cleanly into the bilinear. The bilinear in my
terminology is a triangle with width 2. There are two schools of thought
about how to "widen" this: one is to make it a triangle of width
2*(derivative-cutoff+1). The other is a trapezoid of width
(derivative-cutoff+2) with the sloping parts having a width of 1 each.

I prefer the second because it is smaller and is a box filter, though
the triangle is probably higher quality.

It is certainly possible (as you suggest below) to bilinear sample the
source image recursively to downscale it by a power of 2, and then do
all the filtering with a derivative that is between 1 and 2. For the
triangle this is perfect, for the box it is an approximation.

> * BEST should do:
>   - For each dimension:
>     - If upscaling, use Cubic/Impulse/MAX(3, log of scale factors)
>       - If downscaling, use Impulse/Lanczos3/4
>
> Where the filters are given as Reconstruction/Sampling/Subsample-bits.

BEST should do a number of modifications to GOOD:

1. It changes the cutoff to scale = 1.

2. It changes the filter function at both ends to something more
sinc-like. However the two filters must be equal to each other at a
scale of 1 to avoid a discontinuity. I have not seen any scheme where
this works and the filter is not bilinear other than making the two
filters the same.

>>  * In the future, the benchmark for GOOD is downscaling by
>>    factors of two to the next biggest power of two, and sampling
>>    from that with bilinear filtering. Pixel based backends
>>    should do at least that well in *both* performance and
>>    quality.

I have not experimented with this enough, but I do feel this is
acceptable. This description is equivalent (for scales between 1 and .5)
as my description of setting the cutoff to scale=.5.

Downscaling to the next *lower* power of 2 and then bilinear filtering
would be "accurate" in that the weights have some relationship to
physical reality. However this is excessively blurry.
```