[cairo] Concerns about using filters for downscaling
spitzak at gmail.com
Wed Mar 26 12:45:49 PDT 2014
Søren Sandmann wrote:
> Bill Spitzak <spitzak at gmail.com> writes:
>> I'm also still mystified by the "two filters" approach. I have no idea
>> what you mean by the sampling and reconstruction filter. I have done a
>> lot of image processing, and transformations only use *one* filter,
>> called the "sampling filter". This filter changes depending on the
>> scale and on the fractional portion of the sample point.
> Say that we want to transform a source image with a transformation that
> is a downscale in one dimension and an upscale in another.
Actually this is not any more difficult than two unequal values that are
both larger or both smaller than one, see below.
Each pixel in an affine transformation actually has a source position
and two vectors (which I usually call the derivative) describing how the
source position moves as the output pixel is changed horizontally or
vertically. It is easier to picture this as describing a rhombus in the
source space, all the rhombuses are the same shape and tile to cover the
input image. Note that the filtering can (and should) sample outside
this rhombus, it is just describing the "size" of the source filter.
Virtually all transformations convert this rhombus into an axis-aligned
rectangle, including the pixman one. These rectangles do not tile the
source image, therefore we have already made an approximation. This
means that approximations are allowed in describing the algorithm.
(for non-affine transforms the shape is an arbitrary quad, with 8
degrees of freedom. All systems I have seen abandon this immediately,
instead using the rhombus described by the derivative at the center of
the output pixel. This thus adds even more inaccuracy.)
> (a) Get rid of the replicated copies of the baseband. This is
> done by lowpass filtering so that only one copy remains.
> (b) Transform the image. This will distorts the one remaining baseband
> copy so that it becomes bigger in one dimension and smaller in
> another. Parts of it will now stick out outside the baseband.
> (c) Lowpass filter again to get rid of the bits of baseband copy that
> ended up outside the baseband square. Otherwise those bits will cause
> aliasing in the next step.
> (d) Sample the image; this will once again replicate the baseband to
Yes this is correct, but if you assume perfect lowpass filters one of a
or c is irrelevant. Only the smaller lowpass filter is used since it
will also cut off all the frequencies that the larger one uses.
A smaller lowpass filter corresponds to a larger sinc filter. The end
result is that at any scale less than one the lowpass filter in c is the
only one needed, if you back-map this through the transform this is a
sinc filter widened by the scale. For scales greater than one the
lowpass filter in a is the only one needed, this is a sync of width one.
It is true that other filters which are not perfect lowpass filters will
produce different results if you actually went and implemented this by
multiplying one width-1 filter by another 1/scale size filter (as it
appears pixman is doing). However since the purpose is to *simulate* a
lowpass filter, it works much better to assume your filter is a real
lowpass filter and thus just use the wider one, ignoring the smaller
one. This seems to be where the confusion in pixman arises. We are *not*
trying to simulate 2 imperfect filters, we are instead trying to pick a
single imperfect filter that is as close as possible to a perfect one!
The end result is that sampling should be done by multiplying a single
filter by the source image. The width of this filter is max(1,1/scale).
This works even if the scale is greater than 1 in one direction and less
in the other, it just means the filter is 1 wide in one direction.
For filters with frequencies <= 1/2 (which is true of all the above
ones) sampling theory indicates that only one sample per pixel is
needed, multiplied by the filter at that point. Although the box used by
bilinear has infinite frequency, it can be represented as a triangle
with width 2 if it is used this way, and thus integrated into algorithms
using other filters.
Now I have certainly seen algorithms that *switch* filters at a given
scale. If the switch is done at scale = 1, then you could say step a is
one of these filters and step c is a different one. However if you
assume they are simulating perfect lowpass filters then you just choose
one of them, you don't multiply. Furthermore I have seen the cutoff at
points other than 1, one example is the suggestion that bilinear be used
for scales >= 1/2. It is also common to switch to pixel-aligned box
filter at very small scales.
> Now, when the transform is affine, all the four steps can be combined
> into one by convolving the two filters, and this is what pixman does.
This can be done for non-affine transforms. All that means is that the
filter varies depending on which pixel, but the single convolved result
can be used for each pixel.
In actual implementations where the filter is a weight to multiply each
pixel by, the filter will "vary" even for affine transforms. It depends
on the fractional portion of the sample point. The only time the filter
will not "vary" is if the scale is an integer and the rotation a
multiple of 90 degrees.
More information about the cairo