[gst-devel] Software anti-patent: Luminaplex: constant luminance AND chrominance encoding algorithm

Karel Kulhavy clock at twibright.com
Sun Jan 28 17:35:08 CET 2007


Hi

I have developed and algorithm for chroma subsampling that keeps both constant
luminance and constant chrominance. It's called Twibright Luminaplex. I am
sending a description for two purposes. Maybe Gstreamer will find it useful and
will be able to implement it (there is a reference implementation under GPL). I
am also opposed to the idea of software patents and want to publish prior art
on an independent mailing list, which is a recommended practice to ensure that
if someone tries to patent the idea, the patent will fail on the first court
trial.

I have also developed another algorithm that produces slightly worse result,
but is faster and the execution time doesn't depend on the content.

The following description has been published at
http://ronja.twibright.com/hyperluma.php and on Freshmeat under the name
Luminaplex (still a new pending entry). Who wants to testify against a possible
software patent save this e-mail together with the aforementioned HTML
somewhere with your received e-mail, thanks.

Regards,

-- Karel Kulhavy


<p>I am publishing the algorithm description at the moment of release on an
independent server (Freshmeat) to have a reasonable proof of prior art and
resulting patent invalidity in case someone tries to patent my idea.

<p><b>Hyperluma and Luminaplex - algorithms for improved video
encoding</b></p>

<p>Twibright Labs have developed a family of algorithms to
encode ordinary 4:2:0 and 4:2:2 Y'CbCr video with unprecedented colour detail
quality. There are three algorithms - Hyperluma 1, Hyperluma 2 and Luminaplex,
in order of increasing picture quality.
The algorithms solve the so called <a href=
"http://en.wikipedia.org/wiki/Chroma_subsampling">chroma bleed</a> problem
which is a result of the mathematically incorrect idea of chroma subsampling in
analog and digital video broadcast, computer video encoding and similar
applications. The chroma bleed is best visible as dark contours along edges of
bright colours in the picture.  It can be seen on the edge between green and
magenta and between red and blue on a TV testcard.
However it exists for any kind
of colour detail, and contributes to overall corruption of the image
details. 
<p>The corruption - the black bar between green and magenta and between red and blue
- is very visible on TV testcards.</p>

<p>5 vertical strips of the screenshot show the quality of
individual encoding agorithms compared with the original. Note how the edge
corruption decreases from left to right. Use a CRT to view this image properly,
LCDs usually display false artifacts on the eges. Nearest neighbour chroma
interpolation is assumed in the decoder. </p>

<p><b>Twibright Hyperluma 1</b></p>

<p>Hyperluma 1 internally simulates the receiver (decoder) of the signal
and then adjusts the luma channel until the effect is canceled. The result
- these artifacts, which could confuse the viewer in resolving details,
are gone.</p>

<p>During chroma subsampling we are presented with 3 pixels, each containing R, G,
and B values. We are supposed to produce four Y' (luma) values, one Cb
(blue chroma) and one Cr (red chroma value). The original ancient idea
of the TV system was that the
luma values would describe perceived brightness (luminance), which is perceived
by human eye with twice as sharp resolution than the colour information.
However, as the luma and chroma are calculated from gamma-corrected (nonlinear)
R' G' B' values, luma doesn't represent just luminance (perceived brightness),
but is also influenced by the chroma values. To reconstruct luminance properly,
both luma and chroma values are necessary - luminance depends on both.</p>

<p>When the chroma values are subsampled and therefore changed, the luminance
then suffers too, because depends partially also on the chroma. The first step
in ordinary encoding system is to convert R' G' B' values into Y' Cb Cr, the
second one is to subsample the Cb Cr.  Hyperluma instead converts R' G' B' into
gL Cb Cr, where gL is gamma-transformed luminance. gL represents accurately the
brightness as perceived by the viewer.  In the second step, the Cb and Cr are
subsampled as usually. Hyperluma then adds third step, where gL Cb Cr are used
to calculate Y' - a luma that would produce with the given Cb Cr the brightness
perception that was in the picture at the given pixel position.</p>

<p><b>The core of Hyperluma - the Hyperluma Table</b></p>
<p>The core to the Hyperluma calculation is to solve the following mathematical
problem: given Cb, Cr, and gL, what is the Y', that with Cb and Cr produces
the given gL?</p>

<p>Because the function transforming Y'CbCr into gL (gamma-transformed
luminance) cannot be easily mathematically reversed, the software uses binary
division to search for the solution. As this is 8 times slower than the forward
transformation and would slow down the video encoding, it precalculates the
reverse transformation into an <a href="http://ronja.twibright.com/utils/hyperluma_table">universal
lookup table 12.96 megabytes big</a>.  This table can be compressed down to
just 29kB with bzip and then easily distributed with the program.  When the
program starts, it loads the table into the RAM, allowing fast video processing
without a need to wait on program startup until the lookup table is
precalculated.</p>

<p>After I developed the algorithm, published it and consulted with specialists,
it came out that the algorithm is patented. That was an impulse to develop
Hyperluma 2 and ultimately Luminaplex, which improve the quality even more
and don't seem to be covered by any patent.</p>

<p>The problem of Hyperluma 1 is that it removes the problem of nonconstant
luminance by providing a constant luminance, however the chrominance stays
the same. By the process of chroma filtering and nonlinearities in chroma
definition, the chrominance in the picture suffers. This is partially solved
by Hyperluma 2.</p>

<p><b>Twibright Hyperluma 2</b></p>
This produces 4:2:0 Y'CbCr encoded with Hyperluma 2:
<br><br>
<br>&nbsp;&nbsp;* Input: 3x 16 bits RGB
<br>&nbsp;&nbsp;* Calculate a subsampled RGB, in both dimensions 1:2 subsampling
<br>&nbsp;&nbsp;* Calculate subsampled R'G'B' from the subsampled RGB by applying the
gamma function.
<br>&nbsp;&nbsp;* Calculate Cb and Cr from the R'G'B'
<br>&nbsp;&nbsp;* Calculate 16-bit luminance Y for each from 4 pixels from the original
high-resolution RGB
<br>&nbsp;&nbsp;* Calculate Yc, a companded luminance by applying gamma function on the
luminance Y (Yc has nothing to do with Y'!)
<br>&nbsp;&nbsp;* Use Yc, Cb and Cr as an input for the hyperluma table to calculate luma Y'
for each of four pixels.

<br>&nbsp;&nbsp;* Send the resulting 6 numbers - Y', Y', Y', Y', Cb, Cr - to the output

<p>4:2:2 encoding is analogical. Note that the algorithm doesn't depend on
what subsampling filter you use.</p>

<p>Hyperluma 2 improves the chrominance by calculating the chroma not by
chroma filtering, but by physically correct photon averaging. The luminance
is corrected completely like in Hyperluma 1, the chrominance is corrected
partially. But the chrominance correction is not perfect - the luma shift
caused by the luma correction by the hyperluma table causes the colours to
drift from their ideal values.</p>

<p>Hyperluma 1 and Hyperluma 2 are algorithms of constant luminance. But the
viewer doesn't watch the luminance only. He cares also about the chrominance!
I took a completely different approach and set a goal to get both constant
luminance and constant chrominance. The result is the Twibright Luminaplex
algorithm, an algorithm that produces the ultimately correct result.</p>

<p><b>Twibright Luminaplex</b></p>
<p>The ultimate idea of encoding RGB or 4:4:4 Y'CbCr into 4:2:0 or 4:2:2 Y'CbCr
is strikingly simple. You want the viewer to see the same on the TV as what is in
the input signal. And that's what lead me to the Luminaplex algorithm,
the algorithm that provides the ultimate quality.</p>

<p><b>How to tell whether the viewer sees the same</b></p>
<p>Sees the same doesn't mean the TV shows the same. The human eye is alleged
to have half resolution in colour than in brightness. But how to quantify this,
when the error is perceived according to a power function, while the colour
mixing goes linearly? I cut this Gordic knot with a simple algorithm that I still
believe is accurate. This is what the perceived_error function in the source
code calculates:</p>

<p><b>The perceived error measure</b></p>
<br><br>
<br>&nbsp;&nbsp;* Operates on 2x2 pixel blocks
<br>&nbsp;&nbsp;* Decodes the Y'CbCr data like a decoder with nearest neighbourhood
interpolation would do
<br>&nbsp;&nbsp;* Calculates 16-bit linear RGB that appear on the screen phosphors
<br>&nbsp;&nbsp;* Calculates luminance Y from RGB for each of the 4 pixels.
<br>&nbsp;&nbsp;* Calculates gamma-companded luminance Yc from each Y by just applying
the gamma function (do not confuse Yc and Y'!)
<br>&nbsp;&nbsp;* Calculates gamma-companded luminance Yc for each pixel (that't the brightness
perception
<br>&nbsp;&nbsp;* Averages the RGB (linear) values of all 4 pixels together. Then applies
gamma so R'G'B' come out. That's the colour perception.
<br>&nbsp;&nbsp;* We are having 7 numbers - Yc, Yc, Yc, Yc, R', G', B'. Calculate
the Yc, Yc, Yc, Yc, R', G', B' that would be displayed from the original
RGB signal without 4:2:0 Y'CbCr encoding. Take a difference for each
of the 7 channels, square it and sum the squares. Take a square root. This
is called <a href="http://en.wikipedia.org/wiki/Root_mean_square">Root Mean
Square (RMS)</a> and that says how crappy the viewer considers the picture.
<br>&nbsp;&nbsp;* I defined it as RMS because it has a property that when the same absolute
value of difference is spread into multiple pixels, it is considered less
crappy.  I think this corresponds to the reality of perception. A single
screaming bug in the image is worse than some small imperfection uniformly
spread in the picture.


<p><b>The Luminaplex algorithm itself</b></p>
<br><br>
<br>&nbsp;&nbsp;* Operates on blocks of 2x2 pixels
<br>&nbsp;&nbsp;* Calculate 6 values - Y', Y', Y', Y', Cb, Cr as for Hyperluma 2. This is
our starting point.
<br>&nbsp;&nbsp;* Now we are going to continuously monitor the quality of our data by calculating
the perceived error from these 6 numbers - Y', Y', Y', Y', Cb, Cr
<br>&nbsp;&nbsp;* Take some dimension (for example the first Y') and try to increase
the value. Does the error go up? If yes, try going the other way.
If no, increase until the error would go up.
<br>&nbsp;&nbsp;* Take another dimension, then another until you exhaust all the 6 dimensions.
<br>&nbsp;&nbsp;* Go another round and another until you make a round where you actually didn't
do any adjustment. This is the final result.
<br>&nbsp;&nbsp;* You can see the algorithm proceeds like a satellite dish viewer who wants
to get the best picture. He starts with some picture. Then he adjust the
azimuth for the best picture. Then he adjust the elevation for the best picture
while keeping the improved azimuth. Then he repeats until he realizes there is
nothing more to improve. Then he sits down into the couch and
watches.


<p>Note that this algorithm, unlike Hyperluma 1 and 2, are defined with the
assumption that the receiver uses a nearest neighbour filter for the chroma
upsampling. There is actually no need to use any more complicated filter than
this. The transmission chain with Twibright Luminaplex encoder and
nearest neighbour decoder preserves the luminance
and the chrominance in the passpand of human eye. Some spatial
frequency mirrors are generated in the chrominance area above the capability
of human eye to discern colours, because the average-4-pixels filter doesn't
cut off above the Nyquist frequency well. However it doesn't matter, because
human eye cannot see these mirrors anyway.</p>

<p>Running a decoder with more sophisticated chroma filter on a Luminaplex
output is not only a waste of CPU time, but also degrades the picture somewhat.
The Luminaplex output was optimized for the simplest filter. For example mplayer
on png output (where I can study the output exactly) uses this simplest filter.</p>

<p><b>How to optimize Luminaplex output for decoders with sophisticated
filters?</b></p>

<p>But what if we need to encode with Twibrigtht Luminaplex for a decoder that
uses a more sophisticated filter, like bilinear or even bicubic? Then the
task becomes more complicated, but not impossible.</p>

<p>Instead of optimizing blocks 2x2 now we have to optimize the whole picture
as a single block, because changing one luma or chroma value influences the
errors in several neighbouring pixels. It's the same algorithm, but instead
of 6 dimensions we have several hundreds thousands or millions dimensions.</p>

<p>Each iteration we don't have to minimize the global image error, but we need
to calculate the error only for those pixels that are influenced in the output
by the changed value. The size of this area is given by the size of the filter
core in the decoder. It will surely slow down the calculation, several times,
depending on the core size. But it can still be practical for one-time encoding
of videos for distribution for example on a website or on a DVD. I didn't
implement this variant into pix-y4m.c because it would be more complicated and
slower and I didn't intend to use it on the Ronja project.</p>

<p>The Luminaplex algorithm is not suitable for broadcast because theoretically
one could feed a poisonous picture that would contain a lot of difficult
squares that would slow down the decoder and cause a stream dropout. The
Hyperluma 2 doesn't have this problem, the execution always takes the same
time there.</p>

<p>Some 8-bit R'G'B' triplets are not representable by 8-bit Y'CbCr -
example seems to be 192,64,192. In such case Luminaplex sometimes produces some
kind of 2x2 dithering to minimize the error. Very slight dithering artifacts on
constant edges or constant colours result. If this is not acceptable for some
reason, Hyperluma 2 must be used instead. Sometimes the colour transition is so
sharp that Y'CbCr would have to go out of range to produce the correct
result. It was originally programmed this way but then it came out that mplayer
seems to clip Y'CbCr into the 16-235,16-240,16-240 range before processing and
the result was horrible - a green tint on black. So I removed this and allowed
only Y'CbCr values within the range to be used. This degraded the SNR on the
numeric test (below) only by about
a decibel, from 45dB to 44dB.</p>

<p><b>Numeric quality comparison</b></p>
<p>I have made a test with randomly generated pixel content. Such content exercises
all possible colour transitions in the linear RGB cube homogenously. The RMS
error is calculated for LSB per pixel - if it comes out as N, it means you
would need to increase all 8-bit R'G'B' channels in a picture by N to get
the same level of perceived error. I also provide a SNR value in dB from
this SNR. 0 dB means error excursion of 127.5LSB - the same excursion from the
average as the strongest possible signal - white/black bars. 400,000 randomly
filled pixel squares were used for each run.  Iterations per pixel are
indicated, that counts how many times the perceived_error calculation is
performed per pixel (from which 0.25 is for evaluation purpose only). The
time/pixel is measured on 1.5GHz Pentium M with OpenBSD 4.0, gcc 3.3.5, and
optimization flags -g -O2 -fomit-frame-pointer -funroll-loops -fstrength-reduce.</p>

<br><b>Algorithm - RMS error [LSB] - SNR [dB] - Time/pixel [usec]
- Iterations/pixel</b>
<br>
Ordinary - 9.075 LSB - 22.95 dB - 0.20 usec - 0.25 iter. <br>
Hyperluma 1 - 5.635 LSB - 27.09 dB - 0.18 usec - 0.25 iter. <br>
Hyperluma 2 - 2.800 LSB - 33.17 dB - 0.18 usec - 0.25 iter. <br>
Luminaplex - 0.731 LSB - 44.83 dB - 2.01 usec - 6.96 iter. <br>
<p><b>Source codes</b></p>
<p><a href="http://ronja.twibright.com/utils/pix-y4m.c">pix-y4m.c</a> is the source code. The Hyperluma 1
is disabled in the code. As it is now the code implements Twibright Luminaplex
and the inaccessible Hyperluma 1 code is there only for illustration.</p>

<p><b>Usage in real world</b></p>
<p><a href="http://ronja.twibright.com/3d/">Ronja 3D videos</a> which are used to illustrate how to put
<a href="http://ronja.twibright.com/">Ronja</a> together, are now all encoded with the Luminaplex
algorithm. It is indicated by a small Luminaplex logo in the upper left corner.
I think it was a good idea to develop the Luminaplex algorithm for this application,
because the models are brightly coloured for comprehensibility and then the
chroma bleed problem is the strongest. The chroma bleed disease attacks
details of the picture and people could have problem identifying all those
small nuts, bolts and holes in the Ronja models.</p>

<p><b>Patents</b></p>
<p>As I already mentioned, the Hyperluma 1 algorithm seems to be covered by
some patents. Hyperluma 2 and Luminaplex are not, I hope. On January 28, 2007 I
published these algorithms and their GPL free software implementation to
constitute prior art and to prevent someone from stealing them by patenting
them and then being able to prevent people from using this algorithm. I have
sent a note to various mailing lists and respectable individuals so a plausible
proof of prior art can be constructed to challenge a patent if someone tried to
abuse the patent system this way.</p>

<p>This of course applies only in the case the Hyperluma 2 or Luminaplex
algorithms doesn't accidentally happen to be already covered by a patent as
happened in the case of Hyperluma 1. I didn't conduct a patent search because
these are expensive and checked just the patents connected with Hyperluma 1. If
you are curious you can conduct a patent search yourself. Should you find a
patent that infringes either algorithm, let me know.</p>

<p>Software patents are harmful for software development, especially free
software. If you look into some public patent database you can see the patent
system is full of simple algorithms performing simple tasks, effectively
banning free usage of some independently invented innovative ideas.</p>

<p><b>Bibliography</b></p>
<br><br>
<br>&nbsp;&nbsp;* <a href="http://www.poynton.com/PDFs/Mag_of_nonconst_luminance.pdf">The magnitude of nonconstant luminance errors (Acrobat PDF format)</a> by Charles
Poynton
<br>&nbsp;&nbsp;* <a href="http://www.poynton.com/ColorFAQ.html">Poynton's Color FAQ</a>
<br>&nbsp;&nbsp;* <a href="http://en.wikipedia.org/wiki/Chroma_subsampling">Chroma Subsampling</a> on Wikipedia
<br>&nbsp;&nbsp;* <a
href="http://poynton.com/papers/YUV_and_luminance_harmful.html">http://poynton.com/PDFs/YUV_and_luminance_harmful.pdf</a>
by Charles Poynton
<br>&nbsp;&nbsp;* <a href="http://en.wikipedia.org/wiki/Mach_bands">Mach Bands</a>
<br>&nbsp;&nbsp;* <a href="http://en.wikipedia.org/wiki/Cornsweet_illusion">Cornsweet illusion</a>
<br>&nbsp;&nbsp;* <a
href="http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=GB2293514&F=0">GB2293514</a>
&quot;Luminance signal coding; correcting for failure of constant
luminance&quot;
<br>&nbsp;&nbsp;* <a href="http://v3.espacenet.com/textdoc?DB=EPODOC&IDX=WO9609724&F=0">WO9609724 </a>
&quot;Luminance signal coding; correcting for failure of constant
luminance&quot;
<br>&nbsp;&nbsp;* <a href="http://www.google.com/patents?vid=USPAT4999702&id=ZFIfAAAAEBAJ&dq=4999702&ie=ISO-8859-1">US4999702</a>
&quot;Method and apparatus for processing component signals to preserve high frequency intensity&quot;








More information about the gstreamer-devel mailing list