[Mesa-dev] GSoC : Video decoding state tracker for Gallium3d

Mon Apr 4 13:33:03 PDT 2011

Hello again !
Last call for comments, this is the GSoC proposal that I will submit,
probably tommorow.

Thank you,
Emeric

GSoC 2011 Proposal
Hardware Accelerated VP8 Video Decoding for Gallium3d

==================== Abstract ===================
The goal of this project is to write a Gallium3D state tracker capable
of hardware accelerated VP8 video decoding through the VDPAU API. This
would allow every graphic card with a Gallium3D driver (primary
targets are r300g, r600g and Nouveau drivers) to be able to decode VP8
videos, and every VDPAU enabled multimedia software to play these
videos.
Hardware acceleration will be built upon graphic card’s shaders units,
able to take care of some heavy computations like motion compensation,
intra-predictions, IDCT, deblocking filter.

================= Benefits to Community =================
Video decoding can be a heavy task even for today’s hardware. High
definition videos need a great deal of computational power to achieve
a smooth decoding experience, and low class CPU are sometimes unable
to play videos at decent speed. When a CPU is not powerful enough,
what is usually done is offloading the video decoding process to a
dedicated chip living on-board the GPU. Unfortunately, these dedicated
chips cannot be used by free software drivers, because of the complete
lack of documentation available about them, and the only solution left
is pure CPU-based decoding. Bringing generic hardware accelerated
video decoding to free software drivers would be a great feature,
allowing users to fully enjoy their video experiences.

VP8 has been chosen for this project because of its straightforward
design and its openness. Unlike most existing video formats, no
patents problems have to be expected when the time will come to merge
the GSoC project with mesa master.
The VDPAU has been chosen because the API is widely in use across many
multimedia softwares and has a solid reputation.

=========== Implementation plan, with two major steps : ==========
* Get a purely software implementation of a VP8 decoder to run into a
proper gallium state tracker, and use it to play videos through
various "VDPAU ready" softwares.
- The goal of this project is not to build a VP8 decoder from scratch,
as that task would be more of an entire GSoC project, but rather to
work on GPU optimizations, so the first step will be to port an
existing VP8 decoder into a state tracker. My first choice is to work
with the ffmpeg VP8 decoder, as it is a clean implementation from
scratch.
- Since VP8 is currently not supported by the official nVidia VDPAU
implementation, multimedia software will need to be slightly patched
in order to acknowledge the VP8 decoding ability of the state tracker.
- That step also includes making the state tracker ready to embed more
video formats in a close future.

* Start implementing shader based optimizations, where they can be useful.
This part of the project will consist in building optimizations of the
most shaders-friendly parts of the decoding process, including motion
compensation, deblocking filter, intra-predictions and IDCT, by moving
them from the CPU to the GPU shaders. These shaders programs will be
written using the Tokenized Gallium Shader Instructions (TGSI)
intermediate representation (IR).
The main advantage of using the GPU shaders to do video decoding is
the provision of more computational power accessible in a generic way
across a lot of different graphic cards and operating systems. This
raw power can be dedicated to do some video decoding tasks, thus
significantly offloading the CPU, leaving it free to perform other
common tasks.
The trick is to properly manage the power of these shaders. They
perform very well on vectorized arithmetic operations, and very poorly
on logical operations (branches, loops, ...). They don’t have proper
cache memory and have very slow access to the main memory. Instead
they can access memory areas stored on the video card, and read or
write on these areas (but cannot read and write into the same memory
area).
In order to achieve a significant speed gain, the different decoding
tasks must be carefully divided into small independent and repetitive
tasks operating on large sets of data, which can be simultaneously fed
to several shader units.

=================== Deliverables ===================
The deliverables will correspond to the two major milestones of the
project. The first one will be a functional Gallium3D state tracker
able to decode VP8 videos. This is an important step in order to build
more functionalities beyond the scope of this project. The second one
will be a more elaborate version of the state tracker, using shaders
to achieve faster video decoding speed.

================= Roadmap estimation =================
I will be able to work fulltime, 35h per week on this project starting
mid june, after the end of my academic year. Additionaly I will have
some free time after the summer of code ends if some tasks needs to be
polished, since my next academic year begins only in november.

May
A - Study about state tracker implementation.
B - Get an “empty” gallium3d state tracker prototype to compile.
C - Study about a suitable architecture for a generic video decoding system.
D - Start the implementation of a VP8 decoder into the state tracker,
using ffmpeg VP8 decoder as a model.

June
E - Implementation of the CPU based VP8 decoder.
F - Start working on GPU optimizations designs.

July
G - Implementation of motion compensation.
H - Implementation of loop filter.
I - Implementation of intra-predictions ?

August
J - Implementation of IDCT ?
K - Get videos to work without visual artifacts (debug).
L - Deliver the project.

================ Biographical Information ===============
My name is Emeric Grange, I am a 23 years old French student,
currently in first year of master degree in computer science. I spent
my last year internship working part time in a French company
specialized in IP TV. My job was to build an H.264 decoding library
from scratch. I learned a lot about video decoding and as I liked what
I learned, I decided to apply to the Summer of Code with a project
related to what I had been doing during this past year.

As far as I remember, I've always been in contact with computers
through my father’s hobby, slowly learning how to use them for
playing, working, and then how to make them better. The languages I
learned include QBasics, C, HTML, PHP, C++, Java, X86 Assembly,
Python, Ruby and a few others. I have now decided to focus on low
level programming, the most interesting area in my opinion.
I've used free software for many years, and made various small
contributions to existing softwares along the way, including
debugging, support and translations, while I was learning to be a
programmer. My motivation to join the Google Summer of Code is to dive
into a concrete project and to enrich it with new functionalities, now
that I feel capable of doing so.

=================== Related Works ===================
Shaders accelerated MPEG2 decoding, using XvMC in a Gallium3d state
tracker (previous GSoC work)
http://www.bitblit.org/gsoc/g3dvl/proposal.shtml

OpenCL accelerated VP8 decoder, using libvpx
https://github.com/awatry/libvpx.opencl

OpenGL accelerated h264 decoder, using ffmpeg (previous GSoC work)
http://wiki.xbmc.org/?title=GSoC_-_GPU_Assisted_Video_Decoding
https://github.com/kasbah/gsoc#readme (an attempt to finish the
project mentioned above)

2011/3/23 ★ Emeric <emeric.grange at gmail.com>
>
> Hi everyone,
> My name is Emeric, I am a 22 years old french student, and I am
> currently looking to apply to the google summer of code 2011.
> I saw the "Gallium H.264 decoding" idea on the X.Org GSoC page, and I
> am particularly interested by this project.