[Mesa-dev] hardware xvmc video decoding with nouveau
deathsimple at vodafone.de
Fri Aug 12 07:04:42 PDT 2011
> Take a closer look at decode_macroblocks in vl_mpeg12_decoder. Having
> the state tracker need to care about how much the decoder is buffering
> is a bad idea.
I totally agree on that, so I don't want to force the driver to use what
the state tracker is thinking is the best buffering method, I just want
to give the driver enough information to decide how it needs to lay out
> I don't see how this is going back to a flawed design,
> it's doing almost exactly what is currently done, just not in the
> state tracker.
I think I have to apologise for that, what I wanted to say is that it
isn't a good idea to redefine stuff that is already defined in the spec.
Thinks like picture structure, macroblock adressing/type, motion modes,
etc are part of the spec, so we should not redefine them.
> > 1. We have a create_buffer call witch will return just a void
> > This makes it possible for decoders witch needs frame based decoding
> > return an pointer to an internal frame decoding buffer, while it
> > makes it possible for an decoder which is slice based to just return
> > pointer to itself (for example) with this value not being used
> > otherwise.
> That's not useful, the hardware decoder needs to parse the incoming
> macroblocks for itself. It needs to know things like the CBP, motion
> type, etc. Your current decode buffer design throws all that info
> away. All I did here is pass the macroblocks on to the driver and let
> it do the parsing. In the shader case it still dumps stuff to the
> decode buffer.
I agree on that, my intention of making the decoder buffers with it's
access methods a puplic interface wasn't to make it a long term
solution. I just looked at the problem at hand and tried to solve that
one with the hope that the information is still usable by everybody
> Valid for it to be 0 in the case where field-based motion estimation
> isn't used as far as I can tell. This never gets called for that case.
Even for field-based motion estimation 0 is a valid value, it just means
that every prediction is coming from the top field.
> Well, if that code was still there for me to refer to I would have
> caught it; whatever you fixed before is lost in git history so
> apologies if I don't catch it, I'm just going by what I recall things
> looking like.
I know, my fault, just take a look at the initial commit comment. I
wasn't sure if that is a good idea either, but after realising that your
code doesn't use the spec definitions also, I just got bold and tried
something else to get the problems I was seeing solved.
> > B) or implement my own video buffer with additionally to the video
> > holds a reference to the command buffer which should be used (also
> > because with vdpau you just doesn't knows whether a video buffer is
> > with hardware decoding or software decoding, but maybe that is just
> > another state tracker dependent thing that needs to be hidden from
> > driver)
> This is actually not an ugly idea and makes the most sense. If you
> need every surface to have it's own separate decode buffer for XvMC
> you do that. If you need to implement decode buffers in the decoder
> for VDPAU you do that also.
But we somehow need a consistent interface for all state tracker
implementations, regardless of the use of the video buffer. If I tie the
decoder buffer to the video buffer then that implies the same (stupid)
buffering policy for XvMC is also used for VDPAU and everybody else.
> The problem of what to do with VDPAU's context-free surface
> creation when it comes to which decoder is being used is completely
> separate from this. Any driver that wants to support both shader-based
> decoding and hardware decoding at runtime should use a delayed
> allocation pattern similar to what the failover driver did for
> HW/softpipe mixing, that is don't create the actual buffer until the
> decoder sees it and at that point create the right buffer. This means
> of course that a buffer allocation failure would fail later than it
> would under a regular VDPAU driver, but the VDPAU API wasn't designed
> with runtime switching of decoders in mind so I'm fine with that
The problem is even worse, VDPAU allows even using the same surface with
different decoders. It just says that a surface decoded with one decoder
should not be used as a reference frame for a different decoder (and
this also only for H264).
I can't tell for nvidia hardware, but at least with AMD hardware we can
still have the same surface used in hardware and software decoding (the
proprietary driver does exactly this). But as you already stated that's
a completely different problem, and yes delayed allocation seems to be
the right way to go.
> The attached patch I believe should satisfy everyone's needs here. It
> removes the use of pipe_video_decode_buffer from the state tracker and
> moves it to the shader decoder. This lets every decoder parse the
> incoming macroblocks themselves so they can do whatever they need.
> Too big to inline, see attached. Tested on XvMC/VDPAU with softpipe
> and an RV620 with a couple of videos (this doesn't imply that any/all
> combinations produced correct output, just that it was the same
Ok, take a look at the attached set of patches, they are roughly based
on your patch:
1) I try to keep the buffer management inside the driver as much as
possible, while still having enough information from the state tracker
to decide what is the right thing to do with the buffers.
2) The structure used to pass in macroblock data is (hopefully) based on
the spec. It's obviously that I used vaapis structure as a reference,
but hacked that together all by myself.
3) The decoder now has some more hooks to get information from the state
tracker, but not all of them needs to be implemented. The flush function
for example is called right before the surface is used by the 3D engine,
but is just a "no op" with shader based implementations.
If you wonder why I don't mention your patch in the commit messages, I
would really like to have your "signed of" line in the commit message
before it gets pushed.
More information about the mesa-dev