[Mesa-dev] hardware xvmc video decoding with nouveau

Mon Aug 8 06:00:59 PDT 2011

On 08/08/2011 12:10 PM, Christian König wrote:
> Am Samstag, den 06.08.2011, 14:37 -0400 schrieb Younes Manton:
>> The attached patch I believe should satisfy everyone's needs here. It
>> removes the use of pipe_video_decode_buffer from the state tracker and
>> moves it to the shader decoder. This lets every decoder parse the
>> incoming macroblocks themselves so they can do whatever they need.
> Beside the fact that this patch adds a couple of bugs, you are just
> going back to a design wich has already proven to be a bit flaw full.
>
> Please take a look at these two documents first:
> http://msdn.microsoft.com/en-us/library/ff568441(v=VS.85).aspx
> http://cgit.freedesktop.org/libva/tree/va/va.h
>
> The problem I tried to solve with pipe_video_decode_buffer is that it is
> valid for XvMC to do something like that:
>
> XvMCRenderSurface(target A);
> XvMCRenderSurface(target B);
> XvMCRenderSurface(target A);
> XvMCPutSurface(target A);
> XvMCRenderSurface(target B);
> XvMCPutSurface(target B);
>
> That behaviour creates a bunch of problems when the decoder can only
> work on whole frames, because you can't switch the decoding target while
> inside a frame. This is an ugliness of the XvMC, which in the past made
> it problematic to use it with a couple of hardware decoder.
>
> Most modern players doesn't do it like this any more, but it still seems
> to cause a bunch of problems when seeking or fast forward both with
> mplayer and xine.
>
> So I would suggest we design the interface something like this:
As far as I can tell, the hardware decoder I'm using has no problem with that.
I just queue all commands and on putsurface I flush them. In fact, with flushing
on putsurface it works better than other alternatives, since it can decode future
and current surface simultaneously. Each command has to say which surface it's
being applied to. :)
> 1. We have a create_buffer call witch will return just a void pointer.
> This makes it possible for decoders witch needs frame based decoding to
> return an pointer to an internal frame decoding buffer, while it still
> makes it possible for an decoder which is slice based to just return a
> pointer to itself (for example) with this value not being used
> otherwise.
>
> 2. We have a begin_frame/render/render.../end_frame calls just like
> DXVA, to clearly distinct the start and end of a frame, for frame based
> decoder this makes it possible to do clean housekeeping of the internal
> buffer, while with slice level decoder begin_frame and end_frame are
> just noops.
>
> 3. We pass the render target and the buffer to begin/render/end.
> This enables slice level decoder to start their decoding earlier, while
> it still keeps the burden of figuring out where a frame start/ends to
> the XvMC state tracker.
The nouveau hardware acceleration really doesn't need more than
the current patch by ymanton, and the XvMC api doesn't expose this
kind of information, so I don't really think it makes sense to add that.
>> Too big to inline, see attached. Tested on XvMC/VDPAU with softpipe
>> and an RV620 with a couple of videos (this doesn't imply that any/all
>> combinations produced correct output, just that it was the same
>> before/after).
> Beside the design flaws, the patch shows a couple of problems wich needs
> to be addressed first:
>
>> static unsigned MotionVFSToPipe(int xvmc_mo_vfs)
>> {
>> ...
>>    assert(pipe_mo_vfs != 0);
>>
>>    return pipe_mo_vfs;
>>  }
> The motion_vertical_field_select bitfield is part of the mpeg2 spec (see
> section 6.3.17.2). We should stick our definitions to the spec as most
> as possible and don't try to define our own, since XvMC is also using
> this field as defined in the spec it can simple be copied over.
>
> And by the way, it is completely valid for this field to be 0.
>
>> static enum pipe_mpeg12_macroblock_type TypeToPipe(int xvmc_mb_type)
>> {
>>    if (xvmc_mb_type & XVMC_MB_TYPE_INTRA)
>>       return PIPE_MPEG12_MACROBLOCK_TYPE_INTRA;
>>    if ((xvmc_mb_type & (XVMC_MB_TYPE_MOTION_FORWARD | XVMC_MB_TYPE_MOTION_BACKWARD)) == XVMC_MB_TYPE_MOTION_FORWARD)
>>       return PIPE_MPEG12_MACROBLOCK_TYPE_FORWARD;
>>    /* Workaround for Xine's xxmc video out plugin */
>>    if (!(xvmc_mb_type & ~XVMC_MB_TYPE_PATTERN))
>>       return PIPE_MPEG12_MACROBLOCK_TYPE_FORWARD;
>>    if ((xvmc_mb_type & (XVMC_MB_TYPE_MOTION_FORWARD | XVMC_MB_TYPE_MOTION_BACKWARD)) == XVMC_MB_TYPE_MOTION_BACKWARD)
>>       return PIPE_MPEG12_MACROBLOCK_TYPE_BACKWARD;
>>    if ((xvmc_mb_type & (XVMC_MB_TYPE_MOTION_FORWARD | XVMC_MB_TYPE_MOTION_BACKWARD)) == (XVMC_MB_TYPE_MOTION_FORWARD | XVMC_MB_TYPE_MOTION_BACKWARD))
>>       return PIPE_MPEG12_MACROBLOCK_TYPE_BI;
>>
>>    assert(0);
>>
>>    XVMC_MSG(XVMC_ERR, "[XvMC] Unrecognized mb type 0x%08X.\n", xvmc_mb_type);
>>
>>    return -1;
>> }
> Same problem here, the macroblocktype is defined in section 6.3.17.1 of
> the spec, and like XvMC it just defines forward, backward, pattern and
> intra flags. Since XvMC is also following the spec here this field can
> be copied over without a conversion. For a good reference see the
> VAMacroblockParameterBufferMPEG2 structure in the vaapi specification.
Avoiding the extra copy would be nice. :)
>> static enum pipe_mpeg12_motion_type MotionToPipe(int xvmc_motion_type, int xvmc_dct_type)
>> {
>>    switch (xvmc_motion_type) {
>>       case XVMC_PREDICTION_FRAME:
>>          if (xvmc_dct_type == XVMC_DCT_TYPE_FIELD)
>>             return PIPE_MPEG12_MOTION_TYPE_16x8;
>>          else if (xvmc_dct_type == XVMC_DCT_TYPE_FRAME)
>>             return PIPE_MPEG12_MOTION_TYPE_FRAME;
>>          break;
>>       case XVMC_PREDICTION_FIELD:
>>          return PIPE_MPEG12_MOTION_TYPE_FIELD;
>>       case XVMC_PREDICTION_DUAL_PRIME:
>>          return PIPE_MPEG12_MOTION_TYPE_DUALPRIME;
>>       default:
>>          assert(0);
>>    }
>>
>>    XVMC_MSG(XVMC_ERR, "[XvMC] Unrecognized motion type 0x%08X (with DCT type 0x%08X).\n", xvmc_motion_type, xvmc_dct_type);
>>
>>    return -1;
>> }
> The distinction between 16x8 and Frame based motion compensation isn't
> the dct type, it is the picture coding, I fixed this bug long ago, and
> you are just going back to the initial broken code here.
>
>> struct pipe_mpeg12_picture_desc
>> {
>> ...
>>    uint8_t *intra_quantizer_matrix;
>>    uint8_t *non_intra_quantizer_matrix;
>> ...
>> };
> Both vaapi and DXVA split those out of the picture description into
> separate structures. Honestly I still haven't figured out why, maybe
> this is just to stick more to the spec, but it could be helpful to do
> the same.
>
> So do you have time to work on this a bit more or should I take a break
> from the h264 implementation and try to do it right this time? Since I
> think I now knew what you guys need it should be possible for me to
> change the interface to something everybody is happy with.
>
I'm happy with the current api for nouveau for XvMC (minus bug above).

For vdpau, I wonder if it makes more sense to have each driver have its own library,
for example vdpau_g3dvl.so and vdpau_nouveau.so, if/when it comes to adding nouveau
acceleration. If you look at the vdpau wrapper, it's using dri2 to query for the driver name
to use. This would allow setting the right binary to use for each driver.

~Maarten