Update: UVD status on loongson 3a platform

Christian König deathsimple at vodafone.de
Fri Sep 6 01:56:39 PDT 2013

Am 06.09.2013 04:52, schrieb cee1:
> 2013/9/6 Jerome Glisse <j.glisse at gmail.com>:
>> On Thu, Sep 05, 2013 at 03:29:52PM -0400, Jerome Glisse wrote:
>>> On Thu, Sep 05, 2013 at 10:14:32PM +0800, Chen Jie wrote:
>>>> Hi all,
>>>> This thread is about
>>>> http://lists.freedesktop.org/archives/dri-devel/2013-April/037598.html.
>>>> We recently find some interesting thing about UVD based playback on
>>>> loongson 3a plaform, and also find a way to fix the problem.
>>>> First, we find memcpy in [mesa]src/gallium/drivers/radeon/radeon_uvd.c
>>>> caused the problem:
>>>> * If memcpy is implemented though 16B or 8B load/store instructions,
>>>> it will normally caused video mosaic. When insert a memcmp after the
>>>> copying code in memcpy, it will report the src and dest are not equal.
>>>> * If memcpy use 1B load/store instructions only, the memcmp after the
>>>> copying code reports equal.
>>>> Then we find the following changeset fixs out problem:
>>>> diff --git a/src/gallium/drivers/radeon/radeon_uvd.c
>>>> b/src/gallium/drivers/radeon/radeon_uvd.c
>>>> index 2f98de2..f9599b6 100644
>>>> --- a/src/gallium/drivers/radeon/radeon_uvd.c
>>>> +++ b/src/gallium/drivers/radeon/radeon_uvd.c
>>>> @@ -162,7 +162,7 @@ static bool create_buffer(struct ruvd_decoder *dec,
>>>>     unsigned size)
>>>>   {
>>>>    buffer->buf = dec->ws->buffer_create(dec->ws, size, 4096, false,
>>>> +     RADEON_DOMAIN_GTT);
>>>>    if (!buffer->buf)
>>>>    return false;
>>>> The VRAM is mapped to an uncached area in out platform, so, my
>>>> question is what could go wrong while using  >4B load/store
>>>> instructions in UVD workflow? Any idea?
>>> How do you map the VRAM into user process mapping ? ie do you have
>>> something like Intel PAT or something like MTRR or something else.
>>> In other word, can you map into process address space a region of
>>> io memory (GPU VRAM in this case) and mark it as uncached so that
>>> none of the access to it goes through CPU cache.
>>> Cheers,
>>> Jerome
>> Also it might be that you can't do write combining on your platform,
>> which would be a major drawback as it's assume by radeon userspace.
>> I would need to check the pcie specification, but write combining is
>> probably not mandatory meaning that your architecture might not have
>> it. This would explain why only memset with byte size copy works.
>> Don't think there is any easy way to work around that.
> The original mesa code allows to allocate buffer in GTT and VRAM
> domain. And we change it so that all buffers are allocated in GTT
> domain, it seems fix our problem.

Actually it's not a fix, but a quite ugly hack instead.

Depending on the UVD generation some buffer *must* be allocated in VRAM, 
only starting with NI+ most buffers can be in GTT space instead and I'm 
not even 100% sure that this feature is validated/working reliable.

Anyway, not having a reliable CPU access to VRAM is a quite critical 
platform bug that should be fixed before even thinking about UVD support.


More information about the dri-devel mailing list