[Mesa-dev] Update: UVD status on loongson 3a platform
Christian König
deathsimple at vodafone.de
Fri Sep 6 01:56:39 PDT 2013
Am 06.09.2013 04:52, schrieb cee1:
> 2013/9/6 Jerome Glisse <j.glisse at gmail.com>:
>> On Thu, Sep 05, 2013 at 03:29:52PM -0400, Jerome Glisse wrote:
>>> On Thu, Sep 05, 2013 at 10:14:32PM +0800, Chen Jie wrote:
>>>> Hi all,
>>>>
>>>> This thread is about
>>>> http://lists.freedesktop.org/archives/dri-devel/2013-April/037598.html.
>>>>
>>>> We recently find some interesting thing about UVD based playback on
>>>> loongson 3a plaform, and also find a way to fix the problem.
>>>>
>>>> First, we find memcpy in [mesa]src/gallium/drivers/radeon/radeon_uvd.c
>>>> caused the problem:
>>>> * If memcpy is implemented though 16B or 8B load/store instructions,
>>>> it will normally caused video mosaic. When insert a memcmp after the
>>>> copying code in memcpy, it will report the src and dest are not equal.
>>>> * If memcpy use 1B load/store instructions only, the memcmp after the
>>>> copying code reports equal.
>>>>
>>>> Then we find the following changeset fixs out problem:
>>>>
>>>> diff --git a/src/gallium/drivers/radeon/radeon_uvd.c
>>>> b/src/gallium/drivers/radeon/radeon_uvd.c
>>>> index 2f98de2..f9599b6 100644
>>>> --- a/src/gallium/drivers/radeon/radeon_uvd.c
>>>> +++ b/src/gallium/drivers/radeon/radeon_uvd.c
>>>> @@ -162,7 +162,7 @@ static bool create_buffer(struct ruvd_decoder *dec,
>>>> unsigned size)
>>>> {
>>>> buffer->buf = dec->ws->buffer_create(dec->ws, size, 4096, false,
>>>> - RADEON_DOMAIN_GTT | RADEON_DOMAIN_VRAM);
>>>> + RADEON_DOMAIN_GTT);
>>>> if (!buffer->buf)
>>>> return false;
>>>>
>>>> The VRAM is mapped to an uncached area in out platform, so, my
>>>> question is what could go wrong while using >4B load/store
>>>> instructions in UVD workflow? Any idea?
>>>>
>>> How do you map the VRAM into user process mapping ? ie do you have
>>> something like Intel PAT or something like MTRR or something else.
>>>
>>> In other word, can you map into process address space a region of
>>> io memory (GPU VRAM in this case) and mark it as uncached so that
>>> none of the access to it goes through CPU cache.
>>>
>>> Cheers,
>>> Jerome
>> Also it might be that you can't do write combining on your platform,
>> which would be a major drawback as it's assume by radeon userspace.
>> I would need to check the pcie specification, but write combining is
>> probably not mandatory meaning that your architecture might not have
>> it. This would explain why only memset with byte size copy works.
>>
>> Don't think there is any easy way to work around that.
> The original mesa code allows to allocate buffer in GTT and VRAM
> domain. And we change it so that all buffers are allocated in GTT
> domain, it seems fix our problem.
Actually it's not a fix, but a quite ugly hack instead.
Depending on the UVD generation some buffer *must* be allocated in VRAM,
only starting with NI+ most buffers can be in GTT space instead and I'm
not even 100% sure that this feature is validated/working reliable.
Anyway, not having a reliable CPU access to VRAM is a quite critical
platform bug that should be fixed before even thinking about UVD support.
Christian.
More information about the mesa-dev
mailing list