[Mesa-dev] [PATCH] radeonsi: Enable VGPR spilling for all shader types v3

Wed Jan 21 04:12:56 PST 2015

We also had a case when the CPU accidentally corrupted shaders,
because the shaders were mapped after textures and a CPU texture
upload overflowed and overwrote shaders. I suppose we should have
unmapped the shaders.

Marek

On Wed, Jan 21, 2015 at 12:31 PM, Christian König
<deathsimple at vodafone.de> wrote:
> In general I don't think that modifying shader code with the GPU is such a
> good idea. We already had quite a number of occasions where the GPU
> accidentally corrupted the shader data which usually leads to lockups.
>
> Because of this I already wanted to suggest that everything that the GPU
> executes (e.g. Rings, IB, shader etc...) is only mapped readonly into the
> GPU address space.
>
> Regards,
> Christian.
>
> Am 21.01.2015 um 12:20 schrieb Marek Olšák:
>>
>> On Wed, Jan 21, 2015 at 3:03 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>>
>>> On 20.01.2015 22:39, Marek Olšák wrote:
>>>>
>>>> The problem with CPDMA (DMA_DATA and WRITE_DATA) is that the ordering
>>>> of flushes must be correct. First, partial flushes must be done, so
>>>> that the shaders are idle.
>>>
>>> That's only necessary when reusing a single BO for the shader code, not
>>> when allocating a new BO when the relocations change, right?
>>
>> Yes.
>>
>>>
>>>> Then you can use CP DMA to update the binary. After that, ICACHE should
>>>> be invalidated.
>>>
>>> ICACHE has to be invalidated when writing with the CPU as well, right?
>>
>> Yes, but the invalidation at the beginning of IBs is sufficient for
>> all CPU accesses, so nothing needs to be done.
>>
>>>
>>>> The problem with mapping VRAM can be avoided by keeping a CPU copy of
>>>> the binary from the beginning. We would only need a CPU copy of those
>>>> shaders that use the scratch buffer. Then, you wouldn't have to read
>>>> VRAM at all, which would make it even simpler.
>>>
>>> Right, but CPU writes to the new BO in VRAM could cause stalls anyway.
>>
>> If CPU writes are the problem, we can create a temporary BO in GTT,
>> upload and update the shader there, and copy it to the shader BO in
>> VRAM using CPDMA. In this case, the shader BO in VRAM doesn't have to
>> be reallocated, and shader state doesn't have to be re-emitted. Only
>> the ICACHE should be flushed after CPDMA.
>>
>> One copy packet is better than a lot of small WRITE_DATA packets.
>>
>> Marek
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>