[Mesa-dev] [PATCH 1/2] radeonsi: Lock a mutex when checking scratch relocations.

Fri Apr 22 08:19:02 UTC 2016

On Fri, Apr 22, 2016 at 6:41 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 21.04.2016 13:51, Bas Nieuwenhuizen wrote:
>>
>> We can use shaders from multiple contexts, and they were not
>> otherwise locked yet.
>
>
> Ouch... I guess this is why compute scratch buffers used to be per-program?
>
> I'm still trying to wrap my head around the possible code paths here... are
> you sure that nothing can go wrong with the radeon_add_to_buffer_list of the
> scratch buffer? I'm thinking a code sequence like
>
> Context 1 checks that sctx(1)->scratch_buffer == shader->scratch_bo.
> Context 2 allocates a new scratch buffer sctx(2)->scratch_buffer and uploads
> a modified shader.
> Context 1 adds the old sctx(1)->scratch_buffer to its buffer list and
> submits its CS. It now runs with the new shader, and may cause a VM fault.

Yes, this can happen.

>
> More generally, what if two contexts use the same shader with different
> scratch buffers? Do we just ping-pong, re-uploading the shader each time?

Yes.

>
> Bonus question: Do we even need per-context scratch buffers? As long as we
> idle all shaders at CS flush, we should be fine with a per-device/per-screen
> scratch buffer.

Things would be simpler if the scratch buffer pointer was set via a
user data SGPR. This needs LLVM support though.

>
> (Curiously, there is an old "this is probably not needed anymore" comment on
> the PS_PARTIAL_FLUSH in si_context_gfx_flush, but this may be wrong: since
> shaders can write memory, and the kernel may want to swap buffers around, we
> have to be extra careful about this...)

PS_PARTIAL_FLUSH is not emitted in this case, because SURFACE_SYNC
does the same thing.

Also, the kernel emits EVENT_WRITE_EOP(CACHE_FLUSH_AND_INV_TS_EVENT |
TC_ACTION_EN | TCL1_ACTION_EN) as part of the fence sequence, but it
doesn't wait. SI also flushes SMEM L1 and ICACHE before
EVENT_WRITE_EOP. The GPU scheduler waits for the fence when there is a
dependency (we can't have inter-process dependencies though). Memory
migrations always wait for idle.

It looks like we only need partial flushes at the end of IBs, and SMEM
L1 + ICACHE flushes at the beginning for non-SI.

I'll prepare a patch.

Marek