[Mesa-dev] GBM and the Device Memory Allocator Proposals

Wed Dec 6 07:01:00 UTC 2017

On 12/01/2017 10:34 AM, Nicolai Hähnle wrote:
> On 01.12.2017 18:09, Nicolai Hähnle wrote:
> [snip]
>>>> As for the actual transition API, I accept that some metadata may be
>>>> required, and the metadata probably needs to depend on the memory 
>>>> layout,
>>>> which is often vendor-specific. But even linear layouts need some
>>>> transitions for caches. We probably need at least some generic 
>>>> "off-device
>>>> usage" bit.
>>>
>>> I've started thinking of cached as a capability with a transition.. I
>>> think that helps.  Maybe it needs to somehow be more specific (ie. if
>>> you have two devices both with there own cache with no coherency
>>> between the two)
>>
>> As I wrote above, I'd prefer not to think of "cached" as a capability 
>> at least for radeonsi.
>>
>>  From the desktop perspective, I would say let's ignore caches, the 
>> drivers know which caches they need to flush to make data visible to 
>> other devices on the system.
>>
>> On the other hand, there are probably SoC cases where non-coherent 
>> caches are shared between some but not all devices, and in that case 
>> perhaps we do need to communicate this.
>>
>> So perhaps we should have two kinds of "capabilities".
>>
>> The first, like framebuffer compression, is a capability of the 
>> allocated memory layout (because the compression requires a meta 
>> surface), and devices that expose it may opportunistically use it.
>>
>> The second, like caches, is a capability that the device/driver will 
>> use and you don't get a say in it, but other devices/drivers also 
>> don't need to be aware of them.
>>
>> So then you could theoretically have a system that gives you:
>>
>> GPU:     FOO/tiled(layout-caps=FOO/cc, dev-caps=FOO/gpu-cache)
>> Display: FOO/tiled(layout-caps=FOO/cc)
>> Video:   FOO/tiled(dev-caps=FOO/vid-cache)
>> Camera:  FOO/tiled(dev-caps=FOO/vid-cache)
> [snip]
> 
> FWIW, I think all that stuff about different caches quite likely 
> over-complicates things. At the end of each "command submission" of 
> whichever type of engine, the buffer must be in a state where the kernel 
> is free to move it around for memory management purposes. This already 
> puts a big constraint on the kind of (non-coherent) caches that can be 
> supported anyway, so I wouldn't be surprised if we could get away with a 
> *much* simpler approach.

I'd rather not depend on this type of cleverness if possible.  Other 
kernels/OS's may not behave this way, and I'd like the allocator 
mechanism to be something we can use across all or at least most of the 
POSIX and POSIX-like OS's we support.  Also, this particular example is 
not true of our proprietary Linux driver, and I suspect it won't always 
be the case for other drivers.  If a particular driver or OS fits this 
assumption, the driver is always free to return no-op transitions in 
that case.

Thanks,
-James

> Cheers,
> Nicolai
>