PRT support for amdgpu

Tue Jan 31 16:28:52 UTC 2017

Am 31.01.2017 um 14:06 schrieb Bas Nieuwenhuizen:
>
> On Mon, Jan 30, 2017, at 13:57, Christian König wrote:
>> Hi Dave and Bas,
>>
>> Hi Dave and Bas,
>>
>> the following set of patches is a proposal for adding support for partial
>> resident textures (PRT) to the amdgpu kernel module.
>>
>> The basic idea behind PRT support is that you turn of VM fault reporting
>> and only map parts of a texture into your virtual address space.
> If we add some backing to a range, do we need to unmap the PRT range,
> split and map two PRt ranges? Or will this be handled like mmap and a
> new map just overrides the earlier maps for that range?

Currently the idea is to unmap the PRT range first and then map the new 
stuff. But I've already discussed internally with Nicolai a couple of 
alternatives.

The problem is that IOCTL are supposed to be transactional, e.g. they 
either fail completely or they success completely. But that is rather 
tricky when you need to split mappings like you suggested as well.

So at least for the initial implementation I would like to stick to 
manual unmap calls we can still add the ability to split mappings later 
on if we find performance problems with that approach

>> When a shader now tries to sample from a not present page it gets a
>> notification instead of a VM fault and can react gracefully by switch to
>> a different LOD for example.
> So to confirm this is just using texture instruction with the TFE bit
> enabled, no trap handlers or such?

I'm not so deeply into the shader instructions, but I think so yes.

>> On our current available hardware generation you can unfortunately only
>> turn of VM faults globally, but on future generation you can do this on a
>> per page basis. So my proposal is to have a consistent interface over all
>> generations with a per mapping PRT flag, but enable/disable it globally
>> on current hardware when the first/last mapping is made/destroyed.
>>
>> An open problem with the proposal is that we don't know when or if we
>> want to add the userspace implementation into radeonsi.
>>
>> So price question could you guys use this for radv as well? Or is it
>> sufficient to just write an unit test for it?
> So this API seems usable, and I think this is something we can use for
> radv. However, I'm not sure how much time it takes for us to implement,
> as the TFE variants are not in LLVM yet and I haven't figured out what
> values the NACKs get.

Actually this is also useful without the special NACK handling. E.g. 
when you sample from a texture part which isn't present you always get 
zero and writes are ignored.

The TFE bit and the extra signaling to for special handling in shader 
code are only optional if I'm not completely mistaken.

> Furthermore, if addrlib is missing stuff like Nicolai suggests then that
> could result in complications. I can try if I can get something working
> over the weekend, but no promises.

Not sure what concern Nicolai has about addrlib here. In general we 
should know where the different parts of a texture start (LODs, layers 
etc...) and as far as I can see that's all you need to know.

> As far as an unit test being sufficient, I assume you mean as open
> source user for inclusion into the kernel?

Yes.

> I think that'd be a question
> answered better by Dave.

Yeah, though so as well. Dave can you comment?

Thanks for the comments,
Christian.

>
>> Best regards,
>> Christian.
>>
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx