[Mesa-dev] [PATCH 2/2] radeonsi: Implement ddx/ddy on VI using ds_bpermute
Marek Olšák
maraeo at gmail.com
Sat Apr 16 14:36:03 UTC 2016
On Sat, Apr 16, 2016 at 3:28 PM, Roland Scheidegger <sroland at vmware.com> wrote:
> Am 16.04.2016 um 15:19 schrieb eocallaghan at alterapraxis.com:
>> On 2016-04-16 20:20, Marek Olšák wrote:
>>> On Sat, Apr 16, 2016 at 8:04 AM, Michel Dänzer <michel at daenzer.net>
>>> wrote:
>>>> On 16.04.2016 14:51, Michel Dänzer wrote:
>>>>> On 16.04.2016 11:39, Tom Stellard wrote:
>>>>>> The ds_bpermute instruction allows threads to transfer data directly
>>>>>> to or from the vgprs of other threads. These instructions use the lds
>>>>>> hardware to transfer data, but do not read or write lds memory.
>>>>>>
>>>>>> DDX BEFORE: | DDX AFTER:
>>>>>> |
>>>>>> v_mbcnt_lo_u32_b32_e64 v2, -1, 0 | v_mbcnt_lo_u32_b32_e64 v2, -1, 0
>>>>>> v_mbcnt_hi_u32_b32_e64 v2, -1, v2 | v_mbcnt_hi_u32_b32_e64 v2,
>>>>>> -1, v2
>>>>>> v_lshlrev_b32_e32 v4, 2, v2 | v_and_b32_e32 v2, 0x3ffffffc, v2
>>>>>> v_and_b32_e32 v2, -4, v2 | v_lshlrev_b32_e32 v2, 2, v2
>>>>>> v_lshlrev_b32_e32 v3, 2, v2 | ds_bpermute_b32 v3, v2, v0
>>>>>> s_mov_b32 m0, -1 | ds_bpermute_b32 v0, v2, v0
>>>>>> offset:4
>>>>>> ds_write_b32 v4, v0 | s_waitcnt lgkmcnt(0)
>>>>>> s_waitcnt lgkmcnt(0) |
>>>>>> v_or_b32_e32 v0, 1, v2 |
>>>>>> v_lshlrev_b32_e32 v0, 2, v0 |
>>>>>> ds_read_b32 v1, v3 |
>>>>>> ds_read_b32 v0, v0 |
>>>>>> s_waitcnt lgkmcnt(0) |
>>>>>> |
>>>>>> LDS: 1 blocks | LDS: 0 blocks
>>>>>
>>>>> Nice.
>>>>>
>>>>>
>>>>> Were these intrinsics already available in LLVM 3.6? If not, the old
>>>>> code needs to be kept for backwards compatibility.
>>>>
>>>> I can see now that you're taking care of this for the bpermute
>>>> intrinsic, but AFAICT the mbcnt intrinsics were only added in LLVM 3.8.
>>>
>>> How do you feel about increasing the requirement to LLVM 3.8 for Mesa
>>> git?
>>
>> +1 from me. Supporting more than two generations of LLVM is a bit much
>> to carry imho.
>>
>
> You don't want to support any released version which is older than one
> month?
> (This isn't an objection, just a remark...)
Life's hard. Sometimes we have to make hard choices. :)
Now seriously, LLVM 3.7 enables OpenGL 4.0-4.1 and LLVM 3.8 enables
immediate shader compilation (without recompilations) for radeonsi.
I'll let others assess how important those two are.
Marek
More information about the mesa-dev
mailing list