[Mesa-dev] [PATCH 2/2] radeonsi: Implement ddx/ddy on VI using ds_bpermute

Michel Dänzer michel at daenzer.net
Sat Apr 16 06:04:08 UTC 2016


On 16.04.2016 14:51, Michel Dänzer wrote:
> On 16.04.2016 11:39, Tom Stellard wrote:
>> The ds_bpermute instruction allows threads to transfer data directly
>> to or from the vgprs of other threads.  These instructions use the lds
>> hardware to transfer data, but do not read or write lds memory.
>>
>> DDX BEFORE:                        |  DDX AFTER:
>>                                    |
>> v_mbcnt_lo_u32_b32_e64 v2, -1, 0   |  v_mbcnt_lo_u32_b32_e64 v2, -1, 0
>> v_mbcnt_hi_u32_b32_e64 v2, -1, v2  |  v_mbcnt_hi_u32_b32_e64 v2, -1, v2
>> v_lshlrev_b32_e32 v4, 2, v2        |  v_and_b32_e32 v2, 0x3ffffffc, v2
>> v_and_b32_e32 v2, -4, v2           |  v_lshlrev_b32_e32 v2, 2, v2
>> v_lshlrev_b32_e32 v3, 2, v2        |  ds_bpermute_b32 v3, v2, v0
>> s_mov_b32 m0, -1                   |  ds_bpermute_b32 v0, v2, v0 offset:4
>> ds_write_b32 v4, v0                |  s_waitcnt lgkmcnt(0)
>> s_waitcnt lgkmcnt(0)               |
>> v_or_b32_e32 v0, 1, v2             |
>> v_lshlrev_b32_e32 v0, 2, v0        |
>> ds_read_b32 v1, v3                 |
>> ds_read_b32 v0, v0                 |
>> s_waitcnt lgkmcnt(0)               |
>>                                    |
>> LDS: 1 blocks                      |  LDS: 0 blocks
> 
> Nice.
> 
> 
> Were these intrinsics already available in LLVM 3.6? If not, the old
> code needs to be kept for backwards compatibility.

I can see now that you're taking care of this for the bpermute
intrinsic, but AFAICT the mbcnt intrinsics were only added in LLVM 3.8.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer


More information about the mesa-dev mailing list