[Mesa-dev] [PATCH 2/2] radeonsi: Implement ddx/ddy on VI using ds_bpermute

Ilia Mirkin imirkin at alum.mit.edu
Sat Apr 16 16:41:54 UTC 2016


On Sat, Apr 16, 2016 at 10:36 AM, Marek Olšák <maraeo at gmail.com> wrote:
> On Sat, Apr 16, 2016 at 3:28 PM, Roland Scheidegger <sroland at vmware.com> wrote:
>> Am 16.04.2016 um 15:19 schrieb eocallaghan at alterapraxis.com:
>>> On 2016-04-16 20:20, Marek Olšák wrote:
>>>> On Sat, Apr 16, 2016 at 8:04 AM, Michel Dänzer <michel at daenzer.net>
>>>> wrote:
>>>>> On 16.04.2016 14:51, Michel Dänzer wrote:
>>>>>> On 16.04.2016 11:39, Tom Stellard wrote:
>>>>>>> The ds_bpermute instruction allows threads to transfer data directly
>>>>>>> to or from the vgprs of other threads.  These instructions use the lds
>>>>>>> hardware to transfer data, but do not read or write lds memory.
>>>>>>>
>>>>>>> DDX BEFORE:                        |  DDX AFTER:
>>>>>>>                                    |
>>>>>>> v_mbcnt_lo_u32_b32_e64 v2, -1, 0   |  v_mbcnt_lo_u32_b32_e64 v2, -1, 0
>>>>>>> v_mbcnt_hi_u32_b32_e64 v2, -1, v2  |  v_mbcnt_hi_u32_b32_e64 v2,
>>>>>>> -1, v2
>>>>>>> v_lshlrev_b32_e32 v4, 2, v2        |  v_and_b32_e32 v2, 0x3ffffffc, v2
>>>>>>> v_and_b32_e32 v2, -4, v2           |  v_lshlrev_b32_e32 v2, 2, v2
>>>>>>> v_lshlrev_b32_e32 v3, 2, v2        |  ds_bpermute_b32 v3, v2, v0
>>>>>>> s_mov_b32 m0, -1                   |  ds_bpermute_b32 v0, v2, v0
>>>>>>> offset:4
>>>>>>> ds_write_b32 v4, v0                |  s_waitcnt lgkmcnt(0)
>>>>>>> s_waitcnt lgkmcnt(0)               |
>>>>>>> v_or_b32_e32 v0, 1, v2             |
>>>>>>> v_lshlrev_b32_e32 v0, 2, v0        |
>>>>>>> ds_read_b32 v1, v3                 |
>>>>>>> ds_read_b32 v0, v0                 |
>>>>>>> s_waitcnt lgkmcnt(0)               |
>>>>>>>                                    |
>>>>>>> LDS: 1 blocks                      |  LDS: 0 blocks
>>>>>>
>>>>>> Nice.
>>>>>>
>>>>>>
>>>>>> Were these intrinsics already available in LLVM 3.6? If not, the old
>>>>>> code needs to be kept for backwards compatibility.
>>>>>
>>>>> I can see now that you're taking care of this for the bpermute
>>>>> intrinsic, but AFAICT the mbcnt intrinsics were only added in LLVM 3.8.
>>>>
>>>> How do you feel about increasing the requirement to LLVM 3.8 for Mesa
>>>> git?
>>>
>>> +1 from me. Supporting more than two generations of LLVM is a bit much
>>> to carry imho.
>>>
>>
>> You don't want to support any released version which is older than one
>> month?
>> (This isn't an objection, just a remark...)
>
> Life's hard. Sometimes we have to make hard choices. :)
>
> Now seriously, LLVM 3.7 enables OpenGL 4.0-4.1 and LLVM 3.8 enables
> immediate shader compilation (without recompilations) for radeonsi.
> I'll let others assess how important those two are.

>From a practical standpoint, gentoo and arch are shipping LLVM 3.7.1
by default. It may cause a bunch of frustration for people if you
require 3.8. I don't actually build or use radeonsi, just pointing out
some potentially pertinent facts.

Cheers,

  -ilia


More information about the mesa-dev mailing list