[Intel-gfx] [PATCH 2/2] libdrm_radeon: Optimize reloc writing to do less looping.

Wed Mar 10 19:26:22 CET 2010

2010/3/10 Michel Dänzer <michel at daenzer.net>:
> On Wed, 2010-03-10 at 18:20 +0200, Pauli Nieminen wrote:
>> Bit has table will be first checked from BO if we can quarentee this BO is not
>> in this cs already.
>>
>> To quarentee that there is no other cs with same id number of CS that can have
>> id is limited to 32. Adding and remocing reference in bo is done with atomic
>> operations to allow parallel access to a bo from multiple contexts.
>>
>> This optimization decreases cs_write_reloc share of torcs profiling from 4.3%
>> to 2.6%.
>>
>> Signed-off-by: Pauli Nieminen <suokkos at gmail.com>
>
> [...]
>
>> diff --git a/radeon/radeon_cs_gem.c b/radeon/radeon_cs_gem.c
>> index 45a219c..83aabea 100644
>> --- a/radeon/radeon_cs_gem.c
>> +++ b/radeon/radeon_cs_gem.c
>> @@ -68,6 +69,66 @@ struct cs_gem {
>>      struct radeon_bo_int        **relocs_bo;
>>  };
>>
>> +
>> +#if !defined(__GNUC__) || __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 2)
>> +/* no built in sync support in compiler define place holders */
>> +uint32_t __sync_add_and_fetch(uint32_t *a, uint32_t val)
>> +{
>> +     *a += val;
>> +     return val;
>> +}
>> +
>> +uint32_t __sync_add_and_fetch(uint32_t *a, uint32_t val)
>> +{
>> +     *a -= val;
>> +     return val;
>> +}
>> +#endif
>
> This doesn't look like it could build... presumably the latter should be
> called __sync_sub_and_fetch()?
>

sorry .wrong patch coming from somewhere :/

> Do these stand any chance of working properly in circumstances where
> atomicity is actually important though?
>
>
> --
> Earthling Michel Dänzer           |                http://www.vmware.com
> Libre software enthusiast         |          Debian, X and DRI developer
>