[PATCH 1/2] drm/amdgpu: drop volatile from ring buffer
Tvrtko Ursulin
tvrtko.ursulin at igalia.com
Wed Oct 9 14:11:41 UTC 2024
On 09/10/2024 13:07, Christian König wrote:
> Am 09.10.24 um 09:41 schrieb Tvrtko Ursulin:
>>
>> On 08/10/2024 19:11, Christian König wrote:
>>> Volatile only prevents the compiler from re-ordering reads and writes.
>>> Since we always only modify the ring buffer from one CPU thread and have
>>> an explicit barrier before signaling the HW this should have no
>>> effect at
>>> all and just prevents compiler optimisations.
>>>
>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>> ---
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 10 +++-------
>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 11 ++++-------
>>> 2 files changed, 7 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> index 910293664902..a6e28fe3f8d6 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
>>> @@ -109,21 +109,17 @@ int amdgpu_ring_alloc(struct amdgpu_ring *ring,
>>> unsigned int ndw)
>>> void amdgpu_ring_insert_nop(struct amdgpu_ring *ring, uint32_t count)
>>> {
>>> uint32_t occupied, chunk1, chunk2;
>>> - uint32_t *dst;
>>> occupied = ring->wptr & ring->buf_mask;
>>> - dst = &ring->ring[occupied];
>>> chunk1 = ring->buf_mask + 1 - occupied;
>>> chunk1 = (chunk1 >= count) ? count : chunk1;
>>> chunk2 = count - chunk1;
>>> if (chunk1)
>>> - memset32(dst, ring->funcs->nop, chunk1);
>>> + memset32(&ring->ring[occupied], ring->funcs->nop, chunk1);
>>> - if (chunk2) {
>>> - dst = ring->ring;
>>> - memset32(dst, ring->funcs->nop, chunk2);
>>> - }
>>> + if (chunk2)
>>> + memset32(ring->ring, ring->funcs->nop, chunk2);
>>> ring->wptr += count;
>>> ring->wptr &= ring->ptr_mask;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> index 574336d6714a..36fc9578c53c 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
>>> @@ -246,7 +246,7 @@ struct amdgpu_ring {
>>> struct drm_gpu_scheduler sched;
>>> struct amdgpu_bo *ring_obj;
>>> - volatile uint32_t *ring;
>>> + uint32_t *ring;
>>> unsigned rptr_offs;
>>> u64 rptr_gpu_addr;
>>> volatile u32 *rptr_cpu_addr;
>>> @@ -288,7 +288,7 @@ struct amdgpu_ring {
>>> u64 cond_exe_gpu_addr;
>>> volatile u32 *cond_exe_cpu_addr;
>>> unsigned int set_q_mode_offs;
>>> - volatile u32 *set_q_mode_ptr;
>>> + u32 *set_q_mode_ptr;
>>> u64 set_q_mode_token;
>>> unsigned vm_hub;
>>> unsigned vm_inv_eng;
>>> @@ -386,10 +386,8 @@ static inline void
>>> amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> void *src, int count_dw)
>>> {
>>> unsigned occupied, chunk1, chunk2;
>>> - void *dst;
>>> occupied = ring->wptr & ring->buf_mask;
>>> - dst = (void *)&ring->ring[occupied];
>>> chunk1 = ring->buf_mask + 1 - occupied;
>>> chunk1 = (chunk1 >= count_dw) ? count_dw : chunk1;
>>> chunk2 = count_dw - chunk1;
>>> @@ -397,12 +395,11 @@ static inline void
>>> amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
>>> chunk2 <<= 2;
>>> if (chunk1)
>>> - memcpy(dst, src, chunk1);
>>> + memcpy(&ring->ring[occupied], src, chunk1);
>>> if (chunk2) {
>>> src += chunk1;
>>> - dst = (void *)ring->ring;
>>> - memcpy(dst, src, chunk2);
>>> + memcpy(ring->ring, src, chunk2);
>>> }
>>> ring->wptr += count_dw;
>>
>> Dropping volatile and removal of local variables in
>> amdgpu_ring_insert_nop() and amdgpu_ring_write_multiple() look
>> unrelated to me. Best practice would be if you had split those two
>> changes. Or at least add to the commit message the usual "while at it"
>> blurb.
>
> To me it looked like the main purpose of the local variable was to add
> the void* cast to avoid the compiler warning of removing the volatile
> when giving the parameter to memcpy().
>
> Since that isn't needed any more I've got rid of the local variable as
> well.
Agreed on that, but I would still add a paragraph in the commit message.
Can add while merging too, no need to respin. Something like:
"With that removed we can also remove the local variable in two places
which relied on it to strip the volatile property."
Regards,
Tvrtko
More information about the amd-gfx
mailing list