[Mesa-dev] [PATCH] radeonsi: add basic glClearBufferSubData acceleration
Marek Olšák
maraeo at gmail.com
Thu Nov 5 11:56:10 PST 2015
On Thu, Nov 5, 2015 at 7:58 PM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> On 04.11.2015 00:47, Marek Olšák wrote:
>>
>> From: Marek Olšák <marek.olsak at amd.com>
>>
>> ---
>> src/gallium/drivers/radeonsi/si_blit.c | 55
>> ++++++++++++++++++++++++++++++++++
>> 1 file changed, 55 insertions(+)
>>
>> diff --git a/src/gallium/drivers/radeonsi/si_blit.c
>> b/src/gallium/drivers/radeonsi/si_blit.c
>> index fce014a..e934146 100644
>> --- a/src/gallium/drivers/radeonsi/si_blit.c
>> +++ b/src/gallium/drivers/radeonsi/si_blit.c
>> @@ -731,9 +731,64 @@ static void si_flush_resource(struct pipe_context
>> *ctx,
>> }
>> }
>>
>> +static void si_pipe_clear_buffer(struct pipe_context *ctx,
>> + struct pipe_resource *dst,
>> + unsigned offset, unsigned size,
>> + const void *clear_value,
>> + int clear_value_size)
>> +{
>> + struct si_context *sctx = (struct si_context*)ctx;
>> + const uint32_t *u32 = clear_value;
>> + unsigned i;
>> + bool clear_value_fits_dword = true;
>> + uint8_t *map;
>> +
>> + if (clear_value_size > 4)
>> + for (i = 1; i < clear_value_size / 4; i++)
>> + if (u32[0] != u32[i]) {
>> + clear_value_fits_dword = false;
>> + break;
>> + }
>> +
>> + /* Use CP DMA for the simple case. */
>> + if (offset % 4 == 0 && size % 4 == 0 && clear_value_fits_dword) {
>> + uint32_t value = u32[0];
>> +
>> + switch (clear_value_size) {
>> + case 1:
>> + value &= 0xff;
>> + value |= (value << 8) | (value << 16) | (value <<
>> 24);
>> + break;
>> + case 2:
>> + value &= 0xffff;
>> + value |= value << 16;
>> + break;
>> + }
>
>
> To reduce the chance of complaints by valgrind et al:
>
> switch (clear_value_size) {
> case 1:
> value = *(uint8_t *)u32;
> value |= (value << 8) | (value << 16) | (value << 24);
> break;
> case 2:
> value = *(uint16_t *)u32;
> value |= value << 16;
> break;
> default:
> value = *u32;
> break;
> }
Thanks.
The preliminary plan is to use transform feedback for fills>=64 bits
(already implemented by u_blitter), and CP DMA should be used for
32-bit fills and any fills that can be lowered to 32-bit.
Unaligned 8-bit and 16-bit fills should fill the largest aligned
subrange using CP DMA. Then, the unaligned beginning and ending dwords
will be filled separately using:
COPY_DATA from mem to reg
REG_RMW to fill the requested bytes
COPY_DATA from reg to mem
Marek
More information about the mesa-dev
mailing list