[PATCH 3/8] drm/amdgpu: support mem copy for LSDMA

Christian König ckoenig.leichtzumerken at gmail.com
Mon May 9 06:06:37 UTC 2022


Am 07.05.22 um 02:03 schrieb Luben Tuikov:
> [SNIP]
> +     expect_val = LSDMA_PIO_STATUS__PIO_IDLE_MASK | LSDMA_PIO_STATUS__PIO_FIFO_EMPTY_MASK;
> +     for (i = 0; i < usec_timeout; i++) {
> +             tmp = RREG32_SOC15(LSDMA, 0, regLSDMA_PIO_STATUS);
> +             if ((tmp & expect_val) == expect_val)
> +                     break;
> +             udelay(1);
> +     }
>>> Does the hardware specify a minimum delay after the write to the LSDMA_PIO_COMMAND,
>>> and before we check the LSDMA_PIO_STATUS?
>> I'm not sure, but it should be pretty minimal.
> My point is that there should be a delay at least as large as the polling delay,
> after writing to the command register and before reading the status register.
> So that should be at least a 1 us.
>
> There should be a udelay(1) after writing the command to the LSDMA_PIO_COMMAND
> register, before the for () loop begins.

I can't count how often I had to reject that approach. This is exactly 
what we should *not* do.

The SRBM (or whatever that's called on newer hardware) is the one who 
inserts the delay here. The driver should not explicitly do that 
additionally.

The background is that a) the read is often used to commit the write 
operations, both for the PCIe as well as internally in the GPU and b) 
the read only comes back when the SRBM has synced up between the bus 
interface and the hardware block in question.

So when the SRBM has synced between the two clock domains the operation 
has usually been completed or at least started. The minimal delay we 
enter between reads is just to avoid hammering on the register bus when 
the hardware block in question is (for example) power gated or otherwise 
not reachable.

Regards,
Christian.

>
> Regards,
> Luben
>
>
>>> At the moment the code above checks the status immediately after writing to
>>> LSDMA_PIO_COMMAND, and the copy wouldn't be completed.
>>>
>>> If the poll timeout is 1 us, as the loop shows us, maybe the command completion
>>> is larger than that (the time after writing to LSDMA_PIO_COMMAND and before
>>> checking LSDMA_PIO_STATUS)?
>> The time it takes for the copy will be dependent on the amount of data
>> there is to copy.  While the command is probably not complete on the
>> first read of the status register, it may be required to make sure the
>> previous write goes through.
>>
>> Alex
>>
>>>> +
>>>> +     if (i >= usec_timeout) {
>>>> +             dev_err(adev->dev, "LSDMA PIO failed to copy memory!\n");
>>>> +             return -ETIMEDOUT;
>>>> +     }
>>>> +
>>>> +     return 0;
>>>> +}
>>>> +
>>>>   const struct amdgpu_lsdma_funcs lsdma_v6_0_funcs = {
>>>> +     .copy_mem = lsdma_v6_0_copy_mem
>>>>   };
>>> Regards,
>>> --
>>> Luben
> Regards,



More information about the amd-gfx mailing list