[PATCH 3/8] drm/amdgpu: support mem copy for LSDMA
Luben Tuikov
luben.tuikov at amd.com
Mon May 9 14:59:31 UTC 2022
On 2022-05-09 02:06, Christian König wrote:
> Am 07.05.22 um 02:03 schrieb Luben Tuikov:
>> [SNIP]
>> + expect_val = LSDMA_PIO_STATUS__PIO_IDLE_MASK | LSDMA_PIO_STATUS__PIO_FIFO_EMPTY_MASK;
>> + for (i = 0; i < usec_timeout; i++) {
>> + tmp = RREG32_SOC15(LSDMA, 0, regLSDMA_PIO_STATUS);
>> + if ((tmp & expect_val) == expect_val)
>> + break;
>> + udelay(1);
>> + }
>>>> Does the hardware specify a minimum delay after the write to the LSDMA_PIO_COMMAND,
>>>> and before we check the LSDMA_PIO_STATUS?
>>> I'm not sure, but it should be pretty minimal.
>> My point is that there should be a delay at least as large as the polling delay,
>> after writing to the command register and before reading the status register.
>> So that should be at least a 1 us.
>>
>> There should be a udelay(1) after writing the command to the LSDMA_PIO_COMMAND
>> register, before the for () loop begins.
>
> I can't count how often I had to reject that approach. This is exactly
> what we should *not* do.
>
> The SRBM (or whatever that's called on newer hardware) is the one who
> inserts the delay here. The driver should not explicitly do that
> additionally.
>
> The background is that a) the read is often used to commit the write
> operations, both for the PCIe as well as internally in the GPU and b)
> the read only comes back when the SRBM has synced up between the bus
> interface and the hardware block in question.
>
> So when the SRBM has synced between the two clock domains the operation
> has usually been completed or at least started. The minimal delay we
> enter between reads is just to avoid hammering on the register bus when
> the hardware block in question is (for example) power gated or otherwise
> not reachable.
Yeah, that's what I wanted to avoid, but if you say that we need it to
make sure that the write has committed, then it's okay.
Regards,
Luben
>
> Regards,
> Christian.
>
>>
>> Regards,
>> Luben
>>
>>
>>>> At the moment the code above checks the status immediately after writing to
>>>> LSDMA_PIO_COMMAND, and the copy wouldn't be completed.
>>>>
>>>> If the poll timeout is 1 us, as the loop shows us, maybe the command completion
>>>> is larger than that (the time after writing to LSDMA_PIO_COMMAND and before
>>>> checking LSDMA_PIO_STATUS)?
>>> The time it takes for the copy will be dependent on the amount of data
>>> there is to copy. While the command is probably not complete on the
>>> first read of the status register, it may be required to make sure the
>>> previous write goes through.
>>>
>>> Alex
>>>
>>>>> +
>>>>> + if (i >= usec_timeout) {
>>>>> + dev_err(adev->dev, "LSDMA PIO failed to copy memory!\n");
>>>>> + return -ETIMEDOUT;
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> const struct amdgpu_lsdma_funcs lsdma_v6_0_funcs = {
>>>>> + .copy_mem = lsdma_v6_0_copy_mem
>>>>> };
>>>> Regards,
>>>> --
>>>> Luben
>> Regards,
>
Regards,
--
Luben
More information about the amd-gfx
mailing list