[PATCH] drm/amdgpu: Adding wait time before reading upll control register

Mon Jun 29 22:46:04 UTC 2020

On 2020-06-26 1:04 p.m., Christian König wrote:
> Am 26.06.20 um 18:12 schrieb Alex Jivin:
>> Adding a delay between writing to UVD control register and reading from it.
>> This is to allow the HW to process the write command.
>>
>> Signed-off-by: Alex Jivin <alex.jivin at amd.com>
>> Suggested-By: Luben Tukov <luben.tuikov at amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/si.c | 6 ++++++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
>> index 9d7b4ccd17b8..42cdc14fb79d 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/si.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/si.c
>> @@ -1435,6 +1435,12 @@ static int si_uvd_send_upll_ctlreq(struct amdgpu_device *adev,
>>   	/* Assert UPLL_CTLREQ */
>>   	WREG32_P(cg_upll_func_cntl, UPLL_CTLREQ_MASK, ~UPLL_CTLREQ_MASK);
>>   
>> +	/* Waiting for HW to process the previous write.
>> +	 * This is to give a chance to HW to act before
>> +	 * the first read is done.
>> +	 */
>> +	mdelay(1);
>> +
> 
> Mhm, that is most likely not a good idea.
> 
> We need to issue a read after the write to make sure that the stuff is 
> send out to the hardware.

Tracing down WREG32_P(), it seems to be writing and then reading the register,
back to back, twice over, when deferring to PCIe space, and just writel() when in mmio
space. (There is separate thread on this as it doesn't seem correct.)

> Adding a delay here is probably just postponing that. Do we have some 
> note in the documentation that this is necessary?

Flushing the write buffer (by issuing a read if necessary) is different
than waiting for the hardware to process the request.

The current code does flush the write buffer by reading back,
and then delaying (both in the loop), which does achieve this,
but it leaves a corner case as I wrote in my review.
The corner case is that if the status
register changes to "done" while in the last "delay()"
we then check the loop invariant and exit, as opposed to reading
the register and determining that it is done successfully.

Current, all in pseudo-code:

write()

for (count = 0; count < MAX_COUNT; count++) {
	res = read()
	if (res is success) break
	mdelay(10)              <-- if it changes here, we miss it on the last iteration
}

Optimal:

write()   ; assume write buffer flush
mdelay(9)
do {
	mdelay(1)
	res = read()
} while (res != success && ++count < MAX_COUNT)

This solves the corner case, and ensures a time delay of 10 for
the hardware to process its job, but a delay of 1 for polling
the status, as it could be done "anytime now."

Assuming that WREG32_P() flushes the write buffer, as it seems
that it does, the idea here is to give the hardware some time to process
the request (from writing a value to it), but when polling to poll
a shorter amount of time.

I would've also liked to see the mutex business fixed
as well from my original patch review, as it is easy to prove
that it is always taken, so no need to embed it inside the if().

Regards,
Luben

> 
> Christian.
> 
>>   	/* Wait for CTLACK and CTLACK2 to get asserted */
>>   	for (i = 0; i < SI_MAX_CTLACKS_ASSERTION_WAIT; ++i) {
>>   		uint32_t mask = UPLL_CTLACK_MASK | UPLL_CTLACK2_MASK;
>