[PATCH 2/8] drm/amdgpu: fix sdma v4 startup under SRIOV

Christian König ckoenig.leichtzumerken at gmail.com
Tue Oct 9 10:56:59 UTC 2018


Am 09.10.2018 um 11:17 schrieb Huang Rui:
> On Mon, Oct 08, 2018 at 03:35:15PM +0200, Christian König wrote:
>> [SNIP]
>> -	if (adev->firmware.load_type != AMDGPU_FW_LOAD_PSP) {
>> -		r = sdma_v4_0_load_microcode(adev);
>> +	/* start the gfx rings and rlc compute queues */
>> +	for (i = 0; i < adev->sdma.num_instances; i++)
>> +		sdma_v4_0_gfx_resume(adev, i);
>> +
>> +	if (amdgpu_sriov_vf(adev)) {
>> +		sdma_v4_0_ctx_switch_enable(adev, true);
>> +		sdma_v4_0_enable(adev, true);
>> +	} else {
>> +		r = sdma_v4_0_rlc_resume(adev);
>>   		if (r)
>>   			return r;
>>   	}
> + Monk, Frank,
>
> I probably cannot judge here, under SRIOV, I saw you disable ctx switch
> before. Do you have any concern if we enabled it here.

The problem was that those calls where mixed into sdma_v4_0_gfx_resume() 
for the first SDMA instance.

What was happening is that SDMA0 was initialized and while doing so 
enabled both SDMA0 and SDMA1. So SDMA1 was starting up before the ring 
buffer was even set.

That this doesn't crashed was pure coincident and is most likely also 
the reason why we ran into problems when ring buffers weren't initialized.

Regards,
Christian.

>
> Others, looks good for me. Christian, may we know which kind of jobs will
> use sdma page queue(ring), you know, we just sdma gfx queue(ring) before?
>
> Thanks,
> Ray
>



More information about the amd-gfx mailing list