[PATCH 1/2] drm/amdgpu: use multipipe compute policy on non PL11 asics

Mon Nov 6 06:49:35 UTC 2017

Hi Andres,

With your this patch, OCLperf hung.

Could you explain more?

If I am correctly, the difference of with and without this patch is 
setting first two queue or setting all queues of pipe0 to queue_bitmap.

Then UMD can use different number queue to submit command for compute 
selected by amdgpu_queue_mgr_map.

I checked amdgpu_queue_mgr_map implementation,  CS_IOCTL can map user 
ring to different hw ring depending on busy or idle, right?

If yes, I see a bug in it, which will result in our sched_fence not 
work. Our sched fence assumes the job will be executed in order, your 
mapping queue breaks this.

Regards,

David Zhou

On 2017年09月27日 00:22, Andres Rodriguez wrote:
> A performance regression for OpenCL tests on Polaris11 had this feature
> disabled for all asics.
>
> Instead, disable it selectively on the affected asics.
>
> Signed-off-by: Andres Rodriguez <andresx7 at gmail.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> index 4f6c68f..3d76e76 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
> @@ -109,9 +109,20 @@ void amdgpu_gfx_parse_disable_cu(unsigned *mask, unsigned max_se, unsigned max_s
>   	}
>   }
>   
> +static bool amdgpu_gfx_is_multipipe_capable(struct amdgpu_device *adev)
> +{
> +	/* FIXME: spreading the queues across pipes causes perf regressions
> +	 * on POLARIS11 compute workloads */
> +	if (adev->asic_type == CHIP_POLARIS11)
> +		return false;
> +
> +	return adev->gfx.mec.num_mec > 1;
> +}
> +
>   void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>   {
>   	int i, queue, pipe, mec;
> +	bool multipipe_policy = amdgpu_gfx_is_multipipe_capable(adev);
>   
>   	/* policy for amdgpu compute queue ownership */
>   	for (i = 0; i < AMDGPU_MAX_COMPUTE_QUEUES; ++i) {
> @@ -125,8 +136,7 @@ void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
>   		if (mec >= adev->gfx.mec.num_mec)
>   			break;
>   
> -		/* FIXME: spreading the queues across pipes causes perf regressions */
> -		if (0) {
> +		if (multipipe_policy) {
>   			/* policy: amdgpu owns the first two queues of the first MEC */
>   			if (mec == 0 && queue < 2)
>   				set_bit(i, adev->gfx.mec.queue_bitmap);