[PATCH] drm/amdgpu: enable separate timeout setting for every ring type

Quan, Evan Evan.Quan at amd.com
Mon Apr 29 11:57:21 UTC 2019


Thanks for reviewing. Will update them in V2.

Regards,
Evan
> -----Original Message-----
> From: Michel Dänzer <michel at daenzer.net>
> Sent: Monday, April 29, 2019 6:17 PM
> To: Quan, Evan <Evan.Quan at amd.com>
> Cc: amd-gfx at lists.freedesktop.org; Deucher, Alexander
> <Alexander.Deucher at amd.com>; Lou, Wentao <Wentao.Lou at amd.com>;
> Koenig, Christian <Christian.Koenig at amd.com>
> Subject: Re: [PATCH] drm/amdgpu: enable separate timeout setting for
> every ring type
> 
> On 2019-04-29 10:57 a.m., Evan Quan wrote:
> > Every ring type can have its own timeout setting.
> >
> > Change-Id: I992f224f36bb33acd560162bffd2c3e987840a7e
> > Signed-off-by: Evan Quan <evan.quan at amd.com>
> 
> This is going in a good direction, but there are still some minor/cosmetic
> issues.
> 
> 
> > @@ -958,13 +960,16 @@ static void
> amdgpu_device_check_arguments(struct amdgpu_device *adev)
> >  		amdgpu_vram_page_split = 1024;
> >  	}
> >
> > -	if (amdgpu_lockup_timeout == 0) {
> > -		dev_warn(adev->dev, "lockup_timeout msut be > 0,
> adjusting to 10000\n");
> > -		amdgpu_lockup_timeout = 10000;
> > +	ret = amdgpu_device_get_job_timeout(adev);
> > +	if (ret) {
> > +		dev_err(adev->dev, "invalid job timeout setting\n");
> > +		return ret;
> >  	}
> 
> The error message should still explicitly mention lockup_timeout, or it may
> not be clear to the user what it's about. E.g. "Invalid lockup_timeout
> parameter syntax".
> 
> 
> > @@ -232,12 +234,20 @@ MODULE_PARM_DESC(msi, "MSI support (1 =
> enable,
> > 0 = disable, -1 = auto)");  module_param_named(msi, amdgpu_msi, int,
> > 0444);
> >
> >  /**
> > - * DOC: lockup_timeout (int)
> > - * Set GPU scheduler timeout value in ms. Value 0 is invalidated, will be
> adjusted to 10000.
> > - * Negative values mean 'infinite timeout' (MAX_JIFFY_OFFSET). The
> default is 10000.
> > + * DOC: lockup_timeout (string)
> > + * Set GPU scheduler timeout value in ms.
> > + *
> > + * The format is non-compute[.gfx.sdma.video][:compute].
> > + * With no "non-compute[.gfx.sdma.video]" timeout specified, the
> default timeout for non-compute_job is 10000.
> > + * The "non-compute" timeout setting applys to all non compute
> > + IPs(gfx, sdma and video). Unless
> > + * the timeout for this IP is specified separately(by "[.gfx.sdma.video]").
> 
> A dot is a bit weird as a separator, comma (or maybe semicolon) would be
> better.
> 
> Also, instead of requiring a general non-compute value (which is unused if all
> 3 engine specific values are specified) before the engine specific ones, how
> about: If only one non-compute value is specified, it applies to all non-
> compute engines. If multiple values are specified, the first one is for GFX,
> second one for SDMA, third one for video.
> 
> 
> > @@ -1307,6 +1317,66 @@ int amdgpu_file_to_fpriv(struct file *filp, struct
> amdgpu_fpriv **fpriv)
> >  	return 0;
> >  }
> >
> > +int amdgpu_device_get_job_timeout(struct amdgpu_device *adev) {
> > +	char *input = amdgpu_lockup_timeout;
> > +	char *non_compute_setting = NULL;
> > +	char *timeout_setting = NULL;
> > +	int index = 0;
> > +	int ret = 0;
> > +
> > +	/* Default timeout for non compute job is 10000 */
> > +	adev->gfx_timeout =
> > +		adev->sdma_timeout =
> > +			adev->video_timeout = 10000;
> 
> This is a bit weird formatting. :) Maybe it can be one or two lines, otherwise
> the second continuation line should have the same indentation as the first
> one.
> 
> 
> > +	/* Enforce no timeout on compute job at default */
> 
> "by default" (same in amdgpu_fence_driver_init_ring).
> 
> 
> --
> Earthling Michel Dänzer               |              https://www.amd.com
> Libre software enthusiast             |             Mesa and X developer


More information about the amd-gfx mailing list