[PATCH] drm/amdgpu: support gpu recovery tests on compute rings

Deucher, Alexander Alexander.Deucher at amd.com
Mon Apr 29 03:00:38 UTC 2019


maybe just:
amdgpu.lockup_timeout=<global>,<gfx>,<compute>,<sdma>,<video>
I don't think we really need separate timeouts for all the different video related engines.

Alex
________________________________
From: Quan, Evan
Sent: Sunday, April 28, 2019 1:37 AM
To: Deucher, Alexander; Michel Dänzer; Koenig, Christian
Cc: Xu, Feifei; Cui, Flora; amd-gfx at lists.freedesktop.org
Subject: RE: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings


How about amdgpu.lockup_timeout=non-compute-jobs[, gfx, sdma, decode, encode][: compute-jobs] ?

This will not break backward compatibility.



And I’m not sure how to map “decode” and “encode” to the uvd/vce/vcn rings.

Since there are many rings related with these IPs(uvd, uvd_enc, vce, vcn_dec, vcn_enc, vcn_jpeg).

Maybe we should use IP name(uvd, vce or vcn) instead of “decode/encode”?



Regards,

Evan

From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> On Behalf Of Deucher, Alexander
Sent: 2019年4月26日 22:24
To: Michel Dänzer <michel at daenzer.net>; Quan, Evan <Evan.Quan at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
Cc: Xu, Feifei <Feifei.Xu at amd.com>; Cui, Flora <Flora.Cui at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings



How about an interface to change the timeout on a per engine (gfx, compute, dma, etc.) basis?

amdgpu.lockup_timeout=<global>,<gfx>,<compute>,<sdma>,<decode>,<encode>]

if only one parameter is given, we change it globably.  If more are given, we override the global one.  Could also do a sysfs interface to change it on the fly.



Alex

________________________________

From: amd-gfx <amd-gfx-bounces at lists.freedesktop.org<mailto:amd-gfx-bounces at lists.freedesktop.org>> on behalf of Michel Dänzer <michel at daenzer.net<mailto:michel at daenzer.net>>
Sent: Friday, April 26, 2019 4:35 AM
To: Quan, Evan; Koenig, Christian
Cc: Xu, Feifei; Cui, Flora; amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: support gpu recovery tests on compute rings



On 2019-04-26 10:20 a.m., Quan, Evan wrote:
> My concern is there is already one module parameter "lockup_timeout".
> parm:           lockup_timeout:GPU lockup timeout in ms > 0 (default 10000) (int)
>
> Adding one more "timeout" seems redundant.
> And that will makes the description of "lockup_timeout"(seems working for all jobs) does not match its real effect(affect only non-compute jobs).
>
> A better way is to rename "lockup_timeout" to "non-compute lockup_timeout". But I do not think we can change existing module parameter. Right?

Right. Also, there are already too many amdgpu module parameters, we
should try to remove some rather than adding new ones for every little
thing that could be tweaked. :)

One possibility might be to optionally allow passing multiple values to
lockup_timeout, e.g.

 amdgpu.lockup_timeout=10000,0

The first value would need to have the same meaning as now for backwards
compatibility.


--
Earthling Michel Dänzer               |              https://www.amd.com
Libre software enthusiast             |             Mesa and X developer
_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org<mailto:amd-gfx at lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190429/5d3df355/attachment-0001.html>


More information about the amd-gfx mailing list