<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body>
<div><br>
</div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Joonas Lahtinen <joonas.lahtinen@linux.intel.com><br>
<b>Sent:</b> Friday, May 10, 2024 2:39:56 PM<br>
<b>To:</b> De Marchi, Lucas <lucas.demarchi@intel.com>; Brost, Matthew <matthew.brost@intel.com><br>
<b>Cc:</b> Ghimiray, Himal Prasad <himal.prasad.ghimiray@intel.com>; intel-xe@lists.freedesktop.org <intel-xe@lists.freedesktop.org>; Upadhyay, Tejas <tejas.upadhyay@intel.com><br>
<b>Subject:</b> Re: [PATCH] drm/xe: Modify minimum of various scheduler timeout to 0</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">Quoting Matthew Brost (2024-05-09 19:00:43)<br>
> On Wed, May 08, 2024 at 08:06:52AM -0500, Lucas De Marchi wrote:<br>
> > On Mon, May 06, 2024 at 07:04:47PM GMT, Himal Prasad Ghimiray wrote:<br>
> > > Remove the min/max configs and change minimum values of preemption and<br>
> > > timeslice to 0. The sysman teams must deactivate preemption and<br>
> > > timeslice to assess exclusivity mode. They depend on configuring<br>
> > > timeslice_duration_us and preempt_timeout_us to 0, ensuring these<br>
> > > values are set at their minimum.<br>
> > <br>
> > nack for now, based on this explanation. See below.<br>
> > <br>
> > > <br>
> > > Cc: Matthew Brost <matthew.brost@intel.com><br>
> > > Cc: Tejas Upadhyay <tejas.upadhyay@intel.com><br>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com><br>
> > > Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com><br>
<br>
<SNIP><br>
<br>
> > > +++ b/drivers/gpu/drm/xe/xe_hw_engine.h<br>
> > > @@ -10,41 +10,17 @@<br>
> > > <br>
> > > struct drm_printer;<br>
> > > <br>
> > > -#ifdef CONFIG_DRM_XE_JOB_TIMEOUT_MIN<br>
> > > -#define XE_HW_ENGINE_JOB_TIMEOUT_MIN CONFIG_DRM_XE_JOB_TIMEOUT_MIN<br>
> > > -#else<br>
> > > #define XE_HW_ENGINE_JOB_TIMEOUT_MIN 1<br>
> > > -#endif<br>
> > > -#ifdef CONFIG_DRM_XE_JOB_TIMEOUT_MAX<br>
> > > -#define XE_HW_ENGINE_JOB_TIMEOUT_MAX CONFIG_DRM_XE_JOB_TIMEOUT_MAX<br>
> > > -#else<br>
> > > #define XE_HW_ENGINE_JOB_TIMEOUT_MAX (10 * 1000)<br>
> > <br>
> > previously one could configure a kernel with a different max.<br>
> > And now it's hardcoded. How is this better and related to the<br>
> > motivation in the commit message?<br>
> > <br>
> > is this commit doing 2 different things?<br>
> > <br>
> > 1) Removing the _MAX from config since nobody is or should change that<br>
> > 2) Modify the min value to 0... That seems weird as basically there is<br>
> > no minimum.<br>
> <br>
> This is true.<br>
<br>
Maybe I've missed some discussion but what is wrong with the previously<br>
agreed approach where KConfig would determine hard _MIN/_MAX bounds<br>
(and maybe _DEFAULT) that the kernel can work with, then there would be<br>
sysfs min/max (+ default) which the sysadmin sets as comfortable with<br>
considering the expected use-cases on the system and application then<br>
choose within those bounds (if it wants expedited hang detection for an<br>
example)?<br>
<br>
> > I'd like to hear Matt Brost's opinion on that.<br>
> <br>
> I chatted with Lucas and second his nack. The requirements for this<br>
> change are coming from sysman and are unclear. If sysman needs to set<br>
> this to 0 (akin to no maximum), why can't sysman build the kernel with a<br>
> Kconfig to allow this? This use case appears to be an HPC/AI data center<br>
> scenario in which the kernel is probably built with a non-standard<br>
> Kconfig anyway. Why can't CONFIG_DRM_*_MIN be set to zero? This needs to<br>
> be addressed clearly and publicly.<br>
<br>
I also thought we agreed on doing different build for such scenario<br>
where the interactive desktop is not expected to be used, so larger<br>
delays are more allowable.<br>
<br>
The default build simply can't allow unlimited blocking of the HW by<br>
certain workloads in order to make sure the desktop/compositor stays<br>
operational. That can only be enabled by default build when the long-running<br>
workloads can be isolated so they won't interfere with the workloads<br>
using dma-fences. And I don't think that code exists yet.<br>
<br>
Sysman should be able to report a feature missing from the kernel if it<br>
detects that it can't set the limits wanted. This can be reported to<br>
user via error code and/or error message asking to install a different<br>
kernel build.<br>
<br>
Same strategy has been applied in the past for various kernel features<br>
like realtime/virtualization etc. <br>
<br>
Regards, Joonas<br>
<br>
> <br>
> Matt<br>
> <br>
> > <br>
> > Lucas De Marchi<br>
> > <br>
> > > -#endif<br>
> > > -#ifdef CONFIG_DRM_XE_TIMESLICE_MIN<br>
> > > -#define XE_HW_ENGINE_TIMESLICE_MIN CONFIG_DRM_XE_TIMESLICE_MIN<br>
> > > -#else<br>
> > > -#define XE_HW_ENGINE_TIMESLICE_MIN 1<br>
> > > -#endif<br>
> > > -#ifdef CONFIG_DRM_XE_TIMESLICE_MAX<br>
> > > -#define XE_HW_ENGINE_TIMESLICE_MAX CONFIG_DRM_XE_TIMESLICE_MAX<br>
> > > -#else<br>
> > > +#define XE_HW_ENGINE_TIMESLICE_MIN 0<br>
> > > #define XE_HW_ENGINE_TIMESLICE_MAX (10 * 1000 * 1000)<br>
> > > -#endif<br>
> > > #ifdef CONFIG_DRM_XE_PREEMPT_TIMEOUT<br>
> > > #define XE_HW_ENGINE_PREEMPT_TIMEOUT CONFIG_DRM_XE_PREEMPT_TIMEOUT<br>
> > > #else<br>
> > > #define XE_HW_ENGINE_PREEMPT_TIMEOUT (640 * 1000)<br>
> > > #endif<br>
> > > -#ifdef CONFIG_DRM_XE_PREEMPT_TIMEOUT_MIN<br>
> > > -#define XE_HW_ENGINE_PREEMPT_TIMEOUT_MIN CONFIG_DRM_XE_PREEMPT_TIMEOUT_MIN<br>
> > > -#else<br>
> > > -#define XE_HW_ENGINE_PREEMPT_TIMEOUT_MIN 1<br>
> > > -#endif<br>
> > > -#ifdef CONFIG_DRM_XE_PREEMPT_TIMEOUT_MAX<br>
> > > -#define XE_HW_ENGINE_PREEMPT_TIMEOUT_MAX CONFIG_DRM_XE_PREEMPT_TIMEOUT_MAX<br>
> > > -#else<br>
> > > +#define XE_HW_ENGINE_PREEMPT_TIMEOUT_MIN 0<br>
> > > #define XE_HW_ENGINE_PREEMPT_TIMEOUT_MAX (10 * 1000 * 1000)<br>
> > > -#endif<br>
> > > <br>
> > > int xe_hw_engines_init_early(struct xe_gt *gt);<br>
> > > int xe_hw_engines_init(struct xe_gt *gt);<br>
> > > -- <br>
> > > 2.25.1<br>
> > > <br>
</div>
</span></font></div>
</body>
</html>