[Intel-xe] [RFC 0/3] drm/xe: Add preemption timeout config options

Thomas Hellström thomas.hellstrom at linux.intel.com
Tue Jun 13 17:32:46 UTC 2023


Hi, again

On 6/13/23 18:52, Thomas Hellström wrote:
> Hi, Niranjana,
>
> On 6/13/23 17:52, Niranjana Vishwanathapura wrote:
>> On Tue, Jun 13, 2023 at 09:31:00AM +0200, Thomas Hellström wrote:
>>> Hi, Niranjana
>>>
>>> The plan to handle non-preemptible hardware with Xe is either to use 
>>> faulting or to pin memory and use CGROUPS to control the
>>> pinning. So when that is fully flushed out, there shouldn't be a 
>>> need for these changes?
>>>
>>
>> Hi Thomas,
>> Currently the preemption timeout is hardcoded in driver to 640ms. These
>> patches are for preemptible hardware and allows preemption timeout to be
>> configurable during compile time. These patches are being ported form 
>> i915.
>>
>> Also, the purpose is actually to give compute tasks longer time to reach
>> a pre-emption point after a pre-emption request has been issued. This is
>> necessary because Gen12 does not support mid-thread pre-emption and 
>> compute
>> tasks can have long running threads.
>
> For ordinary (!LR) contexts, I don't see a problem really as long as 
> this can't be used to extend dma-fence signalling time beyond 10s, I 
> perhaps need to sync up with Matt to understand how that limit is 
> enforced in Xe.
>
> But for the compute context, I'm worried that people will start using 
> this to extend the signalling time of the preempt fences to make sure 
> jobs aren't killed shortly after, for example, userptr invalidation 
> because the compute kernel didn't preempt fast enough. That is not 
> what we want for Xe clients.
>
> The suggested solution to handle this latter problem in Xe for 
> non-preemptible hardware is to avoid  having preempt fences request a 
> timeout in the first place by pinning memory, which will need CGROUPS 
> support, which Maarten is working on.
>
> So in short I believe this is OK if there is no way this can be abused 
> to extend signalling time of ordinary dma-fences or compute 
> preempt-fences beyond 10s.
>
> /Thomas

So after discussing with Matt I think this series is fine if we limit 
the configurability to 10s for compute both at compilation- and run-time.

If we need longer compute timeouts and indeed want to use this as a 
workaround for non-preemptible hardware we need to redo the discussion 
we had previously on how to tackle that problem.

/Thomas


>
>
>>
>> Similarly DRM_XE_PREEMPT_TIMEOUT_COMPUTE_COPY sets the default timeout
>> for copy engine to a higher value for PVC.
>>
>> Niranjana
>>
>>> /Thomas
>>>
>>> On 6/10/23 02:12, Niranjana Vishwanathapura wrote:
>>>> Allow preemption timeouts to be specified as a config options.
>>>>
>>>> Signed-off-by: Niranjana Vishwanathapura 
>>>> <niranjana.vishwanathapura at intel.com>
>>>>
>>>> Niranjana Vishwanathapura (3):
>>>>   drm/xe: Add CONFIG_DRM_XE_PREEMPT_TIMEOUT
>>>>   drm/xe: Add DRM_XE_PREEMPT_TIMEOUT_COMPUTE
>>>>   drm/xe: Add DRM_XE_PREEMPT_TIMEOUT_COMPUTE_COPY
>>>>
>>>>  drivers/gpu/drm/xe/Kconfig         |  6 +++++
>>>>  drivers/gpu/drm/xe/Kconfig.profile | 40 
>>>> ++++++++++++++++++++++++++++++
>>>>  drivers/gpu/drm/xe/xe_engine.c     | 30 +++++++++++++++++++++-
>>>>  3 files changed, 75 insertions(+), 1 deletion(-)
>>>>  create mode 100644 drivers/gpu/drm/xe/Kconfig.profile
>>>>


More information about the Intel-xe mailing list