[Intel-xe] [PATCH 0/2] Fix deadlock issue on d3cold

Thomas Hellström thomas.hellstrom at linux.intel.com
Mon Dec 4 11:19:28 UTC 2023


On 12/4/23 11:57, Matthew Auld wrote:
> Hi,
>
> On Mon, 4 Dec 2023 at 05:18, Riana Tauro <riana.tauro at intel.com> wrote:
>> kernel BOs need to be restored to the same place in VRAM, and with
>> d3cold that means that any VRAM allocation can
>> potentially steal the spot from kernel BOs which then blows up when
>> waking the device up.
>>
>> However if we end up moving xe_device_mem_access_get() much higher
>> up in the hierarchy (start of the gem_create_ioctl) then
>> this is no longer possible.
>>
>> This patch fixes the deadlock issue seen in
>> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/256
>> Also enables d3cold to get CI results
>>
>> Riana Tauro (2):
>>    RFC drm/xe: Move xe_device_mem_access_get to the top of
>>      gem_create_ioctl
>>    CI drm/xe: Enable d3cold
> Tried this locally on DG2 and it triggers lockdep splats for me when
> loading the module, so it looks like a lot more is needed before
> turning on d3cold.

IMHO I think for the backup of pinned kernel bos we should either do 
something that is similar to what i915 does, with a separate backup bo, 
or if it is impossible to grab the object lock, put together a function 
that backs up all non-freed memory of a TTM VRAM manager to a set of 
system pages...

/Thomas





>   However I also had to manually set the
> d3cold.capable=true. Wondering if we have machines in CI that are
> d3cold capable, since BAT results are reporting success?




>
>>   drivers/gpu/drm/xe/xe_bo.c | 26 ++++++++++++++++++++------
>>   drivers/gpu/drm/xe/xe_pm.h |  2 +-
>>   2 files changed, 21 insertions(+), 7 deletions(-)
>>
>> --
>> 2.40.0
>>


More information about the Intel-xe mailing list