[RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization

Gupta, Anshuman anshuman.gupta at intel.com
Tue Jan 30 15:01:00 UTC 2024



> -----Original Message-----
> From: Vivi, Rodrigo <rodrigo.vivi at intel.com>
> Sent: Tuesday, January 30, 2024 12:31 AM
> To: intel-xe at lists.freedesktop.org; Knop, Ryszard <ryszard.knop at intel.com>;
> Gupta, Anshuman <anshuman.gupta at intel.com>; Auld, Matthew
> <matthew.auld at intel.com>; Musial, Ewelina <ewelina.musial at intel.com>
> Subject: Re: [RFC 34/34] drm/xe: Enable D3Cold on 'low' VRAM utilization
> 
> On Mon, 2024-01-29 at 12:12 +0000, Matthew Auld wrote:
> > On 26/01/2024 20:30, Rodrigo Vivi wrote:
> > > Now that we eliminated all the mem_access get/put with its locking
> > > issues from the inner calls of migration, we can allow D3Cold.
> > >
> > > Enable it when VRAM utilization is lower then 300Mb. On higher
> > > utilization we only allow D3hot so we don't increase so much the
> > > latency on runtime resume due to the memory restoration.
> >
> > Note that nothing in CI is d3cold capable it seems, so they will never
> > trigger this path AFAIK. All platforms return:
> >
> > [drm:xe_pci_probe [xe]] d3cold: capable=no
> 
> Bummer... :/
> 
> >
> > Is that a bug, or perhaps there is something else needed? 
> 
> No, it is not a bug, it is a limitation in the DG2 parts that we have in our CI.
> Anshuman, do you know exactly why these CI parts are like this? and what
> should we put as a requirement for our CI to get parts that do support
> D3Cold?
There are couple pre-requisite  from card and Host.
PCIe Card : PMC - Power Management Capabilities (offset = 2)
bit(15) 1 XXXXb - PME# can be asserted from D3cold

Host:
ACPI _pr3 resources, which depends on BIOS.
I have noticed that if ACPI D3COLD is disabled from BIOS, we don't se _pr3 resources. So we need to review the Host BIOS settings in CI

Probably we should add the drm_dbg log for both of this requirement ?
        if (!pci_pme_capable(root_pdev, PCI_D3cold) || !pci_pr3_present(root_pdev))
                return false;
Thanks,
Anshuman Gupta.
> 
> >  Main question
> > is how to get CI test coverage for this here.
> 
> Well, that's tricky anyways. There are many other factors that can block
> D3cold underneath. To validate here sometimes I need to force the rpm auto
> for all the devices in the same root port. We have a tool in IGT to do that...
> probably need some adaptation to xe.
> 
> But well, there's no reliable way to ensure that the entire CI runs with always
> entering D3Cold. Unless we can make that tool and request CI folks to ensure a
> pre-script before every run on DG2?
> 
> >
> > >
> > > Signed-off-by: Rodrigo Vivi <rodrigo.vivi at intel.com>
> > > ---
> > >   drivers/gpu/drm/xe/xe_pm.h | 7 +------
> > >   1 file changed, 1 insertion(+), 6 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> > > index 0ae4f6a60c6d..bc6bd2a01189 100644
> > > --- a/drivers/gpu/drm/xe/xe_pm.h
> > > +++ b/drivers/gpu/drm/xe/xe_pm.h
> > > @@ -8,12 +8,7 @@
> > >
> > >   #include <linux/pm_runtime.h>
> > >
> > > -/*
> > > - * TODO: Threshold = 0 will block D3Cold.
> > > - *       Before we can move this to a higher value (like 300), we
> > > need to:
> > > - *           1. rewrite the VRAM save / restore to avoid buffer
> > > object locks
> > > - */
> > > -#define DEFAULT_VRAM_THRESHOLD 0 /* in MB */
> > > +#define DEFAULT_VRAM_THRESHOLD 300 /* in MB */
> > >
> > >   struct xe_device;
> > >



More information about the Intel-xe mailing list