[Intel-gfx] [RFCv5 2/2] drm/i915: Introduce private PAT management

Tue Aug 29 11:31:05 UTC 2017

Thanks for the reply! 

For the hole, per my understanding, the author wanted the mapping to be consistent: i915 cache level <-> PPAT index <-> cache attribute in IA page table in case the GPU and IA may share page tables in future, since actually PPAT index is represented as PAT/PCD/PWT bits in GPU page table entries. We can see the PPAT index is defined with those PAT/PCD/PWT bits from IA page tables. Maybe that's the reason of the hole. :)

Thanks,
Zhi. 

-----Original Message-----
From: Chris Wilson [mailto:chris at chris-wilson.co.uk] 
Sent: Tuesday, August 29, 2017 2:24 PM
To: Wang, Zhi A <zhi.a.wang at intel.com>; intel-gfx at lists.freedesktop.org; intel-gvt-dev at lists.freedesktop.org
Cc: joonas.lahtinen at linux.intel.com; zhenyuw at linux.intel.com; Widawsky, Benjamin <benjamin.widawsky at intel.com>; Vivi, Rodrigo <rodrigo.vivi at intel.com>
Subject: RE: [RFCv5 2/2] drm/i915: Introduce private PAT management

Quoting Wang, Zhi A (2017-08-29 12:13:26)
> Hi Chris:
>     There is mapping between i915 cache level -> PPAT index. Currently it's:
> 
> static gen8_pte_t gen8_pte_encode(dma_addr_t addr,
>                                   enum i915_cache_level level) { ...
>         switch (level) {
>         case I915_CACHE_NONE:
>                 pte |= PPAT_UNCACHED_INDEX;
>                 break;
>         case I915_CACHE_WT:
>                 pte |= PPAT_DISPLAY_ELLC_INDEX;
>                 break;
>         default:
>                 pte |= PPAT_CACHED_INDEX;
>                 break;
>         }
> ...
> 
> According to bspec, the PPAT index filled in the page table is calculated as:
> 
> PPAT index = 4 * PAT + 2 * PCD + PWT
> 
> In the i915_gem_gtt.c
> 
> #define PPAT_UNCACHED_INDEX             (_PAGE_PWT | _PAGE_PCD)     // PPAT INDEX =  1 + 2 * 1 = 3
> #define PPAT_CACHED_PDE_INDEX           0 /* WB LLC */                  //  PPAT INDEX = 0
> #define PPAT_CACHED_INDEX               _PAGE_PAT /* WB LLCeLLC */      // PPAT INDEX = 4 * 1 = 4
> #define PPAT_DISPLAY_ELLC_INDEX         _PAGE_PCD /* WT eLLC */           // PPAT INDEX = 2 * 1 = 2
> 
> Actually the PPAT index used by i915 are: 0 2 3 4, which has already been reserved in the RFC patches.

So we can use these values in alloc_ppat, right? Still very concerned about the hole -- it kind of implies there is an entry we should be using but have forgotten!

> > > @@ -2864,17 +3038,22 @@ static void bdw_setup_private_ppat(struct drm_i915_private *dev_priv)
> > >                  * So we can still hold onto all our assumptions wrt cpu
> > >                  * clflushing on LLC machines.
> > >                  */
> > > -               pat = GEN8_PPAT(0, GEN8_PPAT_UC);
> > > +               alloc_ppat_entry(ppat, 0, GEN8_PPAT_UC);
> > > +               return;
> > > +       }
> > >  
> > > -       /* XXX: spec defines this as 2 distinct registers. It's unclear if a 64b
> > > -        * write would work. */
> > > -       I915_WRITE(GEN8_PRIVATE_PAT_LO, pat);
> > > -       I915_WRITE(GEN8_PRIVATE_PAT_HI, pat >> 32);
> > > +       alloc_ppat_entry(ppat, 0, GEN8_PPAT_WB | GEN8_PPAT_LLC);     /* for normal objects, no eLLC */
> > > +       alloc_ppat_entry(ppat, 2, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC); /* for scanout with eLLC */
> > > +       alloc_ppat_entry(ppat, 3, GEN8_PPAT_UC);                     /* Uncached objects, mostly for scanout */
> > > +       alloc_ppat_entry(ppat, 4, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC 
> > > + |

/* See gen8_pte_encode() for the mapping from cache-level to ppat */ alloc_ppage_entry(ppat, PPAT_CACHED_PDE_INDEX, GEN8_PPAT_WB | GEN8_PPAT_LLC); alloc_ppage_entry(ppat, PPAT_DISPLAY_ELLC_INDEX, GEN8_PPAT_WT | GEN8_PPAT_LLCELLC); alloc_ppage_entry(ppat, PPAT_UNCACHED_INDEX, GEN8_PPAT_UC); alloc_ppage_entry(ppat, PPAT_CACHED_INDEX, GEN8_PPAT_WB | GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));

etc.
-Chris