[Intel-gfx] Usage of _PAGE_PCD et al in i915 driver

Mon Aug 18 12:36:41 CEST 2014

On 08/18/2014 12:21 PM, Ville Syrjälä wrote:
> On Mon, Aug 18, 2014 at 07:31:58AM +0200, Juergen Gross wrote:
>> On 08/15/2014 12:21 PM, Ville Syrjälä wrote:
>>> On Thu, Aug 14, 2014 at 05:55:11AM +0200, Juergen Gross wrote:
>>>> On 08/13/2014 05:07 PM, Jesse Barnes wrote:
>>>>> On Fri, 8 Aug 2014 15:14:15 +0200
>>>>> Daniel Vetter <daniel.vetter at ffwll.ch> wrote:
>>>>>
>>>>>> Adding relevant mailing lists.
>>>>>>
>>>>>> On Fri, Aug 8, 2014 at 1:23 PM, Juergen Gross <jgross at suse.com> wrote:
>>>>>>> I'm just about to create a patch for full PAT support in the Linux
>>>>>>> kernel, including Xen. For this purpose I introduce a translation
>>>>>>> between cache modes and pte bits.
>>>>>>>
>>>>>>> Scanning the kernel sources for usage of the cache mode bits in the
>>>>>>> pte I discovered  drivers/gpu/drm/i915/i915_gem_gtt.h is using
>>>>>>> _PAGE_PCD, _PAGE_PWT and _PAGE_PAT. I think those defines are used
>>>>>>> to create ptes not for usage by the main processor, but for the
>>>>>>> graphics processor. Is this true? In this case I'd suggest to define
>>>>>>> i915-specific macros instead of using the x86 ones.
>>>>>>
>>>>>> Yeah, those are gpu specific PAT tables, but the hw engineers
>>>>>> specifically designed this to match, and we've tried to follow the cpu
>>>>>> side to match it. Especially in the future that will be somewhat
>>>>>> important, since we want to fully share the entire address space
>>>>>> between cpu and gpu on the next platform. Jesse is working on that.
>>>>>
>>>>> Right, we have an x86 compatible MMU in the GPU itself, so re-using the
>>>>> defines makes sense.  I suppose with your work you'll move them and
>>>>> make them a bit more opaque?  If so, we'll still want a way to get at
>>>>> them directly, or access your mapping functions for generating PTE bits
>>>>> for the GPU MMU.
>>>>
>>>> Using the mapping functions I'm introducing should work, if the MMU has
>>>> an x86 compatible MSR_IA32_CR_PAT which is configured the same way as
>>>> on the x86 processor (be aware that Xen is using another MSR_IA32_CR_PAT
>>>> setting as the Linux kernel).
>>>
>>> We have a PAT that is structured the same way as the x86 PAT. But the
>>> contents of the PAT entries are obviously specific to the GPU so it's
>>> not identical. But the pcd/pwt/pat bits index the PAT in exactly the
>>> same way as on x86.
>>>
>>> See bdw_setup_private_ppat() and chv_setup_private_ppat() for how we
>>> set up the PAT.
>>>
>>
>> So you are using the PAT bit in the ptes, but the semantic for the GPU
>> will be different as for the x86 processor, because the GPU PAT is set
>> up differently from the x86 one.
>>
>> In case you are sharing ptes between GPU and x86 processor in future,
>> this might lead to problems when the x86 processor will use ptes with
>> the PAT bit set.
>
> I'm not sure why you single out the PAT bit. It's just another index bit
> like PCD and PWT.

I single out the PAT bit because all entries of CPU PAT-register and
GPU PAT-register differ with PAT==1. With PAT==0 they are configured
to have the same semantics.

> Currently we play around with the GPU caching mode rather freely because
> the hardware is already fully coherent wrt. CPU caches (well, apart from
> display scanout which knows nothing about any caches). What we do
> currently is leave all the CPU mappings as WB and just change the GPU
> caching mode depending on the need.

The Xen hypervisor is already using a different PAT configuration than
then Linux Kernel.

So your approach could break Xen when sharing the page tables between
CPU and GPU.

> However once we share the page tables I'm not sure what's the plan wrt.
> changing the caching mode for GPU buffers since that would involve
> changing the CPU cachine mode as well, and we may still want finer
> granularity control over the various GPU caches. Maybe we need to
> reserve some PAT entries for GPU specific purposes so that the CPU
> might have no difference between two PAT entries but the GPU would.
> But I'm not sure there are any extra PAT entries left which could be
> reserved for such things.

There should be 2 entries left in the PAT-register which could be used
by the GPU, I think: there are only 6 different cache modes defined for
x86 and we have 8 PAT register entries, so at least 2 entries must be
duplicates.

> We do have ways to override the GPU caching mode using inline information
> in the GPU command buffers though, so in theory at least, it doesn't
> matter all that much to the GPU how the page table caching bits are
> configured. However not all commands may have such inline caching
> information, and we still have the display scanout to worry about which
> still relies on the page tables to avoid expensive manual clflushes.