[Intel-gfx] [PATCH 02/21] drm/i915/gtt: Workaround for HW preload not flushing pdps

Wed Aug 12 00:56:49 PDT 2015

On 8/11/2015 1:05 PM, Zhiyuan Lv wrote:
> Hi Mika/Dave/Michel,
>
> I saw the patch of using LRI for root pointer update has been merged to
> drm-intel. When we consider i915 driver to run inside a virtual machine, e.g.
> with XenGT, we may still need Mika's this patch like below:
>
> "
>          if (intel_vgpu_active(ppgtt->base.dev))
>                  gen8_preallocate_top_level_pdps(ppgtt);
> "
>
> Could you share with us your opinion? Thanks in advance!

Hi Zhiyuan,

The change looks ok to me. If you need to preallocate the PDPs, 
gen8_ppgtt_init is the right place to do it. Only add a similar 
vgpu_active check to disable the LRI updates (in gen8_emit_bb_start).

>
> The reason behind is that LRI command will make shadow PPGTT implementation
> hard. In XenGT, we construct shadow page table for each PPGTT in guest i915
> driver, and then track every guest page table change in order to update shadow
> page table accordingly. The problem of page table updates with GPU command is
> that they cannot be trapped by hypervisor to finish the shadow page table
> update work. In XenGT, the only change we have is the command scan in context
> submission. But that is not exactly the right time to do shadow page table
> update.
>
> Mika's patch can address the problem nicely. With the preallocation, the root
> pointers in EXECLIST context will always keep the same. Then we can treat any
> attempt to change guest PPGTT with GPU commands as malicious behavior. Thanks!
>
> Regards,
> -Zhiyuan
>
> On Thu, Jun 11, 2015 at 04:57:42PM +0300, Mika Kuoppala wrote:
>> Dave Gordon <david.s.gordon at intel.com> writes:
>>
>>> On 10/06/15 12:42, Michel Thierry wrote:
>>>> On 5/29/2015 1:53 PM, Michel Thierry wrote:
>>>>> On 5/29/2015 12:05 PM, Michel Thierry wrote:
>>>>>> On 5/22/2015 6:04 PM, Mika Kuoppala wrote:
>>>>>>> With BDW/SKL and 32bit addressing mode only, the hardware preloads
>>>>>>> pdps. However the TLB invalidation only has effect on levels below
>>>>>>> the pdps. This means that if pdps change, hw might access with
>>>>>>> stale pdp entry.
>>>>>>>
>>>>>>> To combat this problem, preallocate the top pdps so that hw sees
>>>>>>> them as immutable for each context.
>>>>>>>
>>>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>>>>>> Cc: Rafael Barbalho <rafael.barbalho at intel.com>
>>>>>>> Signed-off-by: Mika Kuoppala <mika.kuoppala at intel.com>
>>>>>>> ---
>>>>>>>    drivers/gpu/drm/i915/i915_gem_gtt.c | 50
>>>>>>> +++++++++++++++++++++++++++++++++++++
>>>>>>>    drivers/gpu/drm/i915/i915_reg.h     | 17 +++++++++++++
>>>>>>>    drivers/gpu/drm/i915/intel_lrc.c    | 15 +----------
>>>>>>>    3 files changed, 68 insertions(+), 14 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>> b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>> index 0ffd459..1a5ad4c 100644
>>>>>>> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
>>>>>>> @@ -941,6 +941,48 @@ err_out:
>>>>>>>           return ret;
>>>>>>>    }
>>>>>>>
>>>>>>> +/* With some architectures and 32bit legacy mode, hardware pre-loads
>>>>>>> the
>>>>>>> + * top level pdps but the tlb invalidation only invalidates the
>>>>>>> lower levels.
>>>>>>> + * This might lead to hw fetching with stale pdp entries if top level
>>>>>>> + * structure changes, ie va space grows with dynamic page tables.
>>>>>>> + */
>>>
>>> Is this still necessary if we reload PDPs via LRI instructions whenever
>>> the address map has changed? That always (AFAICT) causes sufficient
>>> invalidation, so then we might not need to preallocate at all :)
>>>
>>
>> LRI reload gets my vote. Please ignore this patch.
>> -Mika
>>
>>> .Dave.
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx