your mail

David Hildenbrand david at redhat.com
Thu Aug 21 10:16:48 UTC 2025


On 21.08.25 12:05, Lorenzo Stoakes wrote:
> On Thu, Aug 21, 2025 at 11:30:43AM +0200, David Hildenbrand wrote:
>>> I will add this xen/apply_to_page_range() thing to my TODOs, which atm
>>> would invovle changing these drivers to use vmf_insert_pfn_prot() instead.
>>>
>>
>> Busy today (want to reply to Christian) but
>>
>> a) Re: performance, we would want something like
>>     vmf_insert_pfns_prot(), similar to vm_insert_pages(), to bulk-insert
>>     multiple PFNs.
>>
>> b) Re: PAT, we'll have to figure out why PAT information is wrong here
>>     (was there no previous PAT reservation from the driver?), but IF we
>>     really have to override, we'd want a way to tell
>>     vmf_insert_pfn_prot() to force the selected caching mode.
>>
> 
> Ack, ok good that we have a feasible way forward.
> 
> FYI, spoke to Peter off-list and he mentioned he had a more general series
> to get rid of this kind of [ab]use of apply_to_page_range() (see [0]), I
> gather he hasn't the time to resurrect but perhaps one of us can at some
> point?
> 
> Perhaps we need a shorter term fix to _this_ issue (which involves not
> using this interface), and then follow it up with an adaptation of the
> below?

We need to understand why PAT would be wrong and why it would even be ok
to ignore it. Not hacking around it.

FWIW, I just recently documented:

+/**
+ * pfnmap_setup_cachemode - setup the cachemode in the pgprot for a pfn range
+ * @pfn: the start of the pfn range
+ * @size: the size of the pfn range in bytes
+ * @prot: the pgprot to modify
+ *
+ * Lookup the cachemode for the pfn range starting at @pfn with the size
+ * @size and store it in @prot, leaving other data in @prot unchanged.
+ *
+ * This allows for a hardware implementation to have fine-grained control of
+ * memory cache behavior at page level granularity. Without a hardware
+ * implementation, this function does nothing.
+ *
+ * Currently there is only one implementation for this - x86 Page Attribute
+ * Table (PAT). See Documentation/arch/x86/pat.rst for more details.
+ *
+ * This function can fail if the pfn range spans pfns that require differing
+ * cachemodes. If the pfn range was previously verified to have a single
+ * cachemode, it is sufficient to query only a single pfn. The assumption is
+ * that this is the case for drivers using the vmf_insert_pfn*() interface.
+ *
+ * Returns 0 on success and -EINVAL on error.
+ */
+int pfnmap_setup_cachemode(unsigned long pfn, unsigned long size,
+               pgprot_t *prot);
  extern int track_pfn_copy(struct vm_area_struct *dst_vma,
                 struct vm_area_struct *src_vma, unsigned long *pfn);
  extern void untrack_pfn_copy(struct vm_area_struct *dst_vma,
@@ -1563,6 +1584,21 @@ extern void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
  extern void untrack_pfn_clear(struct vm_area_struct *vma);
  #endif
  
+/**
+ * pfnmap_setup_cachemode_pfn - setup the cachemode in the pgprot for a pfn
+ * @pfn: the pfn
+ * @prot: the pgprot to modify
+ *
+ * Lookup the cachemode for @pfn and store it in @prot, leaving other
+ * data in @prot unchanged.
+ *
+ * See pfnmap_setup_cachemode() for details.
+ */
+static inline void pfnmap_setup_cachemode_pfn(unsigned long pfn, pgprot_t *prot)
+{
+       pfnmap_setup_cachemode(pfn, PAGE_SIZE, prot);
+}


There is certainly something missing that the driver would have previously
verified that the cachemode is as expected.

-- 
Cheers

David / dhildenb


More information about the Intel-xe mailing list