your mail
David Hildenbrand
david at redhat.com
Mon Aug 25 19:20:20 UTC 2025
On 25.08.25 20:35, Christian König wrote:
> On 21.08.25 12:05, Lorenzo Stoakes wrote:
>> On Thu, Aug 21, 2025 at 11:30:43AM +0200, David Hildenbrand wrote:
>>>> I will add this xen/apply_to_page_range() thing to my TODOs, which atm
>>>> would invovle changing these drivers to use vmf_insert_pfn_prot() instead.
>>>>
>>>
>>> Busy today (want to reply to Christian) but
>>>
>>> a) Re: performance, we would want something like
>>> vmf_insert_pfns_prot(), similar to vm_insert_pages(), to bulk-insert
>>> multiple PFNs.
>
> Yes, exactly that. Ideally something like an iterator/callback like interface.
>
> I've seen at least four or five different representations of the PFNs in drivers.
>
>>> b) Re: PAT, we'll have to figure out why PAT information is wrong here
>>> (was there no previous PAT reservation from the driver?), but IF we
>>> really have to override, we'd want a way to tell
>>> vmf_insert_pfn_prot() to force the selected caching mode.
>>>
>
> Well the difference between vmf_insert_pfn() and vmf_insert_pfn_prot() is that the driver actually want to specify the caching modes.
Yes, it's all a mess. x86/PAT doesn't want inconsistencies, so it
expects that a previous reservation would make sure that that caching
mode is actually valid.
>
> That this is overridden by the PAT even for pages which are not part of the linear mapping is really surprising.
Yes, IIUC, it expects an earlier reservation on PAT systems.
>
> As far as I can see there is no technical necessity for that. Even for pages in the linear mapping only a handful of x86 CPUs actually need that. See Intels i915 GPU driver for reference.
>
> Intel has used that approach for ages and for AMD CPUs the only reference I could find where the kernel needs it are Athlons produced between 1996 and 2004.
>
> Maybe we should disable the PAT on CPUs which actually don't need it?
Not sure if that will solve our problems on systems that need it because
of some devices.
I guess the problem of pfnmap_setup_cachemode_pfn() is that there is no
interface to undo it: pfnmap_track() is pared with pfnmap_untrack() such
that it can simply do/undo the reservation itself.
That's why pfnmap_setup_cachemode_pfn() leaves it up to the caller that
a reservation was trigger earlier differently -- which can properly be
undone.
--
Cheers
David / dhildenb
More information about the Intel-xe
mailing list