mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
Dave Airlie
airlied at gmail.com
Wed Oct 19 06:42:08 UTC 2016
On 18 October 2016 at 23:53, Dan Williams <dan.j.williams at intel.com> wrote:
> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied at gmail.com> wrote:
> [..]
>>>> Aren't there only 2 possibilities for this regression?
>>>>
>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>> uncached mapping
>>>>
>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>> mixed mapping types is avoided with the change.
>>>
>>> 3/ The CPU usage through this path goes up, and slows things down,
>>> though I suspect you it's more an uncached mapping showing up
>>> when we don't expect it.
>>
>> It's looking line number 1, there is no mapping, now we get uncached
>> where we used to get write through.
>>
>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>> 8000000000000037, 800000000000002f
>>
>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>> pgprot which lacks that bit.
>>
>> not sure where to go from here, suggestions?
>
> If the driver established an ioremap_wt() across the range, or just
> called reserve_memtype() directly that should restore WT mappings.
>
> Although Daniel's suggestion to use the i915 mapping helpers sounds
> like it avoids problem 3/ as well.
Well we shouldn't be doing that many VRAM mappings on the CPU so
I doubt we'll hit the overheads here that often.
Ideally we'd always use DMA to move stuff in/out of VRAM, but there
are some places where we still do WC VRAM writes for uploads.
So I've sent the patches, any major opinions on them, we can't just
ioremap_wc the whole BAR, as on 32-bit that just messes things up
and it's unnecessary anyways.
Dave.
More information about the amd-gfx
mailing list