mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
Marek Olšák
maraeo at gmail.com
Wed Oct 19 10:33:52 UTC 2016
On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied at gmail.com> wrote:
> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams at intel.com> wrote:
>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied at gmail.com> wrote:
>> [..]
>>>>> Aren't there only 2 possibilities for this regression?
>>>>>
>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>> uncached mapping
>>>>>
>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>> mixed mapping types is avoided with the change.
>>>>
>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>> though I suspect you it's more an uncached mapping showing up
>>>> when we don't expect it.
>>>
>>> It's looking line number 1, there is no mapping, now we get uncached
>>> where we used to get write through.
>>>
>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>> 8000000000000037, 800000000000002f
>>>
>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>> pgprot which lacks that bit.
>>>
>>> not sure where to go from here, suggestions?
>>
>> If the driver established an ioremap_wt() across the range, or just
>> called reserve_memtype() directly that should restore WT mappings.
>>
>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>> like it avoids problem 3/ as well.
>
> Well we shouldn't be doing that many VRAM mappings on the CPU so
> I doubt we'll hit the overheads here that often.
>
> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
> are some places where we still do WC VRAM writes for uploads.
WC VRAM for uploads is better than WC GART IMO.
Marek
More information about the amd-gfx
mailing list