mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]

Marek Olšák maraeo at
Thu Oct 20 09:06:57 UTC 2016

On Thu, Oct 20, 2016 at 3:11 AM, Michel Dänzer <michel at> wrote:
> On 19/10/16 07:33 PM, Marek Olšák wrote:
>> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied at> wrote:
>>> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams at> wrote:
>>>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied at> wrote:
>>>> [..]
>>>>>>> Aren't there only 2 possibilities for this regression?
>>>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>>>> uncached mapping
>>>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>>>> mixed mapping types is avoided with the change.
>>>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>>>> though I suspect you it's more an uncached mapping showing up
>>>>>> when we don't expect it.
>>>>> It's looking line number 1, there is no mapping, now we get uncached
>>>>> where we used to get write through.
>>>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>>>> 8000000000000037, 800000000000002f
>>>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>>>> pgprot which lacks that bit.
>>>>> not sure where to go from here, suggestions?
>>>> If the driver established an ioremap_wt() across the range, or just
>>>> called reserve_memtype() directly that should restore WT mappings.
>>>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>>>> like it avoids problem 3/ as well.
>>> Well we shouldn't be doing that many VRAM mappings on the CPU so
>>> I doubt we'll hit the overheads here that often.
>>> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
>>> are some places where we still do WC VRAM writes for uploads.
>> WC VRAM for uploads is better than WC GART IMO.
> It's not a simple choice I'm afraid. While writing directly to WC VRAM
> can be faster than writing to WC GART and then DMA'ing to VRAM, doing so
> increases pressure on the first 256MB of VRAM. That's why I disabled
> direct VRAM writes for streaming uploads again in
> . It's possible that something has changed since then though, feel free
> to play with enabling it again.

amdgpu should handle any memory pressure gracefully. radeon is not so
robust though.


More information about the dri-devel mailing list