mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
Marek Olšák
maraeo at gmail.com
Thu Oct 20 09:06:57 UTC 2016
On Thu, Oct 20, 2016 at 3:11 AM, Michel Dänzer <michel at daenzer.net> wrote:
> On 19/10/16 07:33 PM, Marek Olšák wrote:
>> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied at gmail.com> wrote:
>>> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams at intel.com> wrote:
>>>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied at gmail.com> wrote:
>>>> [..]
>>>>>>> Aren't there only 2 possibilities for this regression?
>>>>>>>
>>>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>>>> uncached mapping
>>>>>>>
>>>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>>>> mixed mapping types is avoided with the change.
>>>>>>
>>>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>>>> though I suspect you it's more an uncached mapping showing up
>>>>>> when we don't expect it.
>>>>>
>>>>> It's looking line number 1, there is no mapping, now we get uncached
>>>>> where we used to get write through.
>>>>>
>>>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>>>> 8000000000000037, 800000000000002f
>>>>>
>>>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>>>> pgprot which lacks that bit.
>>>>>
>>>>> not sure where to go from here, suggestions?
>>>>
>>>> If the driver established an ioremap_wt() across the range, or just
>>>> called reserve_memtype() directly that should restore WT mappings.
>>>>
>>>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>>>> like it avoids problem 3/ as well.
>>>
>>> Well we shouldn't be doing that many VRAM mappings on the CPU so
>>> I doubt we'll hit the overheads here that often.
>>>
>>> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
>>> are some places where we still do WC VRAM writes for uploads.
>>
>> WC VRAM for uploads is better than WC GART IMO.
>
> It's not a simple choice I'm afraid. While writing directly to WC VRAM
> can be faster than writing to WC GART and then DMA'ing to VRAM, doing so
> increases pressure on the first 256MB of VRAM. That's why I disabled
> direct VRAM writes for streaming uploads again in
> https://cgit.freedesktop.org/mesa/mesa/commit/?id=7b4276d7acf2e0f77044cb50caa6ad936fa78786
> . It's possible that something has changed since then though, feel free
> to play with enabling it again.
amdgpu should handle any memory pressure gracefully. radeon is not so
robust though.
Marek
More information about the dri-devel
mailing list