mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]
Michel Dänzer
michel at daenzer.net
Thu Oct 20 01:11:58 UTC 2016
On 19/10/16 07:33 PM, Marek Olšák wrote:
> On Wed, Oct 19, 2016 at 8:42 AM, Dave Airlie <airlied at gmail.com> wrote:
>> On 18 October 2016 at 23:53, Dan Williams <dan.j.williams at intel.com> wrote:
>>> On Mon, Oct 17, 2016 at 8:48 PM, Dave Airlie <airlied at gmail.com> wrote:
>>> [..]
>>>>>> Aren't there only 2 possibilities for this regression?
>>>>>>
>>>>>> 1/ a memtype entry was never made so track_pfn_insert() returns an
>>>>>> uncached mapping
>>>>>>
>>>>>> 2/ a conflicting memtype entry exists and undefined behavior due to
>>>>>> mixed mapping types is avoided with the change.
>>>>>
>>>>> 3/ The CPU usage through this path goes up, and slows things down,
>>>>> though I suspect you it's more an uncached mapping showing up
>>>>> when we don't expect it.
>>>>
>>>> It's looking line number 1, there is no mapping, now we get uncached
>>>> where we used to get write through.
>>>>
>>>> difference in page prot 7f7bbc0e0000, pfn 20000000000e71e4,
>>>> 8000000000000037, 800000000000002f
>>>>
>>>> 0x2f is the vma pg prot which has PWT set in it, 0x37 is the returned
>>>> pgprot which lacks that bit.
>>>>
>>>> not sure where to go from here, suggestions?
>>>
>>> If the driver established an ioremap_wt() across the range, or just
>>> called reserve_memtype() directly that should restore WT mappings.
>>>
>>> Although Daniel's suggestion to use the i915 mapping helpers sounds
>>> like it avoids problem 3/ as well.
>>
>> Well we shouldn't be doing that many VRAM mappings on the CPU so
>> I doubt we'll hit the overheads here that often.
>>
>> Ideally we'd always use DMA to move stuff in/out of VRAM, but there
>> are some places where we still do WC VRAM writes for uploads.
>
> WC VRAM for uploads is better than WC GART IMO.
It's not a simple choice I'm afraid. While writing directly to WC VRAM
can be faster than writing to WC GART and then DMA'ing to VRAM, doing so
increases pressure on the first 256MB of VRAM. That's why I disabled
direct VRAM writes for streaming uploads again in
https://cgit.freedesktop.org/mesa/mesa/commit/?id=7b4276d7acf2e0f77044cb50caa6ad936fa78786
. It's possible that something has changed since then though, feel free
to play with enabling it again.
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
More information about the amd-gfx
mailing list