mm: fix cache mode tracking in vm_insert_mixed() breaks AMDGPU [was: Re: Latest testing with drm-next-4.9-wip and latest LLVM/mesa stack - Regression in PowerPlay/DPM on CIK?]

Dave Airlie airlied at gmail.com
Sun Oct 16 20:53:25 UTC 2016


On 17 October 2016 at 04:41, Marek Olšák <maraeo at gmail.com> wrote:
> On Fri, Oct 14, 2016 at 3:33 AM, Michel Dänzer <michel at daenzer.net> wrote:
>>
>> [ Adding Dan Williams and dri-devel ]
>>
>> On 14/10/16 03:28 AM, Shawn Starr wrote:
>>> Hello AMD folks,
>>>
>>> I have discovered a problem in Linus master that affects AMDGPU, nobody would
>>> notice this in drm-next-4.9-wip since its not in this repo.
>>
>> [...]
>>
>>> 87744ab3832b83ba71b931f86f9cfdb000d07da5 is the first bad commit
>>> commit 87744ab3832b83ba71b931f86f9cfdb000d07da5
>>> Author: Dan Williams <dan.j.williams at intel.com>
>>> Date:   Fri Oct 7 17:00:18 2016 -0700
>>>
>>>     mm: fix cache mode tracking in vm_insert_mixed()
>>>
>>>     vm_insert_mixed() unlike vm_insert_pfn_prot() and vmf_insert_pfn_pmd(),
>>>     fails to check the pgprot_t it uses for the mapping against the one
>>>     recorded in the memtype tracking tree.  Add the missing call to
>>>     track_pfn_insert() to preclude cases where incompatible aliased mappings
>>>     are established for a given physical address range.
>>>
>>>     Link: http://lkml.kernel.org/r/
>>> 147328717909.35069.14256589123570653697.stgit at dwillia2-
>>> desk3.amr.corp.intel.com
>>>     Signed-off-by: Dan Williams <dan.j.williams at intel.com>
>>>     Cc: David Airlie <airlied at linux.ie>
>>>     Cc: Matthew Wilcox <mawilcox at microsoft.com>
>>>     Cc: Ross Zwisler <ross.zwisler at linux.intel.com>
>>>     Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
>>>     Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
>>>
>>> :040000 040000 7517c0019fe49c1830b5a1d81f1dc099c5aab98a
>>> fd497a604a2af5995db2b8ed1e9c640bede6adf3 M      mm
>>>
>>>
>>> Removal of this patch stops graphics stalls.
>>
>> Thanks for bisecting this Shawn.
>>
>>
>>> A friend of mine mentions,
>>>
>>> "looks like a graphics thingy you depend on is requesting a mapping with a
>>> not-allowed cache mode, and now you are (rightfully) getting errors?"
>>
>> It would be nice to get some more specific pointers what amdgpu (or
>> maybe ttm, since that calls vm_insert_mixed in ttm_bo_vm_fault) might be
>> doing wrong.

       /*
         * We'd like to use VM_PFNMAP on shared mappings, where
         * (vma->vm_flags & VM_SHARED) != 0, for performance reasons,
         * but for some reason VM_PFNMAP + x86 PAT + write-combine is very
         * bad for performance. Until that has been sorted out, use
         * VM_MIXEDMAP on all mappings. See freedesktop.org bug #75719
         */
        vma->vm_flags |= VM_MIXEDMAP;

We have that comment in the ttm code, which to me implies that mixed is
doing the right thing now, but that is slow, as the interface we
should be using.

Dave.


More information about the amd-gfx mailing list