[Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

Marek Olšák maraeo at gmail.com
Thu Sep 7 10:14:38 UTC 2017


On Sep 7, 2017 12:08 PM, "Christian König" <deathsimple at vodafone.de> wrote:

Am 07.09.2017 um 11:23 schrieb Michel Dänzer:

> On 01/09/17 07:40 PM, Christian König wrote:
>
>> Am 01.09.2017 um 12:28 schrieb Michel Dänzer:
>>
>>> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
>>>
>>>> On 01.09.2017 11:58, Michel Dänzer wrote:
>>>>
>>>>> On 29/08/17 11:47 PM, Christian König wrote:
>>>>>
>>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>>
>>>>>> For lower overhead in the CS ioctl.
>>>>>> Winsys allocators are not used with interprocess-sharable resources.
>>>>>>
>>>>>> v2: It shouldn't crash anymore, but the kernel will reject the new
>>>>>> flag.
>>>>>> v3 (christian): Rename the flag, avoid sending those buffers in the
>>>>>> BO list.
>>>>>> v4 (christian): Remove setting the kernel flag for now
>>>>>>
>>>>> This change seems to have caused a GPU hang when running piglit on my
>>>>> Kaveri with the radeon kernel driver.
>>>>>
>>>> I think we can remove "seems to have". I'm still reliably getting the
> GPUVM fault and hang with current master, but not if I revert this
> commit (and the one after it).
>
> Haven't been able to isolate it to a specific test, seems to only
>>>>> happen when running multiple tests concurrently.
>>>>>
>>>> I reproduced the problem with piglit process separation enabled as well,
> and all four tests running when it hung were textureGather tests.
> Before, reproducing the problem twice with piglit process separation
> disabled, three textureGather tests were running when it hung both times
> as well. I've been unable to reproduce the problem by manually running
> the same textureGather tests in parallel though.
>
>
> There's a GPUVM fault before the hang, I suspect it's related:
>>>>>
>>>>> radeon 0000:00:01.0: GPU fault detected: 146 0x0ae6760c
>>>>> radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x000001D7
>>>>> radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
>>>>> VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)
>>>>>
>>>>>
>>>>> Any ideas?
>>>>>
>>>> Not the slightest, but I'm still investigating problems with that on
>> amdgpu.
>>
>> If we can't find the root cause till Monday it might be a good idea to
>> revert the patches for now.
>>
> What's the status on that?
>


I've found and fixed the remaining kernel bugs over the last
weekend/beginning of this week.

Still need to commit the fix for UVD/VCE, but that one shouldn't affect GFX
at all.


Michel is seeing hangs on the radeon KMD, which should be unaffected by you
kernel work I think.

We could revert this to unbreak Michel's Kaveri, but I think it shouldn't
be so difficult to find the culprit in this patch if there is one.

Marek



Christian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170907/84f8711f/attachment-0001.html>


More information about the mesa-dev mailing list