[Mesa-dev] [PATCH 1/2] radeonsi: set a per-buffer flag that disables inter-process sharing (v4)

Christian König deathsimple at vodafone.de
Thu Sep 7 10:08:17 UTC 2017


Am 07.09.2017 um 11:23 schrieb Michel Dänzer:
> On 01/09/17 07:40 PM, Christian König wrote:
>> Am 01.09.2017 um 12:28 schrieb Michel Dänzer:
>>> On 01/09/17 07:23 PM, Nicolai Hähnle wrote:
>>>> On 01.09.2017 11:58, Michel Dänzer wrote:
>>>>> On 29/08/17 11:47 PM, Christian König wrote:
>>>>>> From: Marek Olšák <marek.olsak at amd.com>
>>>>>>
>>>>>> For lower overhead in the CS ioctl.
>>>>>> Winsys allocators are not used with interprocess-sharable resources.
>>>>>>
>>>>>> v2: It shouldn't crash anymore, but the kernel will reject the new
>>>>>> flag.
>>>>>> v3 (christian): Rename the flag, avoid sending those buffers in the
>>>>>> BO list.
>>>>>> v4 (christian): Remove setting the kernel flag for now
>>>>> This change seems to have caused a GPU hang when running piglit on my
>>>>> Kaveri with the radeon kernel driver.
> I think we can remove "seems to have". I'm still reliably getting the
> GPUVM fault and hang with current master, but not if I revert this
> commit (and the one after it).
>
>>>>> Haven't been able to isolate it to a specific test, seems to only
>>>>> happen when running multiple tests concurrently.
> I reproduced the problem with piglit process separation enabled as well,
> and all four tests running when it hung were textureGather tests.
> Before, reproducing the problem twice with piglit process separation
> disabled, three textureGather tests were running when it hung both times
> as well. I've been unable to reproduce the problem by manually running
> the same textureGather tests in parallel though.
>
>
>>>>> There's a GPUVM fault before the hang, I suspect it's related:
>>>>>
>>>>> radeon 0000:00:01.0: GPU fault detected: 146 0x0ae6760c
>>>>> radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_ADDR   0x000001D7
>>>>> radeon 0000:00:01.0:   VM_CONTEXT1_PROTECTION_FAULT_STATUS 0x0607600C
>>>>> VM fault (0x0c, vmid 3) at page 471, read from 'CPF' (0x43504600) (118)
>>>>>
>>>>>
>>>>> Any ideas?
>> Not the slightest, but I'm still investigating problems with that on
>> amdgpu.
>>
>> If we can't find the root cause till Monday it might be a good idea to
>> revert the patches for now.
> What's the status on that?


I've found and fixed the remaining kernel bugs over the last 
weekend/beginning of this week.

Still need to commit the fix for UVD/VCE, but that one shouldn't affect 
GFX at all.

Christian.



More information about the mesa-dev mailing list