[PATCH] drm/amdgpu: fix NULL pointer dereference when run App with DRI_PRIME=1

Zhang, Jerry (Junwei) Jerry.Zhang at amd.com
Mon May 28 07:23:20 UTC 2018


On 05/25/2018 07:23 PM, Christian König wrote:
> Am 25.05.2018 um 11:51 schrieb Zhang, Jerry (Junwei):
>> On 05/25/2018 05:35 PM, Christian König wrote:
>>> Am 25.05.2018 um 10:23 schrieb Zhang, Jerry (Junwei):
>>>> On 05/25/2018 03:54 PM, Christian König wrote:
>>>>> Am 25.05.2018 um 09:20 schrieb Zhang, Jerry (Junwei):
>>>>>> On 05/25/2018 02:44 PM, Christian König wrote:
>>>>>>> NAK, that probably just fixed the symptom but not the underlying problem.
>>>>>>>
>>>>>>> Somebody is accessing the page array when it should never be accessed.
>>>>>>
>>>>>> If prime import as GTT bo by default(now it's CPU bo), it would happens
>>>>>> quickly when GTT sg bo creation rather than next cs validation.
>>>>>>
>>>>>> Since ttm_sg_tt_init() only allocates gtt->ttm.dma_address if sg bo is
>>>>>> created, it would fail to access ttm->pages when ttm populate.
>>>>>
>>>>> And exactly that's the problem, and imported BO should never populate.
>>>>>
>>>>>>
>>>>>> current error happens in ttm populate from cs validation, the sg bo is
>>>>>> imported from exporter.
>>>>>>
>>>>>>>
>>>>>>> How did you manage to trigger this?
>>>>>>
>>>>>> PRI_PRIME=1 with Unigine heaven.
>>>>>
>>>>> Going to give that a try, but the last time I check that worked as expected.
>>>>
>>>> FYI.
>>>> PRI_PRIME=1 glxinfo will not trigger that, but the game does.
>>>
>>> Just tested and it works perfectly fine.
>>>
>>> Is that on the closed stack or the open stack?
>>
>> I used unified driver(latest 18.20 build) + drm-next kernel, installed as all
>> open stack on A+A platform.
>> (issue was found by 18.20 build, all open stack(dkms driver))
>>
>> BTW, How did you get the UMD? apt-get or build by yourself?
>
> That's self build Mesa+libdrm.
>
> Do you have the apt url and/or package versions at hand you used for the test?

I found that the Ubuntu kernel 4.13/4.15 has no below patch:
   * 
https://cgit.freedesktop.org/~agd5f/linux/commit/?h=amd-staging-drm-next&id=186ca446aea19e49d2e1433dd170c6e1c211a52a

So we could fix that in DKMS support rather than in upstream.

Double confirmed drm-next kernel that has no such issue.
(not sure what's going on last week, I did get the latest code and build the 
kernel and it failed. Sorry for this inconvenience)

Thanks for your time to check it.

Jerry

>
> Christian.
>
>>
>>
>> Jerry
>


More information about the amd-gfx mailing list