CIK hangs with kernel 3.15, bisected

Christian König deathsimple at vodafone.de
Wed May 28 03:38:53 PDT 2014


I already tried a similar patch as well, without any more noticeable 
crashes. But going to give this another round with your patch and openarena.

Thanks,
Christian.

Am 27.05.2014 23:55, schrieb Marek Olšák:
> Hi Christian,
>
> I test on Bonaire (ChipID = 0x665c). Unfortunately, the hangs are not
> fixed yet. They are very rare and very random. Therefore, I have come
> up with a patch which evicts page tables between IBs. See the
> attachment. With that patch applied, the system starts fine, compiz
> and glxgears work, but once I start playing openarena, it locks up
> pretty quickly.
>
> The patch shouldn't do anything in theory, because pages are moved
> back to VRAM immediately after that. However, the VRAM address of page
> tables may end up being different from before, which might be the root
> cause.
>
> Marek
>
> On Wed, May 14, 2014 at 2:11 PM, Christian König
> <deathsimple at vodafone.de> wrote:
>> Crap, any chance you can narrow it down a bit more?
>>
>> I've just tried a piglit quick test on my Bonaire and it seems to work
>> perfectly fine.
>>
>> What hw do you test on?
>>
>> Regards,
>> Christian.
>>
>> Am 13.05.2014 23:21, schrieb Marek Olšák:
>>
>>> Hi Christian,
>>>
>>> Even though some regressions are fixed by these patches:
>>>
>>> drm/radeon: fix page directory update size estimation
>>> drm/radeon: fix buffer placement under memory pressure v2
>>>
>>> and indeed, the texelFetch tests no longer hang, there is one more
>>> hang which needs to be fixed. :( All I know is the exact same commit
>>> causes it and it can only be reproduced by running whole piglit with
>>> concurrency enabled.
>>>
>>> My kernel git log:
>>>
>>> * 2ba22c8 - drm/radeon: fix buffer placement under memory pressure v2
>>> (10 hours ago) <Christian König>
>>> * 3af91e5 - drm/radeon: fix page directory update size estimation (21
>>> hours ago) <Christian König>
>>> * 6d2f294 - drm/radeon: use normal BOs for the page tables v4 (2
>>> months ago) <Christian König>
>>> * fa68834 - drm/radeon: further cleanup vm flushing & fencing (2
>>> months ago) <Christian König>
>>>
>>> fa68834 doesn't hang, but 2ba22c8 hangs, which means 6d2f294 or either
>>> of the two fixes is the first bad commit.
>>>
>>> Marek
>>>
>>> On Fri, May 9, 2014 at 8:03 PM, Marek Olšák <maraeo at gmail.com> wrote:
>>>> Hi Christian,
>>>>
>>>> This commit which first appeared in 3.15-rc1 causes hangs on Bonaire:
>>>>
>>>> commit 6d2f2944e95e504a7d33385eeeb9bb7fcca72592
>>>> Author: Christian König <christian.koenig at amd.com>
>>>> Date:   Thu Feb 20 13:42:17 2014 +0100
>>>>
>>>>       drm/radeon: use normal BOs for the page tables v4
>>>>
>>>>       No need to make it more complicated than necessary,
>>>>       just allocate the page tables as normal BO and
>>>>       flush whenever the address change.
>>>>
>>>>       v2: update comments and function name
>>>>       v3: squash bug fixes, page directory and tables patch
>>>>       v4: rebased on Mareks changes
>>>>
>>>>       Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>
>>>>
>>>> Reverting the commit gives me a lot of merge conflicts.
>>>>
>>>> The simplest way to reproduce the hangs is to run piglit with these
>>>> parameters:
>>>> -t texelFetch.fs
>>>>
>>>> Some of the tests allocate a lot of MSAA textures and the tests also
>>>> run in parallel, which creates a lot of memory pressure and probably
>>>> causes buffer evictions.
>>>>
>>>> Any idea what is wrong with it?
>>>>
>>>> Thanks,
>>>>
>>>> Marek
>>



More information about the dri-devel mailing list