[PATCH 1/3] drm/radeon: stop poisoning the GART TLB

Christian König deathsimple at vodafone.de
Fri Jun 13 08:45:36 PDT 2014


Hi Marek,

ah, yes! Piglit in combination with that patch can indeed crash the box.

Going to investigate now that I can reproduce it.

Thanks,
Christian.

Am 13.06.2014 15:19, schrieb Marek Olšák:
> Hi,
>
> With my "force_gtt" patch, Cape Verde is unstable too, so all GCN
> chips are affected.
>
> I recommend applying that patch, because it will reproduce the problem
> faster. Without it, the hangs are very rare and it may take a while
> before they occur.
>
> Marek
>
> On Thu, Jun 12, 2014 at 1:23 PM, Christian König
> <deathsimple at vodafone.de> wrote:
>> Please do so, and you might want to try 3.15.0 as well.
>>
>> I've tested multiple piglit runs over night with my Bonaire and 3.15.0 and
>> that seemed to work perfectly fine.
>>
>> Going to test Alex drm-next-3.16 a bit more as well.
>>
>> Christian.
>>
>> Am 11.06.2014 12:56, schrieb Marek Olšák:
>>
>>> I only tested Bonaire. I can test Cape Verde if needed.
>>>
>>> Marek
>>>
>>> On Wed, Jun 11, 2014 at 11:29 AM, Christian König
>>> <deathsimple at vodafone.de> wrote:
>>>> Crap, I already wanted to check back with you if that really fixes your
>>>> problems.
>>>>
>>>> Thanks for the info, this crash also only happens on CIK doesn't it?
>>>>
>>>> Christian.
>>>>
>>>> Am 11.06.2014 01:30, schrieb Marek Olšák:
>>>>
>>>>> Sorry to tell you the bad news. This patch doesn't fix the hangs on my
>>>>> machine.
>>>>>
>>>>> I tested drm-next-3.16 from Alex's tree. I also switched copying from
>>>>> SDMA to CP DMA, which hung too.
>>>>>
>>>>> I also tried this:
>>>>>
>>>>> git checkout (the problematic commit):
>>>>> 6d2f294 - drm/radeon: use normal BOs for the page tables v4
>>>>>
>>>>> git cherry-pick (fixes):
>>>>> 0e97703c - drm/radeon: add define for flags used in R600+ GTT
>>>>> 0986c1a5 - drm/radeon: stop poisoning the GART TLB
>>>>> 4906f689 - drm/radeon: fix page directory update size estimation
>>>>> 4b095566 - drm/radeon: fix buffer placement under memory pressure v2
>>>>>
>>>>> Then I tested both SDMA and CP DMA copying. Both were unstable.
>>>>>
>>>>> Testing was done with piglit / quick.tests.
>>>>>
>>>>> Marek
>>>>>
>>>>>
>>>>> On Wed, Jun 4, 2014 at 3:29 PM, Christian König
>>>>> <deathsimple at vodafone.de>
>>>>> wrote:
>>>>>> From: Christian König <christian.koenig at amd.com>
>>>>>>
>>>>>> When we set the valid bit on invalid GART entries they are
>>>>>> loaded into the TLB when an adjacent entry is loaded. This
>>>>>> poisons the TLB with invalid entries which are sometimes
>>>>>> not correctly removed on TLB flush.
>>>>>>
>>>>>> For stable inclusion the patch probably needs to be modified a bit.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig at amd.com>
>>>>>> Cc: stable at vger.kernel.org
>>>>>> ---
>>>>>>     drivers/gpu/drm/radeon/rs600.c | 5 ++++-
>>>>>>     1 file changed, 4 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/radeon/rs600.c
>>>>>> b/drivers/gpu/drm/radeon/rs600.c
>>>>>> index 0a8be63..e0465b2 100644
>>>>>> --- a/drivers/gpu/drm/radeon/rs600.c
>>>>>> +++ b/drivers/gpu/drm/radeon/rs600.c
>>>>>> @@ -634,7 +634,10 @@ int rs600_gart_set_page(struct radeon_device
>>>>>> *rdev,
>>>>>> int i, uint64_t addr)
>>>>>>                    return -EINVAL;
>>>>>>            }
>>>>>>            addr = addr & 0xFFFFFFFFFFFFF000ULL;
>>>>>> -       addr |= R600_PTE_GART;
>>>>>> +       if (addr == rdev->dummy_page.addr)
>>>>>> +               addr |= R600_PTE_SYSTEM | R600_PTE_SNOOPED;
>>>>>> +       else
>>>>>> +               addr |= R600_PTE_GART;
>>>>>>            writeq(addr, ptr + (i * 8));
>>>>>>            return 0;
>>>>>>     }
>>>>>> --
>>>>>> 1.9.1
>>>>>>
>>>>>> _______________________________________________
>>>>>> dri-devel mailing list
>>>>>> dri-devel at lists.freedesktop.org
>>>>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>



More information about the dri-devel mailing list