[PATCH] Revert "drm/radeon: Try evicting from CPU accessible to inaccessible VRAM first"

Julien Isorce julien.isorce at gmail.com
Tue Mar 28 08:24:43 UTC 2017


Hi Michel,

About the hard lockup, I noticed that I cannot have it with the following
conditions:

1. soft lockup fix (the 0->i change which avoids infinite loop)
2. Your suggestion: (!(rbo->flags & RADEON_GEM_CPU_ACCESS)
3. radeon.gartsize=512 radeon.vramlimit=1024 (any other values above do not
help, for example (1024, 1024) or (1024, 2048))

Without 1 and 2, but with 3, our test reproduces the soft lockup (just
discovered few days ago).
Without 3 (and with or without 1., 2.), our test reproduces the hard lockup
which one does not give any info in kern.log (sometimes some NUL ^@
characters but not always).

We are converting this repro test to a piglit test in order to share it but
it will take some times. But to simplify it continuously uploads images
with a size picked randomly and up to 4K. So TTM's eviction mechanism is
hit a lot.

(The card is a ForePro W600 Cape Verde 2048M )

I am happy to try any other suggestion.

Thx
Julien

On 24 March 2017 at 19:01, Julien Isorce <julien.isorce at gmail.com> wrote:

> Hi Michel,
>
> No this change does not help on the other issue (hard lockup).
> I have no tried it in combination with the 0 -> i change.
>
> Thx anyway.
> Julien
>
>
> On 24 March 2017 at 10:03, Michel Dänzer <michel at daenzer.net> wrote:
>
>> On 24/03/17 12:31 AM, Zachary Michaels wrote:
>> >
>> > I should also note that we are experiencing another issue where the
>> > kernel locks up in similar circumstances. As Julien noted, we get no
>> > output, and the watchdogs don't seem to work. It may be the case that
>> > Xorg and our process are calling ttm_bo_mem_force_space concurrently,
>> > but I don't think we have enough information yet to say for
>> > sure. Reverting this commit does not fix that issue. I have some small
>> > amount of evidence indicating that bos flagged for CPU access are
>> > getting placed in CPU inaccessible memory. Could that cause this sort of
>> > kernel lockup?
>>
>> Possibly, does this help?
>>
>> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c
>> b/drivers/gpu/drm/radeon/radeon_ttm.c
>> index 37d68cd1f272..40d1bb467a71 100644
>> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
>> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
>> @@ -198,7 +198,8 @@ static void radeon_evict_flags(struct
>> ttm_buffer_object *bo,
>>         case TTM_PL_VRAM:
>>                 if (rbo->rdev->ring[radeon_copy_ring_index(rbo->rdev)].ready
>> == false)
>>                         radeon_ttm_placement_from_domain(rbo,
>> RADEON_GEM_DOMAIN_CPU);
>> -               else if (rbo->rdev->mc.visible_vram_size <
>> rbo->rdev->mc.real_vram_size &&
>> +               else if (!(rbo->flags & RADEON_GEM_CPU_ACCESS) &&
>> +                        rbo->rdev->mc.visible_vram_size <
>> rbo->rdev->mc.real_vram_size &&
>>                          bo->mem.start < (rbo->rdev->mc.visible_vram_size
>> >> PAGE_SHIFT)) {
>>                         unsigned fpfn = rbo->rdev->mc.visible_vram_size
>> >> PAGE_SHIFT;
>>                         int i;
>>
>>
>>
>> --
>> Earthling Michel Dänzer               |               http://www.amd.com
>> Libre software enthusiast             |             Mesa and X developer
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20170328/513054b5/attachment.html>


More information about the dri-devel mailing list