[PATCH 1/1] drm/mm: optimize rb_hole_addr rbtree search in high addr mode

Nirmoy nirmodas at amd.com
Tue Apr 28 17:05:59 UTC 2020


On 4/28/20 6:18 PM, Chris Wilson wrote:
> Quoting Nirmoy Das (2020-04-28 17:04:23)
>> Userspace can severely fragment rb_hole_addr rbtree by manipulating
>> alignment while allocating buffers. Fragmented rb_hole_addr rbtree would
>> result in large delays while allocating buffer object for a userspace
>> application. It takes long time to find suitable hole because if we fail
>> to find a suitable hole in the first attempt then we look for neighbouring
>> nodes using rb_prev(). Traversing rbtree using rb_prev() can take really
>> long time if the tree is fragmented.
>>
>> This patch improves searches in fragmented rb_hole_addr rbtree by storing
>> an extra field in drm_mm_node, max_hole_size. Each drm_mm_node now stores
>> maximum hole size for its subtree in drm_mm_node->max_hole_size. Using
>> drm_mm_node->max_hole_size, it is possible to eliminate complete
>> subtree if that subtree is unable to serve a request hence reducing number
>> of rb_prev() used.
>>
>> Note: Implementation details of rb_hole_addr rbtree updates after a node
>> removal and addition is borrowed from block/bfq-wf2q.c which is trying to
>> achieve a similar goal.
> Feels like it is an augmented rbtree?


I just read about augmented rbtree quickly and yes it feels like so.  I 
will try to understand more

about it and will see if I can convert this logic to a augmented rbtree.


>
>> +static struct drm_mm_node *
>> +next_hole_high_addr(struct drm_mm_node *entry, u64 size)
>> +{
>> +       struct rb_node *rb_node, *left_rb_node, *parent_rb_node;
>> +       struct drm_mm_node *left_node;
>> +
>> +       if (!entry)
>> +               return false;
>> +
>> +       rb_node = &entry->rb_hole_addr;
>> +       if (rb_node->rb_left) {
>> +               left_rb_node = rb_node->rb_left;
>> +               parent_rb_node = rb_parent(rb_node);
>> +               left_node = rb_entry(left_rb_node,
>> +                                    struct drm_mm_node, rb_hole_addr);
>> +               if ((left_node->max_hole_size < size ||
>> +                    entry->size == entry->max_hole_size) &&
>> +                   parent_rb_node && parent_rb_node->rb_left != rb_node)
>> +                       return rb_hole_addr_to_node(parent_rb_node);
>> +       }
>> +
>> +       return rb_hole_addr_to_node(rb_prev(rb_node));
>> +}
> The max_hole_size would equally apply to next_hole_low_addr(), right?


Sorry,  yes we need that too. I will add next_hole_low_addr().


>
> Sadly, I did not explore the fragmented search LOW/HIGH in test-drm_mm.c.
> That would be a useful addition as well.


Yes that would be useful too.

I used ./tools/testing/selftests/drivers/gpu/drm_mm.sh to verify this patch.

> -Chris


Thanks for reviewing,

Nirmoy



More information about the dri-devel mailing list