[Nouveau] [PATCH 2/2] nouveau: Do not add most bo's to the global bo list.
Emil Velikov
emil.l.velikov at gmail.com
Wed Feb 25 08:14:04 PST 2015
On 25 February 2015 at 15:43, Maarten Lankhorst
<maarten.lankhorst at ubuntu.com> wrote:
> Op 25-02-15 om 16:28 schreef Patrick Baggett:
>> On Wed, Feb 25, 2015 at 9:07 AM, Maarten Lankhorst <
>> maarten.lankhorst at ubuntu.com> wrote:
>>
>>> Op 25-02-15 om 16:04 schreef Patrick Baggett:
>>>> On Wed, Feb 25, 2015 at 8:59 AM, Maarten Lankhorst <
>>>> maarten.lankhorst at ubuntu.com> wrote:
>>>>
>>>>> Op 25-02-15 om 15:11 schreef Emil Velikov:
>>>>>> On 24 February 2015 at 09:01, Maarten Lankhorst
>>>>>> <maarten.lankhorst at ubuntu.com> wrote:
>>>>>>> Only add wrapped bo's and bo's that have been exported through flink
>>> or
>>>>> dma-buf.
>>>>>>> This avoids a lock in the common case, and decreases traversal needed
>>>>> for importing
>>>>>>> a dma-buf or flink.
>>>>>>>
>>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at ubuntu.com>
>>>>>>> ---
>>>>>>> nouveau/nouveau.c | 47
>>> +++++++++++++++++++++++------------------------
>>>>>>> 1 file changed, 23 insertions(+), 24 deletions(-)
>>>>>>>
>>>>>>> diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
>>>>>>> index 1c723b9..d411523 100644
>>>>>>> --- a/nouveau/nouveau.c
>>>>>>> +++ b/nouveau/nouveau.c
>>>>>>> @@ -349,8 +349,8 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>>> struct nouveau_bo_priv *nvbo = nouveau_bo(bo);
>>>>>>> struct drm_gem_close req = { bo->handle };
>>>>>>>
>>>>>>> - pthread_mutex_lock(&nvdev->lock);
>>>>>>> - if (nvbo->name) {
>>>>>>> + if (nvbo->head.next) {
>>>>>>> + pthread_mutex_lock(&nvdev->lock);
>>>>>>> if (atomic_read(&nvbo->refcnt) == 0) {
>>>>>>> DRMLISTDEL(&nvbo->head);
>>>>>>> /*
>>>>>>> @@ -365,8 +365,6 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>>> }
>>>>>>> pthread_mutex_unlock(&nvdev->lock);
>>>>>>> } else {
>>>>>>> - DRMLISTDEL(&nvbo->head);
>>>>>>> - pthread_mutex_unlock(&nvdev->lock);
>>>>>>> drmIoctl(bo->device->fd, DRM_IOCTL_GEM_CLOSE, &req);
>>>>>>> }
>>>>>>> if (bo->map)
>>>>>>> @@ -379,7 +377,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>>> uint32_t
>>>>> flags, uint32_t align,
>>>>>>> uint64_t size, union nouveau_bo_config *config,
>>>>>>> struct nouveau_bo **pbo)
>>>>>>> {
>>>>>>> - struct nouveau_device_priv *nvdev = nouveau_device(dev);
>>>>>>> struct nouveau_bo_priv *nvbo = calloc(1, sizeof(*nvbo));
>>>>>>> struct nouveau_bo *bo = &nvbo->base;
>>>>>>> int ret;
>>>>>>> @@ -397,10 +394,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>>>>> uint32_t flags, uint32_t align,
>>>>>>> return ret;
>>>>>>> }
>>>>>>>
>>>>>>> - pthread_mutex_lock(&nvdev->lock);
>>>>>>> - DRMLISTADD(&nvbo->head, &nvdev->bo_list);
>>>>>>> - pthread_mutex_unlock(&nvdev->lock);
>>>>>>> -
>>>>>>> *pbo = bo;
>>>>>>> return 0;
>>>>>>> }
>>>>>>> @@ -457,6 +450,18 @@ nouveau_bo_wrap_locked(struct nouveau_device
>>> *dev,
>>>>> uint32_t handle,
>>>>>>> return -ENOMEM;
>>>>>>> }
>>>>>>>
>>>>>>> +static void
>>>>>>> +nouveau_bo_make_global(struct nouveau_bo_priv *nvbo)
>>>>>>> +{
>>>>>>> + if (!nvbo->head.next) {
>>>>>>> + struct nouveau_device_priv *nvdev =
>>>>> nouveau_device(nvbo->base.device);
>>>>>>> + pthread_mutex_lock(&nvdev->lock);
>>>>>>> + if (!nvbo->head.next)
>>>> I guess the bo_make_global call is not particularly sensitive, so
>>>>> removing's fine with me.
>>>>>
>>>> I would be worried about the duplicated check. It seems like a "smart"
>>>> compiler could cache the value of "nvbo->head.next" (unless marked as
>>>> volatile), rendering the second if() useless. If the field is marked
>>>> volatile, then of course, this does not apply.
>>>>
>>> In that case I would be worried about a compiler that ignores the acquire
>>> semantics of pthread_mutex_lock..
>>>
>> This isn't about the acquire semantics here, because you're reading it on
>> both sides of the lock. The compiler need not reorder the reads at all,
>> merely just save the value of the first read. Unless the compiler knows
>> that the value can change after the lock is acquired, it can simply cache
>> the result. Consider the nearly identical transformation:
>>
>> const int condition = (!nvbo->head.next);
>> if(condition){
>> pthread_mutex_lock(...);
>> if(condition){ //redundant, already can assert it is true
>> ...
>>
> So you're saying a compiler can optimize:
>
> - statement with memory access
> - read memory barrier
> - statement with memory access
>
> To drop the second statement? I would worry about your definition of memory barrier then..
>
Patrick thanks for elaborating my concern in a more complete manner.
Maarten, I would assume that the overhead of dropping the first if
statement will be negligible wrt the time spent debugging when the
compiler does the silly thing. Afaict even with a single if, your
patch does reduce the time spend in libdrm_nouveau quite a bit.
That said, it's up-to you and others to draw the final call.
Cheers,
Emil
More information about the Nouveau
mailing list