[Nouveau] [PATCH 2/2] nouveau: Do not add most bo's to the global bo list.
Maarten Lankhorst
maarten.lankhorst at ubuntu.com
Wed Feb 25 07:43:23 PST 2015
Op 25-02-15 om 16:28 schreef Patrick Baggett:
> On Wed, Feb 25, 2015 at 9:07 AM, Maarten Lankhorst <
> maarten.lankhorst at ubuntu.com> wrote:
>
>> Op 25-02-15 om 16:04 schreef Patrick Baggett:
>>> On Wed, Feb 25, 2015 at 8:59 AM, Maarten Lankhorst <
>>> maarten.lankhorst at ubuntu.com> wrote:
>>>
>>>> Op 25-02-15 om 15:11 schreef Emil Velikov:
>>>>> On 24 February 2015 at 09:01, Maarten Lankhorst
>>>>> <maarten.lankhorst at ubuntu.com> wrote:
>>>>>> Only add wrapped bo's and bo's that have been exported through flink
>> or
>>>> dma-buf.
>>>>>> This avoids a lock in the common case, and decreases traversal needed
>>>> for importing
>>>>>> a dma-buf or flink.
>>>>>>
>>>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst at ubuntu.com>
>>>>>> ---
>>>>>> nouveau/nouveau.c | 47
>> +++++++++++++++++++++++------------------------
>>>>>> 1 file changed, 23 insertions(+), 24 deletions(-)
>>>>>>
>>>>>> diff --git a/nouveau/nouveau.c b/nouveau/nouveau.c
>>>>>> index 1c723b9..d411523 100644
>>>>>> --- a/nouveau/nouveau.c
>>>>>> +++ b/nouveau/nouveau.c
>>>>>> @@ -349,8 +349,8 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>> struct nouveau_bo_priv *nvbo = nouveau_bo(bo);
>>>>>> struct drm_gem_close req = { bo->handle };
>>>>>>
>>>>>> - pthread_mutex_lock(&nvdev->lock);
>>>>>> - if (nvbo->name) {
>>>>>> + if (nvbo->head.next) {
>>>>>> + pthread_mutex_lock(&nvdev->lock);
>>>>>> if (atomic_read(&nvbo->refcnt) == 0) {
>>>>>> DRMLISTDEL(&nvbo->head);
>>>>>> /*
>>>>>> @@ -365,8 +365,6 @@ nouveau_bo_del(struct nouveau_bo *bo)
>>>>>> }
>>>>>> pthread_mutex_unlock(&nvdev->lock);
>>>>>> } else {
>>>>>> - DRMLISTDEL(&nvbo->head);
>>>>>> - pthread_mutex_unlock(&nvdev->lock);
>>>>>> drmIoctl(bo->device->fd, DRM_IOCTL_GEM_CLOSE, &req);
>>>>>> }
>>>>>> if (bo->map)
>>>>>> @@ -379,7 +377,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>> uint32_t
>>>> flags, uint32_t align,
>>>>>> uint64_t size, union nouveau_bo_config *config,
>>>>>> struct nouveau_bo **pbo)
>>>>>> {
>>>>>> - struct nouveau_device_priv *nvdev = nouveau_device(dev);
>>>>>> struct nouveau_bo_priv *nvbo = calloc(1, sizeof(*nvbo));
>>>>>> struct nouveau_bo *bo = &nvbo->base;
>>>>>> int ret;
>>>>>> @@ -397,10 +394,6 @@ nouveau_bo_new(struct nouveau_device *dev,
>>>> uint32_t flags, uint32_t align,
>>>>>> return ret;
>>>>>> }
>>>>>>
>>>>>> - pthread_mutex_lock(&nvdev->lock);
>>>>>> - DRMLISTADD(&nvbo->head, &nvdev->bo_list);
>>>>>> - pthread_mutex_unlock(&nvdev->lock);
>>>>>> -
>>>>>> *pbo = bo;
>>>>>> return 0;
>>>>>> }
>>>>>> @@ -457,6 +450,18 @@ nouveau_bo_wrap_locked(struct nouveau_device
>> *dev,
>>>> uint32_t handle,
>>>>>> return -ENOMEM;
>>>>>> }
>>>>>>
>>>>>> +static void
>>>>>> +nouveau_bo_make_global(struct nouveau_bo_priv *nvbo)
>>>>>> +{
>>>>>> + if (!nvbo->head.next) {
>>>>>> + struct nouveau_device_priv *nvdev =
>>>> nouveau_device(nvbo->base.device);
>>>>>> + pthread_mutex_lock(&nvdev->lock);
>>>>>> + if (!nvbo->head.next)
>>> I guess the bo_make_global call is not particularly sensitive, so
>>>> removing's fine with me.
>>>>
>>> I would be worried about the duplicated check. It seems like a "smart"
>>> compiler could cache the value of "nvbo->head.next" (unless marked as
>>> volatile), rendering the second if() useless. If the field is marked
>>> volatile, then of course, this does not apply.
>>>
>> In that case I would be worried about a compiler that ignores the acquire
>> semantics of pthread_mutex_lock..
>>
> This isn't about the acquire semantics here, because you're reading it on
> both sides of the lock. The compiler need not reorder the reads at all,
> merely just save the value of the first read. Unless the compiler knows
> that the value can change after the lock is acquired, it can simply cache
> the result. Consider the nearly identical transformation:
>
> const int condition = (!nvbo->head.next);
> if(condition){
> pthread_mutex_lock(...);
> if(condition){ //redundant, already can assert it is true
> ...
>
So you're saying a compiler can optimize:
- statement with memory access
- read memory barrier
- statement with memory access
To drop the second statement? I would worry about your definition of memory barrier then..
~Maarten
More information about the Nouveau
mailing list