reservation questions

Christian König ckoenig.leichtzumerken at gmail.com
Tue Mar 6 12:46:57 UTC 2018


> e.g. the excl fence is with the same ctx (but later) of the one in 
> shared list.
>
Well it's not on the same context, but it is guaranteed to not complete 
before all shared fences.

See for example how it is used in amdgpu_copy_buffer(). We first sync to 
all fences in the reservation object:
>                 r = amdgpu_sync_resv(adev, &job->sync, resv,
> AMDGPU_FENCE_OWNER_UNDEFINED,
>                                      false);
>                 if (r) {
>                         DRM_ERROR("sync failed (%d).\n", r);
>                         goto error_free;
>                 }
This way the resulting fence will never signal before anything else and 
so can safely be used as exclusive fence in the reservation object.

Regards,
Christian.

Am 06.03.2018 um 13:32 schrieb Liu, Monk:
>
> You mean the caller must guarantees the excl fence will only signal 
> till all shared fence signaled, so you can just
>
> ignores all shared fence if  add_excl_fence() is invoked.
>
>
> e.g. the excl fence is with the same ctx (but later) of the one in 
> shared list.
>
>
> thanks for the explanation
>
>
> /Monk
>
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Koenig, Christian
> *Sent:* Tuesday, March 6, 2018 7:11:50 PM
> *To:* Liu, Monk
> *Cc:* dri-devel at lists.freedesktop.org; Chris Wilson
> *Subject:* Re: reservation questions
> Hi Monk,
>
> your check isn't correct because you still haven't understood the 
> semantics here.
>
>> the assumption that all shared fences should be signaled before 
>> adding excl fence looks not 100% guaranteed in LKG,
> The semantic is NOT that all shared fences are signaled when the 
> exclusive fence is added.
>
> Instead the requirement is that the exclusive fence signals after all 
> shared fences signaled. In other words that is an asynchronous 
> handling here.
>
> I honestly don't know how else to explain it.
>
> Regards,
> Christian.
>
> Am 06.03.2018 um 12:03 schrieb Liu, Monk:
>>
>> Hi Christian
>>
>>
>> I use blow patch to capture the incorrect case :
>>
>>
>> @@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct 
>> reservation_object *obj,
>>         write_seqcount_end(&obj->seq);
>>         preempt_enable();
>> -       /* inplace update, no shared fences */
>> -       while (i--)
>> - dma_fence_put(rcu_dereference_protected(old->shared[i],
>> - reservation_object_held(obj)));
>> +       /* inplace update, no shared fences continue after all shared 
>> signaled */
>> +       while (i--) {
>> +               struct dma_fence *f = 
>> rcu_dereference_protected(old->shared[i],
>> + reservation_object_held(obj));
>> +               if (!dma_fence_is_signaled(f))
>> +                       BUG();
>> +
>> +               dma_fence_put(f);
>> +               /* better assign shared[i] with NULL for sure */
>> +               rcu_assign_pointer(old->shared[i], NULL);
>> +       }
>>         dma_fence_put(old_fence);
>> +
>> +
>>  }
>>  EXPORT_SYMBOL(reservation_object_add_excl_fence);
>>
>> and I hit this BUG() during test:
>>
>> [  105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for 
>> 0000:00:08.0 on minor 0
>> [  105.623332] ------------[ cut here ]------------
>> [  105.623335] kernel BUG at drivers/dma-buf/reservation.c:275!
>> [  105.624470] invalid opcode: 0000 [#1] SMP
>> [  105.624915] Modules linked in: amdgpu chash gpu_sched ttm 
>> drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect 
>> sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec 
>> snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm 
>> ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi 
>> aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq 
>> snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid 
>> parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy 
>> pata_acpi
>> [  105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1
>> [  105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
>> BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
>> [  105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000
>> [  105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0
>> [  105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246
>> [  105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX: 
>> ffff8f8a6bfa2f50
>> [  105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI: 
>> ffff8f8a6635f628
>> [  105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09: 
>> 0000000000000001
>> [  105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12: 
>> 0000000000000000
>> [  105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15: 
>> ffff8f8a64beee60
>> [  105.643165] FS:  00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000) 
>> knlGS:0000000000000000
>> [  105.644573] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4: 
>> 00000000001406e0
>> [  105.648467] Call Trace:
>> [  105.652480]  amdgpu_bo_do_create+0x3a1/0x540 [amdgpu]
>> [  105.654233]  amdgpu_bo_create+0x3a/0x220 [amdgpu]
>> [  105.655956]  amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu]
>> [  105.657641]  amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu]
>> [  105.659155]  amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu]
>> [  105.660698]  ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu]
>> [  105.662515]  ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
>> [  105.664203]  drm_ioctl_kernel+0x69/0xb0 [drm]
>> [  105.665491]  ? drm_ioctl_kernel+0x69/0xb0 [drm]
>> [  105.666959]  drm_ioctl+0x2d2/0x390 [drm]
>> [  105.668373]  ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
>> [  105.670056]  ? call_rcu_sched+0x1d/0x20
>> [  105.671516]  ? put_object+0x26/0x30
>> [  105.672741]  ? __delete_object+0x39/0x50
>> [  105.674048]  amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
>> [  105.675551]  do_vfs_ioctl+0x92/0x5a0
>> [  105.676874]  ? kvm_sched_clock_read+0x1e/0x30
>> [  105.678276]  ? sched_clock+0x9/0x10
>> [  105.679553]  ? get_vtime_delta+0x99/0xc0
>> [  105.681007]  SyS_ioctl+0x79/0x90
>> [  105.684574]  do_syscall_64+0x6e/0x150
>> [  105.685910]  entry_SYSCALL64_slow_path+0x25/0x25
>> [  105.687354] RIP: 0033:0x7fd13b25ff47
>> [  105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX: 
>> 0000000000000010
>> [  105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX: 
>> 00007fd13b25ff47
>> [  105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI: 
>> 0000000000000007
>> [  105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09: 
>> 000000000000000e
>> [  105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12: 
>> 000000000188a430
>> [  105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15: 
>> 0000000000000000
>> [  105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43 
>> 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0 
>> 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef 
>> ff ff eb
>> [  105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP: 
>> ffffb1204159f9f0
>>
>>
>> the assumption that all shared fences should be signaled before 
>> adding excl fence looks not 100% guaranteed in LKG,
>>
>> Going to take a deep look ...
>>
>>
>> /Monk
>>
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Liu, Monk
>> *Sent:* Tuesday, March 6, 2018 6:47 PM
>> *To:* Koenig, Christian; Chris Wilson; 
>> dri-devel at lists.freedesktop.org <mailto:dri-devel at lists.freedesktop.org>
>> *Subject:* Re: reservation questions
>>
>> ok, that's good point ...
>>
>> ------------------------------------------------------------------------
>> *From:* Koenig, Christian
>> *Sent:* Tuesday, March 6, 2018 6:42:44 PM
>> *To:* Liu, Monk; Chris Wilson; dri-devel at lists.freedesktop.org 
>> <mailto:dri-devel at lists.freedesktop.org>
>> *Subject:* Re: reservation questions
>> Hi Monk,
>>
>> that is to remove the problem that allocating memory could fail.
>>
>> E.g. we only add the fence after sending the command to the hardware, 
>> so there is now way back and we need to add the fence or break memory 
>> management.
>>
>> So reservation_object_reserve_shared() makes sure there is a free 
>> fence slot *before* we start to prepare things for the hardware.
>>
>> Regards,
>> Christian.
>>
>> Am 06.03.2018 um 11:19 schrieb Liu, Monk:
>>>
>>> Hi Chris
>>>
>>>
>>> another question is why we not just call 
>>> "reservation_object_reserve_shared"
>>>
>>> during below add_shared_fence function, so the BUG_ON() could be 
>>> avoided and caller won't need
>>>
>>> to worry when and how much time it should call reserve_shared() ?
>>>
>>> thanks !
>>>
>>>
>>> void reservation_object_add_shared_fence(struct reservation_object *obj,
>>> struct dma_fence *fence)
>>> {
>>> struct reservation_object_list *old, *fobj = obj->staged;
>>>     old = reservation_object_get_list(obj);
>>>     obj->staged = NULL;
>>> if (!fobj) {
>>> BUG_ON(old->shared_count >= old->shared_max);
>>> reservation_object_add_shared_inplace(obj, old, fence);
>>>     } else
>>> reservation_object_add_shared_replace(obj, old, fobj, fence);
>>> }
>>> EXPORT_SYMBOL(reservation_object_add_shared_fence);
>>> ------------------------------------------------------------------------
>>> *From:* Chris Wilson <chris at chris-wilson.co.uk> 
>>> <mailto:chris at chris-wilson.co.uk>
>>> *Sent:* Tuesday, March 6, 2018 6:10:21 PM
>>> *To:* Liu, Monk; dri-devel at lists.freedesktop.org 
>>> <mailto:dri-devel at lists.freedesktop.org>; Koenig, Christian
>>> *Subject:* Re: reservation questions
>>> Quoting Liu, Monk (2018-03-06 09:45:19)
>>> > call reservation_object_add_excl_fence,
>>> > it set obj->fence->shared_count to 0, and put all shared fence 
>>> from obj->fence
>>> > without waiting signaling.
>>> > (this action looks inappropriate, I think at least before put all 
>>> those shared
>>> > fences
>>> > we should dma_wait_fence() on them to make sure they are signaled)
>>>
>>> No. Serialisation of resv updates are handled by the caller, the fences
>>> are ordered asynchronously so the wait is implicit in the construction.
>>> (I.e. before the excl fence can be signaled, all of the earlier shared
>>> fences must be signaled. You can even say before the operation that the
>>> excl fence signals completion of can begin, all the shared fences must
>>> have been signaled. But that is all implicit so that we can do it
>>> asynchronously.)
>>>
>>> > call reservation_object_reserve_shared,
>>> > this time obj->staged isn't NULL, and it is freed (nothing bad now
>>> > since obj->fence points to other place),
>>> > and obj->staged set to NULL,
>>> >
>>> > call reservation_object_add_shared_fence,
>>> > this time should going through reservation_object_add_shared_inplace,
>>> > But BUG_ON(old->shared_count >= old->shared_max) will hit !
>>>
>>> How? You only free staged iff shared_count < shared_max.
>>>
>>> You've reminded me that we should cover all this with a bunch of
>>> selftests.
>>> -Chris
>>
>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180306/e72b3d71/attachment-0001.html>


More information about the dri-devel mailing list