reservation questions
Christian König
christian.koenig at amd.com
Tue Mar 6 11:11:50 UTC 2018
Hi Monk,
your check isn't correct because you still haven't understood the
semantics here.
> the assumption that all shared fences should be signaled before adding
> excl fence looks not 100% guaranteed in LKG,
The semantic is NOT that all shared fences are signaled when the
exclusive fence is added.
Instead the requirement is that the exclusive fence signals after all
shared fences signaled. In other words that is an asynchronous handling
here.
I honestly don't know how else to explain it.
Regards,
Christian.
Am 06.03.2018 um 12:03 schrieb Liu, Monk:
>
> Hi Christian
>
>
> I use blow patch to capture the incorrect case :
>
>
> @@ -267,12 +267,21 @@ void reservation_object_add_excl_fence(struct
> reservation_object *obj,
> write_seqcount_end(&obj->seq);
> preempt_enable();
> - /* inplace update, no shared fences */
> - while (i--)
> - dma_fence_put(rcu_dereference_protected(old->shared[i],
> - reservation_object_held(obj)));
> + /* inplace update, no shared fences continue after all shared
> signaled */
> + while (i--) {
> + struct dma_fence *f =
> rcu_dereference_protected(old->shared[i],
> + reservation_object_held(obj));
> + if (!dma_fence_is_signaled(f))
> + BUG();
> +
> + dma_fence_put(f);
> + /* better assign shared[i] with NULL for sure */
> + rcu_assign_pointer(old->shared[i], NULL);
> + }
> dma_fence_put(old_fence);
> +
> +
> }
> EXPORT_SYMBOL(reservation_object_add_excl_fence);
>
> and I hit this BUG() during test:
>
> [ 105.244816] [drm] Initialized amdgpu 3.24.0 20150101 for
> 0000:00:08.0 on minor 0
> [ 105.623332] ------------[ cut here ]------------
> [ 105.623335] kernel BUG at drivers/dma-buf/reservation.c:275!
> [ 105.624470] invalid opcode: 0000 [#1] SMP
> [ 105.624915] Modules linked in: amdgpu chash gpu_sched ttm
> drm_kms_helper drm i2c_algo_bit fb_sys_fops syscopyarea sysfillrect
> sysimgblt snd_hda_codec_generic snd_hda_intel snd_hda_codec
> snd_hda_core snd_hwdep crct10dif_pclmul crc32_pclmul snd_pcm
> ghash_clmulni_intel pcbc snd_seq_midi snd_seq_midi_event snd_rawmidi
> aesni_intel aes_x86_64 crypto_simd glue_helper cryptd snd_seq
> snd_seq_device snd_timer serio_raw snd soundcore i2c_piix4 mac_hid
> parport_pc ppdev lp parport autofs4 8139too psmouse 8139cp mii floppy
> pata_acpi
> [ 105.630547] CPU: 3 PID: 1216 Comm: 3dmark Not tainted 4.13.0-debug #1
> [ 105.631762] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
> BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> [ 105.633528] task: ffff8f8a6a165a00 task.stack: ffffb1204159c000
> [ 105.634676] RIP: 0010:reservation_object_add_excl_fence+0x9c/0xf0
> [ 105.635824] RSP: 0018:ffffb1204159f9f0 EFLAGS: 00010246
> [ 105.636805] RAX: 0000000000000000 RBX: ffff8f8a64bee760 RCX:
> ffff8f8a6bfa2f50
> [ 105.638123] RDX: ffff8f8a6bfa6770 RSI: ffff8f8a64bee660 RDI:
> ffff8f8a6635f628
> [ 105.639440] RBP: ffffb1204159fa18 R08: 0000000000000000 R09:
> 0000000000000001
> [ 105.640702] R10: ffffb1204159f808 R11: 0000000000000003 R12:
> 0000000000000000
> [ 105.641947] R13: 0000000000000000 R14: ffff8f8a6d0f0200 R15:
> ffff8f8a64beee60
> [ 105.643165] FS: 00007fd13c73d940(0000) GS:ffff8f8a76d80000(0000)
> knlGS:0000000000000000
> [ 105.644573] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 105.646482] CR2: 00007fd13c6fd000 CR3: 00000001a2a58000 CR4:
> 00000000001406e0
> [ 105.648467] Call Trace:
> [ 105.652480] amdgpu_bo_do_create+0x3a1/0x540 [amdgpu]
> [ 105.654233] amdgpu_bo_create+0x3a/0x220 [amdgpu]
> [ 105.655956] amdgpu_vm_alloc_levels.isra.14+0x1dc/0x370 [amdgpu]
> [ 105.657641] amdgpu_vm_alloc_pts+0x49/0x70 [amdgpu]
> [ 105.659155] amdgpu_gem_va_ioctl+0x365/0x520 [amdgpu]
> [ 105.660698] ? amdgpu_gem_create_ioctl+0x19a/0x280 [amdgpu]
> [ 105.662515] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
> [ 105.664203] drm_ioctl_kernel+0x69/0xb0 [drm]
> [ 105.665491] ? drm_ioctl_kernel+0x69/0xb0 [drm]
> [ 105.666959] drm_ioctl+0x2d2/0x390 [drm]
> [ 105.668373] ? amdgpu_gem_metadata_ioctl+0x1c0/0x1c0 [amdgpu]
> [ 105.670056] ? call_rcu_sched+0x1d/0x20
> [ 105.671516] ? put_object+0x26/0x30
> [ 105.672741] ? __delete_object+0x39/0x50
> [ 105.674048] amdgpu_drm_ioctl+0x4c/0x80 [amdgpu]
> [ 105.675551] do_vfs_ioctl+0x92/0x5a0
> [ 105.676874] ? kvm_sched_clock_read+0x1e/0x30
> [ 105.678276] ? sched_clock+0x9/0x10
> [ 105.679553] ? get_vtime_delta+0x99/0xc0
> [ 105.681007] SyS_ioctl+0x79/0x90
> [ 105.684574] do_syscall_64+0x6e/0x150
> [ 105.685910] entry_SYSCALL64_slow_path+0x25/0x25
> [ 105.687354] RIP: 0033:0x7fd13b25ff47
> [ 105.688666] RSP: 002b:00007fff5422b2c8 EFLAGS: 00000202 ORIG_RAX:
> 0000000000000010
> [ 105.691268] RAX: ffffffffffffffda RBX: 0000000001886130 RCX:
> 00007fd13b25ff47
> [ 105.693148] RDX: 00007fff5422b390 RSI: 00000000c0286448 RDI:
> 0000000000000007
> [ 105.695003] RBP: 00007fff5422b300 R08: 0000000300000000 R09:
> 000000000000000e
> [ 105.696774] R10: 0000000001887c28 R11: 0000000000000202 R12:
> 000000000188a430
> [ 105.698459] R13: 0000000001886130 R14: 00007fff5422b638 R15:
> 0000000000000000
> [ 105.700168] Code: 74 3f 41 89 c4 45 89 e5 4b 8b 5c ee 18 48 8b 43
> 48 a8 01 75 cc 48 8b 43 08 48 8b 40 18 48 85 c0 74 09 48 89 df ff d0
> 84 c0 75 0c <0f> 0b 48 89 df e8 2a ed ff ff eb b4 48 89 df e8 80 ef ff
> ff eb
> [ 105.704982] RIP: reservation_object_add_excl_fence+0x9c/0xf0 RSP:
> ffffb1204159f9f0
>
>
> the assumption that all shared fences should be signaled before adding
> excl fence looks not 100% guaranteed in LKG,
>
> Going to take a deep look ...
>
>
> /Monk
>
>
>
> ------------------------------------------------------------------------
> *From:* Liu, Monk
> *Sent:* Tuesday, March 6, 2018 6:47 PM
> *To:* Koenig, Christian; Chris Wilson; dri-devel at lists.freedesktop.org
> *Subject:* Re: reservation questions
>
> ok, that's good point ...
>
> ------------------------------------------------------------------------
> *From:* Koenig, Christian
> *Sent:* Tuesday, March 6, 2018 6:42:44 PM
> *To:* Liu, Monk; Chris Wilson; dri-devel at lists.freedesktop.org
> *Subject:* Re: reservation questions
> Hi Monk,
>
> that is to remove the problem that allocating memory could fail.
>
> E.g. we only add the fence after sending the command to the hardware,
> so there is now way back and we need to add the fence or break memory
> management.
>
> So reservation_object_reserve_shared() makes sure there is a free
> fence slot *before* we start to prepare things for the hardware.
>
> Regards,
> Christian.
>
> Am 06.03.2018 um 11:19 schrieb Liu, Monk:
>>
>> Hi Chris
>>
>>
>> another question is why we not just call
>> "reservation_object_reserve_shared"
>>
>> during below add_shared_fence function, so the BUG_ON() could be
>> avoided and caller won't need
>>
>> to worry when and how much time it should call reserve_shared() ?
>>
>> thanks !
>>
>>
>> void reservation_object_add_shared_fence(struct reservation_object *obj,
>> struct dma_fence *fence)
>> {
>> struct reservation_object_list *old, *fobj = obj->staged;
>> old = reservation_object_get_list(obj);
>> obj->staged = NULL;
>> if (!fobj) {
>> BUG_ON(old->shared_count >= old->shared_max);
>> reservation_object_add_shared_inplace(obj, old, fence);
>> } else
>> reservation_object_add_shared_replace(obj, old, fobj, fence);
>> }
>> EXPORT_SYMBOL(reservation_object_add_shared_fence);
>> ------------------------------------------------------------------------
>> *From:* Chris Wilson <chris at chris-wilson.co.uk>
>> <mailto:chris at chris-wilson.co.uk>
>> *Sent:* Tuesday, March 6, 2018 6:10:21 PM
>> *To:* Liu, Monk; dri-devel at lists.freedesktop.org
>> <mailto:dri-devel at lists.freedesktop.org>; Koenig, Christian
>> *Subject:* Re: reservation questions
>> Quoting Liu, Monk (2018-03-06 09:45:19)
>> > call reservation_object_add_excl_fence,
>> > it set obj->fence->shared_count to 0, and put all shared fence from
>> obj->fence
>> > without waiting signaling.
>> > (this action looks inappropriate, I think at least before put all
>> those shared
>> > fences
>> > we should dma_wait_fence() on them to make sure they are signaled)
>>
>> No. Serialisation of resv updates are handled by the caller, the fences
>> are ordered asynchronously so the wait is implicit in the construction.
>> (I.e. before the excl fence can be signaled, all of the earlier shared
>> fences must be signaled. You can even say before the operation that the
>> excl fence signals completion of can begin, all the shared fences must
>> have been signaled. But that is all implicit so that we can do it
>> asynchronously.)
>>
>> > call reservation_object_reserve_shared,
>> > this time obj->staged isn't NULL, and it is freed (nothing bad now
>> > since obj->fence points to other place),
>> > and obj->staged set to NULL,
>> >
>> > call reservation_object_add_shared_fence,
>> > this time should going through reservation_object_add_shared_inplace,
>> > But BUG_ON(old->shared_count >= old->shared_max) will hit !
>>
>> How? You only free staged iff shared_count < shared_max.
>>
>> You've reminded me that we should cover all this with a bunch of
>> selftests.
>> -Chris
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20180306/0c891950/attachment-0001.html>
More information about the dri-devel
mailing list