答复: [V2 04/11] drm/amdgpu/virt: use kiq to access registers

Liu, Monk Monk.Liu at amd.com
Wed Jan 11 08:30:40 UTC 2017


Hi Christian


latest updates on the fence grab/release


I recall the impression of why I added those grab/release pare:


I checked the history, the first time I added those pare is because they are first introduced/implemented in amd-sriov-4.3, and that branch's fence implement has big gap between current (4.3 fence doesn't use RCU and kmem_cache)


in amd-sriov-4.3 branch, without this grab/release around I found there will be kernel page fault triggered if fence quickly signaled before fence_wait(), because amdgpu_fence_process will put the fence and lead to fence released prior to later's fence_wait().


but for staging-4.7 code, I didn't found such page fault after remove those grab/release, and I noticed that the reason maybe we increases the kref of fence to 2 in amdgpu_fence_emit ( in 4.3 kernel we only increase it to 1):


Increasing to 1 is by fence_init, and to 2 is by rcu_assign_pointer(*ptr,fence_get(&fence->base))

and with this reason the fence is 1 after it signaled. Agree  with you that we should remove those grab/release and only leave one fence_put() in kiq reg access routines.


I didn't check the code on detail previously, thanks !


BR Monk


________________________________
发件人: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> 代表 Liu, Monk <Monk.Liu at amd.com>
发送时间: 2017年1月11日 11:06:34
收件人: Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org
主题: 答复: [V2 04/11] drm/amdgpu/virt: use kiq to access registers

> +     fence_get(f);
> +     amdgpu_ring_commit(ring);
> +     mutex_unlock(&adev->virt.lock);
> +
> +     r = fence_wait(f, false);
> +     fence_put(f);

Why do you grab and release an extra fence reference here?

Christian.

[ML] e.g.  without those grab/release pare, if fence is signaled right after ring_commit(), then
"amdgpu_fence_process" will be invoked by interrupt and it will call fence_put() on this fence, so this
Fence is never valid and all fence_wait on it in sequence will trigger page fault ...

BR Monk



-----邮件原件-----
发件人: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] 代表 Christian König
发送时间: Tuesday, January 10, 2017 9:09 PM
收件人: Yu, Xiangliang; amd-gfx at lists.freedesktop.org
主题: Re: [V2 04/11] drm/amdgpu/virt: use kiq to access registers

Am 10.01.2017 um 11:00 schrieb Xiangliang Yu:
> For virtualization, it is must for driver to use KIQ to access
> registers when it is out of GPU full access mode.
>
> Signed-off-by: Xiangliang Yu <Xiangliang.Yu at amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/Makefile        |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 86 ++++++++++++++++++++++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  5 ++
>   drivers/gpu/drm/amd/amdgpu/vi.c            |  3 ++
>   5 files changed, 101 insertions(+), 1 deletion(-)
>   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
> b/drivers/gpu/drm/amd/amdgpu/Makefile
> index 4185b03..0b8e470 100644
> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> @@ -30,7 +30,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>        atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
>        amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
>        amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
> -     amdgpu_gtt_mgr.o amdgpu_vram_mgr.o
> +     amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o
>
>   # add asic specific block
>   amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index f82919d..9a2fd3e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -95,6 +95,9 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>   {
>        uint32_t ret;
>
> +     if (amdgpu_sriov_runtime(adev) && !in_interrupt())
> +             return amdgpu_virt_kiq_rreg(adev, reg);
> +
>        if ((reg * 4) < adev->rmmio_size && !always_indirect)
>                ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
>        else {
> @@ -114,6 +117,9 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>   {
>        trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
>
> +     if (amdgpu_sriov_runtime(adev))
> +             return amdgpu_virt_kiq_wreg(adev, reg, v);
> +
>        if ((reg * 4) < adev->rmmio_size && !always_indirect)
>                writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>        else {
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> new file mode 100644
> index 0000000..6520a4e
> --- /dev/null
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> @@ -0,0 +1,86 @@
> +/*
> + * Copyright 2017 Advanced Micro Devices, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person
> +obtaining a
> + * copy of this software and associated documentation files (the
> +"Software"),
> + * to deal in the Software without restriction, including without
> +limitation
> + * the rights to use, copy, modify, merge, publish, distribute,
> +sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom
> +the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be
> +included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> +EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> +MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> +SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
> +DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> +OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> +OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + */
> +
> +#include "amdgpu.h"
> +#include "amdgpu_virt.h"
> +
> +void amdgpu_virt_init_setting(struct amdgpu_device *adev) {
> +     mutex_init(&adev->virt.lock);
> +}
> +
> +uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t
> +reg) {
> +     signed long r;
> +     uint32_t val;
> +     struct fence *f;
> +     struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> +     struct amdgpu_ring *ring = &kiq->ring;
> +
> +     BUG_ON(!ring->funcs->emit_rreg);
> +
> +     mutex_lock(&adev->virt.lock);
> +     amdgpu_ring_alloc(ring, 32);
> +     amdgpu_ring_emit_hdp_flush(ring);
> +     amdgpu_ring_emit_rreg(ring, reg);
> +     amdgpu_ring_emit_hdp_invalidate(ring);
> +     amdgpu_fence_emit(ring, &f);
> +     fence_get(f);
> +     amdgpu_ring_commit(ring);
> +     mutex_unlock(&adev->virt.lock);
> +
> +     r = fence_wait(f, false);
> +     fence_put(f);

Why do you grab and release an extra fence reference here?

Christian.

> +     if (r)
> +             DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> +     fence_put(f);
> +
> +     val = adev->wb.wb[adev->virt.val_offs];
> +
> +     return val;
> +}
> +
> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
> +uint32_t v) {
> +     signed long r;
> +     struct fence *f;
> +     struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> +     struct amdgpu_ring *ring = &kiq->ring;
> +
> +     BUG_ON(!ring->funcs->emit_wreg);
> +
> +     mutex_lock(&adev->virt.lock);
> +     amdgpu_ring_alloc(ring, 32);
> +     amdgpu_ring_emit_hdp_flush(ring);
> +     amdgpu_ring_emit_wreg(ring, reg, v);
> +     amdgpu_ring_emit_hdp_invalidate(ring);
> +     amdgpu_fence_emit(ring, &f);
> +     fence_get(f);
> +     amdgpu_ring_commit(ring);
> +     mutex_unlock(&adev->virt.lock);
> +
> +     r = fence_wait(f, false);
> +     fence_put(f);
> +     if (r)
> +             DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> +     fence_put(f);
> +}
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> index 79619b6..24f0590 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> @@ -33,6 +33,7 @@
>   struct amdgpu_virt {
>        uint32_t                caps;
>        uint32_t                val_offs;
> +     struct mutex            lock;
>   };
>
>   #define amdgpu_sriov_enabled(adev) \ @@ -59,4 +60,8 @@ static inline
> bool is_virtual_machine(void)
>   #endif
>   }
>
> +void amdgpu_virt_init_setting(struct amdgpu_device *adev); uint32_t
> +amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg); void
> +amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
> +uint32_t v);
> +
>   #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
> b/drivers/gpu/drm/amd/amdgpu/vi.c index 7350a8f..dc0d4fa 100644
> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> @@ -892,6 +892,9 @@ static int vi_common_early_init(void *handle)
>                (amdgpu_ip_block_mask & (1 << AMD_IP_BLOCK_TYPE_SMC)))
>                smc_enabled = true;
>
> +     if (amdgpu_sriov_vf(adev))
> +             amdgpu_virt_init_setting(adev);
> +
>        adev->rev_id = vi_get_rev_id(adev);
>        adev->external_rev_id = 0xFF;
>        switch (adev->asic_type) {


_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170111/012107f4/attachment-0001.html>


More information about the amd-gfx mailing list