Re: 答复: [V2 04/11] drm/amdgpu/virt: use kiq to access registers

Christian König deathsimple at vodafone.de
Wed Jan 11 11:48:48 UTC 2017


Ah, that issue again. Yeah that was fixed in the meantime. Please remove 
the extra fance_get()/fence_put() for upstreaming.

If an old branch still has that issue we should backport the proper 
fixes and not work around them like this.

Thanks for looking into it,
Christian.

Am 11.01.2017 um 09:30 schrieb Liu, Monk:
>
> Hi Christian
>
>
> latest updates on the fence grab/release
>
>
> I recall the impression of why I added those grab/release pare:
>
>
> I checked the history, the first time I added those pare is because 
> they are first introduced/implemented in amd-sriov-4.3, and that 
> branch's fence implement has big gap between current (4.3 fence 
> doesn't use RCU and kmem_cache)
>
>
> in amd-sriov-4.3 branch, without this grab/release around I found 
> there will be kernel page fault triggered if fence quickly signaled 
> before fence_wait(), because amdgpu_fence_process will put the fence 
> and lead to fence released prior to later's fence_wait().
>
>
> but for staging-4.7 code, I didn't found such page fault after remove 
> those grab/release, and I noticed that the reason maybe we increases 
> the kref of fence to 2 in amdgpu_fence_emit ( in 4.3 kernel we only 
> increase it to 1):
>
>
> Increasing to 1 is by fence_init, and to 2 is by 
> rcu_assign_pointer(*ptr,fence_get(&fence->base))
>
>
> and with this reason the fence is 1 after it signaled. Agree  with you 
> that we should remove those grab/release and only leave one 
> fence_put() in kiq reg access routines.
>
>
> I didn't check the code on detail previously, thanks !
>
>
> BR Monk
>
>
> ------------------------------------------------------------------------
> *发件人:* amd-gfx <amd-gfx-bounces at lists.freedesktop.org> 代表 Liu, Monk 
> <Monk.Liu at amd.com>
> *发送时间:* 2017年1月11日 11:06:34
> *收件人:* Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org
> *主题:* 答复: [V2 04/11] drm/amdgpu/virt: use kiq to access registers
> > +     fence_get(f);
> > +     amdgpu_ring_commit(ring);
> > +     mutex_unlock(&adev->virt.lock);
> > +
> > +     r = fence_wait(f, false);
> > +     fence_put(f);
>
> Why do you grab and release an extra fence reference here?
>
> Christian.
>
> [ML] e.g.  without those grab/release pare, if fence is signaled right 
> after ring_commit(), then
> "amdgpu_fence_process" will be invoked by interrupt and it will call 
> fence_put() on this fence, so this
> Fence is never valid and all fence_wait on it in sequence will trigger 
> page fault ...
>
> BR Monk
>
>
>
> -----邮件原件-----
> 发件人: amd-gfx [mailto:amd-gfx-bounces at lists.freedesktop.org] 代表 
> Christian König
> 发送时间: Tuesday, January 10, 2017 9:09 PM
> 收件人: Yu, Xiangliang; amd-gfx at lists.freedesktop.org
> 主题: Re: [V2 04/11] drm/amdgpu/virt: use kiq to access registers
>
> Am 10.01.2017 um 11:00 schrieb Xiangliang Yu:
> > For virtualization, it is must for driver to use KIQ to access
> > registers when it is out of GPU full access mode.
> >
> > Signed-off-by: Xiangliang Yu <Xiangliang.Yu at amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/Makefile        |  2 +-
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 86 
> ++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  5 ++
> >   drivers/gpu/drm/amd/amdgpu/vi.c            |  3 ++
> >   5 files changed, 101 insertions(+), 1 deletion(-)
> >   create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
> > b/drivers/gpu/drm/amd/amdgpu/Makefile
> > index 4185b03..0b8e470 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/Makefile
> > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
> > @@ -30,7 +30,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
> >        atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
> >        amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
> >        amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
> > -     amdgpu_gtt_mgr.o amdgpu_vram_mgr.o
> > +     amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o
> >
> >   # add asic specific block
> >   amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index f82919d..9a2fd3e 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -95,6 +95,9 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device 
> *adev, uint32_t reg,
> >   {
> >        uint32_t ret;
> >
> > +     if (amdgpu_sriov_runtime(adev) && !in_interrupt())
> > +             return amdgpu_virt_kiq_rreg(adev, reg);
> > +
> >        if ((reg * 4) < adev->rmmio_size && !always_indirect)
> >                ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
> >        else {
> > @@ -114,6 +117,9 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, 
> uint32_t reg, uint32_t v,
> >   {
> >        trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
> >
> > +     if (amdgpu_sriov_runtime(adev))
> > +             return amdgpu_virt_kiq_wreg(adev, reg, v);
> > +
> >        if ((reg * 4) < adev->rmmio_size && !always_indirect)
> >                writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
> >        else {
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> > new file mode 100644
> > index 0000000..6520a4e
> > --- /dev/null
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
> > @@ -0,0 +1,86 @@
> > +/*
> > + * Copyright 2017 Advanced Micro Devices, Inc.
> > + *
> > + * Permission is hereby granted, free of charge, to any person
> > +obtaining a
> > + * copy of this software and associated documentation files (the
> > +"Software"),
> > + * to deal in the Software without restriction, including without
> > +limitation
> > + * the rights to use, copy, modify, merge, publish, distribute,
> > +sublicense,
> > + * and/or sell copies of the Software, and to permit persons to whom
> > +the
> > + * Software is furnished to do so, subject to the following conditions:
> > + *
> > + * The above copyright notice and this permission notice shall be
> > +included in
> > + * all copies or substantial portions of the Software.
> > + *
> > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> > +EXPRESS OR
> > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
> > +MERCHANTABILITY,
> > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
> > +SHALL
> > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
> > +DAMAGES OR
> > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> > +OTHERWISE,
> > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
> > +OR
> > + * OTHER DEALINGS IN THE SOFTWARE.
> > + */
> > +
> > +#include "amdgpu.h"
> > +#include "amdgpu_virt.h"
> > +
> > +void amdgpu_virt_init_setting(struct amdgpu_device *adev) {
> > +     mutex_init(&adev->virt.lock);
> > +}
> > +
> > +uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t
> > +reg) {
> > +     signed long r;
> > +     uint32_t val;
> > +     struct fence *f;
> > +     struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> > +     struct amdgpu_ring *ring = &kiq->ring;
> > +
> > +     BUG_ON(!ring->funcs->emit_rreg);
> > +
> > +     mutex_lock(&adev->virt.lock);
> > +     amdgpu_ring_alloc(ring, 32);
> > +     amdgpu_ring_emit_hdp_flush(ring);
> > +     amdgpu_ring_emit_rreg(ring, reg);
> > +     amdgpu_ring_emit_hdp_invalidate(ring);
> > +     amdgpu_fence_emit(ring, &f);
> > +     fence_get(f);
> > +     amdgpu_ring_commit(ring);
> > +     mutex_unlock(&adev->virt.lock);
> > +
> > +     r = fence_wait(f, false);
> > +     fence_put(f);
>
> Why do you grab and release an extra fence reference here?
>
> Christian.
>
> > +     if (r)
> > +             DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> > +     fence_put(f);
> > +
> > +     val = adev->wb.wb[adev->virt.val_offs];
> > +
> > +     return val;
> > +}
> > +
> > +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
> > +uint32_t v) {
> > +     signed long r;
> > +     struct fence *f;
> > +     struct amdgpu_kiq *kiq = &adev->gfx.kiq;
> > +     struct amdgpu_ring *ring = &kiq->ring;
> > +
> > +     BUG_ON(!ring->funcs->emit_wreg);
> > +
> > +     mutex_lock(&adev->virt.lock);
> > +     amdgpu_ring_alloc(ring, 32);
> > +     amdgpu_ring_emit_hdp_flush(ring);
> > +     amdgpu_ring_emit_wreg(ring, reg, v);
> > +     amdgpu_ring_emit_hdp_invalidate(ring);
> > +     amdgpu_fence_emit(ring, &f);
> > +     fence_get(f);
> > +     amdgpu_ring_commit(ring);
> > +     mutex_unlock(&adev->virt.lock);
> > +
> > +     r = fence_wait(f, false);
> > +     fence_put(f);
> > +     if (r)
> > +             DRM_ERROR("wait for kiq fence error: %ld.\n", r);
> > +     fence_put(f);
> > +}
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> > b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> > index 79619b6..24f0590 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
> > @@ -33,6 +33,7 @@
> >   struct amdgpu_virt {
> >        uint32_t                caps;
> >        uint32_t                val_offs;
> > +     struct mutex            lock;
> >   };
> >
> >   #define amdgpu_sriov_enabled(adev) \ @@ -59,4 +60,8 @@ static inline
> > bool is_virtual_machine(void)
> >   #endif
> >   }
> >
> > +void amdgpu_virt_init_setting(struct amdgpu_device *adev); uint32_t
> > +amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg); void
> > +amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
> > +uint32_t v);
> > +
> >   #endif
> > diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
> > b/drivers/gpu/drm/amd/amdgpu/vi.c index 7350a8f..dc0d4fa 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/vi.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
> > @@ -892,6 +892,9 @@ static int vi_common_early_init(void *handle)
> >                (amdgpu_ip_block_mask & (1 << AMD_IP_BLOCK_TYPE_SMC)))
> >                smc_enabled = true;
> >
> > +     if (amdgpu_sriov_vf(adev))
> > +             amdgpu_virt_init_setting(adev);
> > +
> >        adev->rev_id = vi_get_rev_id(adev);
> >        adev->external_rev_id = 0xFF;
> >        switch (adev->asic_type) {
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170111/3bd88801/attachment-0001.html>


More information about the amd-gfx mailing list