答复: [V3 04/11] drm/amdgpu/virt: use kiq to access registers

Liu, Monk Monk.Liu at amd.com
Thu Jan 12 03:21:28 UTC 2017


Xiangliang

please BUG() when register access occured in RUNTIME and IRQ context, e.g.:

for register read:
if (amdgpu_sriov_runtime(adev)) {
   if (in_interrupt())
        BUG();
   else
       return amdgpu_virt_kiq_rreg(adev, reg, v);
}

and also for register write, with above addressed, Reviewed-by: Monk Liu <monk.liu at amd.com>


________________________________
发件人: Liu, Monk
发送时间: 2017年1月12日 11:19:40
收件人: Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org
主题: 答复: [V3 04/11] drm/amdgpu/virt: use kiq to access registers


Xiangliang


please BUG() when register access occured in RUNTIME and IRQ context, e.g.:


if (amdgpu_sriov_runtime(adev)) {


}
  return amdgpu_virt_kiq_wreg(adev, reg, v);


with above addressed, Reviewed-by: Monk Liu <monk.liu at amd.com>

________________________________
发件人: amd-gfx <amd-gfx-bounces at lists.freedesktop.org> 代表 Liu, Monk <Monk.Liu at amd.com>
发送时间: 2017年1月11日 22:54:52
收件人: Christian König; Yu, Xiangliang; amd-gfx at lists.freedesktop.org
主题: RE: [V3 04/11] drm/amdgpu/virt: use kiq to access registers

We should BUG and not access register at all if found during RUNTIME && IRQ_context
By far.

BR Monk
-----Original Message-----
From: Christian König [mailto:deathsimple at vodafone.de]
Sent: Wednesday, January 11, 2017 10:44 PM
To: Liu, Monk <Monk.Liu at amd.com>; Yu, Xiangliang <Xiangliang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers

> do you think that's okay ?
Fine with me, but I'm not sure if the hypervisor doesn't gets a hickup when a client writes to some registers from interrupt context.

Christian.

Am 11.01.2017 um 15:38 schrieb Liu, Monk:
> Yeah, make sense
>
> I think before we have a full version of KIQ reg accessing without context switch, we should totally disable KIQ reg read/write in IRQ context, do you think that's okay ?
>
> BR Monk
>
> -----Original Message-----
> From: Christian König [mailto:deathsimple at vodafone.de]
> Sent: Wednesday, January 11, 2017 10:16 PM
> To: Liu, Monk <Monk.Liu at amd.com>; Yu, Xiangliang
> <Xiangliang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
> Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers
>
>> Because if we are in interrupt , we are forbid to do schedule, and use kiq to read register will invoke fence_wait() ...
> That won't work. Locking a mutext like it is done in the write path can cause scheduling as well.
>
> If we need to push writes through the KIQ in interrupt context we need to change the code to use a spinlock with disabled interrupts instead of a mutex.
>
> Otherwise the whole thing can easily deadlock when an interrupt happens to come while the lock is taken.
>
> Please also test the patchset with lockdep enabled, that should yield a bunch of error messages when you try to do something like this.
>
> Regards,
> Christian.
>
> Am 11.01.2017 um 14:51 schrieb Liu, Monk:
>> Because if we are in interrupt , we are forbid to do schedule, and use kiq to read register will invoke fence_wait() ...
>>
>> If you think that's odds, we can use BUG_ON() to stop driver running if we need read registers in SRIOV case.
>>
>> And to fully support register reading, we need to implement a new
>> method to use KIQ access register without any schedule() chance, but
>> that's overhead for current stage, and we don't observe registers
>> reading in IRQ context now on 3D apps. (kfd observed such case, and
>> @Shaoyun dispatches a kernel thread/tasklet in IRQ to do the register
>> dumping, instead of directly reading them via KIQ )
>>
>> BR Monk
>>
>> -----Original Message-----
>> From: Christian König [mailto:deathsimple at vodafone.de]
>> Sent: Wednesday, January 11, 2017 9:38 PM
>> To: Yu, Xiangliang <Xiangliang.Yu at amd.com>;
>> amd-gfx at lists.freedesktop.org
>> Cc: Liu, Monk <Monk.Liu at amd.com>
>> Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers
>>
>> Am 11.01.2017 um 14:18 schrieb Xiangliang Yu:
>>> For virtualization, it is must for driver to use KIQ to access
>>> registers when it is out of GPU full access mode.
>>>
>>> Signed-off-by: Xiangliang Yu <Xiangliang.Yu at amd.com>
>>> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
>>> ---
>>>     drivers/gpu/drm/amd/amdgpu/Makefile        |  2 +-
>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 82 ++++++++++++++++++++++++++++++
>>>     drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  5 ++
>>>     drivers/gpu/drm/amd/amdgpu/vi.c            |  3 ++
>>>     5 files changed, 97 insertions(+), 1 deletion(-)
>>>     create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
>>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>>> index 4185b03..0b8e470 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>>> @@ -30,7 +30,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>>      atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
>>>      amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
>>>      amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
>>> -   amdgpu_gtt_mgr.o amdgpu_vram_mgr.o
>>> +   amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o
>>>
>>>     # add asic specific block
>>>     amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o
>>> kv_dpm.o \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> index f82919d..9a2fd3e 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>>> @@ -95,6 +95,9 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>>>     {
>>>      uint32_t ret;
>>>
>>> +   if (amdgpu_sriov_runtime(adev) && !in_interrupt())
>>> +           return amdgpu_virt_kiq_rreg(adev, reg);
>>> +
>> Mhm, why is the in_interrupt() necessary here?
>>
>>>      if ((reg * 4) < adev->rmmio_size && !always_indirect)
>>>              ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
>>>      else {
>>> @@ -114,6 +117,9 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>>     {
>>>      trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
>>>
>>> +   if (amdgpu_sriov_runtime(adev))
>>> +           return amdgpu_virt_kiq_wreg(adev, reg, v);
>>> +
>> And why it isn't necessary here? What happens when we write a register from an interrupt routine?
>>
>> Christian.
>>
>>>      if ((reg * 4) < adev->rmmio_size && !always_indirect)
>>>              writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>>      else {
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> new file mode 100644
>>> index 0000000..7861b6b
>>> --- /dev/null
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>> @@ -0,0 +1,82 @@
>>> +/*
>>> + * Copyright 2017 Advanced Micro Devices, Inc.
>>> + *
>>> + * Permission is hereby granted, free of charge, to any person
>>> +obtaining a
>>> + * copy of this software and associated documentation files (the
>>> +"Software"),
>>> + * to deal in the Software without restriction, including without
>>> +limitation
>>> + * the rights to use, copy, modify, merge, publish, distribute,
>>> +sublicense,
>>> + * and/or sell copies of the Software, and to permit persons to
>>> +whom the
>>> + * Software is furnished to do so, subject to the following conditions:
>>> + *
>>> + * The above copyright notice and this permission notice shall be
>>> +included in
>>> + * all copies or substantial portions of the Software.
>>> + *
>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>>> +EXPRESS OR
>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>>> +MERCHANTABILITY,
>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO
>>> +EVENT SHALL
>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
>>> +DAMAGES OR
>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>>> +OTHERWISE,
>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
>>> +USE OR
>>> + * OTHER DEALINGS IN THE SOFTWARE.
>>> + */
>>> +
>>> +#include "amdgpu.h"
>>> +#include "amdgpu_virt.h"
>>> +
>>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev) {
>>> +   mutex_init(&adev->virt.lock);
>>> +}
>>> +
>>> +uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t
>>> +reg) {
>>> +   signed long r;
>>> +   uint32_t val;
>>> +   struct fence *f;
>>> +   struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> +   struct amdgpu_ring *ring = &kiq->ring;
>>> +
>>> +   BUG_ON(!ring->funcs->emit_rreg);
>>> +
>>> +   mutex_lock(&adev->virt.lock);
>>> +   amdgpu_ring_alloc(ring, 32);
>>> +   amdgpu_ring_emit_hdp_flush(ring);
>>> +   amdgpu_ring_emit_rreg(ring, reg);
>>> +   amdgpu_ring_emit_hdp_invalidate(ring);
>>> +   amdgpu_fence_emit(ring, &f);
>>> +   amdgpu_ring_commit(ring);
>>> +   mutex_unlock(&adev->virt.lock);
>>> +
>>> +   r = fence_wait(f, false);
>>> +   if (r)
>>> +           DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>> +   fence_put(f);
>>> +
>>> +   val = adev->wb.wb[adev->virt.reg_val_offs];
>>> +
>>> +   return val;
>>> +}
>>> +
>>> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
>>> +uint32_t v) {
>>> +   signed long r;
>>> +   struct fence *f;
>>> +   struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>>> +   struct amdgpu_ring *ring = &kiq->ring;
>>> +
>>> +   BUG_ON(!ring->funcs->emit_wreg);
>>> +
>>> +   mutex_lock(&adev->virt.lock);
>>> +   amdgpu_ring_alloc(ring, 32);
>>> +   amdgpu_ring_emit_hdp_flush(ring);
>>> +   amdgpu_ring_emit_wreg(ring, reg, v);
>>> +   amdgpu_ring_emit_hdp_invalidate(ring);
>>> +   amdgpu_fence_emit(ring, &f);
>>> +   amdgpu_ring_commit(ring);
>>> +   mutex_unlock(&adev->virt.lock);
>>> +
>>> +   r = fence_wait(f, false);
>>> +   if (r)
>>> +           DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>>> +   fence_put(f);
>>> +}
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>>> index 3d38a9f..2869980 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>>> @@ -33,6 +33,7 @@
>>>     struct amdgpu_virt {
>>>      uint32_t                caps;
>>>      uint32_t                reg_val_offs;
>>> +   struct mutex            lock;
>>>     };
>>>
>>>     #define amdgpu_sriov_enabled(adev) \ @@ -59,4 +60,8 @@ static
>>> inline bool is_virtual_machine(void)
>>>     #endif
>>>     }
>>>
>>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev); uint32_t
>>> +amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg);
>>> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
>>> +uint32_t v);
>>> +
>>>     #endif
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
>>> b/drivers/gpu/drm/amd/amdgpu/vi.c index 7350a8f..dc0d4fa 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>>> @@ -892,6 +892,9 @@ static int vi_common_early_init(void *handle)
>>>              (amdgpu_ip_block_mask & (1 << AMD_IP_BLOCK_TYPE_SMC)))
>>>              smc_enabled = true;
>>>
>>> +   if (amdgpu_sriov_vf(adev))
>>> +           amdgpu_virt_init_setting(adev);
>>> +
>>>      adev->rev_id = vi_get_rev_id(adev);
>>>      adev->external_rev_id = 0xFF;
>>>      switch (adev->asic_type) {
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
amd-gfx at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170112/abc2dc6e/attachment-0001.html>


More information about the amd-gfx mailing list