[V3 04/11] drm/amdgpu/virt: use kiq to access registers

Christian König deathsimple at vodafone.de
Wed Jan 11 14:16:11 UTC 2017


> Because if we are in interrupt , we are forbid to do schedule, and use kiq to read register will invoke fence_wait() ...
That won't work. Locking a mutext like it is done in the write path can 
cause scheduling as well.

If we need to push writes through the KIQ in interrupt context we need 
to change the code to use a spinlock with disabled interrupts instead of 
a mutex.

Otherwise the whole thing can easily deadlock when an interrupt happens 
to come while the lock is taken.

Please also test the patchset with lockdep enabled, that should yield a 
bunch of error messages when you try to do something like this.

Regards,
Christian.

Am 11.01.2017 um 14:51 schrieb Liu, Monk:
> Because if we are in interrupt , we are forbid to do schedule, and use kiq to read register will invoke fence_wait() ...
>
> If you think that's odds, we can use BUG_ON() to stop driver running if we need read registers in SRIOV case.
>
> And to fully support register reading, we need to implement a new method to use KIQ access register without any schedule() chance, but that's overhead for current stage, and we don't observe registers reading in IRQ context now on 3D apps. (kfd observed such case, and @Shaoyun dispatches a kernel thread/tasklet in IRQ to do the register dumping, instead of directly reading them via KIQ )
>   
> BR Monk
>
> -----Original Message-----
> From: Christian König [mailto:deathsimple at vodafone.de]
> Sent: Wednesday, January 11, 2017 9:38 PM
> To: Yu, Xiangliang <Xiangliang.Yu at amd.com>; amd-gfx at lists.freedesktop.org
> Cc: Liu, Monk <Monk.Liu at amd.com>
> Subject: Re: [V3 04/11] drm/amdgpu/virt: use kiq to access registers
>
> Am 11.01.2017 um 14:18 schrieb Xiangliang Yu:
>> For virtualization, it is must for driver to use KIQ to access
>> registers when it is out of GPU full access mode.
>>
>> Signed-off-by: Xiangliang Yu <Xiangliang.Yu at amd.com>
>> Signed-off-by: Monk Liu <Monk.Liu at amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/Makefile        |  2 +-
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_device.c |  6 +++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c   | 82 ++++++++++++++++++++++++++++++
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h   |  5 ++
>>    drivers/gpu/drm/amd/amdgpu/vi.c            |  3 ++
>>    5 files changed, 97 insertions(+), 1 deletion(-)
>>    create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile
>> b/drivers/gpu/drm/amd/amdgpu/Makefile
>> index 4185b03..0b8e470 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile
>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile
>> @@ -30,7 +30,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \
>>    	atombios_encoders.o amdgpu_sa.o atombios_i2c.o \
>>    	amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
>>    	amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
>> -	amdgpu_gtt_mgr.o amdgpu_vram_mgr.o
>> +	amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o
>>    
>>    # add asic specific block
>>    amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> index f82919d..9a2fd3e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
>> @@ -95,6 +95,9 @@ uint32_t amdgpu_mm_rreg(struct amdgpu_device *adev, uint32_t reg,
>>    {
>>    	uint32_t ret;
>>    
>> +	if (amdgpu_sriov_runtime(adev) && !in_interrupt())
>> +		return amdgpu_virt_kiq_rreg(adev, reg);
>> +
> Mhm, why is the in_interrupt() necessary here?
>
>>    	if ((reg * 4) < adev->rmmio_size && !always_indirect)
>>    		ret = readl(((void __iomem *)adev->rmmio) + (reg * 4));
>>    	else {
>> @@ -114,6 +117,9 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
>>    {
>>    	trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
>>    
>> +	if (amdgpu_sriov_runtime(adev))
>> +		return amdgpu_virt_kiq_wreg(adev, reg, v);
>> +
> And why it isn't necessary here? What happens when we write a register from an interrupt routine?
>
> Christian.
>
>>    	if ((reg * 4) < adev->rmmio_size && !always_indirect)
>>    		writel(v, ((void __iomem *)adev->rmmio) + (reg * 4));
>>    	else {
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> new file mode 100644
>> index 0000000..7861b6b
>> --- /dev/null
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
>> @@ -0,0 +1,82 @@
>> +/*
>> + * Copyright 2017 Advanced Micro Devices, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person
>> +obtaining a
>> + * copy of this software and associated documentation files (the
>> +"Software"),
>> + * to deal in the Software without restriction, including without
>> +limitation
>> + * the rights to use, copy, modify, merge, publish, distribute,
>> +sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom
>> +the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be
>> +included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
>> +EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
>> +MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
>> +SHALL
>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM,
>> +DAMAGES OR
>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>> +OTHERWISE,
>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE
>> +OR
>> + * OTHER DEALINGS IN THE SOFTWARE.
>> + */
>> +
>> +#include "amdgpu.h"
>> +#include "amdgpu_virt.h"
>> +
>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev) {
>> +	mutex_init(&adev->virt.lock);
>> +}
>> +
>> +uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t
>> +reg) {
>> +	signed long r;
>> +	uint32_t val;
>> +	struct fence *f;
>> +	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>> +	struct amdgpu_ring *ring = &kiq->ring;
>> +
>> +	BUG_ON(!ring->funcs->emit_rreg);
>> +
>> +	mutex_lock(&adev->virt.lock);
>> +	amdgpu_ring_alloc(ring, 32);
>> +	amdgpu_ring_emit_hdp_flush(ring);
>> +	amdgpu_ring_emit_rreg(ring, reg);
>> +	amdgpu_ring_emit_hdp_invalidate(ring);
>> +	amdgpu_fence_emit(ring, &f);
>> +	amdgpu_ring_commit(ring);
>> +	mutex_unlock(&adev->virt.lock);
>> +
>> +	r = fence_wait(f, false);
>> +	if (r)
>> +		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>> +	fence_put(f);
>> +
>> +	val = adev->wb.wb[adev->virt.reg_val_offs];
>> +
>> +	return val;
>> +}
>> +
>> +void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
>> +uint32_t v) {
>> +	signed long r;
>> +	struct fence *f;
>> +	struct amdgpu_kiq *kiq = &adev->gfx.kiq;
>> +	struct amdgpu_ring *ring = &kiq->ring;
>> +
>> +	BUG_ON(!ring->funcs->emit_wreg);
>> +
>> +	mutex_lock(&adev->virt.lock);
>> +	amdgpu_ring_alloc(ring, 32);
>> +	amdgpu_ring_emit_hdp_flush(ring);
>> +	amdgpu_ring_emit_wreg(ring, reg, v);
>> +	amdgpu_ring_emit_hdp_invalidate(ring);
>> +	amdgpu_fence_emit(ring, &f);
>> +	amdgpu_ring_commit(ring);
>> +	mutex_unlock(&adev->virt.lock);
>> +
>> +	r = fence_wait(f, false);
>> +	if (r)
>> +		DRM_ERROR("wait for kiq fence error: %ld.\n", r);
>> +	fence_put(f);
>> +}
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>> index 3d38a9f..2869980 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
>> @@ -33,6 +33,7 @@
>>    struct amdgpu_virt {
>>    	uint32_t		caps;
>>    	uint32_t		reg_val_offs;
>> +	struct mutex		lock;
>>    };
>>    
>>    #define amdgpu_sriov_enabled(adev) \ @@ -59,4 +60,8 @@ static inline
>> bool is_virtual_machine(void)
>>    #endif
>>    }
>>    
>> +void amdgpu_virt_init_setting(struct amdgpu_device *adev); uint32_t
>> +amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg); void
>> +amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg,
>> +uint32_t v);
>> +
>>    #endif
>> diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c
>> b/drivers/gpu/drm/amd/amdgpu/vi.c index 7350a8f..dc0d4fa 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/vi.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/vi.c
>> @@ -892,6 +892,9 @@ static int vi_common_early_init(void *handle)
>>    		(amdgpu_ip_block_mask & (1 << AMD_IP_BLOCK_TYPE_SMC)))
>>    		smc_enabled = true;
>>    
>> +	if (amdgpu_sriov_vf(adev))
>> +		amdgpu_virt_init_setting(adev);
>> +
>>    	adev->rev_id = vi_get_rev_id(adev);
>>    	adev->external_rev_id = 0xFF;
>>    	switch (adev->asic_type) {
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx




More information about the amd-gfx mailing list