[PATCH] drm/amdkfd: don't add DOORBELL and MMIO BOs to validate list
Lang Yu
Lang.Yu at amd.com
Wed May 25 12:46:25 UTC 2022
On 05/25/ , Christian König wrote:
> Am 25.05.22 um 13:37 schrieb Lang Yu:
> > On 05/25/ , Christian König wrote:
> > > Am 25.05.22 um 11:25 schrieb Lang Yu:
> > > > On 05/25/ , Christian König wrote:
> > > > > Am 25.05.22 um 10:43 schrieb Lang Yu:
> > > > > > DOORBELL and MMIO BOs never move once created.
> > > > > > No need to validate them after that.
> > > > > Yeah, but you still need to make sure their page tables are up to date.
> > > > >
> > > > > So this here might break horrible.
> > > > These BOs(and attachments) are validated when allocated and mapped.
> > > > Their page tables should be determined at this time.
> > > >
> > > > The kfd_bo_list is used to restore BOs after evictions.
> > > >
> > > > Do you mean their page tabes could be changed? Thanks.
> > > Yes, page tables can be destroyed under memory pressure as well.
> > Destroyed? You mean the contents of page table BOs are disappeared.
>
> Currently we try to just free up the backing store of them, but the idea is
> to really get rid of the whole BO under memory pressure.
>
> See page tables are managed in a hierarchy and their content can be fully
> restored from the metadata.
>
> So except for the root PD all page tables in a VM can (in theory) be
> destroyed and re-created when they are not used.
>
> > If so, could other BOs be destroyed under memory pressure? Thanks!
>
> I don't think so, everything else is just referenced somewhere.
Thanks. I got it. Just curious how do we identify PT BOs when we want
to destroy them under memory pressure? And does this happen in eviction
process?
Regards,
Lang
> Regards,
> Christian.
>
> >
> > Regards,
> > Lang
> >
> > > Not sure how the KFD handles that, but in theory we should have every BO
> > > used by a process on the validation list. Even the ones pinned.
> > >
> > > Regards,
> > > Christian.
> > >
> > > >
> > > > > Christian.
> > > > >
> > > > > > Signed-off-by: Lang Yu <Lang.Yu at amd.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 14 +++++++++-----
> > > > > > 1 file changed, 9 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > > > > > index 34ba9e776521..45de9cadd771 100644
> > > > > > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > > > > > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> > > > > > @@ -808,6 +808,10 @@ static void add_kgd_mem_to_kfd_bo_list(struct kgd_mem *mem,
> > > > > > struct ttm_validate_buffer *entry = &mem->validate_list;
> > > > > > struct amdgpu_bo *bo = mem->bo;
> > > > > > + if (mem->alloc_flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
> > > > > > + KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))
> > > > > > + return;
> > > > > > +
> > > > > > INIT_LIST_HEAD(&entry->head);
> > > > > > entry->num_shared = 1;
> > > > > > entry->bo = &bo->tbo;
> > > > > > @@ -824,6 +828,10 @@ static void remove_kgd_mem_from_kfd_bo_list(struct kgd_mem *mem,
> > > > > > {
> > > > > > struct ttm_validate_buffer *bo_list_entry;
> > > > > > + if (mem->alloc_flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
> > > > > > + KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP))
> > > > > > + return;
> > > > > > +
> > > > > > bo_list_entry = &mem->validate_list;
> > > > > > mutex_lock(&process_info->lock);
> > > > > > list_del(&bo_list_entry->head);
> > > > > > @@ -1649,7 +1657,6 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
> > > > > > unsigned long bo_size = mem->bo->tbo.base.size;
> > > > > > struct kfd_mem_attachment *entry, *tmp;
> > > > > > struct bo_vm_reservation_context ctx;
> > > > > > - struct ttm_validate_buffer *bo_list_entry;
> > > > > > unsigned int mapped_to_gpu_memory;
> > > > > > int ret;
> > > > > > bool is_imported = false;
> > > > > > @@ -1677,10 +1684,7 @@ int amdgpu_amdkfd_gpuvm_free_memory_of_gpu(
> > > > > > }
> > > > > > /* Make sure restore workers don't access the BO any more */
> > > > > > - bo_list_entry = &mem->validate_list;
> > > > > > - mutex_lock(&process_info->lock);
> > > > > > - list_del(&bo_list_entry->head);
> > > > > > - mutex_unlock(&process_info->lock);
> > > > > > + remove_kgd_mem_from_kfd_bo_list(mem, process_info);
> > > > > > /* No more MMU notifiers */
> > > > > > amdgpu_mn_unregister(mem->bo);
>
More information about the amd-gfx
mailing list