[PATCH 2/3] drm/amdgpu: add AMDGPU_VM_NOALLOC
Christian König
ckoenig.leichtzumerken at gmail.com
Mon May 16 16:13:35 UTC 2022
I don't have access to any gfx10 hardware.
Can you give me a dmesg and/or backtrace, etc..?
I can't push this unless it's working properly.
Christian.
Am 16.05.22 um 14:56 schrieb Marek Olšák:
> Reproduction steps:
> - use mesa/main on gfx10.3 (not sure what other GPUs do)
> - run: radeonsi_mall_noalloc=true glxgears
>
> Marek
>
> On Mon, May 16, 2022 at 7:53 AM Christian König
> <ckoenig.leichtzumerken at gmail.com> wrote:
>
> Crap, do you have a link to the failure?
>
> Am 16.05.22 um 13:10 schrieb Marek Olšák:
>> I forgot to say: The NOALLOC flag causes an allocation failure,
>> so there is a kernel bug somewhere.
>>
>> Marek
>>
>> On Mon, May 16, 2022 at 7:06 AM Marek Olšák <maraeo at gmail.com> wrote:
>>
>> FYI, I think it's time to merge this because the Mesa commits
>> are going to be merged in ~30 minutes if Gitlab CI is green,
>> and that includes updated amdgpu_drm.h.
>>
>> Marek
>>
>> On Wed, May 11, 2022 at 2:55 PM Marek Olšák
>> <maraeo at gmail.com> wrote:
>>
>> Ok sounds good.
>>
>> Marek
>>
>> On Wed., May 11, 2022, 03:43 Christian König,
>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>
>> It really *is* a NOALLOC feature. In other words
>> there is no latency improvement on reads because the
>> cache is always checked, even with the noalloc flag set.
>>
>> The only thing it affects is that misses not enter
>> the cache and so don't cause any additional pressure
>> on evicting cache lines.
>>
>> You might want to double check with the hardware
>> guys, but I'm something like 95% sure that it works
>> this way.
>>
>> Christian.
>>
>> Am 11.05.22 um 09:22 schrieb Marek Olšák:
>>> Bypass means that the contents of the cache are
>>> ignored, which decreases latency at the cost of no
>>> coherency between bypassed and normal memory
>>> requests. NOA (noalloc) means that the cache is
>>> checked and can give you cache hits, but misses are
>>> not cached and the overall latency is higher. I
>>> don't know what the hw does, but I hope it was
>>> misnamed and it really means bypass because there is
>>> no point in doing cache lookups on every memory
>>> request if the driver wants to disable caching to
>>> *decrease* latency in the situations when the cache
>>> isn't helping.
>>>
>>> Marek
>>>
>>> On Wed, May 11, 2022 at 2:15 AM Lazar, Lijo
>>> <lijo.lazar at amd.com> wrote:
>>>
>>>
>>>
>>> On 5/11/2022 11:36 AM, Christian König wrote:
>>> > Mhm, it doesn't really bypass MALL. It just
>>> doesn't allocate any MALL
>>> > entries on write.
>>> >
>>> > How about AMDGPU_VM_PAGE_NO_MALL ?
>>>
>>> One more - AMDGPU_VM_PAGE_LLC_* [ LLC = last
>>> level cache, * = some sort
>>> of attribute which decides LLC behaviour]
>>>
>>> Thanks,
>>> Lijo
>>>
>>> >
>>> > Christian.
>>> >
>>> > Am 10.05.22 um 23:21 schrieb Marek Olšák:
>>> >> A better name would be:
>>> >> AMDGPU_VM_PAGE_BYPASS_MALL
>>> >>
>>> >> Marek
>>> >>
>>> >> On Fri, May 6, 2022 at 7:23 AM Christian König
>>> >> <ckoenig.leichtzumerken at gmail.com> wrote:
>>> >>
>>> >> Add the AMDGPU_VM_NOALLOC flag to let
>>> userspace control MALL
>>> >> allocation.
>>> >>
>>> >> Only compile tested!
>>> >>
>>> >> Signed-off-by: Christian König
>>> <christian.koenig at amd.com>
>>> >> ---
>>> >> drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 ++
>>> >> drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 3 +++
>>> >> drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 3 +++
>>> >> include/uapi/drm/amdgpu_drm.h | 2 ++
>>> >> 4 files changed, 10 insertions(+)
>>> >>
>>> >> diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> index bf97d8f07f57..d8129626581f 100644
>>> >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> >> @@ -650,6 +650,8 @@ uint64_t
>>> amdgpu_gem_va_map_flags(struct
>>> >> amdgpu_device *adev, uint32_t flags)
>>> >> pte_flag |= AMDGPU_PTE_WRITEABLE;
>>> >> if (flags & AMDGPU_VM_PAGE_PRT)
>>> >> pte_flag |= AMDGPU_PTE_PRT;
>>> >> + if (flags & AMDGPU_VM_PAGE_NOALLOC)
>>> >> + pte_flag |= AMDGPU_PTE_NOALLOC;
>>> >>
>>> >> if (adev->gmc.gmc_funcs->map_mtype)
>>> >> pte_flag |= amdgpu_gmc_map_mtype(adev,
>>> >> diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> index b8c79789e1e4..9077dfccaf3c 100644
>>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>> >> @@ -613,6 +613,9 @@ static void
>>> gmc_v10_0_get_vm_pte(struct
>>> >> amdgpu_device *adev,
>>> >> *flags &=
>>> ~AMDGPU_PTE_MTYPE_NV10_MASK;
>>> >> *flags |= (mapping->flags &
>>> AMDGPU_PTE_MTYPE_NV10_MASK);
>>> >>
>>> >> + *flags &= ~AMDGPU_PTE_NOALLOC;
>>> >> + *flags |= (mapping->flags &
>>> AMDGPU_PTE_NOALLOC);
>>> >> +
>>> >> if (mapping->flags &
>>> AMDGPU_PTE_PRT) {
>>> >> *flags |= AMDGPU_PTE_PRT;
>>> >> *flags |= AMDGPU_PTE_SNOOPED;
>>> >> diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> >> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> >> index 8d733eeac556..32ee56adb602 100644
>>> >> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> >> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>> >> @@ -508,6 +508,9 @@ static void
>>> gmc_v11_0_get_vm_pte(struct
>>> >> amdgpu_device *adev,
>>> >> *flags &=
>>> ~AMDGPU_PTE_MTYPE_NV10_MASK;
>>> >> *flags |= (mapping->flags &
>>> AMDGPU_PTE_MTYPE_NV10_MASK);
>>> >>
>>> >> + *flags &= ~AMDGPU_PTE_NOALLOC;
>>> >> + *flags |= (mapping->flags &
>>> AMDGPU_PTE_NOALLOC);
>>> >> +
>>> >> if (mapping->flags &
>>> AMDGPU_PTE_PRT) {
>>> >> *flags |= AMDGPU_PTE_PRT;
>>> >> *flags |= AMDGPU_PTE_SNOOPED;
>>> >> diff --git a/include/uapi/drm/amdgpu_drm.h
>>> >> b/include/uapi/drm/amdgpu_drm.h
>>> >> index 57b9d8f0133a..9d71d6330687 100644
>>> >> --- a/include/uapi/drm/amdgpu_drm.h
>>> >> +++ b/include/uapi/drm/amdgpu_drm.h
>>> >> @@ -533,6 +533,8 @@ struct
>>> drm_amdgpu_gem_op {
>>> >> #define AMDGPU_VM_MTYPE_UC
>>> (4 << 5)
>>> >> /* Use Read Write MTYPE instead of
>>> default MTYPE */
>>> >> #define AMDGPU_VM_MTYPE_RW
>>> (5 << 5)
>>> >> +/* don't allocate MALL */
>>> >> +#define AMDGPU_VM_PAGE_NOALLOC
>>> (1 << 9)
>>> >>
>>> >> struct drm_amdgpu_gem_va {
>>> >> /** GEM object handle */
>>> >> --
>>> >> 2.25.1
>>> >>
>>> >
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20220516/ec913a0c/attachment-0001.htm>
More information about the amd-gfx
mailing list