[PATCH] drm/amdgpu: cache in more vm fault information

Wed Mar 6 16:41:53 UTC 2024

On 3/6/2024 9:59 PM, Alex Deucher wrote:
> On Wed, Mar 6, 2024 at 11:21 AM Khatri, Sunil <sukhatri at amd.com> wrote:
>>
>> On 3/6/2024 9:45 PM, Alex Deucher wrote:
>>> On Wed, Mar 6, 2024 at 11:06 AM Khatri, Sunil <sukhatri at amd.com> wrote:
>>>> On 3/6/2024 9:07 PM, Christian König wrote:
>>>>> Am 06.03.24 um 16:13 schrieb Khatri, Sunil:
>>>>>> On 3/6/2024 8:34 PM, Christian König wrote:
>>>>>>> Am 06.03.24 um 15:29 schrieb Alex Deucher:
>>>>>>>> On Wed, Mar 6, 2024 at 8:04 AM Khatri, Sunil <sukhatri at amd.com> wrote:
>>>>>>>>> On 3/6/2024 6:12 PM, Christian König wrote:
>>>>>>>>>> Am 06.03.24 um 11:40 schrieb Khatri, Sunil:
>>>>>>>>>>> On 3/6/2024 3:37 PM, Christian König wrote:
>>>>>>>>>>>> Am 06.03.24 um 10:04 schrieb Sunil Khatri:
>>>>>>>>>>>>> When an  page fault interrupt is raised there
>>>>>>>>>>>>> is a lot more information that is useful for
>>>>>>>>>>>>> developers to analyse the pagefault.
>>>>>>>>>>>> Well actually those information are not that interesting because
>>>>>>>>>>>> they are hw generation specific.
>>>>>>>>>>>>
>>>>>>>>>>>> You should probably rather use the decoded strings here, e.g. hub,
>>>>>>>>>>>> client, xcc_id, node_id etc...
>>>>>>>>>>>>
>>>>>>>>>>>> See gmc_v9_0_process_interrupt() an example.
>>>>>>>>>>>> I saw this v9 does provide more information than what v10 and v11
>>>>>>>>>>>> provide like node_id and fault from which die but thats again very
>>>>>>>>>>>> specific to IP_VERSION(9, 4, 3)) i dont know why thats information
>>>>>>>>>>>> is not there in v10 and v11.
>>>>>>>>>>> I agree to your point but, as of now during a pagefault we are
>>>>>>>>>>> dumping this information which is useful like which client
>>>>>>>>>>> has generated an interrupt and for which src and other information
>>>>>>>>>>> like address. So i think to provide the similar information in the
>>>>>>>>>>> devcoredump.
>>>>>>>>>>>
>>>>>>>>>>> Currently we do not have all this information from either job or vm
>>>>>>>>>>> being derived from the job during a reset. We surely could add more
>>>>>>>>>>> relevant information later on as per request but this
>>>>>>>>>>> information is
>>>>>>>>>>> useful as
>>>>>>>>>>> eventually its developers only who would use the dump file provided
>>>>>>>>>>> by customer to debug.
>>>>>>>>>>>
>>>>>>>>>>> Below is the information that i dump in devcore and i feel that is
>>>>>>>>>>> good information but new information could be added which could be
>>>>>>>>>>> picked later.
>>>>>>>>>>>
>>>>>>>>>>>> Page fault information
>>>>>>>>>>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
>>>>>>>>>>>> in page starting at address 0x0000000000000000 from client 0x1b
>>>>>>>>>>>> (UTCL2)
>>>>>>>>>> This is a perfect example what I mean. You record in the patch is
>>>>>>>>>> the
>>>>>>>>>> client_id, but this is is basically meaningless unless you have
>>>>>>>>>> access
>>>>>>>>>> to the AMD internal hw documentation.
>>>>>>>>>>
>>>>>>>>>> What you really need is the client in decoded form, in this case
>>>>>>>>>> UTCL2. You can keep the client_id additionally, but the decoded
>>>>>>>>>> client
>>>>>>>>>> string is mandatory to have I think.
>>>>>>>>>>
>>>>>>>>>> Sure i am capturing that information as i am trying to minimise the
>>>>>>>>>> memory interaction to minimum as we are still in interrupt context
>>>>>>>>>> here that why i recorded the integer information compared to
>>>>>>>>>> decoding
>>>>>>>>> and writing strings there itself but to postpone till we dump.
>>>>>>>>>
>>>>>>>>> Like decoding to the gfxhub/mmhub based on vmhub/vmid_src and client
>>>>>>>>> string from client id. So are we good to go with the information with
>>>>>>>>> the above information of sharing details in devcoredump using the
>>>>>>>>> additional information from pagefault cached.
>>>>>>>> I think amdgpu_vm_fault_info() has everything you need already (vmhub,
>>>>>>>> status, and addr).  client_id and src_id are just tokens in the
>>>>>>>> interrupt cookie so we know which IP to route the interrupt to. We
>>>>>>>> know what they will be because otherwise we'd be in the interrupt
>>>>>>>> handler for a different IP.  I don't think ring_id has any useful
>>>>>>>> information in this context and vmid and pasid are probably not too
>>>>>>>> useful because they are just tokens to associate the fault with a
>>>>>>>> process.  It would be better to have the process name.
>>>>>> Just to share context here Alex, i am preparing this for devcoredump,
>>>>>> my intention was to replicate the information which in KMD we are
>>>>>> sharing in Dmesg for page faults. If assuming we do not add client id
>>>>>> specially we would not be able to share enough information in
>>>>>> devcoredump.
>>>>>> It would be just address and hub(gfxhub/mmhub) and i think that is
>>>>>> partial information as src id and client id and ip block shares good
>>>>>> information.
>>>>>>
>>>>>> For process related information we are capturing that information
>>>>>> part of dump from existing functionality.
>>>>>> **** AMDGPU Device Coredump ****
>>>>>> version: 1
>>>>>> kernel: 6.7.0-amd-staging-drm-next
>>>>>> module: amdgpu
>>>>>> time: 45.084775181
>>>>>> process_name: soft_recovery_p PID: 1780
>>>>>>
>>>>>> Ring timed out details
>>>>>> IP Type: 0 Ring Name: gfx_0.0.0
>>>>>>
>>>>>> Page fault information
>>>>>> [gfxhub] page fault (src_id:0 ring:24 vmid:3 pasid:32773)
>>>>>> in page starting at address 0x0000000000000000 from client 0x1b (UTCL2)
>>>>>> VRAM is lost due to GPU reset!
>>>>>>
>>>>>> Regards
>>>>>> Sunil
>>>>>>
>>>>>>> The decoded client name would be really useful I think since the
>>>>>>> fault handled is a catch all and handles a whole bunch of different
>>>>>>> clients.
>>>>>>>
>>>>>>> But that should be ideally passed in as const string instead of the
>>>>>>> hw generation specific client_id.
>>>>>>>
>>>>>>> As long as it's only a pointer we also don't run into the trouble
>>>>>>> that we need to allocate memory for it.
>>>>>> I agree but i prefer adding the client id and decoding it in
>>>>>> devcorecump using soc15_ih_clientid_name[fault_info->client_id]) is
>>>>>> better else we have to do an sprintf this string to fault_info in irq
>>>>>> context which is writing more bytes to memory i guess compared to an
>>>>>> integer:)
>>>>> Well I totally agree that we shouldn't fiddle to much in the interrupt
>>>>> handler, but exactly what you suggest here won't work.
>>>>>
>>>>> The client_id is hw generation specific, so the only one who has that
>>>>> is the hw generation specific fault handler. Just compare the defines
>>>>> here:
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c#L83
>>>>>
>>>>>
>>>>> and here:
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/gfxhub_v11_5_0.c#L38
>>>>>
>>>>>
>>>> Got your point. Let me see but this is a lot of work in irq context.
>>>> Either we can drop totally the client id thing as alex is suggesting
>>>> here as its always be same client and src id or let me come up with a
>>>> patch and see if its acceptable.
>>>>
>>>> Also as Alex pointed we need to decode from status register which kind
>>>> of page fault it is (permission, read, write etc) this all is again
>>>> family specific and thats all in IRQ context. Not feeling good about it
>>>> but let me try to share all that in a new patch.
>>>>
>>> I don't think you need to decode it.  As long as you have a way to
>>> identify the chip, we can just include the raw status register and the
>>> developer can decode it when they look at the devcoredump.
>> Got it Alex.
>> I will try to add chip information along with status register value
>> only. We have below two values in adev, i think this along with status
>> register should suffice.
>> enum amd_asic_type        asic_type;
> You can skip asic_type.  It's not really used anymore.
>
>> uint32_t            family;
Ok then only the above family id  is fine. Do we need a string to say 
the family name or the integer value is good enough and developer would 
decode based on it.
> Please also include the PCI DID, VID and RID and
> amdgpu_ip_version(adev, GC_HWIP, 0).  You can include all of the IP
> versions if you want for completeness, but GC should be enough.

Sure noted but i will add this in a new patch which is to add info of 
all IP's of the GPU.

Regards
Sunil.

> Alex
>
>> Regards
>> Sunil Khatri
>>
>>> Alex
>>>
>>>
>>>> Regards
>>>> Sunil.
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> We can argue on values like pasid and vmid and ring id to be taken
>>>>>> off if they are totally not useful.
>>>>>>
>>>>>> Regards
>>>>>> Sunil
>>>>>>
>>>>>>> Christian.
>>>>>>>
>>>>>>>> Alex
>>>>>>>>
>>>>>>>>> regards
>>>>>>>>> sunil
>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Christian.
>>>>>>>>>>
>>>>>>>>>>> Regards
>>>>>>>>>>> Sunil Khatri
>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Christian.
>>>>>>>>>>>>
>>>>>>>>>>>>> Add all such information in the last cached
>>>>>>>>>>>>> pagefault from an interrupt handler.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Sunil Khatri <sunil.khatri at amd.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 9 +++++++--
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 7 ++++++-
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +-
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c | 2 +-
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c  | 2 +-
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c  | 2 +-
>>>>>>>>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 2 +-
>>>>>>>>>>>>>      7 files changed, 18 insertions(+), 8 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>>>> index 4299ce386322..b77e8e28769d 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>>>>>>>>>>>> @@ -2905,7 +2905,7 @@ void amdgpu_debugfs_vm_bo_info(struct
>>>>>>>>>>>>> amdgpu_vm *vm, struct seq_file *m)
>>>>>>>>>>>>>       * Cache the fault info for later use by userspace in
>>>>>>>>>>>>> debugging.
>>>>>>>>>>>>>       */
>>>>>>>>>>>>>      void amdgpu_vm_update_fault_cache(struct amdgpu_device *adev,
>>>>>>>>>>>>> -                  unsigned int pasid,
>>>>>>>>>>>>> +                  struct amdgpu_iv_entry *entry,
>>>>>>>>>>>>>                        uint64_t addr,
>>>>>>>>>>>>>                        uint32_t status,
>>>>>>>>>>>>>                        unsigned int vmhub)
>>>>>>>>>>>>> @@ -2915,7 +2915,7 @@ void amdgpu_vm_update_fault_cache(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>> xa_lock_irqsave(&adev->vm_manager.pasids, flags);
>>>>>>>>>>>>>      -    vm = xa_load(&adev->vm_manager.pasids, pasid);
>>>>>>>>>>>>> +    vm = xa_load(&adev->vm_manager.pasids, entry->pasid);
>>>>>>>>>>>>>          /* Don't update the fault cache if status is 0.  In the
>>>>>>>>>>>>> multiple
>>>>>>>>>>>>>           * fault case, subsequent faults will return a 0 status
>>>>>>>>>>>>> which is
>>>>>>>>>>>>>           * useless for userspace and replaces the useful fault
>>>>>>>>>>>>> status, so
>>>>>>>>>>>>> @@ -2924,6 +2924,11 @@ void amdgpu_vm_update_fault_cache(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>          if (vm && status) {
>>>>>>>>>>>>>              vm->fault_info.addr = addr;
>>>>>>>>>>>>>              vm->fault_info.status = status;
>>>>>>>>>>>>> +        vm->fault_info.client_id = entry->client_id;
>>>>>>>>>>>>> +        vm->fault_info.src_id = entry->src_id;
>>>>>>>>>>>>> +        vm->fault_info.vmid = entry->vmid;
>>>>>>>>>>>>> +        vm->fault_info.pasid = entry->pasid;
>>>>>>>>>>>>> +        vm->fault_info.ring_id = entry->ring_id;
>>>>>>>>>>>>>              if (AMDGPU_IS_GFXHUB(vmhub)) {
>>>>>>>>>>>>>                  vm->fault_info.vmhub = AMDGPU_VMHUB_TYPE_GFX;
>>>>>>>>>>>>>                  vm->fault_info.vmhub |=
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> index 047ec1930d12..c7782a89bdb5 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
>>>>>>>>>>>>> @@ -286,6 +286,11 @@ struct amdgpu_vm_fault_info {
>>>>>>>>>>>>>          uint32_t    status;
>>>>>>>>>>>>>          /* which vmhub? gfxhub, mmhub, etc. */
>>>>>>>>>>>>>          unsigned int    vmhub;
>>>>>>>>>>>>> +    unsigned int    client_id;
>>>>>>>>>>>>> +    unsigned int    src_id;
>>>>>>>>>>>>> +    unsigned int    ring_id;
>>>>>>>>>>>>> +    unsigned int    pasid;
>>>>>>>>>>>>> +    unsigned int    vmid;
>>>>>>>>>>>>>      };
>>>>>>>>>>>>>        struct amdgpu_vm {
>>>>>>>>>>>>> @@ -605,7 +610,7 @@ static inline void
>>>>>>>>>>>>> amdgpu_vm_eviction_unlock(struct amdgpu_vm *vm)
>>>>>>>>>>>>>      }
>>>>>>>>>>>>>        void amdgpu_vm_update_fault_cache(struct amdgpu_device
>>>>>>>>>>>>> *adev,
>>>>>>>>>>>>> -                  unsigned int pasid,
>>>>>>>>>>>>> +                  struct amdgpu_iv_entry *entry,
>>>>>>>>>>>>>                        uint64_t addr,
>>>>>>>>>>>>>                        uint32_t status,
>>>>>>>>>>>>>                        unsigned int vmhub);
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>>>>> index d933e19e0cf5..6b177ce8db0e 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>>>>>>>>>>>>> @@ -150,7 +150,7 @@ static int gmc_v10_0_process_interrupt(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>              status = RREG32(hub->vm_l2_pro_fault_status);
>>>>>>>>>>>>>              WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>>>>      -        amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>>>>> addr,
>>>>>>>>>>>>> status,
>>>>>>>>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>>>>>                               entry->vmid_src ? AMDGPU_MMHUB0(0) :
>>>>>>>>>>>>> AMDGPU_GFXHUB(0));
>>>>>>>>>>>>>          }
>>>>>>>>>>>>>      diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>>>>> index 527dc917e049..bcf254856a3e 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
>>>>>>>>>>>>> @@ -121,7 +121,7 @@ static int gmc_v11_0_process_interrupt(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>              status = RREG32(hub->vm_l2_pro_fault_status);
>>>>>>>>>>>>>              WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>>>>      -        amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>>>>> addr,
>>>>>>>>>>>>> status,
>>>>>>>>>>>>> +        amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>>>>>                               entry->vmid_src ? AMDGPU_MMHUB0(0) :
>>>>>>>>>>>>> AMDGPU_GFXHUB(0));
>>>>>>>>>>>>>          }
>>>>>>>>>>>>>      diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>>>>> index 3da7b6a2b00d..e9517ebbe1fd 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>>>>>>>>>>>>> @@ -1270,7 +1270,7 @@ static int
>>>>>>>>>>>>> gmc_v7_0_process_interrupt(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>          if (!addr && !status)
>>>>>>>>>>>>>              return 0;
>>>>>>>>>>>>>      -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
>>>>>>>>>>>>>                           ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
>>>>>>>>>>>>> status, AMDGPU_GFXHUB(0));
>>>>>>>>>>>>>            if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>>>>> index d20e5f20ee31..a271bf832312 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>>>>>>>>>>>>> @@ -1438,7 +1438,7 @@ static int
>>>>>>>>>>>>> gmc_v8_0_process_interrupt(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>          if (!addr && !status)
>>>>>>>>>>>>>              return 0;
>>>>>>>>>>>>>      -    amdgpu_vm_update_fault_cache(adev, entry->pasid,
>>>>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry,
>>>>>>>>>>>>>                           ((u64)addr) << AMDGPU_GPU_PAGE_SHIFT,
>>>>>>>>>>>>> status, AMDGPU_GFXHUB(0));
>>>>>>>>>>>>>            if (amdgpu_vm_fault_stop == AMDGPU_VM_FAULT_STOP_FIRST)
>>>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>>>>> index 47b63a4ce68b..dc9fb1fb9540 100644
>>>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>>>>>>>> @@ -666,7 +666,7 @@ static int gmc_v9_0_process_interrupt(struct
>>>>>>>>>>>>> amdgpu_device *adev,
>>>>>>>>>>>>>          rw = REG_GET_FIELD(status,
>>>>>>>>>>>>> VM_L2_PROTECTION_FAULT_STATUS, RW);
>>>>>>>>>>>>>          WREG32_P(hub->vm_l2_pro_fault_cntl, 1, ~1);
>>>>>>>>>>>>>      -    amdgpu_vm_update_fault_cache(adev, entry->pasid, addr,
>>>>>>>>>>>>> status, vmhub);
>>>>>>>>>>>>> +    amdgpu_vm_update_fault_cache(adev, entry, addr, status,
>>>>>>>>>>>>> vmhub);
>>>>>>>>>>>>>            dev_err(adev->dev,
>>>>>>>>>>>>> "VM_L2_PROTECTION_FAULT_STATUS:0x%08X\n",