[Intel-gfx] [PATCH] drm/i915: Improve debug print in vm_fault_ttm

Das, Nirmoy nirmoy.das at intel.com
Fri Sep 23 07:27:54 UTC 2022


On 9/22/2022 6:38 PM, Matthew Auld wrote:
> On 22/09/2022 13:09, Nirmoy Das wrote:
>> Print the error code returned by __i915_ttm_migrate()
>> for better debuggability.
>>
>> References: https://gitlab.freedesktop.org/drm/intel/-/issues/6889
>> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
>> ---
>>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c 
>> b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> index e3fc38dd5db0..9619c0fe1025 100644
>> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
>> @@ -1034,7 +1034,7 @@ static vm_fault_t vm_fault_ttm(struct vm_fault 
>> *vmf)
>>           }
>>             if (err) {
>> -            drm_dbg(dev, "Unable to make resource CPU accessible\n");
>> +            drm_dbg(dev, "Unable to make resource CPU accessible(err 
>> = %pe)\n", err);
>
> Yeah, looks useful. I think for that bug the object is just too large 
> for the mappable part of lmem, so this just gives -2big or similar on 
> small-bar systems. I presume that the test needs to be updated to 
> account for the cpu_size or so.


Yeah, can't think of any other case. The test need to be updated, going 
to send out igt fixes for this.

>
> With the kernel test robot warning fixed:
> Acked-by: Matthew Auld <matthew.auld at intel.com>


Thanks, I will resend a updated one.

>
> I looked at the GEM_BUG_ON(rq->reserved_space > ring->space), and I 
> think the issue is maybe with emit_pte() using the ring->space to 
> manually figure out the number of dwords it can emit (instead of the 
> usual ring_begin()), which I guess works, but if we are unlucky and 
> get interrupted (like with a very well timed sigbus here), while 
> waiting for more ring space and end up bailing early, we might have 
> trampled over the reserved_space when submitting the request. I guess 
> normally the next ring_begin() would take care of the reserved_space, 
> like when constructing the actual copy packet.


I am not so familiar with the code but sounds logical.


Nirmoy

>
>> dma_resv_unlock(bo->base.resv);
>>               ret = VM_FAULT_SIGBUS;
>>               goto out_rpm;


More information about the Intel-gfx mailing list