[PATCH] drm/i915: Improve debug print in vm_fault_ttm

Matthew Auld matthew.auld at intel.com
Thu Sep 22 16:38:20 UTC 2022


On 22/09/2022 13:09, Nirmoy Das wrote:
> Print the error code returned by __i915_ttm_migrate()
> for better debuggability.
> 
> References: https://gitlab.freedesktop.org/drm/intel/-/issues/6889
> Signed-off-by: Nirmoy Das <nirmoy.das at intel.com>
> ---
>   drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> index e3fc38dd5db0..9619c0fe1025 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c
> @@ -1034,7 +1034,7 @@ static vm_fault_t vm_fault_ttm(struct vm_fault *vmf)
>   		}
>   
>   		if (err) {
> -			drm_dbg(dev, "Unable to make resource CPU accessible\n");
> +			drm_dbg(dev, "Unable to make resource CPU accessible(err = %pe)\n", err);

Yeah, looks useful. I think for that bug the object is just too large 
for the mappable part of lmem, so this just gives -2big or similar on 
small-bar systems. I presume that the test needs to be updated to 
account for the cpu_size or so.

With the kernel test robot warning fixed:
Acked-by: Matthew Auld <matthew.auld at intel.com>

I looked at the GEM_BUG_ON(rq->reserved_space > ring->space), and I 
think the issue is maybe with emit_pte() using the ring->space to 
manually figure out the number of dwords it can emit (instead of the 
usual ring_begin()), which I guess works, but if we are unlucky and get 
interrupted (like with a very well timed sigbus here), while waiting for 
more ring space and end up bailing early, we might have trampled over 
the reserved_space when submitting the request. I guess normally the 
next ring_begin() would take care of the reserved_space, like when 
constructing the actual copy packet.

>   			dma_resv_unlock(bo->base.resv);
>   			ret = VM_FAULT_SIGBUS;
>   			goto out_rpm;


More information about the dri-devel mailing list