[Intel-gfx] [PATCH 3/8] drm/i915: Cope with request list state change during error state capture

Daniel Vetter daniel at ffwll.ch
Tue Oct 13 04:39:03 PDT 2015


On Fri, Oct 09, 2015 at 12:25:26PM +0100, Tomas Elf wrote:
> On 09/10/2015 08:48, Chris Wilson wrote:
> >On Thu, Oct 08, 2015 at 07:31:35PM +0100, Tomas Elf wrote:
> >>Since we're not synchronizing the ring request list during error state capture
> >>the request list state might change between the time the corresponding error
> >>request list was allocated and dimensioned to the time when the ring request
> >>list is actually captured into the error state. If this happens, throw a
> >>WARNING and do early exit and be aware that the captured error state might not
> >>be fully reliable.
> >>
> >>Signed-off-by: Tomas Elf <tomas.elf at intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gpu_error.c | 13 +++++++++++++
> >>  1 file changed, 13 insertions(+)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>index 32c1799..cc75ca4 100644
> >>--- a/drivers/gpu/drm/i915/i915_gpu_error.c
> >>+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> >>@@ -1071,6 +1071,19 @@ static void i915_gem_record_rings(struct drm_device *dev,
> >>  		list_for_each_entry_safe(request, tmpreq, &ring->request_list, list) {
> >>  			struct drm_i915_error_request *erq;
> >>
> >>+			if (WARN_ON(!request || count >= error->ring[i].num_requests)) {
> >
> >Request cannot be null, count can legitmately be more, the WARN on is
> >inappropriate. Again, I sent several patches over the past couple of
> >years to fix this.
> >-Chris
> >
> 
> Ok, so having the driver request list change grow in between the point where
> we allocate and set up the error state request list to the same size as the
> driver request list (since that's what count being larger than the list size
> implies) is legitimate? Traversing into unallocated memory seems pretty
> dodgy to me but if you say so.

We still need to handle it ofc, but just not WARN on this condition since
it can happen.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


More information about the Intel-gfx mailing list