[Mesa-dev] [PATCH 4/4] i965: Drop the batch and limp along if execbuf fails.

Jason Ekstrand jason at jlekstrand.net
Mon Sep 25 18:41:39 UTC 2017


On September 25, 2017 11:46:18 AM Kenneth Graunke <kenneth at whitecape.org> 
wrote:

> On Monday, September 25, 2017 8:05:29 AM PDT Chris Wilson wrote:
>> Quoting Jason Ekstrand (2017-09-24 22:53:04)
>> > I've got this a few times recently and it's really annoying.  I don't know
>> > if this will fix anything or not but it may be worth a go.  I fear,
>> > however, that ignoring an execbuf failure will lead to permanently
>> > corrupted rendering or even additional hangs due to a chunk of the stream
>> > being missing.  That seems undesirable.  I would feel more comfortable
>> > about if you flagged BRW_NEW_CONTEXT in this cases to force a full state
>> > re-emit.
>>
>> The other step to consider is dependency chains. If you replace the
>> batch with an empty one (just MI_BB_END) and execute it with the rest of
>> the execobjects, then all the fences (both implicit and explicit) will
>> be valid. Just the contents garbage, and gpu state can be fixed up by
>> NEW_CONTEXT as Jason suggested.
>> -Chris
>
> That seems like a good idea.  What do you think we should do if that
> fails?  I guess it would still try to pin all the execobject BOs, so
> couldn't it still hit a low memory problem and fail?
>
> Jason's email also reminded me that there's a bunch of CPU-side "current
> state of the GPU" fields that we'd need to reset too (which I also need
> to fix when we exceed the aperture).  They're kind of scattered all over
> the place now, so I'll try and round them up and fix that...

The most dangerous one I can think of is URB size.  That one can lead to 
done nasty issues.

--Jason




More information about the mesa-dev mailing list