[Mesa-dev] [PATCH v3 10/25] panfrost: Make sure the BO is 'ready' when picked from the cache

Thu Sep 5 21:26:39 UTC 2019

On Thu, 5 Sep 2019 16:43:23 -0400
Alyssa Rosenzweig <alyssa at rosenzweig.io> wrote:

> > +bool
> > +panfrost_bo_wait(struct panfrost_bo *bo, int64_t timeout_ns)
> > +{
> > +        struct drm_panfrost_wait_bo req = {
> > +                .handle = bo->gem_handle,
> > +		.timeout_ns = timeout_ns,
> > +        };
> > +        int ret;
> > +
> > +        ret = drmIoctl(bo->screen->fd, DRM_IOCTL_PANFROST_WAIT_BO, &req);
> > +        if (ret != -1)
> > +                return true;
> > +
> > +        assert(errno == ETIMEDOUT || errno == EBUSY);
> > +        return false;
> > +}  
> 
> I would appreciate a comment explaining what the return value of this
> ioctl is. `ret != -1` and asserting an errno is... suspicious? Not
> wrong, to my knowledge, but hard to decipher without context.

Will document that.

> 
> > +        /* Before creating a BO, we first want to check the cache but without
> > +         * waiting for BO readiness (BOs in the cache can still be referenced
> > +         * by jobs that are not finished yet).
> > +         * If the cached allocation fails we fall back on fresh BO allocation,
> > +         * and if that fails too, we try one more time to allocate from the
> > +         * cache, but this time we accept to wait.
> >           */  
> 
> Conceptually:
> 
> We first try a ready BO from the cache. OK.
> 
> If that fails, there is no BO in the cache that is currently ready for
> use; by definition of BO readiness, this is because another concurrent
> job is using it. We then try to create a new BO. Suppose a given job
> uses an average of `b` BOs. Then for `j` concurrent jobs, assuming all
> of these allocations succeed, we have `j * b` BOs in the cache. This is
> an unfortunate bump in memory usage but necessary for pipelining.
> 
> If that allocation fails, by definition of memory allocation failures,
> we ran out of memory and cannot proceed with the allocation. Either:
> 
>  - The BO cache is responsible for this. In this case, continuing to use
>    the BO cache (even with the waits) will just dig us deeper into the
>    hole. Perhaps we should call bo_evict_all from userspace to handle
>    the memory pressure? Or does madvise render this irrelevant?

Evict won't help here as memory will only be released after the jobs
are done using it. And madvise doesn't help either, for the same reason.

> 
>  - The BO cache is not responsible for this. In this case, we could
>    continue to use the BO cache, but then either:
> 
>    	- There is a BO we can wait for. Then waiting is okay.
> 	- There is not. Then that cache fetch fails and we kerplutz.
> 	  What now? If we need an allocation, cache or no cache, if the
> 	  kernel says no, no means no. What then?

The behavior hasn't changed regarding allocation failures: it's still
an assert(), so the code is not more or less buggy than it was :p. What
happens when assert()s are disabled? probably a segfault because of a
NULL pointer dereference. So, adding the fprintf() is probably a good
idea as a first step, and then we can see if we can handle the OOM case
gracefully.

> 
> In short, I'm not convinced this algorithm (specifically the last step)
> is ideal.

It really depends on how robust you want to be when the system is under
memory pressure vs how long you accept to wait. Note that, in the worst
case scenario we wouldn't wait more than we currently do, as having each
batch wait on BOs of the previous batch is just like the serialization
we had in panfrost_flush(). I don't see it as a huge problem, but maybe
I'm wrong.

> 
> If there is no memory left for us, is it responsible to continue at all?

It's not exactly no memory, it's no immediately available memory.

> Should we just fail the allocation after step 2, and if the caller has a
> problem with that, it's their issue? Or we abort here after step 2?

I think that one is a separate issue. I mean, that's something we have
to handle even if we go through step 3 and step 3 fails. 

> I
> don't like the robustness implications but low memory behaviour is a
> risky subject as it is; I don't want to add more unknowns into it --
> aborting it with an assert(0) is something we can recognize immediately.
> Strange crashes in random places with no explanation, less so.

And that hasn't changed. We still have an assert after step 3.

> 
> CC'ing Rob to see if he has any advise re Panfrost madvise interactions
> as well as general kernel OOM policy.