[Mesa-dev] [PATCH v3 10/25] panfrost: Make sure the BO is 'ready' when picked from the cache
Alyssa Rosenzweig
alyssa at rosenzweig.io
Thu Sep 5 22:21:10 UTC 2019
> Will document that.
+1
> Evict won't help here as memory will only be released after the jobs
> are done using it. And madvise doesn't help either, for the same reason.
Ah-ha, I understand the distinction; thank you.
> The behavior hasn't changed regarding allocation failures: it's still
> an assert(), so the code is not more or less buggy than it was :p. What
> happens when assert()s are disabled? probably a segfault because of a
> NULL pointer dereference. So, adding the fprintf() is probably a good
> idea as a first step, and then we can see if we can handle the OOM case
> gracefully.
Haha, that's reasonable. I'm wondering if we should try some assert-less
stress test but maybe that doesn't matter until "productization".
> > In short, I'm not convinced this algorithm (specifically the last step)
> > is ideal.
>
> It really depends on how robust you want to be when the system is under
> memory pressure vs how long you accept to wait. Note that, in the worst
> case scenario we wouldn't wait more than we currently do, as having each
> batch wait on BOs of the previous batch is just like the serialization
> we had in panfrost_flush(). I don't see it as a huge problem, but maybe
> I'm wrong.
Ya, I don't know; these seem like hard problems to say the least :-(
> > If there is no memory left for us, is it responsible to continue at all?
>
> It's not exactly no memory, it's no immediately available memory.
'fraid I don't know enough about Linux allocators to grok the
distinction.
> > Should we just fail the allocation after step 2, and if the caller has a
> > problem with that, it's their issue? Or we abort here after step 2?
>
> I think that one is a separate issue. I mean, that's something we have
> to handle even if we go through step 3 and step 3 fails.
>
> > I
> > don't like the robustness implications but low memory behaviour is a
> > risky subject as it is; I don't want to add more unknowns into it --
> > aborting it with an assert(0) is something we can recognize immediately.
> > Strange crashes in random places with no explanation, less so.
>
> And that hasn't changed. We still have an assert after step 3.
OK, let's hold off until after this series is merged :)
More information about the mesa-dev
mailing list