[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/4] dma-buf: add caching of sg_table
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Tue Aug 7 15:13:53 UTC 2018
On 07/08/2018 14:21, Daniel Vetter wrote:
> On Wed, Jul 18, 2018 at 03:24:26PM +0200, Christian König wrote:
>> Hi Daniel,
>>
>> Am 18.07.2018 um 14:07 schrieb Patchwork:
>>> == Series Details ==
>>>
>>> Series: series starting with [1/4] dma-buf: add caching of sg_table
>>> URL : https://patchwork.freedesktop.org/series/46778/
>>> State : failure
>>> [SNIP]
>>
>> it looks like I'm a step further understanding the problems which come with
>> this change.
>>
>> I've more or less audited all use cases and think that only i915 is left
>> with the following lock inversion: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9705/fi-cfl-8700k/igt@gem_mmap_gtt@basic-small-bo-tiledx.html
>>
>> Now my question is what is &obj->mm.lock used for and why do you guys call
>> dma_buf_map_attachment() while holding it?
>
> obj->mm.lock is the lock to protect getting at the backing storage.
> i915_gem_object_get_pages and _put_pages are the relevant functions.
>
> Existing paths want to pin the backing storage while holding the
> reservation lock. And your new path needs to do the inverse since
> dma_buf_map_attachment now also requires the reservation lock. And that is
> obviously called from within the dma-buf importer version of get_pages.
>
> I think there's 2 solutions:
>
> - Merge obj->mm.lock and the reservation lock. Probably the cleaner
> solution, but likely more work.
>
> - Make sure the obj->mm.lock always nests within the reservation lock, and
> grab the reservation lock anywhere it's not yet grabbed. Then you can
> use the dma_buf_map_attachment_locked variant in
> i915_gem_object_get_pages_dmabuf to avoid the locking inversion. This
> would essentially make the obj->mm.lock fully redundant.
>
> Either way is going to be quite a bit of work. I expect that you need to
> replace all the cases of dma_buf_map_attachment in i915 with
> dma_buf_map_attachment_locked, and adjust the entire callchain to the new
> locking scheme.
>
> The real trouble here imo is that i915 CI is just the canary, I expect a
> bunch of other drivers will also look at an inverted locking hierarchy if
> dma_buf_map_attachment needs the reservation lock. And there's no
> convenient CI for them, and code audit won't cut it really (at least I'm
> too stupid to keep the locking hierarchy of an entire driver in my head).
We chatted about this on #intel-gfx and concluded that either solution
derives to replacing the obj->mm.lock with the reservation lock. And
that is problematic for i915, both from the reason of a general
direction towards more fine-grained locking, and also issue that
reservation lock needs to be avoided under the shrinker path (we lock
obj->mm.lock when dropping the backing store there).
I proposed that maybe we could re-jig how we use obj->mm.lock a bit, to
ensure backing store vfunc (get_pages) is not called under it (although
I haven't thought it fully through it may be possible without too
significant drawbacks), but Chris also has some patches which may work
around this in a different way. So I'll wait to see those first.
On whether or not reservation lock is the right lock to use from dma-buf
for this purpose I'll leave other guys to comment - I am not fully into
the details of dma-buf design.
Regards,
Tvrtko
More information about the Intel-gfx
mailing list