[Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/4] dma-buf: add caching of sg_table

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Tue Aug 7 15:13:53 UTC 2018


On 07/08/2018 14:21, Daniel Vetter wrote:
> On Wed, Jul 18, 2018 at 03:24:26PM +0200, Christian König wrote:
>> Hi Daniel,
>>
>> Am 18.07.2018 um 14:07 schrieb Patchwork:
>>> == Series Details ==
>>>
>>> Series: series starting with [1/4] dma-buf: add caching of sg_table
>>> URL   : https://patchwork.freedesktop.org/series/46778/
>>> State : failure
>>> [SNIP]
>>
>> it looks like I'm a step further understanding the problems which come with
>> this change.
>>
>> I've more or less audited all use cases and think that only i915 is left
>> with the following lock inversion: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_9705/fi-cfl-8700k/igt@gem_mmap_gtt@basic-small-bo-tiledx.html
>>
>> Now my question is what is &obj->mm.lock used for and why do you guys call
>> dma_buf_map_attachment() while holding it?
> 
> obj->mm.lock is the lock to protect getting at the backing storage.
> i915_gem_object_get_pages and _put_pages are the relevant functions.
> 
> Existing paths want to pin the backing storage while holding the
> reservation lock. And your new path needs to do the inverse since
> dma_buf_map_attachment now also requires the reservation lock. And that is
> obviously called from within the dma-buf importer version of get_pages.
> 
> I think there's 2 solutions:
> 
> - Merge obj->mm.lock and the reservation lock. Probably the cleaner
>    solution, but likely more work.
> 
> - Make sure the obj->mm.lock always nests within the reservation lock, and
>    grab the reservation lock anywhere it's not yet grabbed. Then you can
>    use the dma_buf_map_attachment_locked variant in
>    i915_gem_object_get_pages_dmabuf to avoid the locking inversion. This
>    would essentially make the obj->mm.lock fully redundant.
> 
> Either way is going to be quite a bit of work. I expect that you need to
> replace all the cases of dma_buf_map_attachment in i915 with
> dma_buf_map_attachment_locked, and adjust the entire callchain to the new
> locking scheme.
> 
> The real trouble here imo is that i915 CI is just the canary, I expect a
> bunch of other drivers will also look at an inverted locking hierarchy if
> dma_buf_map_attachment needs the reservation lock. And there's no
> convenient CI for them, and code audit won't cut it really (at least I'm
> too stupid to keep the locking hierarchy of an entire driver in my head).

We chatted about this on #intel-gfx and concluded that either solution 
derives to replacing the obj->mm.lock with the reservation lock. And 
that is problematic for i915, both from the reason of a general 
direction towards more fine-grained locking, and also issue that 
reservation lock needs to be avoided under the shrinker path (we lock 
obj->mm.lock when dropping the backing store there).

I proposed that maybe we could re-jig how we use obj->mm.lock a bit, to 
ensure backing store vfunc (get_pages) is not called under it (although 
I haven't thought it fully through it may be possible without too 
significant drawbacks), but Chris also has some patches which may work 
around this in a different way. So I'll wait to see those first.

On whether or not reservation lock is the right lock to use from dma-buf 
for this purpose I'll leave other guys to comment - I am not fully into 
the details of dma-buf design.

Regards,

Tvrtko


More information about the Intel-gfx mailing list