[Intel-gfx] [PATCH] drm/i915: Remove temporary allocation of dma addresses when rotating

Mon Feb 27 14:31:17 UTC 2017

On 27/02/2017 10:21, Chris Wilson wrote:
> On Mon, Feb 27, 2017 at 10:14:12AM +0000, Tvrtko Ursulin wrote:
>>
>> On 27/02/2017 10:06, Chris Wilson wrote:
>>> On Mon, Feb 27, 2017 at 09:55:10AM +0000, Tvrtko Ursulin wrote:
>>>>
>>>> On 22/02/2017 08:44, Chris Wilson wrote:
>>>>> I also think that's an argument for improving the general cache rather
>>>>> than arguing against using it.
>>>>
>>>> Well I wasn't concerned about the cache per se, but about whether it
>>>> is completely appropriate (best choice) to use it in this particular
>>>> case.
>>>>
>>>> Because as I said before, for 1920x1080x32 we are talking about a
>>>> 16KiB extremely short lived temporary allocation, vs the similar
>>>> size for the sg radix cache. But radix cache sticks around the the
>>>> lifetime of obj->mm.pages and it wouldn't otherwise be there since
>>>> AFAICS in practice no one really touches frame buffers in a way to
>>>> trigger its creation.
>>>>
>>>> Those amounts of memory are not a concern, but again, is the
>>>> simplification of the code worth the conceptual downsides mentioned
>>>> above? Even if we considered 4K frame buffers, when both allocations
>>>> go to ~64KiB, would that change anything? I am not sure, probably
>>>> not for me.
>>>>
>>>> So I am still unsure that we should go with this change.
>>>
>>> Again, the complaint you have here are general concerns about caching
>>> the mapping. Avoiding using the cache instead of improving the cache
>>> seems the wrong approach.
>>
>> Depends what kind of improvments to the cache you have in mind. If
>> you are thinking about size then I disagree, I think it is efficient
>> enough already. But if you are thinking about the lifetime
>> management then it is obvious from all that I have written so far
>> that I would agree with that. Since the core of my "complaint" is
>> the lifetime mismatch, and not the size.
>>
>> For lifetime I am not sure what you could do. Exposing the size of
>> it, with maybe some other bits attached the the object, to the
>> shrinker I think doesn't make much sense since the sizes are so
>> small compared to the backing store sizes.
>>
>> Perhaps you could add an explicit reset of the cache after the
>> rotation is done with it, but then the only remaining benefit will
>> be avoiding greater than zero order allocations. I say the only
>> one.. it would still be a good one. Just have no idea if this level
>> of cache usage would satisfy you!
>>
>> Perhaps you could say what kind of optimisation you have in mind to
>> save me guessing? :)
>
> I was thinking you would like an inactivity timer. Or we could have a
> separate shrinker, as that's the principal cache management system.

I thought about the shrinker myself. Even wrote some code to more 
accurately size the objects as part of the existing passes. But as I 
said the contribution of anything object and not backing store is so 
small that, even though it would conceptually be more correct and 
perhaps avoid some marginal over-shrinking, I am not sure it is worth 
doing it. Assuming of course that I got the sizing of the radix tree 
correct! I just hacked something up based on some debug dumping code 
from radix-tree.c.

So the complication is there is no API to get the size of the radix tree 
(or the scatter list table) and we would have to add something, either 
internally to i915, or try and upstream it.

Or we avoid that with your timer idea and just purge all caches which 
haven't been used in a while. Maybe from idle work or something.

But for this immediate patch, would you be happy with adding and 
exporting i915_gem_object_reset_page_iter and calling it after rotation 
is done with accessing the pages? Benefit would be avoidance of 
drm_malloc_gfp if that bothers you most.

Regards,

Tvrtko