[Intel-gfx] [PATCH 2/3] drm/i915: Infrastructure for supporting different GGTT views per object
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Dec 4 04:17:51 PST 2014
On 12/04/2014 10:59 AM, Daniel Vetter wrote:
> On Thu, Dec 04, 2014 at 10:26:14AM +0000, Chris Wilson wrote:
>> On Thu, Dec 04, 2014 at 10:19:09AM +0000, Tvrtko Ursulin wrote:
>>>
>>> On 12/04/2014 09:53 AM, Chris Wilson wrote:
>>>> On Wed, Dec 03, 2014 at 02:59:25PM +0000, Tvrtko Ursulin wrote:
>>>>> +void i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>>>>> + u32 flags)
>>>>> +{
>>>>> + struct sg_table *pages = i915_ggtt_view_pages(vma);
>>>>> +
>>>>> + if (pages && !IS_ERR(pages)) {
>>>>> + vma->bind_vma(vma, pages, cache_level, flags);
>>>>> +
>>>>> + if (vma->ggtt_view.type != I915_GGTT_VIEW_NORMAL) {
>>>>> + sg_free_table(pages);
>>>>> + kfree(pages);
>>>>> + }
>>>>> + }
>>>>> +}
>>>>
>>>> Stop. Even if the failure path is impossible with the present
>>>> implementation, here you are masking the error only to go and pretend
>>>> the binding succeeded.
>>>>
>>>> Don't be lazy, this is a very nasty bug that should be hit during igt -
>>>> or else you are not testing well enough.
>>>
>>> Fair comment, even if a bit too assuming. I actually had this as
>>> TODO but somehow lost it.
>>>
>>> I don't have any ideas on how to provoke this to fail from an IGT?
>>> Even with future implementations it boils down to a couple of small
>>> allocations which would have to fail reliably.
>>
>> We have quite a few thrash tests now that are fairly good at getting
>> even the small allocations to fail.
>>
>> What we don't have is a single-fd, multi-ctx thrash test (well except
>> for some GL tests...)
>
> But none of these tests result in permanent memory failures (only the
> occasional ioctl restart when waiting for gpu rendering). And sg table
> alloc only recurses through the shrinker so that can't happen. So I think
> we just have to get by with review.
>
> We did have issues with sg table allocations in stress tests though,
> before we've added the recursive shrinker locking, hence sg table alloc
> can indeed go south.
I looked at propagating errors from i915_vma_bind() out to callers and
it is mostly all fine apart from the i915_gem_restore_gtt_mappings
during i915_drm_resume.
I don't see how this is fixable apart by going back and having sgls stay
around for the lifetime of their VMAs. It shouldn't be such a big deal -
they are not so big even with non-coalesced entries.
Thoughts?
Regards,
Tvrtko
More information about the Intel-gfx
mailing list