[Intel-gfx] [PATCH] drm/i915: Convert WARNs during userptr revoke to SIGBUS
Tvrtko Ursulin
tvrtko.ursulin at linux.intel.com
Thu Sep 24 03:55:23 PDT 2015
On 09/24/2015 11:31 AM, Chris Wilson wrote:
> On Thu, Sep 24, 2015 at 11:23:48AM +0100, Tvrtko Ursulin wrote:
>>
>> On 09/23/2015 09:07 PM, Chris Wilson wrote:
>>> If the client revokes the virtual address it asked to be mapped into GPU
>>> space via userptr (by using anything along the lines of mmap, mprotect,
>>> madvise, munmap, ftruncate etc) the mmu notifier sends a range
>>> invalidate command to userptr. Upon receiving the invalidation signal
>>> for the revoked range, we try to release the struct pages we pinned into
>>> the GTT. However, this can fail if any of the GPU's VMA are pinned for
>>> use by the hardware (i.e. despite the user's intention we cannot
>>> relinquish the client's address range and keep uptodate with whatever is
>>> placed in there). Currently we emit a few WARN so that we would notice
>>> if this every occurred in the wild; it has. Sadly this means we need to
>>> replace those WARNs with the proper SIGBUS to the offending clients
>>> instead.
>>
>> How does it happen? Frame buffer?
>
> Ignoring the issue of -EIO since patches to fix that path also haven't
> landed, the primary cause is through binding the userptr to a scanout
> (framebuffer). This is not recommended usage for userptr since the CPU
> view is then incoherent, but not impossible.
>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
>>> Cc: MichaĆ Winiarski <michal.winiarski at intel.com>
>>> ---
>>> drivers/gpu/drm/i915/i915_gem_userptr.c | 41 +++++++++++++++++++++++++++++----
>>> 1 file changed, 37 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> index f75d90118888..efb404b9fe69 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
>>> @@ -81,11 +81,44 @@ static void __cancel_userptr__worker(struct work_struct *work)
>>
>> This line is a reminder the previous series still hasn't landed. I
>> think it was all r-b-ed, with only my request to not rely on
>> release_pages (or something) handle negative and zero page count.
>>
>>> was_interruptible = dev_priv->mm.interruptible;
>>> dev_priv->mm.interruptible = false;
>>>
>>> - list_for_each_entry_safe(vma, tmp, &obj->vma_list, obj_link) {
>>> - int ret = i915_vma_unbind(vma);
>>> - WARN_ON(ret && ret != -EIO);
>>> + list_for_each_entry_safe(vma, tmp, &obj->vma_list, obj_link)
>>> + i915_vma_unbind(vma);
>>> + if (i915_gem_object_put_pages(obj)) {
>>> + struct task_struct *p;
>>> +
>>> + DRM_ERROR("Unable to revoke ownership by userptr of"
>>> + " invalidated address range, sending SIGBUS"
>>> + " to attached clients.\n");
>>> +
>>> + rcu_read_lock();
>>> + for_each_process(p) {
>>
>> I don't think this is safe this without holding the tasklist_lock.
>
> Hmm, it's the only lock taken in the oom-killer for sending the signal.
> The list will not change nor will tasks disappear whilst we hold the
> read-lock so it seems sane.
Then I'll say hmm as well. Since I've now seen there is both in use,
with and without holding the tasklist_lock.
I thought that with just rcu_read_lock, nothing prevents another CPU
from obtaining the write tasklist_lock and mess about with it. But maybe
we are talking about some complex locking scheme here? I don't know. Did
not find any documentation on the tasklist_lock..
Regards,
Tvrtko
More information about the Intel-gfx
mailing list