[PATCH 2/2] drm/radeon: fix deadlock when bo is associated to different handle
Jerome Glisse
j.glisse at gmail.com
Wed Nov 28 07:38:20 PST 2012
On Wed, Nov 28, 2012 at 11:27:27AM +0100, Christian König wrote:
> On 27.11.2012 19:07, j.glisse at gmail.com wrote:
> >From: Jerome Glisse <jglisse at redhat.com>
> >
> >There is a rare case, that seems to only happen accross suspend/resume
> >cycle, where a bo is associated with several different handle. This
> >lead to a deadlock in ttm buffer reservation path. This could only
> >happen with flinked(globaly exported) object. Userspace should not
> >reopen multiple time a globaly exported object.
> >
> >However the kernel should handle gracefully this corner case and not
> >keep rejecting the userspace command stream. This is the object of
> >this patch.
> >
> >Fix suspend/resume issue where user see following message :
> >[drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!
> >
> >Signed-off-by: Jerome Glisse <jglisse at redhat.com>
>
> See comment below.
>
> >---
> > drivers/gpu/drm/radeon/radeon_cs.c | 53 ++++++++++++++++++++++----------------
> > 1 file changed, 31 insertions(+), 22 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
> >index 41672cc..064e64d 100644
> >--- a/drivers/gpu/drm/radeon/radeon_cs.c
> >+++ b/drivers/gpu/drm/radeon/radeon_cs.c
> >@@ -54,39 +54,48 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
> > return -ENOMEM;
> > }
> > for (i = 0; i < p->nrelocs; i++) {
> >- struct drm_radeon_cs_reloc *r;
> >-
> >+ struct drm_radeon_cs_reloc *reloc;
> >+
> >+ /* One bo could be associated with several different handle.
> >+ * Only happen for flinked bo that are open several time.
> >+ *
> >+ * FIXME:
> >+ * Maybe we should consider an alternative to idr for gem
> >+ * object to insure a 1:1 uniq mapping btw handle and gem
> >+ * object.
> >+ */
> > duplicate = false;
> >- r = (struct drm_radeon_cs_reloc *)&chunk->kdata[i*4];
> >+ reloc = (struct drm_radeon_cs_reloc *)&chunk->kdata[i*4];
> >+ p->relocs[i].handle = 0;
> >+ p->relocs[i].flags = reloc->flags;
> >+ p->relocs[i].gobj = drm_gem_object_lookup(ddev,
> >+ p->filp,
> >+ reloc->handle);
> >+ if (p->relocs[i].gobj == NULL) {
> >+ DRM_ERROR("gem object lookup failed 0x%x\n",
> >+ reloc->handle);
> >+ return -ENOENT;
> >+ }
> >+ p->relocs[i].robj = gem_to_radeon_bo(p->relocs[i].gobj);
> >+ p->relocs[i].lobj.bo = p->relocs[i].robj;
> >+ p->relocs[i].lobj.wdomain = reloc->write_domain;
> >+ p->relocs[i].lobj.rdomain = reloc->read_domains;
> >+ p->relocs[i].lobj.tv.bo = &p->relocs[i].robj->tbo;
> >+
> > for (j = 0; j < i; j++) {
> >- if (r->handle == p->relocs[j].handle) {
> >+ if (p->relocs[i].lobj.bo == p->relocs[j].lobj.bo) {
> > p->relocs_ptr[i] = &p->relocs[j];
> > duplicate = true;
> > break;
> > }
> > }
> >+
> > if (!duplicate) {
> >- p->relocs[i].gobj = drm_gem_object_lookup(ddev,
> >- p->filp,
> >- r->handle);
> >- if (p->relocs[i].gobj == NULL) {
> >- DRM_ERROR("gem object lookup failed 0x%x\n",
> >- r->handle);
> >- return -ENOENT;
> >- }
> > p->relocs_ptr[i] = &p->relocs[i];
> >- p->relocs[i].robj = gem_to_radeon_bo(p->relocs[i].gobj);
> >- p->relocs[i].lobj.bo = p->relocs[i].robj;
> >- p->relocs[i].lobj.wdomain = r->write_domain;
> >- p->relocs[i].lobj.rdomain = r->read_domains;
> >- p->relocs[i].lobj.tv.bo = &p->relocs[i].robj->tbo;
> >- p->relocs[i].handle = r->handle;
> >- p->relocs[i].flags = r->flags;
> >+ p->relocs[i].handle = reloc->handle;
> > radeon_bo_list_add_object(&p->relocs[i].lobj,
> > &p->validated);
> >-
> >- } else
> >- p->relocs[i].handle = 0;
>
> I'm not sure if the memory p->relocs is pointing to is zero
> initialized, so we should at least initialize whatever member we use
> to find the duplicates. Also I think we don't need the handle in
> this structure any more if we don't use it for comparison (but not
> 100% sure without testing it).
No need to initialize p->relocs[i].lobj.bo which is the one use to find
duplicate. When a duplicate is found p->relocs_ptr[i] points to first
relocation with the duplicate bo. p->relocs[i].lobj.bo is always
initialized before looking for duplicate.
I kept the handle around because its usefull for debuging. But it could
as well be removed and just added back whenever someone is doing debugging.
Cheers,
Jerome
>
> >+ }
> > }
> > return radeon_bo_list_validate(&p->validated);
> > }
>
More information about the dri-devel
mailing list