[Mesa-dev] [android-x86-devel] Re: need-help: how to change to newest mesa in android-x86?
Rob Clark
robdclark at gmail.com
Fri Jan 22 09:10:40 PST 2016
On Fri, Jan 22, 2016 at 10:29 AM, Rob Herring <robherring2 at gmail.com> wrote:
> On Mon, Jan 18, 2016 at 4:46 PM, Rob Clark <robdclark at gmail.com> wrote:
>> On Mon, Jan 18, 2016 at 5:18 PM, Rob Herring <robherring2 at gmail.com> wrote:
>>> On Fri, Jan 15, 2016 at 3:10 PM, Dave Airlie <airlied at gmail.com> wrote:
>>>>>
>>>>> well, nothing specific, but for example early on we had some confusion
>>>>> in drm_gralloc (when adding dmabuf fd support) about who close()d the
>>>>> fd's. Resulting in same fd getting closed twice (although some
>>>>> completely unrelated file handle might have snuck into that slot
>>>>> between the two close()s).
>>>>>
>>>>> Or you could end up w/ the same sort of thing for gem handles (which
>>>>> are unique to the drm device fd, so if you open() the device multiple
>>>>> times..). I did initially have some confusion about this, w/ multiple
>>>>> fd_bo's (or pipe_resource's) being created for a given gem handle, and
>>>>> first one that gets deleted takes away the backing store that both
>>>>> userspace references where using.
>>>>>
>>>>> You might want to look at freedreno_drm_winsys.c and the crazy things
>>>>> it does.. I suspect virgl might need similar hacks. That is really
>>>>> one part of two, that ensures that we share the same pipe_screen, and
>>>>> therefore (from libdrm) fd_device, for a given drm device fd. The
>>>>> other part is that fd_device has hashtables to deal w/ double import
>>>>> of gem buffers.
>>>>
>>>> I think Rob Clark might be on the right track here.
>>>>
>>>> it sounds like the host is sending double deletes for some objects, which
>>>> sounds like a possible reference counting bug in the host.
>>>>
>>>> So possibly something drm_gralloc, but it could be a bug in mesa, though
>>>> I haven't triggered anything like this yet.
>>>
>>> Some more details. If I comment out the freeing of imported buffers in
>>> gralloc_drm_handle_unregister, things work a lot better. I think there
>>> is a missing reference taken on an alloc with an existing dmabuf FD.
>>> I've been comparing freedreno vs. pipe alloc/free implementations for
>>> gralloc. The freedreno alloc function will call drmPrimeFDToHandle and
>>> increment the BO ref count (in lookup_bo). The equivalent for virgl is
>>> in virgl_drm_winsys_resource_create_handle. It calls
>>> drmPrimeFDToHandle, but there is no reference taken. The shared handle
>>> case in that function does take a reference with a
>>> virgl_drm_resource_reference call. I may be off as my attempted fix
>>> doesn't work...
>>
>> Hmm, so I guess I should have removed gralloc_drm_freedreno, but by
>> the time I actually got things working properly on ifc6410 I was using
>> gralloc_drm_pipe.c.. sorry if you had been looking at the wrong code.
>
> I was and should have realized that. Still, it got me looking at the
> right area. The virgl driver is missing a BO handle hash table for
> prime fd handles. It is only doing handle lookups for shared handles.
> I've fixed that, but have broken the shared handle hashing in the
> process. Looking at other drivers, it appears I need to have 2 hash
> tables. I'm guessing shared handles and prime handles are from
> different number spaces?
I suppose there are actually three name-spaces (handles, flink names,
and dmabuf/prime fd's)..
IIRC, you end up needing an extra table for flink name importing,
otherwise the kernel will happily hand you back multiple gem handles
for the same bo, which causes similar sorts of confusion.
>> I originally abandoned gralloc_drm_freedreno once I realized that for
>> the refcnt'ing to work out I needed gralloc to share the same
>> fd_device ptr as mesa/gl. And by using gralloc_drm_pipe (plus that
>> freedreno winsys code I pointed to earlier) both gralloc and the gl
>> context share the same pipe_screen (and fd_device).
>
> Looks like this was the main problem. With screen ref counting and
> prime handle lookups, virgl seems to be working now.
>
> The radeon driver also does the same ref counting. Shouldn't this be
> common code rather than duplicated in each driver?
Yeah, probably. I think radeon and nouveau ended up implementing
similar sort of code for buffer sharing between gl and vdpau, where
basically the same sort of issue comes up. And I copied from one of
the two once I realized what was going on.
Might even be some room for some re-use on the libdrm side, but things
kinda grew organically there too. In both cases it is just a little
bit of code you more or less write / copypaste once and then never
look at again, so I guess by the time things work people forget to go
back and refactor.. at least that's my excuse ;-)
BR,
-R
More information about the mesa-dev
mailing list