[Mesa-dev] Using the right context in st_texture_release_all_sampler_views()

Thu Jul 23 01:56:20 PDT 2015

On 23/07/15 01:08, Jose Fonseca wrote:
> On 23/07/15 01:00, Brian Paul wrote:
>> On 07/22/2015 05:31 PM, Jose Fonseca wrote:
>>> On 22/07/15 23:32, Brian Paul wrote:
>>>> Hi Marek,
>>>>
>>>> This is regarding your commit "st/mesa: use pipe_sampler_view_release
>>>> for releasing sampler views" from last October.
>>>>
>>>> Basically, we have:
>>>>
>>>> void
>>>> st_texture_release_all_sampler_views(struct st_context *st,
>>>>                                       struct st_texture_object *stObj)
>>>> {
>>>>     GLuint i;
>>>>
>>>>     /* XXX This should use sampler_views[i]->pipe, not st->pipe */
>>>>     for (i = 0; i < stObj->num_sampler_views; ++i)
>>>>        pipe_sampler_view_release(st->pipe, &stObj->sampler_views[i]);
>>>> }
>>>>
>>>> Our VMware/svga driver has an issue when
>>>> pipe_context::sampler_view_destroy() is called with one context and a
>>>> sampler view which was created with another context.  I can hack around
>>>> it in our driver code, but it would be nice to fix this in the state
>>>> tracker.
>>>>
>>>> Ideally, the above code should be something like:
>>>>
>>>>     for (i = 0; i < stObj->num_sampler_views; ++i)
>>>>        pipe_sampler_view_reference(&stObj->sampler_views[i], NULL);
>>>>
>>>> The current code which uses the st->pipe context came from the bug
>>>> https://bugs.freedesktop.org/show_bug.cgi?id=81680
>>>>
>>>> AFAICT, you were just working around an R600 driver issue.  Any chance
>>>> we could fix the state tracker and re-test Firefox on R600?
>>>
>>> Freeing a view from a different context with another context is wrong.
>>>
>>> But freeing a view with a context that might be current on another
>>> thread is also wrong, as pipe_context are not thread safe.
>>>
>>>
>>> If that was the previous behavior, then maybe Firefox crashed due to the
>>> race conditions.  Looking at the stack trace, I suspect that what
>>> happened was that one context was freeing the st_texture, and all its
>>> views, while the other context was destroying itself, and all its own
>>> views.  In short, st_texture_release_sampler_view was being called
>>> simultanouesly from two different threads.
>>>
>>>
>>> The proper fix IMO is to let the pipe context that owns the view to
>>> destroyed from whatever thread it is current (or whenever it is made
>>> current.)
>>
>> OK, in our off-list discussion it wasn't clear to me that
>> multi-threading was your main concern until your last message.  I see
>> what you're saying now.
>>
>> So, when we destroy a texture object which may be shared by multiple
>> contexts, we theoretically need to move the texture's sampler views to
>> new, per-context lists.  At some later time, when the context is used,
>> we'd check if the list of sampler views to be destroyed was non-empty
>> and free them.
>>
>> Unfortunately, that involves one context/thread reaching into another
>> context.  And that sounds messy.
>
> Yes.  We could have a sepeare mutex for this list.
>
> Still, it's hard to prevent dead-locks (e.g, two contexts trying to add
> a deferred view to each-others defferred list.)
>
>> The ultimate solution may be to get rid of the per-texture list of
>> sampler views and instead store sampler views in a context-private data
>> structure.  I'll have to think that through.
>
> Yes, that's another possibility. I haven't thought it through either.
>
> One way or another, you'll still want to notify a context to "garbagge
> collect" its views, even if it's just an atomic flag on each context,
> otherwise a texture that was shared by two contexts might end up
> lingering around indefinitely because contexts are holding to its views.

What about this:

- we keep the surface views in a LRU-ordered linked list in each context 
  (whenever a view is unbounded, it's moved to the end of the list the 
time-stamp/frame-sequence number noted)

- and we dereference the old views whenever they haven't been refered 
for a while -- either in time (e.g, 1 second) or frames (e.g, 1 or 2 
frames.)

This has several benefits:

- the context that destroys the texture object doesn't need to worry 
about views from other contexts -- those context will automatically 
destroy them on their time

- there's no need for any context to ever need to refer other context's 
data.

- drivers for which views are heavy-weight  (e.g, contain auxiliary 
resources with their own storage) will reclaim that storage earlier if 
those views stop being needed

The aspect I'm less clear is how to effeciently find the views -- in 
addition of the LRU list, we'll probably need an additional data 
structure (an hash, or a table of lists indexed by texture object) to 
quickly find the views.

Jose