[PATCH RFC 102/111] staging: etnaviv: separate GPU pipes from execution state

Wed Apr 8 00:28:03 PDT 2015

2015-04-07 23:25 GMT+02:00 Russell King - ARM Linux <linux at arm.linux.org.uk>:
> On Tue, Apr 07, 2015 at 06:59:59PM +0200, Christian Gmeiner wrote:
>> Hi Lucas.
>>
>> 2015-04-07 17:29 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
>> > And I don't get why each core needs to have a single device node. IMHO
>> > this is purely an implementation decision weather to have one device
>> > node for all cores or one device node per core.
>>
>> It is an important decision. And I think that one device node per core
>> reflects the hardware design to 100%.
>
> Since when do the interfaces to userspace need to reflect the hardware
> design?
>
> Isn't the point of having a userspace interface, in part, to abstract
> the hardware design details and provide userspace with something that
> is relatively easy to use without needlessly exposing the variation
> of the underlying hardware?
>
> Please get away from the idea that userspace interfaces should reflect
> the hardware design.
>

I think that we are in a phase of heavy discussion and we should talk about
every aspect to design the driver - keep in mind that we could skip staging
and then the interface needs to future proof.

>> What makes harder to get it right? The needed changes to the kernel
>> driver are not that hard. The user space is an other story but thats
>> because of the render-only thing, where we need to pass (prime)
>> buffers around and do fence syncs etc. In the end I do not see a
>> showstopper in the user space.
>
> The fence syncs are an issue when you have multiple cores - that's
> something I started to sort out in my patch series, but when you
> appeared to refuse to accept some of the patches, I stopped...
>

I hope we can close this chapter soon. I am quite sorry about that but if
you only would answered a single mail or a single irc message at that time
we could have sorted this out.

> The problem when you have multiple cores is one global fence event
> counter which gets compared to the fence values in each buffer
> object no longer works.
>
> Consider this scenario:
>
> You have two threads, thread A making use of a 2D core, and thread B
> using the 3D core.
>
> Thread B submits a big long render operation, and the buffers get
> assigned fence number 1.
>
> Thread A submits a short render operation, and the buffers get assigned
> fence number 2.
>
> The 2D core finishes, and sends its interrupt.  Etnaviv updates the
> completed fence position to 2.
>
> At this point, we believe that fence numbers 1 and 2 are now complete,
> despite the 3D core continuing to execute and operate on the buffers
> with fence number 1.
>

Yes, this _is_ a problem.

> I'm certain that the fence implementation we currently have can't be
> made to work with multiple cores with a few tweeks - we need something
> better to cater for what is essentially out-of-order completion amongst
> the cores.
>
> A simple resolution to that _would_ be your argument of exposing each
> GPU as a separate DRM node, because then we get completely separate
> accounting of each - but it needlessly adds an expense in userspace.
> Userspace would have to make multiple calls - to each GPU DRM node -
> to check whether the buffer is busy on any of the GPUs as it may not
> know which GPU could be using the buffer, especially if it got it via
> a dmabuf fd sent over the DRI3 protocol.  To me, that sounds like a
> burden on userspace.
>

greets
--
Christian Gmeiner, MSc

https://soundcloud.com/christian-gmeiner