[PATCH RFC 102/111] staging: etnaviv: separate GPU pipes from execution state

Tue Apr 7 09:59:59 PDT 2015

Hi Lucas.

2015-04-07 17:29 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
> Am Dienstag, den 07.04.2015, 17:13 +0200 schrieb Christian Gmeiner:
>> 2015-04-07 17:01 GMT+02:00 Lucas Stach <l.stach at pengutronix.de>:
>> > Am Dienstag, den 07.04.2015, 16:51 +0200 schrieb Jon Nettleton:
>> >>
>> >>
>> >> On Tue, Apr 7, 2015 at 4:38 PM, Alex Deucher <alexdeucher at gmail.com>
>> >> wrote:
>> >>         On Tue, Apr 7, 2015 at 3:46 AM, Lucas Stach
>> >>         <l.stach at pengutronix.de> wrote:
>> >>         > Am Sonntag, den 05.04.2015, 21:41 +0200 schrieb Christian
>> >>         Gmeiner:
>> >>         >> 2015-04-02 18:37 GMT+02:00 Russell King - ARM Linux
>> >>         <linux at arm.linux.org.uk>:
>> >>         >> > On Thu, Apr 02, 2015 at 05:30:44PM +0200, Lucas Stach
>> >>         wrote:
>> >>         >> >> While this isn't the case on i.MX6 a single GPU pipe can
>> >>         have
>> >>         >> >> multiple rendering backend states, which can be selected
>> >>         by the
>> >>         >> >> pipe switch command, so there is no strict mapping
>> >>         between the
>> >>         >> >> user "pipes" and the PIPE_2D/PIPE_3D execution states.
>> >>         >> >
>> >>         >> > This is good, because on Dove we have a single Vivante
>> >>         core which
>> >>         >> > supports both 2D and 3D together.  It's always bugged me
>> >>         that
>> >>         >> > etnadrm has not treated cores separately from their
>> >>         capabilities.
>> >>         >> >
>> >>         >>
>> >>         >> Today I finally got the idea how this multiple pipe stuff
>> >>         should be
>> >>         >> done the right way - thanks Russell.
>> >>         >> So maybe you/we need to rework how the driver is designed
>> >>         regarding
>> >>         >> cores and pipes.
>> >>         >>
>> >>         >> On the imx6 we should get 3 device nodes each only
>> >>         supporting one pipe
>> >>         >> type. On the dove we
>> >>         >> should get only one device node supporting 2 pipes types.
>> >>         What do you think?
>> >>         >>
>> >>         > Sorry, but I strongly object against the idea of having
>> >>         multiple DRM
>> >>         > device nodes for the different pipes.
>> >>         >
>> >>         > If we need the GPU2D and GPU3D to work together (and I can
>> >>         already see
>> >>         > use-cases where we need to use the GPU2D in MESA to do
>> >>         things the GPU3D
>> >>         > is incapable of) we would then need a lot more DMA-BUFs to
>> >>         get buffers
>> >>         > across the devices. This is a waste of resources and
>> >>         complicates things
>> >>         > a lot as we would then have to deal with DMA-BUF fences just
>> >>         to get the
>> >>         > synchronization right, which is a no-brainer if we are on
>> >>         the same DRM
>> >>         > device.
>> >>         >
>> >>         > Also it does not allow us to make any simplifications to the
>> >>         userspace
>> >>         > API, so I can't really see any benefit.
>> >>         >
>> >>         > Also on Dove I think one would expect to get a single pipe
>> >>         capable of
>> >>         > executing in both 2D and 3D state. If userspace takes
>> >>         advantage of that
>> >>         > one could leave the sync between both engines to the FE,
>> >>         which is a good
>> >>         > thing as this allows the kernel to do less work. I don't see
>> >>         why we
>> >>         > should throw this away.
>> >>
>> >>         Just about all modern GPUs support varying combinations of
>> >>         independent
>> >>         pipelines and we currently support this just fine via a single
>> >>         device
>> >>         node in other drm drivers.  E.g., modern radeons support one
>> >>         or more
>> >>         gfx, compute, dma, video decode and video encode engines.
>> >>         What
>> >>         combination is present depends on the asic.
>> >>
>> >>
>> >>
>> >>
>> >> That reminds me.  We should also have in the back of our heads that
>> >> compute is supported by the newer Vivante chips.  We will also need to
>> >> support multiple independent 3d cores as that support has shown up in
>> >> the V5 galcore drivers.
>> >>
>> > AFAIK compute is just another state of the 3D pipe where instead of
>> > issuing a draw command you would kick the thread walker.
>> >
>> > Multicore with a single FE is just a single pipe with chip selects set
>> > to the available backends and mirrored pagetables for the MMUs. With
>> > more than one FE you get more than one pipe which is more like a SLI
>> > setup on the desktop, where userspace has to deal with splitting the
>> > render targets into portions for each GPU.
>> > One more reason to keep things in one DRM device, as I think no one
>> > wants to deal with syncing pagetables across different devices.
>> >
>>
>> I don't get you naming scheme - sorry.
>>
>> For me one Core has a single FE. This single FE can have one pipe or
>> multiple pipes. A pipe is the execution unit select via SELECT_PIPE
>> command (2d, 3d, ..).
>>
>> In the Dove use case we have:
>> - 1 Core with one FE
>> - 2 pipelines
>>
>> In the imx6 case we have:
>> - 3 Cores (each has only one FE)
>> - every FE only support one type of pipeline.
>>
> Okay let's keep it at this: a core is an entity with a FE at the front.
> A pipe is the backend fed by the FE selected by the SELECT_PIPE command.
>
> This is currently confusing as I didn't change the naming in the API,
> but really the "pipe" parameter in the IOCTLs means core. I'll rename
> this for the next round.
>

The current driver was written only for the imx6 use case. So it
combines one pipe
of the 3 GPU cores into one device node. And yes the pipe
parameter could be seen as core. But I think that this design is wrong. I did
not know it better at the time I started working on it. I think that I
would not be
that hard to change the driver in that way that every core has its own
device node
and the pipe parameter really is a pipe of that core.

>> And each Core(/FE) has its own device node. Does this make any sense?
>>
> And I don't get why each core needs to have a single device node. IMHO
> this is purely an implementation decision weather to have one device
> node for all cores or one device node per core.
>

It is an important decision. And I think that one device node per core
reflects the
hardware design to 100%.

> For now I could only see that one device node per core makes things
> harder to get right, while I don't see a single benefit.
>

What makes harder to get it right? The needed changes to the kernel
driver are not that
hard. The user space is an other story but thats because of the
render-only thing, where we
need to pass (prime) buffers around and do fence syncs etc. In the end
I do not see a
showstopper in the user space.

What would you do if - I know/hope that this will never happen - there is a SoC,
which integrates two 3d cores?

greets
--
Christian Gmeiner, MSc

https://soundcloud.com/christian-gmeiner