RFC: hardware accelerated bitblt using dma engine

Enrico Weigelt, metux IT consult enrico.weigelt at gr13.net
Wed Aug 3 23:32:57 UTC 2016


On 03.08.2016 13:47, Daniel Vetter wrote:

> Because for optimal performance you _must_ supply the commands to the
> kernel in an as close to the format/layout used by the hardware as
> possible. That means no shared command submission of any kind. And the
> other reason is that cache transfers and memory transfers are highly
> hardware specific, too. Which means no shared buffer management and
> mapping interfaces either.

Right, but I wonder whether that applies to my case.
Again, I'm talking about using aux IPs (not the actual GPU) for things
like copying image regions, maybe even pixfmt/colospace conversions -
those things, in embedded world, usually aren't done by the gpu, but
separate IPs.

> Of course having some common helper code to make drivers easier to type
> (like cma helpers, or ttm, or similar) is something entirely
> different, this is about the uapi.

Well, I'm actually talking about an uapi, as userland somehow needs to
call it :p

Doing it in specific drivers doesn't seem to be a good ways, as sooner
or later we'd have to implement that into lots of different drivers
(plus corresponding userland support), as it's pretty orthogonal to
GPU, as well as fbs/crtcs. Just in some cases, it **might** also be done
via GPU, if applicable (maybe only when its idle anyways), but that's
not the usual case. Instead the usual case would be employing some DMA
controller or IPU.

> And please don't be discourage here, I just want to set clear expectations
> to avoid disappointment. Supporting blitter hardware is obviously a good
> idea, and I think the drm subsystem is the right place for that
> (especially if you have a display block or sometimes a real gpu connected
> to that blitter).

Okay, where else should we put it ? Invent an entirely new device for
that ?


--mtx



More information about the dri-devel mailing list