A simple alternative to GEMr

Thu Oct 3 19:31:44 CEST 2013

Hi,

DirectFB is a good example of "doing it all in userspace". It works but at
the cost of ending up with pretty custom interfaces and non-standard ways
of handling things such as buffer addresses (physical) w.r.t to h/w
acceleration, IPC/RPC, buffer sharing for multi-process support, etc.
Memory management (and dma) has to be the kernel's duty.

Ilyes

On Thu, Oct 3, 2013 at 6:00 PM, Rob Clark <robdclark at gmail.com> wrote:

> On Thu, Oct 3, 2013 at 7:48 AM, dm.leontiev7 <dm.leontiev7 at gmail.com>
> wrote:
> > Hello
> >
> > In my opinion, graphics stack will benefit from moving memory management
> to userspace because there are tons of features not available in kernel,
> like simd or c++.
>
> both of which bring no benefit to memory management code
>
> > Also, bugs in buffer management code will bite only one process, not the
> whole system.
>
> As soon as you need to pin pages (which you need to do, except for the
> hw that Jerome is targetting with his proposal where the GPU can
> really support virtual memory), memory management becomes a whole
> system issue..  pinning pages can only be done from the kernel and it
> is pretty frowned upon to have a driver that lets userspace pin
> arbitrary pages without being able to keep track of those pages and
> clean up.
>
> Anyways, it is much better to trust the kernel than userspace.  In
> system design, you must assume userspace is untrusted.  If you have
> enough tracking for random pages that userspace asks the kernel to pin
> for the gpu in order to cleanup when userspace process dies, then you
> have *more* complexity than what you have in GEM.  Trust me, it is far
> easier for the kernel to deal with buffer handles than having go
> figure out the pages backing a random vma (get_user_pages()) and
> keeping track of things on a per-page basis.
>
> >
> > However, tile-based page flipping can be implemented without major
> changes in graphics stack and it may improve double-buffered 2D rendering
> performance by reducing amount of blitted pixels by reusing unchanged
> pages. If GPU's ROP units can take pixels from one location(front buffer)
> and put results to another one(back buffer), blitting may be completely
> avoided if a small area of double buffered window is updated.
> >
>
> Taking pixels from one location to another sounds like blitting to me.
>  But anyways, client GL app blitting (or otherwise) directly into
> front buffer is basically defeating the purpose of dri2
>
> And tile base page flipping is an orthogonal topic to userspace vs
> kernel memory management.
>
> > As for security, there are thousands of ways to peeform a DoS attack. In
> windows, one can eat so much ram, so user will be unable to kill an app
> because the task manager will not start. To avoid this, some memory must be
> reserved for emergency situation, enough to perform 2D rendering by single
> client. Multiple clients will be able to render their gui without caching
> of window contents even under stress conditions. Also, kernel dri module
> must be able to warn a client  if it must return memory to system and reset
> it's context on task manager request
> >
>
> With the current GEM design, buffers can be swapped out under memory
> pressure, or the appropriate cleanup done if OOM killer kills a
> userspace process.
>
> Doing the memory management in userspace, there are just so many ways
> that things can go wrong.  And once you've fixed those, you end up
> with something more complex.   Sorry, it is just a really bad idea.
>
> BR,
> -R
>
> > Regards, Dmitry.
> >
> >
> >
> > Пользователь Rob Clark <robdclark at gmail.com> писал:
> >
> >>right, but the time you do that, you've implemented enough memory
> >>tracking/management in the kernel, so you don't really win on
> >>complexity.  Otherwise those pinned pages will remain pinned, and you
> >>are still out of memory.
> >>
> >>BR,
> >>-R
> >>
> >>
> >>On Fri, Sep 27, 2013 at 7:53 PM, dm.leontiev7 <dm.leontiev7 at gmail.com>
> wrote:
> >>> DoS from client app is a certainly a problem if we can't interrupt a
> program. But we can.
> >>>
> >>> The program ate all gpu ram, ok. Let wm to cast oom killer on gpu ram
> eater.j
> >>>
> >>> Пользователь Rob Clark <robdclark at gmail.com> писал:
> >>>
> >>>>sure, but userspace memory management is not a good idea for gpu's
> >>>>which cannot support page fault & resume, as it requires pinning
> >>>>pages.  In the best case (ignoring other issues), it allows any
> >>>>userspace that can use GPU easily construct a DoS attach by pinning
> >>>>all available memory.
> >>>>
> >>>>BR,
> >>>>-R
> >>>>
> >>>>On Fri, Sep 27, 2013 at 6:54 PM, dm.leontiev7 <dm.leontiev7 at gmail.com>
> wrote:
> >>>>> My idea targets not only new gpus. it targets any GPU with MMU.
> >>>>>
> >>>>>
> >>>>> I  just want the idea to be not patentable.
> >>>>>
> >>>>> Пользователь Rob Clark <robdclark at gmail.com> писал:
> >>>>>
> >>>>>>new gpu's can support coherency.. this is the HSA stuff (latest
> >>>>>>generation of radeon can support, and I think latest nv stuff as
> >>>>>>well.. probably not any current intel hw, though).  What Jerome was
> >>>>>>talking about is a bit different from what you are trying to do.
> >>>>>>
> >>>>>>On Fri, Sep 27, 2013 at 6:41 PM, dm.leontiev7 <
> dm.leontiev7 at gmail.com> wrote:
> >>>>>>> Passing structures... well, maybe sometimes in future.
> >>>>>>>
> >>>>>>> But NOW we are not living in infuture. Right now gpus doesn't
> support cache snooping, memory coherence protocols like MESI or MOESI.
> Radeon cache is read-only. And memory is NUMA. Just forget about coherence.
> >>>>>>>
> >>>>>>> I see no point in fighting selfmade problems. Really.
> >>>>>>>
> >>>>>>> Пользователь Rob Clark <robdclark at gmail.com> писал:
> >>>>>>>
> >>>>>>>>Jerome's talk was about something above and beyond opencl, where
> you
> >>>>>>>>can just pass data structures (which can include cpu userspace
> ptrs)
> >>>>>>>>to the gpu for more transparent cpu/gpu interoperability.. (ie.
> >>>>>>>>without explicit map step)
> >>>>>>>>
> >>>>>>>>BR,
> >>>>>>>>-R
> >>>>>>>>
> >>>>>>>>On Fri, Sep 27, 2013 at 5:54 PM, dm.leontiev7 <
> dm.leontiev7 at gmail.com> wrote:
> >>>>>>>>> In my opinion, GART support can be dropped because non pci-e
> hardware is just not usable with modern linux distros. It is too old and
> does not have enough ram.
> >>>>>>>>>
> >>>>>>>>> About page faults: I don't really understand what is the problem
> with page faults. All pages referenced by memory map must be locked before
> execution of a gpu operation. Memory map must be locked(by rwsem) while it
> is in use.
> >>>>>>>>>
> >>>>>>>>> Пользователь Rob Clark <robdclark at gmail.com> писал:
> >>>>>>>>>
> >>>>>>>>>>For GL yes (ignoring some important details like GART size
> >>>>>>>>>>limitations, alignment, etc)
> >>>>>>>>>>
> >>>>>>>>>>Jerome's talk was about doing things where an explicit
> map-to-gpu is
> >>>>>>>>>>not required... think of things like passing a pointer to a
> linked
> >>>>>>>>>>list to a shader.  For that you need to let the CPU intervene on
> page
> >>>>>>>>>>fault from GPU.
> >>>>>>>>>>
> >>>>>>>>>>BR,
> >>>>>>>>>>-R
> >>>>>>>>>>
> >>>>>>>>>>On Fri, Sep 27, 2013 at 4:48 PM, dm.leontiev7 <
> dm.leontiev7 at gmail.com> wrote:
> >>>>>>>>>>> Hello
> >>>>>>>>>>>
> >>>>>>>>>>> Page fault support is not required: virtual address space can
> be separated into 3 areas: read-only, write-only and read-write. So, no
> read-write protection on mmu level is required.
> >>>>>>>>>>>
> >>>>>>>>>>> Non-existent pages are not the problem because an application
> has to allocate page before mapping it. Pages must always exist.
> >>>>>>>>>>>
> >>>>>>>>>>> On page deallocation driver must invalidate all affected
> memory maps.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Dmitry
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Пользователь Rob Clark <robdclark at gmail.com> писал:
> >>>>>>>>>>>
> >>>>>>>>>>>>On Fri, Sep 27, 2013 at 3:08 PM, Christian König
> >>>>>>>>>>>><deathsimple at vodafone.de> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> A different story is backing buffers with anonymous system
> memory. I was
> >>>>>>>>>>>>> told that Jerome just recently did a very interesting talk
> at XDC about it
> >>>>>>>>>>>>> (didn't have time to look at it myself).
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>note that this requires a gpu which can page fault (and more
> >>>>>>>>>>>>importantly, resume after cpu intervenes on page fault)..
> which I
> >>>>>>>>>>>>think means modern(ish) radeon or nv..
> >>>>>>>>>>>>
> >>>>>>>>>>>>BR,
> >>>>>>>>>>>>-R
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20131003/bcf2bc7c/attachment-0001.html>