[PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms

Wed Sep 14 21:37:06 PDT 2011

Hi, Rob.

> -----Original Message-----
> From: robdclark at gmail.com [mailto:robdclark at gmail.com] On Behalf Of Rob
> Clark
> Sent: Thursday, September 15, 2011 4:42 AM
> To: Inki Dae
> Cc: linaro-dev at lists.linaro.org; dri-devel at lists.freedesktop.org
> Subject: Re: [PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms
> 
> On Wed, Sep 14, 2011 at 12:44 AM, Inki Dae <inki.dae at samsung.com> wrote:
> > Hello Rob.
> > Sorry for being late. here was a national holiday.
> >
> >
> >> -----Original Message-----
> >> From: robdclark at gmail.com [mailto:robdclark at gmail.com] On Behalf Of Rob
> >> Clark
> >> Sent: Thursday, September 08, 2011 3:44 AM
> >> To: Inki Dae
> >> Cc: linaro-dev at lists.linaro.org; dri-devel at lists.freedesktop.org
> >> Subject: Re: [PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms
> >>
> >> On Wed, Sep 7, 2011 at 1:00 AM, Inki Dae <inki.dae at samsung.com> wrote:
> >> > Hello, Rob.
> >> >
> >> [snip]
> >> >> >> +static void page_flip_cb(void *arg)
> >> >> >> +{
> >> >> >> +     struct drm_crtc *crtc = arg;
> >> >> >> +     struct drm_device *dev = crtc->dev;
> >> >> >> +     struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
> >> >> >> +     struct drm_pending_vblank_event *event = omap_crtc->event;
> >> >> >> +     struct timeval now;
> >> >> >> +     unsigned long flags;
> >> >> >> +
> >> >> >> +     WARN_ON(!event);
> >> >> >> +
> >> >> >> +     omap_crtc->event = NULL;
> >> >> >> +
> >> >> >> +     update_scanout(crtc);
> >> >> >> +     commit(crtc);
> >> >> >> +
> >> >> >> +     /* wakeup userspace */
> >> >> >> +     // TODO: this should happen *after* flip.. somehow..
> >> >> >> +     if (event) {
> >> >> >> +             spin_lock_irqsave(&dev->event_lock, flags);
> >> >> >> +             event->event.sequence =
> >> >> >> +                             drm_vblank_count_and_time(dev,
> >> >> > omap_crtc->id,
> >> >> >> &now);
> >> >> >> +             event->event.tv_sec = now.tv_sec;
> >> >> >> +             event->event.tv_usec = now.tv_usec;
> >> >> >> +             list_add_tail(&event->base.link,
> >> >> >> +                            
&event->base.file_priv->event_list);
> >> >> >> +
> >> > wake_up_interruptible(&event->base.file_priv->event_wait);
> >> >> >> +             spin_unlock_irqrestore(&dev->event_lock, flags);
> >> >> >> +     }
> >> >> >
> >> >> > How about moving codes above into interrupt handler for vblank?
> >> >> >  maybe there
> >> >>
> >> >> I should mention a couple of things:
> >> >>
> >> >> 1) drm vblank stuff isn't really used currently.. it is actually
> DSS2
> >> >> layer that is registering for the display related interrupt(s).  I'm
> >> >> not really sure how to handle this best.  I suppose the DRM driver
> >> >> could *also* register for these interrupts, but that seems a bit
> odd..
> >> >>
> >> >
> >> > DRM Framework supports only one interrupt handler. this issue should
> be
> >> > resolved. and currently I made our driver to use its own request_irq,
> >> not
> >> > DRM Framework side. we only sets drm_device->irq_enabled to 1 and
> >> interrupt
> >> > handler is registered at display controller or hdmi driver
> respectively.
> >> but
> >> > I'm not sure that this way is best so I will look over more. Anyway
> >> current
> >> > drm framework should be fixed to be considered with multiple irq.
> >>
> >> Or perhaps even callbacks (some other driver handling the irq's
> directly)?
> >>
> >
> > Yes.
> >
> >> >> Also, I guess it is also worth mentioning.. when it comes to vblank,
> >> >> there are two fundamentally different sorts of displays we deal
with.
> >> >> Something like DVI/HDMI/etc which need constant refreshing.  In
> these
> >> >> cases we constantly scan-out the buffer until the next page
> >> >> flip+vsync.  And smart panels, where they get scanned out once and
> >> >> then DSS is no longer reading the scanout buffer until we manually
> >> >> trigger an update.
> >> >>
> >> >
> >> > Is the Smart panel CPU interface based lcd panel that has its own
> >> > framebuffer internally.?
> >>
> >> yes
> >>
> >
> > CPU Interface based lcd panel needs manual updating once all contents of
> > internal framebuffer are transferred to panel. thus, the Smart Panel(cpu
> > interface based lcd panel) has advantage of power consumption with
> > user-requested update. this means application should trigger display
> > controller to be transferred to lcd panel at framedone signal of lcd
> panel
> > because only the application is aware of this moment. this way would be
> > getting confused, especially on general os, such as linux system, based
> > platform so in case of using Smart Panel, I think we should make the
> panel
> > driver to be worked like RGB interface.
> >
> > - framedone interrupt handler of the panel should be placed in the
> driver.
> > - at booting time, display controller driver triggers just one time
> > - the panel driver triggers every time framedone interrupt handler is
> > called.
> >
> > and the order above would be repeated like RGB interface. thus I think
> we
> > could deal with the Smart Panel like RGB way.
> 
> 
> fwiw, I saw this:
> 
> http://www.engadget.com/2011/09/14/idts-power-saving-panel-self-refresh-
> tech-coming-to-laptops-ul/
> 
> seems like similar concept coming to PC's..
> 

I would read the article above, thank you. and actually, I had developed it
like my concept and this driver entirely worked like RGB way.

> >> [snip]
> >> >>
> >> >> The main reason for the page-flip cb is actually not vsync
> >> >> synchronization, but rather synchronizing with other hw blocks that
> >> >> might be rendering to the buffer..  the page flip can be submitted
> >> >> from userspace while some other hw block (3d, 2d, etc) is still
> >> >> DMA'ing to the buffer.  The sync-obj is intended to give a way to
> >> >> defer the (in this case) page flip until other hw blocks are done
> >> >> writing to the buffer.
> >> >
> >> > I thought page-flip is used to change buffer register value of
> display
> >> > controller into another one like the Pan Display feature of linux
> >> > framebuffer. actually page flip interface of libdrm library,
> >> > page_flip_handler, is called with new framebuffer id that has its own
> >> > buffer. and then the controller using current crtc would be set to
> >> buffer
> >> > address of new framebuffer. and I understood that for other blocks
> such
> >> as
> >> > 2d/3d accelerators, rendering synchronization you already mentioned
> >> above,
> >> > is not depend on page flip action. It’s a sequence with my
> understanding
> >> > below.
> >> >
> >> > 1. allocate a new buffer through drm interface such as DUMB_* or
> >> SPECIFIC
> >> > IOCTLs.
> >> > 2. render something on this buffer through DRM interfaces that handle
> >> 2D/3D
> >> > Accelerators and also it would use their own interfaces for
> >> synchronization.
> >> > 3. allocate a new framebuffer and attach the buffer to this
> framebuffer.
> >> > 4. call page flip to change current framebuffer to the framebuffer
> above
> >> at
> >> > vblank moment. at this time, buffer address of the framebuffer would
> be
> >> set
> >> > to a controller.
> >> >
> >> > finally, we can see some image rendered on diplay. thus, I think page
> >> flip
> >> > and rendering synchronization have no any relationship. if I missed
> any
> >> > points, please give me your comments.
> >>
> >> Well, I guess it depends on whether synchronization of the 3d/2d
> >> render is done in userspace, or whether you just submit all the render
> >> commands and then immediately submit the page-flip.
> >>
> >> The actual page-flip should be a fairly light operation (writing a few
> >> registers) and could be done directly from some sort of
> >> render-complete interrupt from 2d/3d core.
> >>
> >
> > In case of 3d rendering, I understood like below.
> > - 3d application requests 3d rendering through OpenGL APIs
> > - the requests would be packed as command in a command buffer kernel
> driver
> > is aware of.
> > - kernel driver gets the command buffer and unpacks it.
> > - kernel driver renders geometries using the commands unpacked on the
> > specific memory address(destination buffer).
> > - while rendered, this application would be blocked.
> > - if render-complete, the application requests swap buffer to change
> current
> > buffer to rendered buffer.
> 
> well, really not kernel driver rendering or dealing with the command
> stream.  But instead the 3d core / gpu.. which you could think of as a
> specialized cpu running asynchronously..
> 

There is my missing point. the geometries are rendered by hw pipeline such
as pp, gp and so on.

> Part of the command-stream that the gpu is chewing thru is a command
> to flip (or perhaps generate an interrupt back to cpu to flip..
> depending on how tightly integrated your display and gpu are).
> 

I know that in case of SGX core, sgx driver has three buffers(one
framebuffer and two system memory) and vsync would be catched at
displayclass module in kernel side at least, for the sgx540 core. but in
case of MALI core, all things such as vsync and page flip(in fact, pan
display of linux framebuffer is used), are done in user side.

> So when glSwapBuffers() returns and application proceeds to start
> working on frame n+1, the gpu might still be rendering frame n and the
> display still showing frame n-1.  So the whole process is pipelined.
> (Or at least can be.)  You'd normally not want to block application
> unnecessarily, because this is a sort of pipeline stall.
> 
> 
> > If vsync feature was enabled, it would wait for vsync at last step above
> > before swapping buffers(page flip). thus I thought page-flip should be
> > requested by application. but in your case, it appears that page flip
> could
> > be requested at render context of 3d driver(kernel side) also. I'm not
> sure
> > that this is a reasonable way. if there is any missing points, please
> give
> > me your comments.
> 
> For "flipping" on a manual update display, you only need to wait if
> you haven't yet got the frame-done from previous flip.  Frame-done irq
> for manual update display is also not exactly like a vsync irq..
> because the gpu can start rendering to that buffer again without the
> user seeing tearing.. you can think of the framebuffer in the display
> as an extra buffer for your flipchain.  Maybe that is just an
> implementation detail.. not sure if it matters from a framework
> perspective.
> 

I think we don't need interested in manual update display if smart panel
works like rgb way and we are enough to deal with frame done interrupt of
display controller. with rgb way, smart panel would be refreshed repeatedly
without tearing issue because display controller would be triggered at
external interrupt of smart panel that means frame done signal of smart
panel. so also smart panel, it could be worked entirely like rgb way.

> Other thing to mention.. any sort of time calculations based on the
> assumption that vsync comes at a certain fixed rate (for a given set
> of monitor timings) is bogus.
> 
> BR,
> -R
> 
> >> [snip]
> >> >> >> +int omap_framebuffer_get_buffer(struct drm_framebuffer *fb, int
> x,
> >> int
> >> >> y,
> >> >> >> +             void **vaddr, unsigned long *paddr, int
> *screen_width)
> >> >> >> +{
> >> >> >> +     struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb);
> >> >> >> +     int bpp = fb->depth / 8;
> >> >> >> +     unsigned long offset;
> >> >> >> +
> >> >> >> +     offset = (x * bpp) + (y * fb->pitch);
> >> >> >> +
> >> >> >> +     if (vaddr) {
> >> >> >> +             if (!omap_fb->vaddr) {
> >> >> >> +                     omap_fb->vaddr = ioremap_wc(omap_fb->paddr,
> >> >> omap_fb-
> >> >> >> >size);
> >> >> >> +             }
> >> >> >> +             *vaddr = omap_fb->vaddr + offset;
> >> >> >> +     }
> >> >> >
> >> >> > Did you use ioremap_wc() to map physical memory(reserved memory)
> that
> >> >> kernel
> >> >> > doesn't aware of to kernel space.? if not so, the memory region
> >> mapped
> >> >> to
> >> >> > kernel space as 1:1,  this way would be faced with duplicated
> cache
> >> >> > attribute issue mentioned by Russel King before. 1:1 mapping
> region
> >> is
> >> >> > mapped to kernel space with cachable attribute.
> >> >>
> >> >> Today the carveout memory does not have a kernel virtual
> mapping.  So
> >> >> we are ok.  And I think this should still be the case w/ CMA.
> >> >>
> >
> > "the carveout memory does not have a kernel virtual mapping." -> this
> means
> > carveout memory has no any page mapping such as 1 level page or 2 level
> page
> > mapping.?
> >
> >> >
> >> > I wonder what is the carveout and sacnout memory. carvout is
> physically
> >> non
> >> > continuous memory and scanout is physically continuous memory?
> >>
> >> today, scanout buffers are required to be contiguous.  There is the
> >> possibility to remap non-contiguous buffers in TILER/DMM (which you
> >> can think of as a sort of IOMMU).  Although adding support for this is
> >> still on my TODO list.
> >>
> >> So today, carveout is used to allocate scanout buffers to ensure
> >> physically contiguous.
> >>
> >> [snip]
> >> >> >
> >> >> > Remove device
> >  specific functions from linux/include/linux/omap_drm.h
> >> and
> >> >> > move them to driver folder. I was told that include/linux/*.h file
> >> >> should
> >> >> > contain only interfaces for user process from Dave.
> >> >>
> >> >>
> >> >> fwiw, the functions in this header are already only ones used by
> other
> >> >> kernel drivers.. everything else internal to omapdrm is already in
> >> >> omap_drv.h.  I'll remove these for now (see discussion about plugin
> >> >> API), although when it is re-introduced they need to be in a header
> >> >> accessible from other drivers.  Although should maybe still be split
> >> >> from what is needed by userspace?  (Something like
omapdrm_plugin.h?)
> >> >>
> >> >>
> >> >
> >> > I'm afraid I don't understand what you mean, "Although should maybe
> >> still be
> >> > split from what is needed by userspace?  (Something like
> >> > omapdrm_plugin.h?)", could you please clarify that again?
> >>
> >> I mean one header file for userspace facing API, and a separate one
> >> for kernel facing API used by other drivers/modules..
> >>
> >
> > Yes, At least, I think the header file in include/drm/ should include
> only
> > userspace APIs and structures.
> >
> >> (In the first pass, that second header would not exist because I'll
> >> remove the plugin API until we have one w/ open userspace)
> >>
> >> BR,
> >> -R
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >