[PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms

Tue Sep 6 23:00:40 PDT 2011

Hello, Rob.

> -----Original Message-----
> From: robdclark at gmail.com [mailto:robdclark at gmail.com] On Behalf Of Rob
> Clark
> Sent: Tuesday, September 06, 2011 1:05 AM
> To: Inki Dae
> Cc: dri-devel at lists.freedesktop.org; linaro-dev at lists.linaro.org;
> Valkeinen, Tomi
> Subject: Re: [PATCH] RFC: omapdrm DRM/KMS driver for TI OMAP platforms
> 
> On Mon, Sep 5, 2011 at 4:58 AM, Inki Dae <inki.dae at samsung.com> wrote:
> > Hello, Rob.
> > I didn't look into any comments from other develops yet , so my comments
> > could be duplicated with other ones. below is my opinions and I
> commented
> > your codes also.
> >
> > We can discuss crtc, encoder and connector. in x86 world, my
> understanding,
> > each one means the following.
> 
> just fwiw, some of current implementation in the KMS code is result of
> how the DSS2 API works..  ie. a dssdev (panel driver) is really part
> encoder and part connector.
> 
> But on other hand, there are customers who write their own panel
> drivers in DSS2 so I'm not sure that we want to completely change
> this.  And the encoder/connector distinction gets a bit more blurry w/
> smart-panel type displays.
> 
> [snip]
> >> +static int omap_connector_mode_valid(struct drm_connector *connector,
> >> +                              struct drm_display_mode *mode)
> >> +{
> >> +     struct omap_connector *omap_connector =
> >> to_omap_connector(connector);
> >> +     struct omap_dss_device *dssdev = omap_connector->dssdev;
> >> +     struct omap_dss_driver *dssdrv = dssdev->driver;
> >> +     struct omap_video_timings timings = {0};
> >> +     struct drm_device *dev = connector->dev;
> >> +     struct drm_display_mode *new_mode;
> >> +     int ret = MODE_BAD;
> >> +
> >> +     copy_timings_drm_to_omap(&timings, mode);
> >> +     mode->vrefresh = drm_mode_vrefresh(mode);
> >> +
> >> +     if (!dssdrv->check_timings(dssdev, &timings)) {
> >> +             /* check if vrefresh is still valid */
> >> +             new_mode = drm_mode_duplicate(dev, mode);
> >> +             new_mode->clock = timings.pixel_clock;
> >> +             new_mode->vrefresh = 0;
> >> +             if (mode->vrefresh == drm_mode_vrefresh(new_mode))
> >> +                     ret = MODE_OK;
> >> +             drm_mode_destroy(dev, new_mode);
> >> +     }
> >> +
> >
> > is there any reason that you call drm_mode_duplicate() to get a clone of
> a
> > mode? I just wonder it.
> 
> yes, because the new_mode is modified and I didn't want this fxn to
> have unintended side-effects.
> 
> this works in a slightly funny way, because dssdev->check_timings() is
> actually a dssdev->check_and_modify_timings(), modifying the pixel
> clock to the closest thing that is supported.  We need to compare the
> resulting vrefresh to see if it is something matching..  otherwise we
> ask for some resolution @ 60Hz and maybe get 30Hz instead.
> 
> [snip]
> >> +/* initialize connector */
> >> +struct drm_connector * omap_connector_init(struct drm_device *dev,
> >> +             int connector_type, struct omap_dss_device *dssdev)
> >> +{
> >> +     struct drm_connector *connector = NULL;
> >
> > It appears that this connector is initialized with NULL. This is just
> minor
> > issue. :)
> 
> it is so it isn't uninitialized if kzalloc fails and we 'goto fail'..
> I guess in this particular case there is only a single fail point,
> although it is a coding pattern that I follow elsewhere where there
> are multiple fail points.  And at least this way if I added some other
> step later where it might fail after the kzalloc(), I won't
> accidentally miss the appropriate cleanup. I mostly prefer to have a
> single cleanup path when possible so I don't confuse myself later
> 
> [snip]
> >> +static void page_flip_cb(void *arg)
> >> +{
> >> +     struct drm_crtc *crtc = arg;
> >> +     struct drm_device *dev = crtc->dev;
> >> +     struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
> >> +     struct drm_pending_vblank_event *event = omap_crtc->event;
> >> +     struct timeval now;
> >> +     unsigned long flags;
> >> +
> >> +     WARN_ON(!event);
> >> +
> >> +     omap_crtc->event = NULL;
> >> +
> >> +     update_scanout(crtc);
> >> +     commit(crtc);
> >> +
> >> +     /* wakeup userspace */
> >> +     // TODO: this should happen *after* flip.. somehow..
> >> +     if (event) {
> >> +             spin_lock_irqsave(&dev->event_lock, flags);
> >> +             event->event.sequence =
> >> +                             drm_vblank_count_and_time(dev,
> > omap_crtc->id,
> >> &now);
> >> +             event->event.tv_sec = now.tv_sec;
> >> +             event->event.tv_usec = now.tv_usec;
> >> +             list_add_tail(&event->base.link,
> >> +                             &event->base.file_priv->event_list);
> >> +            
wake_up_interruptible(&event->base.file_priv->event_wait);
> >> +             spin_unlock_irqrestore(&dev->event_lock, flags);
> >> +     }
> >
> > How about moving codes above into interrupt handler for vblank?
> >  maybe there
> 
> I should mention a couple of things:
> 
> 1) drm vblank stuff isn't really used currently.. it is actually DSS2
> layer that is registering for the display related interrupt(s).  I'm
> not really sure how to handle this best.  I suppose the DRM driver
> could *also* register for these interrupts, but that seems a bit odd..
> 

DRM Framework supports only one interrupt handler. this issue should be
resolved. and currently I made our driver to use its own request_irq, not
DRM Framework side. we only sets drm_device->irq_enabled to 1 and interrupt
handler is registered at display controller or hdmi driver respectively. but
I'm not sure that this way is best so I will look over more. Anyway current
drm framework should be fixed to be considered with multiple irq.

> Also, I guess it is also worth mentioning.. when it comes to vblank,
> there are two fundamentally different sorts of displays we deal with.
> Something like DVI/HDMI/etc which need constant refreshing.  In these
> cases we constantly scan-out the buffer until the next page
> flip+vsync.  And smart panels, where they get scanned out once and
> then DSS is no longer reading the scanout buffer until we manually
> trigger an update.
> 

Is the Smart panel CPU interface based lcd panel that has its own
framebuffer internally.?

> 2) updating things like scanout address for framebuffers, these are
> actually double-buffered registers in the DSS block.  So the update of
> actual scanout address is automatically synchronized by the hardware
> w/ the vblank.  The one thing I need to do still is defer the event
> back to userspace until after the flip so it doesn't start rendering
> onto the buffer that is still being scanned out.  I'm not quite sure
> the best way to do this yet.  DSS2 driver has an API that I can
> register for IRQ callbacks, although the knowledge about the right
> callback to register (VSYNC, VSYNC2, FRAMEDONE, EVSYNC_ODD,
> EVSYNC_EVEN, etc) is in the dssdev/connector.  I could use
> omap_connector_sync() to block until vsync/framedone but I'd prefer a
> non-blocking mechanism.
> 
> > would be some reason that spin_lock_* are used here.
> 
> dev->event_lock seems to be used to protect modifying event_list
> elsewhere..
> 
> >> +}
> >> +
> >> +static int omap_crtc_page_flip_locked(struct drm_crtc *crtc,
> >> +              struct drm_framebuffer *fb,
> >> +              struct drm_pending_vblank_event *event)
> >> +{
> >> +     struct drm_device *dev = crtc->dev;
> >> +     struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
> >> +
> >> +     DBG("%d -> %d", crtc->fb ? crtc->fb->base.id : -1, fb->base.id);
> >> +
> >> +     if (omap_crtc->event) {
> >> +             dev_err(dev->dev, "already a pending flip\n");
> >> +             return -EINVAL;
> >> +     }
> >> +
> >> +     crtc->fb = fb;
> >> +     omap_crtc->event = event;
> >> +
> >> +     omap_gem_op_async(omap_framebuffer_bo(fb), OMAP_GEM_READ,
> >> +                     page_flip_cb, crtc);
> >> +
> >> +     return 0;
> >> +}
> >> +
> >
> > Where are drm_vblank_get() and drm_vblank_put() called to enable and
> disable
> > vsync interrupt? and does the this page flip need user-requested
> drm_control
> > ioctl to install/uninstall a handler?. I think the page_flip feature
> should
> > work it fine itself.
> >
> > like this:
> > at omap_crtc_page_flip_locked()
> > drm_vblank_get();   <- enable vsync interrupt.
> > update famebuffer. <- update overlay and set it to hw.
> >
> > at interrupt handler
> > handle page flip event.
> > wake_up_interruptible() <- notify  user of page flip event.
> > drm_vblank_put();  <- disable vsync interrupt.
> >
> > I think that page flip callback would be called after vsync interrupt
> > occurred. for this, you can refer to set_mode function of libdrm
library.
> a
> > event has user's apage flip handler and when user received any event
> from
> > kernel side, this handler would be called to update framebuffer
> repeatedly.
> 
> The main reason for the page-flip cb is actually not vsync
> synchronization, but rather synchronizing with other hw blocks that
> might be rendering to the buffer..  the page flip can be submitted
> from userspace while some other hw block (3d, 2d, etc) is still
> DMA'ing to the buffer.  The sync-obj is intended to give a way to
> defer the (in this case) page flip until other hw blocks are done
> writing to the buffer.

I thought page-flip is used to change buffer register value of display
controller into another one like the Pan Display feature of linux
framebuffer. actually page flip interface of libdrm library,
page_flip_handler, is called with new framebuffer id that has its own
buffer. and then the controller using current crtc would be set to buffer
address of new framebuffer. and I understood that for other blocks such as
2d/3d accelerators, rendering synchronization you already mentioned above,
is not depend on page flip action. It’s a sequence with my understanding
below.

1. allocate a new buffer through drm interface such as DUMB_* or SPECIFIC
IOCTLs.
2. render something on this buffer through DRM interfaces that handle 2D/3D
Accelerators and also it would use their own interfaces for synchronization.
3. allocate a new framebuffer and attach the buffer to this framebuffer.
4. call page flip to change current framebuffer to the framebuffer above at
vblank moment. at this time, buffer address of the framebuffer would be set
to a controller.

finally, we can see some image rendered on diplay. thus, I think page flip
and rendering synchronization have no any relationship. if I missed any
points, please give me your comments.

> 
> (I guess in the same way, I could use the sync-obj to prevent someone
> else writing to frame n-1 until the vsync irq.. but as mentioned
> earlier I still need to sort out a sane way to handle the different
> sorts of irq's we get back for different sorts of panels when they are
> done reading from a buffer.)
> 
> [snip]
> >> +static int omap_modeset_init(struct drm_device *dev)
> >> +{
> >> +     const struct omap_drm_platform_data *pdata = dev->dev-
> >> >platform_data;
> >> +     struct omap_drm_private *priv = dev->dev_private;
> >> +     struct omap_dss_device *dssdev = NULL;
> >> +     int i, j;
> >> +     unsigned int connected_connectors = 0;
> >> +
> >> +     /* create encoders for each manager */
> >> +     int create_encoder(int i) {
> >> +             struct omap_overlay_manager *mgr =
> >> +                             omap_dss_get_overlay_manager(i);
> >> +             struct drm_encoder *encoder = omap_encoder_init(dev,
mgr);
> >> +
> >> +             if (!encoder) {
> >> +                     dev_err(dev->dev, "could not create encoder\n");
> >> +                     return -ENOMEM;
> >> +             }
> >> +
> >> +             priv->encoders[priv->num_encoders++] = encoder;
> >> +
> >> +             return 0;
> >> +     }
> >> +
> >> +     /* create connectors for each display device */
> >> +     int create_connector(struct omap_dss_device *dssdev) {
> >> +             static struct notifier_block *notifier;
> >> +             struct drm_connector *connector;
> >> +
> >> +             if (!dssdev->driver) {
> >> +                     dev_warn(dev->dev, "%s has no driver.. skipping
> > it\n",
> >> +                                     dssdev->name);
> >> +                     return 0;
> >> +             }
> >> +
> >> +             if (!(dssdev->driver->get_timings ||
> >> +                                     dssdev->driver->read_edid)) {
> >> +                     dev_warn(dev->dev, "%s driver does not support "
> >> +                             "get_timings or read_edid.. skipping
it!\n",
> >> +                             dssdev->name);
> >> +                     return 0;
> >> +             }
> >> +
> >> +             connector = omap_connector_init(dev,
> >> +                             get_connector_type(dssdev), dssdev);
> >> +
> >> +             if (!connector) {
> >> +                     dev_err(dev->dev, "could not create
connector\n");
> >> +                     return -ENOMEM;
> >> +             }
> >> +
> >> +             /* track what is already connected.. rather than looping
> >> thru
> >> +              * all connectors twice later, first for connected then
> for
> >> +              * remainder (which could be a race condition if
connected
> >> +              * status changes)
> >> +              */
> >> +             if (omap_connector_detect(connector, true) ==
> >> +                             connector_status_connected) {
> >> +                     connected_connectors |= (1 <<
priv->num_connectors);
> >> +             }
> >> +
> >> +             priv->connectors[priv->num_connectors++] = connector;
> >> +
> >> +#if 0 /* enable when dss2 supports hotplug */
> >> +             notifier = kzalloc(sizeof(struct notifier_block),
> >> GFP_KERNEL);
> >> +             notifier->notifier_call = omap_drm_notifier;
> >> +             omap_dss_add_notify(dssdev, notifier);
> >> +#else
> >> +             notifier = NULL;
> >> +#endif
> >> +
> >> +             for (j = 0; j < priv->num_encoders; j++) {
> >> +                     struct omap_overlay_manager *mgr =
> >> +                            
omap_encoder_get_manager(priv->encoders[j]);
> >> +                     if (mgr->device == dssdev) {
> >> +                            
drm_mode_connector_attach_encoder(connector,
> >> +                                             priv->encoders[j]);
> >> +                     }
> >> +             }
> >> +
> >> +             return 0;
> >> +     }
> >> +
> >> +     /* create up to max_overlays CRTCs mapping to overlays.. by
> default,
> >> +      * connect the overlays to different managers/encoders, giving
> >> priority
> >> +      * to encoders connected to connectors with a detected connection
> >> +      */
> >> +     int create_crtc(int i) {
> >> +             struct omap_overlay *ovl = omap_dss_get_overlay(i);
> >> +             struct omap_overlay_manager *mgr = NULL;
> >> +             struct drm_crtc *crtc;
> >> +
> >> +             if (ovl->manager) {
> >> +                     DBG("disconnecting %s from %s", ovl->name,
> >> +                                             ovl->manager->name);
> >> +                     ovl->unset_manager(ovl);
> >> +             }
> >> +
> >> +             /* find next best connector, ones with detected
connection
> >> first
> >> +              */
> >> +             while (j < priv->num_connectors && !mgr) {
> >> +                     if (connected_connectors & (1 << j)) {
> >> +                             struct drm_encoder * encoder =
> >> +                                     omap_connector_attached_encoder(
> >> +
> > priv->connectors[j]);
> >> +                             if (encoder) {
> >> +                                     mgr =
> > omap_encoder_get_manager(encoder);
> >> +                             }
> >> +                     }
> >> +                     j++;
> >> +             }
> >> +
> >> +             /* if we couldn't find another connected connector, lets
> >> start
> >> +              * looking at the unconnected connectors:
> >> +              */
> >> +             while (j < 2 * priv->num_connectors && !mgr) {
> >> +                     int idx = j - priv->num_connectors;
> >> +                     if (!(connected_connectors & (1 << idx))) {
> >> +                             struct drm_encoder * encoder =
> >> +                                     omap_connector_attached_encoder(
> >> +
> > priv->connectors[idx]);
> >> +                             if (encoder) {
> >> +                                     mgr =
> > omap_encoder_get_manager(encoder);
> >> +                             }
> >> +                     }
> >> +                     j++;
> >> +             }
> >> +
> >> +             if (mgr) {
> >> +                     DBG("connecting %s to %s", ovl->name, mgr->name);
> >> +                     ovl->set_manager(ovl, mgr);
> >> +             }
> >> +
> >> +             crtc = omap_crtc_init(dev, ovl, priv->num_crtcs);
> >> +
> >> +             if (!crtc) {
> >> +                     dev_err(dev->dev, "could not create CRTC\n");
> >> +                     return -ENOMEM;
> >> +             }
> >> +
> >> +             priv->crtcs[priv->num_crtcs++] = crtc;
> >> +
> >> +             return 0;
> >> +     }
> >> +
> >> +     drm_mode_config_init(dev);
> >> +
> >> +     if (pdata) {
> >> +             /* if platform data is provided by the board file, use it
> to
> >> +              * control which overlays, managers, and devices we own.
> >> +              */
> >> +             for (i = 0; i < pdata->mgr_cnt; i++) {
> >> +                     create_encoder(pdata->mgr_ids[i]);
> >> +             }
> >> +
> >> +             for (i = 0; i < pdata->dev_cnt; i++) {
> >> +                     int m(struct omap_dss_device *dssdev, void *data)
{
> >> +                             return ! strcmp(dssdev->name, data);
> >> +                     }
> >> +                     struct omap_dss_device *dssdev =
> >> +                             omap_dss_find_device(
> >> +                                     (void *)pdata->dev_names[i], m);
> >> +                     if (!dssdev) {
> >> +                             dev_warn(dev->dev, "no such dssdev:
%s\n",
> >> +                                             pdata->dev_names[i]);
> >> +                             continue;
> >> +                     }
> >> +                     create_connector(dssdev);
> >> +             }
> >> +
> >> +             j = 0;
> >> +             for (i = 0; i < pdata->ovl_cnt; i++) {
> >> +                     create_crtc(pdata->ovl_ids[i]);
> >> +             }
> >> +     } else {
> >> +             /* otherwise just grab up to CONFIG_DRM_OMAP_NUM_CRTCS
and
> >> try
> >> +              * to make educated guesses about everything else
> >> +              */
> >> +             int max_overlays = min(omap_dss_get_num_overlays(),
> >> +                                     CONFIG_DRM_OMAP_NUM_CRTCS);
> >> +
> >> +             for (i = 0; i < omap_dss_get_num_overlay_managers(); i++)
> {
> >> +                     create_encoder(i);
> >> +             }
> >> +
> >> +             for_each_dss_dev(dssdev) {
> >> +                     create_connector(dssdev);
> >> +             }
> >> +
> >> +             j = 0;
> >> +             for (i = 0; i < max_overlays; i++) {
> >> +                     create_crtc(i);
> >> +             }
> >> +     }
> >> +
> >> +     /* for now keep the mapping of CRTCs and encoders static.. */
> >> +     for (i = 0; i < priv->num_encoders; i++) {
> >> +             struct drm_encoder *encoder = priv->encoders[i];
> >> +             struct omap_overlay_manager *mgr =
> >> +                             omap_encoder_get_manager(encoder);
> >> +
> >> +             encoder->possible_crtcs = 0;
> >> +
> >> +             for (j = 0; j < priv->num_crtcs; j++) {
> >> +                     struct omap_overlay *ovl =
> >> +
> > omap_crtc_get_overlay(priv->crtcs[j]);
> >> +                     if (ovl->manager == mgr) {
> >> +                             encoder->possible_crtcs |= (1 << j);
> >> +                     }
> >> +             }
> >> +
> >> +             DBG("%s: possible_crtcs=%08x", mgr->name,
> >> +                                     encoder->possible_crtcs);
> >> +     }
> >> +
> >> +     dump_video_chains();
> >> +
> >> +     dev->mode_config.min_width = 640;
> >> +     dev->mode_config.min_height = 480;
> >> +
> >> +     /* note: pvr can't currently handle dst surfaces larger than 2k
> by
> >> 2k */
> >> +     dev->mode_config.max_width = 2048;
> >> +     dev->mode_config.max_height = 2048;
> >> +
> >> +     dev->mode_config.funcs = &omap_mode_config_funcs;
> >> +
> >> +     return 0;
> >> +}
> >
> > For code clean, how about moving create_encoder(), create_connector()
> and
> > create_crtc() into omap_encoder.c, omap_connector.c and omap_crtc.c
> file?.
> > omap_modeset_init function contains so many codes. it looks like not
> good.
> 
> yeah, this is the way it is mainly because we are still allowing two
> cases:  (1) omapdrm uses all the DSS2 resources, and (2) the case of
> using some platform-data to tell the omapdrm to not use all the
> available resources in DSS.  This is mainly a transition thing, so you
> could (for example) use omapdrm to control some outputs or video
> pipes, and omapfb (fbdev) or omap_vout (v4l2) for others.  Have a look
> at the if (pdata) { ... } else { ... } part at the later part of the
> fxn.
> 
> I expect eventually, when we have overlay support in the DRM driver
> (drm_plane stuff) and enough time has elapsed for folks to move away
> from legacy fbdev/v4l2 interfaces, that this platform-data stuff can
> be removed and this part greatly simplified.  But I guess that will be
> a couple years.  For now I wanted to try and keep the ugly bits
> localized in one place ;-)
> 
> [snip]
> 
> >> +}
> >> +
> >> +struct drm_ioctl_desc ioctls[DRM_COMMAND_END - DRM_COMMAND_BASE] = {
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GET_PARAM, ioctl_get_param,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_SET_PARAM, ioctl_set_param,
> >> DRM_UNLOCKED|DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GET_BASE, ioctl_get_base,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GEM_NEW, ioctl_gem_new,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GEM_CPU_PREP, ioctl_gem_cpu_prep,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GEM_CPU_FINI, ioctl_gem_cpu_fini,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +     DRM_IOCTL_DEF_DRV(OMAP_GEM_INFO, ioctl_gem_info,
> >> DRM_UNLOCKED|DRM_AUTH),
> >> +};
> >
> > You also should consider security issue. :)
> 
> 
> All the ioctls do have the DRM_AUTH bit set, so any userspace process
> that uses them needs to at least be authenticated with the DRM master.
>  I don't really see any way to avoid needing to support buffer
> allocating in an authenticated client process.
> 
> So I think this is sufficient, but always possible that I missed
> something..
> 

Oh, you are right, it seems sufficient.

> [snip]
> >> +
> >> +static void omap_framebuffer_destroy(struct drm_framebuffer *fb)
> >> +{
> >> +     struct drm_device *dev = fb->dev;
> >> +     struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb);
> >> +
> >> +     DBG("destroy: FB ID: %d (%p)", fb->base.id, fb);
> >> +
> >> +     drm_framebuffer_cleanup(fb);
> >> +
> >> +     if (omap_fb->vaddr) {
> >> +             iounmap(omap_fb->vaddr);
> >> +     }
> >
> > You always use reserved memory for drm framebuffer.? If not so, for
> > instance, any physical memory was allocated by dma api such as
> dma_alloc_*
> > then you should call dma_free_*.
> 
> currently it is always reserved memory.. I've not moved to CMA stuff yet.
> 
> So currently we have no vaddr in the GEM parts, and only place where
> vaddr is created/mapped is in omap_framebuffer part.
> 
> I think this will have to change though..  partly because
> dma_alloc_coherent() also allocates a vaddr.  Partly because there
> will be some cases where kernel side plugin code might need access to
> the buffer.  (Not so much for pixel buffers, but for command-stream
> type buffers.)
> 
> Possibly I should pull this into the GEM code w/ a
> omap_gem_{get,put}_vaddr() which could do the right thing depending on
> where the memory for the buffer came from.  Whatever I do, I don't
> want to force to have a kernel virtual mapping for all pixel buffers
> when it can otherwise be avoided, because I think in a generation or
> two kernel address space will start being a limited resource.
> 
> 
> >> +
> >> +     if (omap_gem_put_paddr(omap_fb->bo)) {
> >> +             dev_err(dev->dev, "could not unmap!\n");
> >> +     }
> >> +
> >> +     if (omap_fb->bo) {
> >> +             drm_gem_object_unreference_unlocked(omap_fb->bo);
> >> +     }
> >> +
> >> +     kfree(omap_fb);
> >> +}
> >> +
> >> +static int omap_framebuffer_dirty(struct drm_framebuffer *fb,
> >> +             struct drm_file *file_priv, unsigned flags, unsigned
color,
> >> +             struct drm_clip_rect *clips, unsigned num_clips)
> >> +{
> >> +     int i;
> >> +
> >> +     for (i = 0; i < num_clips; i++) {
> >> +             omap_framebuffer_flush(fb, clips[i].x1, clips[i].y1,
> >> +                                     clips[i].x2 - clips[i].x1,
> >> +                                     clips[i].y2 - clips[i].y1);
> >> +     }
> >> +
> >> +     return 0;
> >> +}
> >> +
> >> +static const struct drm_framebuffer_funcs omap_framebuffer_funcs = {
> >> +     .create_handle = omap_framebuffer_create_handle,
> >> +     .destroy = omap_framebuffer_destroy,
> >> +     .dirty = omap_framebuffer_dirty,
> >> +};
> >> +
> >> +int omap_framebuffer_get_buffer(struct drm_framebuffer *fb, int x, int
> y,
> >> +             void **vaddr, unsigned long *paddr, int *screen_width)
> >> +{
> >> +     struct omap_framebuffer *omap_fb = to_omap_framebuffer(fb);
> >> +     int bpp = fb->depth / 8;
> >> +     unsigned long offset;
> >> +
> >> +     offset = (x * bpp) + (y * fb->pitch);
> >> +
> >> +     if (vaddr) {
> >> +             if (!omap_fb->vaddr) {
> >> +                     omap_fb->vaddr = ioremap_wc(omap_fb->paddr,
> omap_fb-
> >> >size);
> >> +             }
> >> +             *vaddr = omap_fb->vaddr + offset;
> >> +     }
> >
> > Did you use ioremap_wc() to map physical memory(reserved memory) that
> kernel
> > doesn't aware of to kernel space.? if not so, the memory region mapped
> to
> > kernel space as 1:1,  this way would be faced with duplicated cache
> > attribute issue mentioned by Russel King before. 1:1 mapping region is
> > mapped to kernel space with cachable attribute.
> 
> Today the carveout memory does not have a kernel virtual mapping.  So
> we are ok.  And I think this should still be the case w/ CMA.
> 

I wonder what is the carveout and sacnout memory. carvout is physically non
continuous memory and scanout is physically continuous memory?

> [snip]
> >> +
> >> +static int omap_fbdev_create(struct drm_fb_helper *helper,
> >> +             struct drm_fb_helper_surface_size *sizes)
> >> +{
> >> +     struct omap_fbdev *fbdev = to_omap_fbdev(helper);
> >> +     struct drm_device *dev = helper->dev;
> >> +     struct drm_framebuffer *fb;
> >> +     struct fb_info *fbi;
> >> +     struct drm_mode_fb_cmd mode_cmd = {0};
> >> +     unsigned long paddr;
> >> +     void __iomem *vaddr;
> >> +     int size, screen_width;
> >> +     int ret;
> >> +
> >> +     /* only doing ARGB32 since this is what is needed to alpha-blend
> >> +      * with video overlays:
> >> +      */
> >> +     sizes->surface_bpp = 32;
> >> +     sizes->surface_depth = 32;
> >> +
> >> +     DBG("create fbdev: %dx%d@%d", sizes->surface_width,
> >> +                     sizes->surface_height, sizes->surface_bpp);
> >> +
> >> +     mode_cmd.width = sizes->surface_width;
> >> +     mode_cmd.height = sizes->surface_height;
> >> +
> >> +     mode_cmd.bpp = sizes->surface_bpp;
> >> +     mode_cmd.depth = sizes->surface_depth;
> >> +
> >> +     mutex_lock(&dev->struct_mutex);
> >> +
> >> +     fbi = framebuffer_alloc(0, dev->dev);
> >> +     if (!fbi) {
> >> +             dev_err(dev->dev, "failed to allocate fb info\n");
> >> +             ret = -ENOMEM;
> >> +             goto fail;
> >> +     }
> >> +
> >> +     DBG("fbi=%p, dev=%p", fbi, dev);
> >> +
> >> +     fbdev->fb = omap_framebuffer_init(dev, &mode_cmd, NULL);
> >> +     if (!fbdev->fb) {
> >> +             dev_err(dev->dev, "failed to allocate fb\n");
> >> +             ret = -ENOMEM;
> >> +             goto fail;
> >
> > In case of fail, framebuffer_release should be called.
> 
> oh, yeah.. // TODO cleanup ;-)
> 
> >
> >> +     }
> >> +
> >> +     fb = fbdev->fb;
> >> +     helper->fb = fb;
> >> +     helper->fbdev = fbi;
> >> +
> >> +     fbi->par = helper;
> >> +     fbi->flags = FBINFO_DEFAULT;
> >> +     fbi->fbops = &omap_fb_ops;
> >> +
> >> +     strcpy(fbi->fix.id, MODULE_NAME);
> >> +
> >> +     ret = fb_alloc_cmap(&fbi->cmap, 256, 0);
> >> +     if (ret) {
> >> +             ret = -ENOMEM;
> >> +             goto fail;
> >
> > Also, omap_framebuffer_destroy() should be called to release resource.
> >
> >> +     }
> >> +
> >> +     drm_fb_helper_fill_fix(fbi, fb->pitch, fb->depth);
> >> +     drm_fb_helper_fill_var(fbi, helper, fb->width, fb->height);
> >> +
> >> +     size = omap_framebuffer_get_buffer(fb, 0, 0,
> >> +                     &vaddr, &paddr, &screen_width);
> >> +
> >> +     dev->mode_config.fb_base = paddr;
> >> +
> >> +     fbi->screen_base = vaddr;
> >> +     fbi->screen_size = size;
> >> +     fbi->fix.smem_start = paddr;
> >> +     fbi->fix.smem_len = size;
> >> +
> >> +     DBG("par=%p, %dx%d", fbi->par, fbi->var.xres, fbi->var.yres);
> >> +     DBG("allocated %dx%d fb", fbdev->fb->width, fbdev->fb->height);
> >> +
> >> +     mutex_unlock(&dev->struct_mutex);
> >> +
> >> +     return 0;
> >> +
> >> +fail:
> >> +     mutex_unlock(&dev->struct_mutex);
> >> +     // TODO cleanup?
> >> +     return ret;
> >> +}
> >> +
> >> +static void omap_crtc_fb_gamma_set(struct drm_crtc *crtc,
> >> +             u16 red, u16 green, u16 blue, int regno)
> >> +{
> >> +     DBG("fbdev: set gamma");
> >> +}
> >> +
> >> +static void omap_crtc_fb_gamma_get(struct drm_crtc *crtc,
> >> +             u16 *red, u16 *green, u16 *blue, int regno)
> >> +{
> >> +     DBG("fbdev: get gamma");
> >> +}
> >> +
> >> +static int omap_fbdev_probe(struct drm_fb_helper *helper,
> >> +             struct drm_fb_helper_surface_size *sizes)
> >> +{
> >> +     int new_fb = 0;
> >> +     int ret;
> >> +
> >> +     if (!helper->fb) {
> >> +             ret = omap_fbdev_create(helper, sizes);
> >> +             if (ret)
> >> +                     return ret;
> >> +             new_fb = 1;
> >> +     }
> >> +     return new_fb;
> >> +}
> >> +
> >
> > This strange code must be because of mainline code below.
> > fb_probe callback of drm_fb_helper_single_fb_probe function needs return
> > values of three types. :(  I think this mainline code should be modified
> > properly.
> >
> 
> yeah
> 
> [snip]
> >> +
> >> +struct omap_gem_object {
> >> +     struct drm_gem_object base;
> >> +
> >> +     uint32_t flags;
> >> +
> >> +     /**
> >> +      * if buffer is physically contiguous or remapped in TILER, the
> >> +      * OMAP_BO_DMA flag is set and the paddr is valid.
> >> +      *
> >> +      * Note that OMAP_BO_SCANOUT is a hint from userspace that DMA
> >> capable
> >> +      * buffer is requested, but doesn't mean that it is.  Use the
> >> +      * OMAP_BO_DMA flag to determine if the buffer has a DMA capable
> >> +      * physical address.
> >> +      */
> >> +     unsigned long paddr;
> >> +
> >> +     /**
> >> +      * Array of backing pages, if allocated.  Note that pages are
> never
> >> +      * allocated for buffers originally allocated from contiguous
> >> memory
> >> +      * (which will have obj->filp==NULL)
> >> +      */
> >> +     struct page **pages;
> >
> > Is this array of backing pages used for physically non-continuous
> memory?.
> > it appears that sgx core with iommu is considered.
> 
> 
> yes (and will be similar for 2d hw).  But even in case of scanout
> buffers, we can remap them into TILER (which you can think of as a
> sort of IOMMU on OMAP4+).  The code for this isn't in place yet, but
> the intention is to
> 
> a) if userspace knows it will be a scannout buffer, allocate
> physically contiguous to begin with.. if that fails, fall back to
> dis-contiguous.
> b) if we need to scan it out, but it isn't contiguous, fall back to
> remapping into TILER.
> 
> That is a bit of over-simplification.. for tiled buffers, and buffers
> for video encode/decode, these would always be allocated as pages and
> remapped into TILER.
> 
> [sync]
> >> +
> >> +static inline void sync_op(struct drm_gem_object *obj,
> >> +             enum omap_gem_op op, bool start)
> >> +{
> >> +     struct omap_gem_object *omap_obj = to_omap_bo(obj);
> >> +
> >> +     spin_lock(&sync_lock);
> >> +
> >> +     if (!omap_obj->sync) {
> >> +             omap_obj->sync = kzalloc(sizeof(*omap_obj->sync),
> >> GFP_KERNEL);
> >> +     }
> >
> > If sync_op function is called at interrupt context, use GFP_ATOMIC
> instead
> > of GFP_KERNEL. from what you use spin_lock/unlock here, it appears that
> this
> > function would be called at interrupt context.
> 
> yeah, that is an oversight.. I updated omap_gem_set_sync_object() but
> missed this one
> 
> 
> [snip]
> >> +
> >> +/* common constructor body */
> >> +static struct drm_gem_object * omap_gem_new_impl(struct drm_device
> *dev,
> >> +             size_t size, uint32_t flags, unsigned long paddr, struct
> >> page **pages,
> >> +             struct omap_gem_vm_ops *ops)
> >> +{
> >
> > so many arguments are used at this function. I encourage you to use
> maximum
> > four arguments for ARM system.
> >
> 
> I know the reasoning for this.. although the alternative is to stick
> extra param's in a struct on the stack, which roughly amounts to what
> the compiler is doing for you anyways for args above the 1st four (ie.
> pushing them to the stack).
> 
> anyways it is a static fxn so compiler has some flexibility to inline
> or optimize however it sees fit..
> 
> [snip]
> >> diff --git a/include/linux/omap_drm.h b/include/linux/omap_drm.h
> >> new file mode 100644
> >> index 0000000..81af200
> >> --- /dev/null
> >> +++ b/include/linux/omap_drm.h
> >> @@ -0,0 +1,191 @@
> >> +/*
> >> + * linux/include/linux/omap_drm.h
> >> + *
> >> + * Copyright (C) 2011 Texas Instruments
> >> + * Author: Rob Clark <rob at ti.com>
> >> + *
> >> + * This program is free software; you can redistribute it and/or
> modify
> >> it
> >> + * under the terms of the GNU General Public License version 2 as
> >> published by
> >> + * the Free Software Foundation.
> >> + *
> >> + * This program is distributed in the hope that it will be useful, but
> >> WITHOUT
> >> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
> or
> >> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
> License
> >> for
> >> + * more details.
> >> + *
> >> + * You should have received a copy of the GNU General Public License
> >> along with
> >> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> >> + */
> >> +
> >> +#ifndef __OMAP_DRM_H__
> >> +#define __OMAP_DRM_H__
> >> +
> >> +#include <linux/module.h>
> >> +#include <drm/drmP.h>
> >> +
> >> +#define OMAP_PARAM_CHIPSET_ID        1       /* ie. 0x3430, 0x4430,
etc
> > */
> >> +
> >> +struct drm_omap_param {
> >> +     uint64_t param;                 /* in */
> >> +     uint64_t value;                 /* in (set_param), out
(get_param)
> >> */
> >> +};
> >> +
> >> +struct drm_omap_get_base {
> >> +     char plugin_name[64];           /* in */
> >> +     uint32_t ioctl_base;            /* out */
> >> +};
> >> +
> >> +#define OMAP_BO_SCANOUT              0x00000001      /* scanout
capable
> > (phys
> >> contiguous) */
> >> +#define OMAP_BO_CACHE_MASK   0x00000006      /* cache type mask, see
> >> cache modes */
> >> +#define OMAP_BO_TILED_MASK   0x00000f00      /* tiled mapping mask,
see
> >> tiled modes */
> >> +
> >> +/* cache modes */
> >> +#define OMAP_BO_CACHED               0x00000000      /* default */
> >> +#define OMAP_BO_WC           0x00000002      /* write-combine */
> >> +#define OMAP_BO_UNCACHED     0x00000004      /* strongly-ordered
> > (uncached) */
> >> +
> >> +/* tiled modes */
> >> +#define OMAP_BO_TILED_8              0x00000100
> >> +#define OMAP_BO_TILED_16     0x00000200
> >> +#define OMAP_BO_TILED_32     0x00000300
> >> +
> >> +struct drm_omap_gem_new {
> >> +     union {                         /* in */
> >> +             uint32_t bytes;         /* (for non-tiled formats) */
> >> +             struct {
> >> +                     uint16_t width;
> >> +                     uint16_t height;
> >> +             } tiled;                /* (for tiled formats) */
> >> +     } size;
> >> +     uint32_t flags;                 /* in */
> >> +     uint64_t offset;                /* out */
> >> +     uint32_t handle;                /* out */
> >> +};
> >> +
> >> +/* mask of operations: */
> >> +enum omap_gem_op {
> >> +     OMAP_GEM_READ = 0x01,
> >> +     OMAP_GEM_WRITE = 0x02,
> >> +};
> >> +
> >> +struct drm_omap_gem_cpu_prep {
> >> +     uint32_t handle;                /* buffer handle (in) */
> >> +     uint32_t op;                    /* mask of omap_gem_op (in) */
> >> +};
> >> +
> >> +struct drm_omap_gem_cpu_fini {
> >> +     uint32_t handle;                /* buffer handle (in) */
> >> +     uint32_t op;                    /* mask of omap_gem_op (in) */
> >> +     /* TODO maybe here we pass down info about what regions are
> touched
> >> +      * by sw so we can be clever about cache ops?  For now a
> > placeholder,
> >> +      * set to zero and we just do full buffer flush..
> >> +      */
> >> +     uint32_t nregions;
> >> +};
> >> +
> >> +struct drm_omap_gem_info {
> >> +     uint32_t handle;                /* buffer handle (in) */
> >> +     uint32_t pad;
> >> +     uint64_t offset;                /* out */
> >> +};
> >> +
> >> +#define DRM_OMAP_GET_PARAM           0x00
> >> +#define DRM_OMAP_SET_PARAM           0x01
> >> +#define DRM_OMAP_GET_BASE            0x02
> >> +#define DRM_OMAP_GEM_NEW             0x03
> >> +#define DRM_OMAP_GEM_CPU_PREP        0x04
> >> +#define DRM_OMAP_GEM_CPU_FINI        0x05
> >> +#define DRM_OMAP_GEM_INFO    0x06
> >> +#define DRM_OMAP_NUM_IOCTLS          0x07
> >> +
> >> +#define DRM_IOCTL_OMAP_GET_PARAM     DRM_IOWR(DRM_COMMAND_BASE +
> >> DRM_OMAP_GET_PARAM, struct drm_omap_param)
> >> +#define DRM_IOCTL_OMAP_SET_PARAM     DRM_IOW (DRM_COMMAND_BASE +
> >> DRM_OMAP_SET_PARAM, struct drm_omap_param)
> >> +#define DRM_IOCTL_OMAP_GET_BASE              DRM_IOWR(DRM_COMMAND_BASE
> +
> >> DRM_OMAP_GET_BASE, struct drm_omap_get_base)
> >> +#define DRM_IOCTL_OMAP_GEM_NEW               DRM_IOWR(DRM_COMMAND_BASE
+
> >> DRM_OMAP_GEM_NEW, struct drm_omap_gem_new)
> >> +#define DRM_IOCTL_OMAP_GEM_CPU_PREP  DRM_IOW (DRM_COMMAND_BASE +
> >> DRM_OMAP_GEM_CPU_PREP, struct drm_omap_gem_cpu_prep)
> >> +#define DRM_IOCTL_OMAP_GEM_CPU_FINI  DRM_IOW (DRM_COMMAND_BASE +
> >> DRM_OMAP_GEM_CPU_FINI, struct drm_omap_gem_cpu_fini)
> >> +#define DRM_IOCTL_OMAP_GEM_INFO              DRM_IOWR(DRM_COMMAND_BASE
> +
> >> DRM_OMAP_GEM_INFO, struct drm_omap_gem_info)
> >> +
> >> +/* interface that plug-in drivers can implement */
> >> +struct omap_drm_plugin {
> >> +     const char *name;
> >> +
> >> +     /* drm functions */
> >> +     int (*open)(struct drm_device *dev, struct drm_file *file);
> >> +     int (*load)(struct drm_device *dev, unsigned long flags);
> >> +     int (*unload)(struct drm_device *dev);
> >> +     int (*release)(struct drm_device *dev, struct drm_file *file);
> >> +
> >> +     struct drm_ioctl_desc *ioctls;
> >> +     int num_ioctls;
> >> +     int ioctl_base;
> >> +
> >> +     struct list_head list;  /* note, this means struct can't be
const..
> >> */
> >> +};
> >> +
> >> +int omap_drm_register_plugin(struct omap_drm_plugin *plugin);
> >> +int omap_drm_unregister_plugin(struct omap_drm_plugin *plugin);
> >> +
> >> +int omap_drm_register_mapper(void);
> >> +void omap_drm_unregister_mapper(int id);
> >> +
> >> +/* external mappers should get paddr or pages when it needs the pages
> >> pinned
> >> + * and put when done..
> >> + */
> >> +int omap_gem_get_paddr(struct drm_gem_object *obj,
> >> +             unsigned long *paddr, bool remap);
> >> +int omap_gem_put_paddr(struct drm_gem_object *obj);
> >> +int omap_gem_get_pages(struct drm_gem_object *obj, struct page
> ***pages);
> >> +int omap_gem_put_pages(struct drm_gem_object *obj);
> >> +
> >> +uint32_t omap_gem_flags(struct drm_gem_object *obj);
> >> +void * omap_gem_priv(struct drm_gem_object *obj, int mapper_id);
> >> +void omap_gem_set_priv(struct drm_gem_object *obj, int mapper_id, void
> >> *priv);
> >> +uint64_t omap_gem_mmap_offset(struct drm_gem_object *obj);
> >> +
> >> +/* for external plugin buffers wrapped as GEM object (via.
> >> omap_gem_new_ext())
> >> + * a vm_ops struct can be provided to get callback notification of
> >> various
> >> + * events..
> >> + */
> >> +struct omap_gem_vm_ops {
> >> +     void (*open)(struct vm_area_struct * area);
> >> +     void (*close)(struct vm_area_struct * area);
> >> +
> >> +     /* note: mmap is not expected to do anything.. it is just to
> allow
> >> buffer
> >> +      * allocate to update it's own internal state
> >> +      */
> >> +     void (*mmap)(struct file *, struct vm_area_struct *);
> >> +};
> >> +
> >> +struct drm_gem_object * omap_gem_new_ext(struct drm_device *dev,
> >> +             size_t size, uint32_t flags, unsigned long paddr, struct
> >> page **pages,
> >> +             struct omap_gem_vm_ops *ops);
> >> +struct drm_gem_object * omap_gem_new(struct drm_device *dev,
> >> +             size_t size, uint32_t flags);
> >> +
> >> +void omap_gem_op_update(void);
> >> +void omap_gem_op_start(struct drm_gem_object *obj, enum omap_gem_op
> op);
> >> +void omap_gem_op_finish(struct drm_gem_object *obj, enum omap_gem_op
> op);
> >> +int omap_gem_op_sync(struct drm_gem_object *obj, enum omap_gem_op op);
> >> +int omap_gem_op_async(struct drm_gem_object *obj, enum omap_gem_op op,
> >> +             void (*fxn)(void *arg), void *arg);
> >> +void omap_gem_set_sync_object(struct drm_gem_object *obj, void
> *syncobj);
> >> +
> >> +
> >> +/* optional platform data to configure the default configuration of
> which
> >> + * pipes/overlays/CRTCs are used.. if this is not provided, then
> instead
> >> the
> >> + * first CONFIG_DRM_OMAP_NUM_CRTCS are used, and they are each
> connected
> >> to
> >> + * one manager, with priority given to managers that are connected to
> >> + * detected devices.  This should be a good default behavior for most
> >> cases,
> >> + * but yet there still might be times when you wish to do something
> >> different.
> >> + */
> >> +struct omap_drm_platform_data {
> >> +     int ovl_cnt;
> >> +     const int *ovl_ids;
> >> +     int mgr_cnt;
> >> +     const int *mgr_ids;
> >> +     int dev_cnt;
> >> +     const char **dev_names;
> >> +};
> >> +
> >> +#endif /* __OMAP_DRM_H__ */
> >> --
> >
> > Remove device specific functions from linux/include/linux/omap_drm.h and
> > move them to driver folder. I was told that include/linux/*.h file
> should
> > contain only interfaces for user process from Dave.
> 
> 
> fwiw, the functions in this header are already only ones used by other
> kernel drivers.. everything else internal to omapdrm is already in
> omap_drv.h.  I'll remove these for now (see discussion about plugin
> API), although when it is re-introduced they need to be in a header
> accessible from other drivers.  Although should maybe still be split
> from what is needed by userspace?  (Something like omapdrm_plugin.h?)
> 
> 

I'm afraid I don't understand what you mean, "Although should maybe still be
split from what is needed by userspace?  (Something like
omapdrm_plugin.h?)", could you please clarify that again?

> BR,
> -R
> 
> >> 1.7.5.4
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel at lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >