[RFC] Documentation: DRM framework documentation

Mon Jun 11 04:59:33 PDT 2012

On Thu, Jun 7, 2012 at 2:52 AM, Laurent Pinchart
<laurent.pinchart at ideasonboard.com> wrote:
> Hi Rob,
>
> On Tuesday 05 June 2012 04:31:54 Rob Clark wrote:
>
>> > +  - int (*firstopen) (struct drm_device *)
>> > +  - void (*lastclose) (struct drm_device *)
>> > +  - int (*open) (struct drm_device *, struct drm_file *)
>> > +  - void (*preclose) (struct drm_device *, struct drm_file *)
>> > +  - void (*postclose) (struct drm_device *, struct drm_file *)
>> > +
>> > +  Open and close handlers. None of those methods are mandatory.
>> > +
>> > +  The .firstopen() method is called by the DRM core when an application
>> > opens
>> > +  a device that has no other opened file handle. Similarly the
>> > .lastclose()
>> > +  method is called when the last application holding a file handle opened
>> > on
>> > +  the device closes it. Both methods are mostly used for UMS (User Mode
>> > +  Setting) drivers to acquire and release device resources which should
>> > be
>> > +  done in the .load() and .unload() methods for KMS drivers.
>> > +
>> > +  Note that the .lastclose() method is also called at module unload time
>> > or,
>> > +  for hot-pluggable devices, when the device is unplugged. The
>> > .firstopen()
>> > +  and .lastclose() calls can thus be unbalanced.
>> > +
>>
>> AFAIK lastclose() should also be drm_fb_helper_restore_fbdev_mode() to
>> restore fbcon mode.
>
> Good point. I haven't documented the fbdev helper yet, I'll keep that in mind
> when updating the documentation.
>
>> I'm also restoring crtc and plane properties to default values here, so a
>> subsequent open of the device isn't inheriting some state from the previous
>> user.
>
> That's very unlike V4L2, where we explicitly require state to be kept across
> opens. Is restoring properties a DRM standard behaviour, implemented by other
> drivers as well ?

I think it is in a way required, because on lastclose you are
restoring the state of the fbcon.. and at least some (probably all?)
of the properties are adjusting the mode, so the fbcon mode would be
incorrect if you didn't restore property state.

[snip]
>
>> > +- GEM Initialization
>> > +
>> > +  Drivers that use GEM must set the DRIVER_GEM bit in the struct
>> > drm_driver
>> > +  driver_features field. The DRM core will then automatically initialize
>> > the
>> > +  GEM core before calling the .load() operation.
>> > +
>> > +- GEM Objects Creation
>> > +
>> > +  GEM splits creation of GEM objects and allocation of the memory that
>> > backs
>> > +  them in two distinct operations.
>> > +
>> > +  GEM objects are represented by an instance of struct drm_gem_object.
>> > Drivers
>> > +  usually need to extend GEM objects with private information and thus
>> > create
>> > +  a driver-specific GEM object structure type that embeds an instance of
>> > +  struct drm_gem_object.
>> > +
>> > +  To create a GEM object, a driver allocates memory for an instance of
>> > its
>> > +  specific GEM object type and initializes the embedded struct
>> > drm_gem_object
>> > +  with a call to drm_gem_object_init(). The function takes a pointer to
>> > the
>> > +  DRM device, a pointer to the GEM object and the buffer object size in
>> > bytes.
>> > +
>> > +  GEM automatically allocate anonymous pageable memory through shmfs when
>> > an
>>
>> s/allocate/allocates/
>>
>> Also, maybe worth mentioning that actual 'struct page's are allocated
>> when drm_gem_get_pages() is called..
>
> I don't think that's in mainline yet, is it ?

oh, heh, you are right.. maybe I need to revisit that.   Well, for
now, s/drm_gem_get_pages()/shmem_read_mapping_page_gfp()/

[snip]
>
>> > +  To create a handle for a GEM object drivers call
>> > drm_gem_handle_create().
>> > +  The function takes a pointer to the DRM file and the GEM object and
>> > returns
>> > +  a locally unique handle. When the handle is no longer needed drivers
>> > delete
>> > +  it with a call to drm_gem_handle_delete(). Finally the GEM object
>> > associated
>> > +  with a handle can be retrieved by a call to drm_gem_object_lookup().
>>
>> I suppose it is easy enough seen from the code, but might be worth
>> mentioning that drm_gem_handle_create() increment's the refcnt of the gem
>> obj, rather than taking ownership of the gem obj.. so in some cases the
>> driver writer would want to drop that reference to the obj
>
> Do you really mean that driver sometimes need to drop the reference taken by
> the handle. My understanding is that they sometimes need to drop the reference
> they already hold. This of course makes no difference in the code, but from a
> documentation point of view that seems less confusing to me.

yes, drop the ref they already hold

>> > +  GEM names are similar in purpose to handles but are not local to DRM
>> > files.
>> > +  They can be passed between processes to reference a GEM object
>> > globally.
>> > +  Names can't be used directly to refer to objects in the DRM API,
>> > +  applications must convert handles to names and names to handles using
>> > the
>> > +  DRM_IOCTL_GEM_FLINK and DRM_IOCTL_GEM_OPEN ioctls respectively. The
>> > +  conversion is handled by the DRM core without any driver-specific
>> > support.
>> > +
>> > +  Similar to global names, GEM file descriptors are also used to share
>> > GEM
>> > +  objects across processes. They offer additional security: as file
>> > +  descriptors must be explictly sent over UNIX domain sockets to be
>> > shared
>> > +  between applications, they can't be guessed like the globally unique
>> > GEM
>> > +  names.
>> > +
>>
>> +  Drivers that support GEM file descriptors, also known as the DRM PRIME
>>
>> > API,
>> > +  must set the DRIVER_PRIME bit in the struct drm_driver driver_features
>> > field
>> > +  and implement the .prime_handle_to_fd() and .prime_fd_to_handle()
>> > +  operations.
>> > +
>> > +  int (*prime_handle_to_fd)(struct drm_device *dev,
>> > +                           struct drm_file *file_priv, uint32_t handle,
>> > +                           uint32_t flags, int *prime_fd)
>> > +  int (*prime_fd_to_handle)(struct drm_device *dev,
>> > +                           struct drm_file *file_priv, int prime_fd,
>> > +                           uint32_t *handle)
>> > +
>> > +  Those two operations convert a GEM handle to a PRIME file descriptor
>> > and
>> > +  vice versa. While the PRIME file descriptors can be specific to a
>> > device,
>> > +  their true power come from making them shareable between multiple
>> > devices
>> > +  using the cross-subsystem dma-buf buffer sharing framework. For that
>> > reason
>> > +  drivers are advised to use the drm_gem_prime_handle_to_fd() and
>> > +  drm_gem_prime_fd_to_handle() helper functions as their PRIME operations
>> > +  handlers.
>> > +
>>
>> editorial nag: maybe move the notes about dma-buf / cross-device /
>> cross-subsystem stuff above the prime_x_to_y fxn ptrs..  the real purpose
>> of drm_gem_prime_x_to_y() was to allow dma-buf/prime to be used by drivers
>> that do not use GEM.  Any driver using GEM in some form or another, either
>> by itself or in combination w/ TTM, probably wants to use
>> drm_gem_prime_x_to_y() for these two fxn ptrs and instead provide the
>> .gem_prime_{import,export} fxn ptr hooks which are used by
>> drm_gem_prime_x_to_y().
>
> I've rephrased the documentation a bit, but exchanging the two parts
> completely is difficult and confusing.
>
>> But if your buffer object handle is not a GEM handle, then you instead need
>> to implement .prime_x_to_y() fxn ptrs.. I think this applies only just to
>> one driver (vmwgfx?  I don't remember for sure, airlied mentioned it at one
>> point).  This was the reason for the split into two sets of fxn ptrs, so
>> that non GEM drivers don't get left out.
>
> So can I state in the documentation that the PRIME FD will always be a DMABUF
> FD, or is that non-mandatory ?

I guess it isn't strictly mandatory currently (if you implement your
own prime_x_to_y() fxns, although I can't think of why it would ever
be anything else

> [snip]
>
>> > +9. CRTC Operations
>> > +-------------------
>> > +
>> > +- void (*gamma_set)(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
>> > +                   uint32_t start, uint32_t size)
>> > +
>> > +  Apply a gamma table to the device. The operation is optional.
>> > +
>> > +- void (*destroy)(struct drm_crtc *crtc)
>> > +
>> > +  Destroy the CRTC when not needed anymore. See the KMS cleanup section.
>> > +
>> > +- int (*set_config)(struct drm_mode_set *set)
>> > +
>> > +  Apply a new CRTC configuration to the device. The configuration
>> > specifies a
>> > +  CRTC, a frame buffer to scan out from, a (x,y) position in the frame
>> > buffer,
>> > +  a display mode and an array of connectors to drive with the CRTC if
>> > +  possible.
>> > +
>> > +  If the frame buffer specified in the configuration is NULL, the driver
>> > must
>> > +  detach all encoders connected to the CRTC and all connectors attached
>> > to
>> > +  those encoders and disable them.
>> > +
>> > +  This operation is called with the mode config lock held.
>> > +
>> > +  (FIXME: How should set_config interact with DPMS? If the CRTC is
>> > suspended,
>> > +  should it be resumed?)
>>
>> DPMS calls from userspace are made on the connector, and internally
>> propagated to encoder/crtc/etc via drm_helper_connector_dpms().  When it is
>> resumed, it should (IIRC) keep it's old settings.
>
> Then what about the following ? :-)
>
> static void omap_crtc_commit(struct drm_crtc *crtc)
> {
>        struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
>        DBG("%s", omap_crtc->name);
>        omap_crtc_dpms(crtc, DRM_MODE_DPMS_ON);
> }
>
> The CRTC is resumed when the mode is set.

hmm, I guess that is a bit of an implementation detail, since it is
the dpms path where the display state gets set.  But yeah, the result
is setting a mode (or setting a plane) will turn things on.

[snip]
>> > +- int (*page_flip)(struct drm_crtc *crtc, struct drm_framebuffer *fb,
>> > +                  struct drm_pending_vblank_event *event)
>> > +
>> > +  Schedule a page flip to the given frame buffer for the CRTC. This
>> > operation
>> > +  is called with the mode config mutex held.
>> > +
>> > +  Page flipping is a synchronization mechanism that replaces the frame
>> > buffer
>> > +  being scanned out by the CRTC with a new frame buffer during vertical
>> > +  blanking, avoiding tearing. When an application requests a page flip
>> > the DRM
>> > +  core verifies that the new frame buffer is large enough to be scanned
>> > out by
>> > +  the CRTC in the currently configured mode and then calls the CRTC
>> > +  .page_flip() operation with a pointer to the new frame buffer.
>> > +
>> > +  The .page_flip() operation schedules a page flip. Once any pending
>> > rendering
>> > +  targetting the new frame buffer has completed, the CRTC will be
>> > reprogrammed
>> > +  to display that frame buffer after the next vertical refresh. The
>> > operation
>> > +  must return immediately without waiting for rendering or page flip to
>> > +  complete and must block any new rendering to the frame buffer until the
>> > page
>> > +  flip completes.
>> > +
>> > +  If a page flip is already pending, the .page_flip() operation must
>> > return
>> > +  -EBUSY.
>> > +
>> > +  (FIXME: Should DRM allow queueing multiple page flips?)
>>
>> Queuing could be done in userspace (which gets a page_flip event when flip
>> completes).. this at least simplifies the driver somewhat.  And in cases
>> where UI interactivity is required you really don't want to let rendering
>> get too far ahead of scanout or the user will see this as laggy display.
>>  So I don't see a particularly good reason to support queuing.
>
> It could be useful to make up for system latency, but I suppose that if your
> system load peaks prevent you from handling the page flip fast enough compared
> to the vertical frequency, you're screwed anyway.

I guess if you can't handle an IRQ in ~16ms you are pretty screwed anyways ;-)

For UI you often care much more about latency than throughput.. ie.
time from user input to results on the screen.  At least this is what
the user is more likely to notice.  If you allow too many frames to be
queued up, what the user sees on screen will be the response to their
input several frames ago, which you really don't want because it feels
laggy.  Think about when you pick up a phone/tablet/whatever in the
shop.. at least for me the first thing I do is try and scroll back and
forth and see how closely the UI tracks my finger ;-)

It is perhaps a bit different from a camera or video playback which is
more about throughput and less about user interactiveness.  Mostly for
UI you will be double buffered or sometimes triple buffered.  But you
want to minimize the time between doing some rendering in response to
user input and when they see the result on the screen.

>> (Also, makes it easier to move page_flip event handling to the core if you
>> can have only at most one outstanding page flip.)
>>
>> > +
>> > +  To synchronize page flip to vertical blanking the driver will likely
>> > need to
>> > +  enable vertical blanking interrupts. It should call drm_vblank_get()
>> > for
>> > +  that purpose, and call drm_vblank_put() after the page flip completes.
>> > +
>> > +  If the application has requested to be notified when page flip
>> > completes the
>> > +  .page_flip() operation will be called with a non-NULL event argument
>> > +  pointing to a drm_pending_vblank_event instance. Upon page flip
>> > completion
>> > +  the driver must fill the event::event sequence, tv_sec and tv_usec
>> > fields
>> > +  with the associated vertical blanking count and timestamp, add the
>> > event to
>> > +  the drm_file list of events to be signaled, and wake up any waiting
>> > process.
>> > +  This can be performed with
>> > +
>> > +       struct timeval now;
>> > +
>> > +       event->event.sequence = drm_vblank_count_and_time(..., &now);
>> > +       event->event.tv_sec = now.tv_sec;
>> > +       event->event.tv_usec = now.tv_usec;
>> > +
>> > +       spin_lock_irqsave(&dev->event_lock, flags);
>> > +       list_add_tail(&event->base.link,
>> > &event->base.file_priv->event_list);
>> > +       wake_up_interruptible(&event->base.file_priv->event_wait);
>> > +       spin_unlock_irqrestore(&dev->event_lock, flags);
>> > +
>> > +  (FIXME: Could drivers that don't need to wait for rendering to complete
>> > just
>> > +   add the event to dev->vblank_event_list and let the DRM core handle
>> > +   everything, as for "normal" vertical blanking events?)
>>
>> Or just drm_crtc_handle_page_flip_event() which would also let us simplify
>> some of the cleanup when userspace closes the device, as mentioned earlier
>> ;-)  I think when I have a few spare minutes I'll try this.
>
> I think it's a good idea, I'm waiting for your patch ;-) For drivers that can
> process page flips immediately that should not be difficult. For drivers that
> must wait for rendering to complete before processing the page flip, this
> might be more difficult.

well, I think drm_crtc_handle_page_flip_event() would be called by
driver after rendering and after vblank event.. so that is ok.

but I'm thinking we need to start keeping a drm TODO list somewhere so
some of these cleanups don't get forgotten ;-)

>> > +
>> > +  While waiting for the page flip to complete, the event->base.link list
>> > head
>> > +  can be used freely by the driver to store the pending event in a
>> > +  driver-specific list.
>> > +
>> > +  If the file handle is closed before the event is signaled, drivers must
>> > take
>> > +  care to destroy the event in their .preclose() operation (and, if
>> > needed,
>> > +  call drm_vblank_put()).
>> > +
>> > +
>> > +10. Plane Operations
>> > +-------------------
>> > +
>> > +- int (*update_plane)(struct drm_plane *plane, struct drm_crtc *crtc,
>> > +                     struct drm_framebuffer *fb, int crtc_x, int crtc_y,
>> > +                     unsigned int crtc_w, unsigned int crtc_h,
>> > +                     uint32_t src_x, uint32_t src_y,
>> > +                     uint32_t src_w, uint32_t src_h)
>> > +
>> > +  Enable and configure the plane to use the given CRTC and frame buffer.
>> > +
>> > +  The source rectangle in frame buffer memory coordinates is given by the
>> > +  src_x, src_y, src_w and src_h parameters (as 16.16 fixed point values).
>> > +  Devices that don't support subpixel plane coordinates can ignore the
>> > +  fractional part.
>> > +
>> > +  The destination rectangle in CRTC coordinates is given by the crtc_x,
>> > +  crtc_y, crtc_w and crtc_h parameters (as integer values). Devices scale
>> > +  the source rectangle to the destination rectangle. If scaling is not
>> > +  supported, the src_w and src_h values can be ignored.
>>
>> hmm, ignored seems bad.. maybe driver should -EINVAL?
>
> What if scaling is supported but src_w and src_h values are out of range ? Or
> if they specify sub-pixel coordinates while the hardware only supports pixel
> coordinates ? Should we return an error in all those cases, or can some of
> them be ignored ?

in sub-pixel case, the driver can just ignore the fractional part.
The bigger problem is dealing w/ out of range.

> I'm also not sure to like -EINVAL, it's pretty broad. A more precise error
> code might be better. Maybe -EDOM ?

this is an open issue.. it is partly addressed by Jesse Barnes's
atomic mode set plane patch a while back, which would let you 'test' a
configuration.  Although I don't think it returned any status mask or
something like this to let the driver indicate to userspace *what* it
didn't like about the config.

BR,
-R

> [snip]
>
>> > +13. TODO
>> > +--------
>> > +
>> > +- Document the struct_mutex catch-all lock
>> > +- Document connector properties
>> > +
>> > +- crtc and encoder dpms helper operations are only mandatory if the
>> > disable
>> > +  operation isn't provided.
>> > +- crtc and connector .save and .restore operations are only used
>> > internally in
>> > +  drivers, should they be removed from the core?
>> > +- encoder mid-layer .save and .restore operations are only used
>> > internally in
>> > +  drivers, should they be removed from the core?
>> > +- encoder mid-layer .detect operation is only used internally in drivers,
>> > +  should it be removed from the core?
>> > +
>> > +- KMS drivers must call drm_vblank_pre_modeset() and
>> > drm_vblank_post_modeset()
>> > +  around mode setting. Should this be done in the DRM core?
>> > +- vblank_disable_allowed is set to 1 in the first
>> > drm_vblank_post_modeset()
>> > +  call and never set back to 0. It seems to be safe to permanently set it
>> > to 1
>> > +  in drm_vblank_init() for KMS driver, and it might be safe for UMS
>> > drivers as
>> > +  well. This should be investigated.
>>
>> cool, thx.. I may have missed some things since I was trying to hurry up
>> and finish reviewing before my battery died ;-)
>
> Thanks for the review. If you get a bit more time, you could give me your
> opinion on the questions in the TODO section :-)
>
> --
> Regards,
>
> Laurent Pinchart
>