[PATCH 45/51] drm/gm12u320: Simplify upload work
Daniel Vetter
daniel.vetter at ffwll.ch
Sat Feb 22 13:00:46 UTC 2020
On Sat, Feb 22, 2020 at 1:30 PM Hans de Goede <hdegoede at redhat.com> wrote:
>
> Hi,
>
> On 2/21/20 10:03 PM, Daniel Vetter wrote:
> > Instead of having a work item that never stops (which really should be
> > a kthread), with a dedicated workqueue to not upset anyone else, use a
> > delayed work. A bunch of changes:
> >
> > - We can throw out all the custom wakeup and requeue logic and state
> > tracking. If we schedule the work with a 0 delay it'll get
> > scheduled immediately.
>
> I'm afraid that that is not true, from the kdoc of
> queue_delayed_work_on() (which the other functions are wrappers of) :
>
> * Return: %false if @work was already on a queue, %true otherwise. If
> * @delay is zero and @dwork is idle, it will be scheduled for immediate
> * execution.
>
> And since the work gets scheduled with IDLE_TIMEOUT at the end of
> the (modified) gm12u320_fb_update_work, it will not be idle when
> gm12u320_fb_mark_dirty() does the schedule with 0 timeout, so it
> will stay scheduled at the old IDLE_TIMEOUT and we will get a
> very low framerate.
>
> Instead we could use mod_delayed_work_on in the case where we want
> 0 timeout, that will behave as queue_delayed_work_on() when the work
> has not been scheduled yet and it will modify the timeout otherwise.
>
> This will still allow us to get rid of the waitq.
Hm I missed that, will fix.
> ###
>
> More in general though I'm not sure if getting rid of having our own
> workqueue is a good idea (getting rid of the waitq is still a nice
> cleanup). These projectors can be connected over USB2, and we send 20
> blocks for a frame update. For each block we send a command + data
> + readback status, the data part does not fit in a single USB 2 timeslice
> so that takes 2 ms + 1 ms for the command + 1ms of the status, so this
> takes aprox. 80 ms on an idle USB-2 bus, if the bus is in use things get
> worse and this assumes instant turn around for all the commands from the
> projector.
>
> schedule_delayed_work() uses the system_wq and that is described in
> the docs as:
>
> * system_wq is the one used by schedule[_delayed]_work[_on]().
> * Multi-CPU multi-threaded. There are users which expect relatively
> * short queue flush time. Don't queue works which can run for too
> * long.
>
> Arguably 80 ms is way too long, which would bring us to:
tbh I have no idea what's considered "long" in this context.
> * system_long_wq is similar to system_wq but may host long running
> * works. Queue flushing might take relatively long.
>
> But when connected over USB-3 we can easily do 60 FPS and we really
> don't want frame updates to be delayed by other long running works.
This is not what happens, the worker subsystem spools up new threads
in that case. If you're worried about latency then use
system_unbound_wq. The only reason you want your own workqueue is if
you need to flush the entire queue (instead of individual work items)
maybe because you don't want to deadlock with random other work items
that run there. As long as all you do is run a single work item, you
can just flush that, so no concern. Iirc the worker subsystem even
internally merges the actual worker threads, so your own wq is just
book-keeping for queue flushes.
> So neither of the standard available queues is really suitable
> and thus we really should keep using our own queue for this IMHO.
We can pick another one, but your own is imo still overkill. We don't
even do that in atomic helpers, and those hang out for at least a full
frame on the worker thread too. Thus far no screaming (but yeah it's
maybe not 80ms).
btw, can you give this a spin with your hw? Testing this stuff,
especially hotunplug and driver load would be really good.
Thanks, Daniel
>
> Regards,
>
> Hans
>
>
>
>
>
>
>
> >
> > - Persistent state (frame & draw_status_timeout) need to be moved out
> > of the work.
> >
> > - diff is bigger than the changes, biggest chunk is reindenting the
> > work fn because it lost its while loop.
> >
> > Lots of code deleting as consequence all over. Specifically we can
> > delete the drm_driver.release code now!
> >
> > Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
> > Cc: Hans de Goede <hdegoede at redhat.com>
> > Cc: "Noralf Trønnes" <noralf at tronnes.org>
> > ---
> > drivers/gpu/drm/tiny/gm12u320.c | 170 +++++++++++++-------------------
> > 1 file changed, 67 insertions(+), 103 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tiny/gm12u320.c b/drivers/gpu/drm/tiny/gm12u320.c
> > index c22b2ee470eb..46f5cea335a7 100644
> > --- a/drivers/gpu/drm/tiny/gm12u320.c
> > +++ b/drivers/gpu/drm/tiny/gm12u320.c
> > @@ -89,13 +89,12 @@ struct gm12u320_device {
> > unsigned char *cmd_buf;
> > unsigned char *data_buf[GM12U320_BLOCK_COUNT];
> > struct {
> > - bool run;
> > - struct workqueue_struct *workq;
> > - struct work_struct work;
> > - wait_queue_head_t waitq;
> > + struct delayed_work work;
> > struct mutex lock;
> > struct drm_framebuffer *fb;
> > struct drm_rect rect;
> > + int frame;
> > + int draw_status_timeout;
> > } fb_update;
> > };
> >
> > @@ -183,19 +182,9 @@ static int gm12u320_usb_alloc(struct gm12u320_device *gm12u320)
> > data_block_footer, DATA_BLOCK_FOOTER_SIZE);
> > }
> >
> > - gm12u320->fb_update.workq = create_singlethread_workqueue(DRIVER_NAME);
> > - if (!gm12u320->fb_update.workq)
> > - return -ENOMEM;
> > -
> > return 0;
> > }
> >
> > -static void gm12u320_usb_free(struct gm12u320_device *gm12u320)
> > -{
> > - if (gm12u320->fb_update.workq)
> > - destroy_workqueue(gm12u320->fb_update.workq);
> > -}
> > -
> > static int gm12u320_misc_request(struct gm12u320_device *gm12u320,
> > u8 req_a, u8 req_b,
> > u8 arg_a, u8 arg_b, u8 arg_c, u8 arg_d)
> > @@ -338,80 +327,76 @@ static void gm12u320_copy_fb_to_blocks(struct gm12u320_device *gm12u320)
> > static void gm12u320_fb_update_work(struct work_struct *work)
> > {
> > struct gm12u320_device *gm12u320 =
> > - container_of(work, struct gm12u320_device, fb_update.work);
> > - int draw_status_timeout = FIRST_FRAME_TIMEOUT;
> > + container_of(to_delayed_work(work), struct gm12u320_device,
> > + fb_update.work);
> > int block, block_size, len;
> > - int frame = 0;
> > int ret = 0;
> >
> > - while (gm12u320->fb_update.run) {
> > - gm12u320_copy_fb_to_blocks(gm12u320);
> > -
> > - for (block = 0; block < GM12U320_BLOCK_COUNT; block++) {
> > - if (block == GM12U320_BLOCK_COUNT - 1)
> > - block_size = DATA_LAST_BLOCK_SIZE;
> > - else
> > - block_size = DATA_BLOCK_SIZE;
> > -
> > - /* Send data command to device */
> > - memcpy(gm12u320->cmd_buf, cmd_data, CMD_SIZE);
> > - gm12u320->cmd_buf[8] = block_size & 0xff;
> > - gm12u320->cmd_buf[9] = block_size >> 8;
> > - gm12u320->cmd_buf[20] = 0xfc - block * 4;
> > - gm12u320->cmd_buf[21] = block | (frame << 7);
> > -
> > - ret = usb_bulk_msg(gm12u320->udev,
> > - usb_sndbulkpipe(gm12u320->udev, DATA_SND_EPT),
> > - gm12u320->cmd_buf, CMD_SIZE, &len,
> > - CMD_TIMEOUT);
> > - if (ret || len != CMD_SIZE)
> > - goto err;
> > -
> > - /* Send data block to device */
> > - ret = usb_bulk_msg(gm12u320->udev,
> > - usb_sndbulkpipe(gm12u320->udev, DATA_SND_EPT),
> > - gm12u320->data_buf[block], block_size,
> > - &len, DATA_TIMEOUT);
> > - if (ret || len != block_size)
> > - goto err;
> > -
> > - /* Read status */
> > - ret = usb_bulk_msg(gm12u320->udev,
> > - usb_rcvbulkpipe(gm12u320->udev, DATA_RCV_EPT),
> > - gm12u320->cmd_buf, READ_STATUS_SIZE, &len,
> > - CMD_TIMEOUT);
> > - if (ret || len != READ_STATUS_SIZE)
> > - goto err;
> > - }
> > + gm12u320_copy_fb_to_blocks(gm12u320);
> > +
> > + for (block = 0; block < GM12U320_BLOCK_COUNT; block++) {
> > + if (block == GM12U320_BLOCK_COUNT - 1)
> > + block_size = DATA_LAST_BLOCK_SIZE;
> > + else
> > + block_size = DATA_BLOCK_SIZE;
> > +
> > + /* Send data command to device */
> > + memcpy(gm12u320->cmd_buf, cmd_data, CMD_SIZE);
> > + gm12u320->cmd_buf[8] = block_size & 0xff;
> > + gm12u320->cmd_buf[9] = block_size >> 8;
> > + gm12u320->cmd_buf[20] = 0xfc - block * 4;
> > + gm12u320->cmd_buf[21] =
> > + block | (gm12u320->fb_update.frame << 7);
> >
> > - /* Send draw command to device */
> > - memcpy(gm12u320->cmd_buf, cmd_draw, CMD_SIZE);
> > ret = usb_bulk_msg(gm12u320->udev,
> > usb_sndbulkpipe(gm12u320->udev, DATA_SND_EPT),
> > - gm12u320->cmd_buf, CMD_SIZE, &len, CMD_TIMEOUT);
> > + gm12u320->cmd_buf, CMD_SIZE, &len,
> > + CMD_TIMEOUT);
> > if (ret || len != CMD_SIZE)
> > goto err;
> >
> > + /* Send data block to device */
> > + ret = usb_bulk_msg(gm12u320->udev,
> > + usb_sndbulkpipe(gm12u320->udev, DATA_SND_EPT),
> > + gm12u320->data_buf[block], block_size,
> > + &len, DATA_TIMEOUT);
> > + if (ret || len != block_size)
> > + goto err;
> > +
> > /* Read status */
> > ret = usb_bulk_msg(gm12u320->udev,
> > usb_rcvbulkpipe(gm12u320->udev, DATA_RCV_EPT),
> > gm12u320->cmd_buf, READ_STATUS_SIZE, &len,
> > - draw_status_timeout);
> > + CMD_TIMEOUT);
> > if (ret || len != READ_STATUS_SIZE)
> > goto err;
> > -
> > - draw_status_timeout = CMD_TIMEOUT;
> > - frame = !frame;
> > -
> > - /*
> > - * We must draw a frame every 2s otherwise the projector
> > - * switches back to showing its logo.
> > - */
> > - wait_event_timeout(gm12u320->fb_update.waitq,
> > - !gm12u320->fb_update.run ||
> > - gm12u320->fb_update.fb != NULL,
> > - IDLE_TIMEOUT);
> > }
> > +
> > + /* Send draw command to device */
> > + memcpy(gm12u320->cmd_buf, cmd_draw, CMD_SIZE);
> > + ret = usb_bulk_msg(gm12u320->udev,
> > + usb_sndbulkpipe(gm12u320->udev, DATA_SND_EPT),
> > + gm12u320->cmd_buf, CMD_SIZE, &len, CMD_TIMEOUT);
> > + if (ret || len != CMD_SIZE)
> > + goto err;
> > +
> > + /* Read status */
> > + ret = usb_bulk_msg(gm12u320->udev,
> > + usb_rcvbulkpipe(gm12u320->udev, DATA_RCV_EPT),
> > + gm12u320->cmd_buf, READ_STATUS_SIZE, &len,
> > + gm12u320->fb_update.draw_status_timeout);
> > + if (ret || len != READ_STATUS_SIZE)
> > + goto err;
> > +
> > + gm12u320->fb_update.draw_status_timeout = CMD_TIMEOUT;
> > + gm12u320->fb_update.frame = !gm12u320->fb_update.frame;
> > +
> > + /*
> > + * We must draw a frame every 2s otherwise the projector
> > + * switches back to showing its logo.
> > + */
> > + schedule_delayed_work(&gm12u320->fb_update.work, IDLE_TIMEOUT);
> > +
> > return;
> > err:
> > /* Do not log errors caused by module unload or device unplug */
> > @@ -446,36 +431,24 @@ static void gm12u320_fb_mark_dirty(struct drm_framebuffer *fb,
> > mutex_unlock(&gm12u320->fb_update.lock);
> >
> > if (wakeup)
> > - wake_up(&gm12u320->fb_update.waitq);
> > + schedule_delayed_work(&gm12u320->fb_update.work, 0);
> >
> > if (old_fb)
> > drm_framebuffer_put(old_fb);
> > }
> >
> > -static void gm12u320_start_fb_update(struct gm12u320_device *gm12u320)
> > -{
> > - mutex_lock(&gm12u320->fb_update.lock);
> > - gm12u320->fb_update.run = true;
> > - mutex_unlock(&gm12u320->fb_update.lock);
> > -
> > - queue_work(gm12u320->fb_update.workq, &gm12u320->fb_update.work);
> > -}
> > -
> > static void gm12u320_stop_fb_update(struct gm12u320_device *gm12u320)
> > {
> > - mutex_lock(&gm12u320->fb_update.lock);
> > - gm12u320->fb_update.run = false;
> > - mutex_unlock(&gm12u320->fb_update.lock);
> > + struct drm_framebuffer *old_fb;
> >
> > - wake_up(&gm12u320->fb_update.waitq);
> > - cancel_work_sync(&gm12u320->fb_update.work);
> > + cancel_delayed_work_sync(&gm12u320->fb_update.work);
> >
> > mutex_lock(&gm12u320->fb_update.lock);
> > - if (gm12u320->fb_update.fb) {
> > - drm_framebuffer_put(gm12u320->fb_update.fb);
> > - gm12u320->fb_update.fb = NULL;
> > - }
> > + old_fb = gm12u320->fb_update.fb;
> > + gm12u320->fb_update.fb = NULL;
> > mutex_unlock(&gm12u320->fb_update.lock);
> > +
> > + drm_framebuffer_put(old_fb);
> > }
> >
> > static int gm12u320_set_ecomode(struct gm12u320_device *gm12u320)
> > @@ -583,11 +556,11 @@ static void gm12u320_pipe_enable(struct drm_simple_display_pipe *pipe,
> > struct drm_crtc_state *crtc_state,
> > struct drm_plane_state *plane_state)
> > {
> > - struct gm12u320_device *gm12u320 = pipe->crtc.dev->dev_private;
> > struct drm_rect rect = { 0, 0, GM12U320_USER_WIDTH, GM12U320_HEIGHT };
> > + struct gm12u320_device *gm12u320 = pipe->crtc.dev->dev_private;
> >
> > + gm12u320->fb_update.draw_status_timeout = FIRST_FRAME_TIMEOUT;
> > gm12u320_fb_mark_dirty(plane_state->fb, &rect);
> > - gm12u320_start_fb_update(gm12u320);
> > }
> >
> > static void gm12u320_pipe_disable(struct drm_simple_display_pipe *pipe)
> > @@ -622,13 +595,6 @@ static const uint64_t gm12u320_pipe_modifiers[] = {
> > DRM_FORMAT_MOD_INVALID
> > };
> >
> > -static void gm12u320_driver_release(struct drm_device *dev)
> > -{
> > - struct gm12u320_device *gm12u320 = dev->dev_private;
> > -
> > - gm12u320_usb_free(gm12u320);
> > -}
> > -
> > DEFINE_DRM_GEM_FOPS(gm12u320_fops);
> >
> > static struct drm_driver gm12u320_drm_driver = {
> > @@ -640,7 +606,6 @@ static struct drm_driver gm12u320_drm_driver = {
> > .major = DRIVER_MAJOR,
> > .minor = DRIVER_MINOR,
> >
> > - .release = gm12u320_driver_release,
> > .fops = &gm12u320_fops,
> > DRM_GEM_SHMEM_DRIVER_OPS,
> > };
> > @@ -670,9 +635,8 @@ static int gm12u320_usb_probe(struct usb_interface *interface,
> > return -ENOMEM;
> >
> > gm12u320->udev = interface_to_usbdev(interface);
> > - INIT_WORK(&gm12u320->fb_update.work, gm12u320_fb_update_work);
> > + INIT_DELAYED_WORK(&gm12u320->fb_update.work, gm12u320_fb_update_work);
> > mutex_init(&gm12u320->fb_update.lock);
> > - init_waitqueue_head(&gm12u320->fb_update.waitq);
> >
> > dev = &gm12u320->dev;
> > ret = devm_drm_dev_init(&interface->dev, dev, &gm12u320_drm_driver);
> >
>
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
More information about the dri-devel
mailing list