<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    <br>
    <div class="moz-cite-prefix">On 09/02/2016 10:48 AM, Christian König
      wrote:<br>
    </div>
    <blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div class="moz-cite-prefix">Am 02.09.2016 um 16:10 schrieb Leo
        Liu:<br>
      </div>
      <blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> <br>
        <br>
        On 09/02/2016 09:50 AM, Christian König wrote: <br>
        <blockquote type="cite">Am 02.09.2016 um 15:27 schrieb Leo Liu:
          <br>
          <blockquote type="cite"> <br>
            <br>
            On 09/02/2016 02:11 AM, Christian König wrote: <br>
            <blockquote type="cite">Am 02.09.2016 um 04:03 schrieb
              Michel Dänzer: <br>
              <blockquote type="cite">On 02/09/16 10:17 AM, Michel
                Dänzer wrote: <br>
                <blockquote type="cite">On 02/09/16 12:58 AM, Leo Liu
                  wrote: <br>
                  <blockquote type="cite">On 09/01/2016 11:54 AM, Nayan
                    Deshmukh wrote: <br>
                    <blockquote type="cite">I saw the code in dri3_glx.c
                      and I could somewhat relate some basic <br>
                      code structure to the vl_winsys_dri3.c. But I am
                      new to this and not aware of the <br>
                      terminology that you used about the buffers. Could
                      you please explain what needs <br>
                      to be done in more detail or point me to where I
                      can read about it. <br>
                    </blockquote>
                    I believe it's from loader_dri3_helper.c with
                    "is_different_gpu" <br>
                    condition true, that will include back buffer and
                    front buffer case. <br>
                    you could try only back buffer case for now. <br>
                  </blockquote>
                   From a high level, PRIME mainly affects presentation,
                  not so much the <br>
                  video decoding / rendering. The important thing is
                  that the buffer used <br>
                  for presentation via the Present extension is linear,
                  not tiled. I'm not <br>
                  sure whether it makes more sense to allocate a
                  separate linear buffer <br>
                  for this purpose, as is done for GLX, or for the vl
                  code to make the <br>
                  corresponding back (or front?) buffer linear in the
                  first place. <br>
                </blockquote>
                A separate linear buffer is probably better, actually,
                since it will <br>
                also be pinned to system memory while it's being shared
                with another GPU. <br>
              </blockquote>
              <br>
              Yes, I agree. Nayan should also work on avoiding the extra
              copy which currently occur because we can't allocate
              output buffers directly in the format needed for
              presentation. <br>
              <br>
              The general idea should be to to check during presentation
              if the format in the output surface is displayable
              directly. <br>
            </blockquote>
            <br>
            Also we have to consider drawable resized case. <br>
          </blockquote>
          <br>
          Actually we don't. Take a look at the VDPAU spec the output
          surface should be send for displaying without considering it's
          size. <br>
          <br>
          E.g. when the window is 256x256 pixels, but the application
          allocated an output surface of 1024x768 we should still send
          the whole surface to the X server. <br>
          <br>
          It's the job of the application to resize the output surfaces
          not the one of the VDPAU state tracker. <br>
        </blockquote>
        <br>
        I thought this get done by vl compositor from presentation,
        scaling up or down from output surface to back buffer based on
        the resize. <br>
      </blockquote>
      <br>
      No, that is incorrect. Take a look at the VDPAU spec:<br>
      <br>
      <blockquote type="cite">
        <p>Applications may choose to allow resizing of the presentation
          queue target (which may be e.g. a regular Window when using an
          X11-based implementation).</p>
        <p><b>clip_width</b> and <b>clip_height</b> may be used to
          limit the size of the displayed region of a surface, in order
          to match the specific region that was rendered to.</p>
        <p>In turn, this allows the application to allocate over-sized
          (e.g. screen-sized) surfaces, but render to a region that
          matches the current size of the video window.</p>
        <p>Using this technique, an application's response to window
          resizing may simply be to render to, and display, a different
          region of the surface, rather than de-/re-allocation of
          surfaces to match the updated window size.</p>
      </blockquote>
      <br>
      This means that we should send the original output surface size to
      X, no matter what size it has or what size the window has it is
      displayed in.<br>
      <br>
      That wasn't possible with DRI2, that's why we have that workaround
      with the delayed rendering in the mixer.<br>
    </blockquote>
    <br>
    I did a quick hack on single GPU, and tested, this proves the whole
    idea is working including resizing.<br>
    Linear is still displayable, just looks kind of sluggish when
    playback.<br>
    <br>
    Here is the hack for reference including remove back buffer
    creating, and presentation rendering, use output surface handle for
    X<br>
    <br>
    diff --git a/src/gallium/auxiliary/vl/vl_winsys.h
    b/src/gallium/auxiliary/vl/vl_winsys.h<br>
    index 26db9f2..908ec3a 100644<br>
    --- a/src/gallium/auxiliary/vl/vl_winsys.h<br>
    +++ b/src/gallium/auxiliary/vl/vl_winsys.h<br>
    @@ -59,6 +59,9 @@ struct vl_screen<br>
        void *<br>
        (*get_private)(struct vl_screen *vscreen);<br>
     <br>
    +   void *<br>
    +   (*set_output_handle)(struct vl_screen *vscreen, struct
    winsys_handle whandle);<br>
    +<br>
        struct pipe_screen *pscreen;<br>
        struct pipe_loader_device *dev;<br>
     };<br>
    diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
    b/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
    index 3d596a6..4908699 100644<br>
    --- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
    +++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
    @@ -64,6 +64,9 @@ struct vl_dri3_screen<br>
        xcb_connection_t *conn;<br>
        xcb_drawable_t drawable;<br>
     <br>
    +   unsigned output_handle;<br>
    +   unsigned output_stride;<br>
    +<br>
        uint32_t width, height, depth;<br>
     <br>
        xcb_present_event_t eid;<br>
    @@ -225,6 +228,7 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
    *scrn)<br>
        if (!shm_fence)<br>
           goto close_fd;<br>
     <br>
    +#if 0<br>
        memset(&templ, 0, sizeof(templ));<br>
        templ.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW |<br>
                     PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;<br>
    @@ -248,6 +252,11 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
    *scrn)<br>
                                                usage);<br>
        buffer_fd = whandle.handle;<br>
        buffer->pitch = whandle.stride;<br>
    +#endif<br>
    +<br>
    +   buffer_fd = scrn->output_handle;<br>
    +   buffer->pitch = scrn->output_stride;<br>
    +<br>
        xcb_dri3_pixmap_from_buffer(scrn->conn,<br>
                                    (pixmap =
    xcb_generate_id(scrn->conn)),<br>
                                    scrn->drawable,<br>
    @@ -591,6 +600,15 @@ vl_dri3_screen_get_private(struct vl_screen
    *vscreen)<br>
     }<br>
     <br>
     static void<br>
    +vl_dri3_set_output_handle(struct vl_screen *vscreen, struct
    winsys_handle whandle)<br>
    +{<br>
    +   struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;<br>
    +<br>
    +   scrn->output_handle = whandle.handle;<br>
    +   scrn->output_stride = whandle.stride;<br>
    +}<br>
    +<br>
    +static void<br>
     vl_dri3_screen_destroy(struct vl_screen *vscreen)<br>
     {<br>
        struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;<br>
    @@ -706,6 +724,7 @@ vl_dri3_screen_create(Display *display, int
    screen)<br>
        scrn->base.set_next_timestamp =
    vl_dri3_screen_set_next_timestamp;<br>
        scrn->base.get_private = vl_dri3_screen_get_private;<br>
        scrn->base.pscreen->flush_frontbuffer =
    vl_dri3_flush_frontbuffer;<br>
    +   scrn->base.set_output_handle = vl_dri3_set_output_handle;<br>
     <br>
        return &scrn->base;<br>
     <br>
    diff --git a/src/gallium/state_trackers/vdpau/presentation.c
    b/src/gallium/state_trackers/vdpau/presentation.c<br>
    index 2862eaf..d5e832e 100644<br>
    --- a/src/gallium/state_trackers/vdpau/presentation.c<br>
    +++ b/src/gallium/state_trackers/vdpau/presentation.c<br>
    @@ -30,6 +30,11 @@<br>
     <br>
     #include "util/u_debug.h"<br>
     #include "util/u_memory.h"<br>
    +#include "util/u_sampler.h"<br>
    +#include "util/u_format.h"<br>
    +#include "util/u_surface.h"<br>
    +<br>
    +#include "state_tracker/drm_driver.h"<br>
     <br>
     #include "vdpau_private.h"<br>
     <br>
    @@ -216,6 +221,7 @@
    vlVdpPresentationQueueDisplay(VdpPresentationQueue
    presentation_queue,<br>
        struct vl_compositor *compositor;<br>
        struct vl_compositor_state *cstate;<br>
        struct vl_screen *vscreen;<br>
    +   struct winsys_handle whandle;<br>
     <br>
        pq = vlGetDataHTAB(presentation_queue);<br>
        if (!pq)<br>
    @@ -231,14 +237,26 @@
    vlVdpPresentationQueueDisplay(VdpPresentationQueue
    presentation_queue,<br>
        vscreen = pq->device->vscreen;<br>
     <br>
        pipe_mutex_lock(pq->device->mutex);<br>
    +<br>
    +   memset(&whandle, 0, sizeof(struct winsys_handle));<br>
    +   whandle.type = DRM_API_HANDLE_TYPE_FD;<br>
    +<br>
    +   if
    (!vscreen->pscreen->resource_get_handle(vscreen->pscreen,
    surf->device->context,<br>
    +                                     surf->surface->texture,
    &whandle,<br>
    +                     PIPE_HANDLE_USAGE_READ_WRITE))<br>
    +      return VDP_STATUS_NO_IMPLEMENTATION;<br>
    +<br>
    +   vscreen->set_output_handle(vscreen, whandle);<br>
    +<br>
        tex = vscreen->texture_from_drawable(vscreen, (void
    *)pq->drawable);<br>
    -   if (!tex) {<br>
    -      pipe_mutex_unlock(pq->device->mutex);<br>
    -      return VDP_STATUS_INVALID_HANDLE;<br>
    -   }<br>
    +//   if (!tex) {<br>
    +//      pipe_mutex_unlock(pq->device->mutex);<br>
    +//      return VDP_STATUS_INVALID_HANDLE;<br>
    +//   }<br>
     <br>
        dirty_area = vscreen->get_dirty_area(vscreen);<br>
     <br>
    +#if 0<br>
        memset(&surf_templ, 0, sizeof(surf_templ));<br>
        surf_templ.format = tex->format;<br>
        surf_draw = pipe->create_surface(pipe, tex, &surf_templ);<br>
    @@ -269,6 +287,7 @@
    vlVdpPresentationQueueDisplay(VdpPresentationQueue
    presentation_queue,<br>
           vl_compositor_set_dst_clip(cstate, &dst_clip);<br>
           vl_compositor_render(cstate, compositor, surf_draw,
    dirty_area, true);<br>
        }<br>
    +#endif<br>
     <br>
        vscreen->set_next_timestamp(vscreen,
    earliest_presentation_time);<br>
        pipe->screen->flush_frontbuffer(pipe->screen, tex, 0,
    0,<br>
    @@ -294,8 +313,10 @@
    vlVdpPresentationQueueDisplay(VdpPresentationQueue
    presentation_queue,<br>
           framenum++;<br>
        }<br>
     <br>
    +#if 0<br>
        pipe_resource_reference(&tex, NULL);<br>
        pipe_surface_reference(&surf_draw, NULL);<br>
    +#endif<br>
        pipe_mutex_unlock(pq->device->mutex);<br>
     <br>
        return VDP_STATUS_OK;<br>
    <br>
    <br>
    Regards,<br>
    Leo<br>
    <br>
    <br>
    <blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com"
      type="cite"> <br>
      But no worry it's only a minor issue and a good task for Nayan to
      get deeper into the graphics stack.<br>
      <br>
      Regards,<br>
      Christian.<br>
      <br>
      <blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> <br>
        Regards, <br>
        Leo <br>
        <br>
        <blockquote type="cite"> <br>
          Regards, <br>
          Christian. <br>
          <br>
          <blockquote type="cite"> <br>
            Regards, <br>
            Leo <br>
            <br>
            <blockquote type="cite">If that is the case then handle of
              that surface should be send directly to X. <br>
              If that isn't the case we reallocate the backing buffer,
              copy the content of the output surface into it and then
              send the new handle to X. <br>
              <br>
              Regards, <br>
              Christian. <br>
            </blockquote>
            <br>
            _______________________________________________ <br>
            mesa-dev mailing list <br>
            <a moz-do-not-send="true" class="moz-txt-link-abbreviated"
              href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a>
            <br>
            <a moz-do-not-send="true" class="moz-txt-link-freetext"
              href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a>
            <br>
          </blockquote>
          <br>
          <br>
        </blockquote>
        <br>
      </blockquote>
      <p><br>
      </p>
    </blockquote>
    <br>
  </body>
</html>