<html> <head> <meta content="text/html; charset=utf-8" http-equiv="Content-Type"> </head> <body bgcolor="#FFFFFF" text="#000000"> <div class="moz-cite-prefix">On 09/02/2016 10:48 AM, Christian König wrote: </div> <blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com" type="cite"> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <div class="moz-cite-prefix">Am 02.09.2016 um 16:10 schrieb Leo Liu: </div> <blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> On 09/02/2016 09:50 AM, Christian König wrote: <blockquote type="cite">Am 02.09.2016 um 15:27 schrieb Leo Liu: <blockquote type="cite"> On 09/02/2016 02:11 AM, Christian König wrote: <blockquote type="cite">Am 02.09.2016 um 04:03 schrieb Michel Dänzer: <blockquote type="cite">On 02/09/16 10:17 AM, Michel Dänzer wrote: <blockquote type="cite">On 02/09/16 12:58 AM, Leo Liu wrote: <blockquote type="cite">On 09/01/2016 11:54 AM, Nayan Deshmukh wrote: <blockquote type="cite">I saw the code in dri3_glx.c and I could somewhat relate some basic code structure to the vl_winsys_dri3.c. But I am new to this and not aware of the terminology that you used about the buffers. Could you please explain what needs to be done in more detail or point me to where I can read about it. </blockquote> I believe it's from loader_dri3_helper.c with "is_different_gpu" condition true, that will include back buffer and front buffer case. you could try only back buffer case for now. </blockquote> From a high level, PRIME mainly affects presentation, not so much the video decoding / rendering. The important thing is that the buffer used for presentation via the Present extension is linear, not tiled. I'm not sure whether it makes more sense to allocate a separate linear buffer for this purpose, as is done for GLX, or for the vl code to make the corresponding back (or front?) buffer linear in the first place. </blockquote> A separate linear buffer is probably better, actually, since it will also be pinned to system memory while it's being shared with another GPU. </blockquote> Yes, I agree. Nayan should also work on avoiding the extra copy which currently occur because we can't allocate output buffers directly in the format needed for presentation. The general idea should be to to check during presentation if the format in the output surface is displayable directly. </blockquote> Also we have to consider drawable resized case. </blockquote> Actually we don't. Take a look at the VDPAU spec the output surface should be send for displaying without considering it's size. E.g. when the window is 256x256 pixels, but the application allocated an output surface of 1024x768 we should still send the whole surface to the X server. It's the job of the application to resize the output surfaces not the one of the VDPAU state tracker. </blockquote> I thought this get done by vl compositor from presentation, scaling up or down from output surface to back buffer based on the resize. </blockquote> No, that is incorrect. Take a look at the VDPAU spec: <blockquote type="cite"> Applications may choose to allow resizing of the presentation queue target (which may be e.g. a regular Window when using an X11-based implementation). clip_width and clip_height may be used to limit the size of the displayed region of a surface, in order to match the specific region that was rendered to. In turn, this allows the application to allocate over-sized (e.g. screen-sized) surfaces, but render to a region that matches the current size of the video window. Using this technique, an application's response to window resizing may simply be to render to, and display, a different region of the surface, rather than de-/re-allocation of surfaces to match the updated window size. </blockquote> This means that we should send the original output surface size to X, no matter what size it has or what size the window has it is displayed in. That wasn't possible with DRI2, that's why we have that workaround with the delayed rendering in the mixer. </blockquote> I did a quick hack on single GPU, and tested, this proves the whole idea is working including resizing. Linear is still displayable, just looks kind of sluggish when playback. Here is the hack for reference including remove back buffer creating, and presentation rendering, use output surface handle for X diff --git a/src/gallium/auxiliary/vl/vl_winsys.h b/src/gallium/auxiliary/vl/vl_winsys.h index 26db9f2..908ec3a 100644 --- a/src/gallium/auxiliary/vl/vl_winsys.h +++ b/src/gallium/auxiliary/vl/vl_winsys.h @@ -59,6 +59,9 @@ struct vl_screen void * (*get_private)(struct vl_screen *vscreen); + void * + (*set_output_handle)(struct vl_screen *vscreen, struct winsys_handle whandle); + struct pipe_screen *pscreen; struct pipe_loader_device *dev; }; diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c b/src/gallium/auxiliary/vl/vl_winsys_dri3.c index 3d596a6..4908699 100644 --- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c +++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c @@ -64,6 +64,9 @@ struct vl_dri3_screen xcb_connection_t *conn; xcb_drawable_t drawable; + unsigned output_handle; + unsigned output_stride; + uint32_t width, height, depth; xcb_present_event_t eid; @@ -225,6 +228,7 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn) if (!shm_fence) goto close_fd; +#if 0 memset(&templ, 0, sizeof(templ)); templ.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_SCANOUT | PIPE_BIND_SHARED; @@ -248,6 +252,11 @@ dri3_alloc_back_buffer(struct vl_dri3_screen *scrn) usage); buffer_fd = whandle.handle; buffer->pitch = whandle.stride; +#endif + + buffer_fd = scrn->output_handle; + buffer->pitch = scrn->output_stride; + xcb_dri3_pixmap_from_buffer(scrn->conn, (pixmap = xcb_generate_id(scrn->conn)), scrn->drawable, @@ -591,6 +600,15 @@ vl_dri3_screen_get_private(struct vl_screen *vscreen) } static void +vl_dri3_set_output_handle(struct vl_screen *vscreen, struct winsys_handle whandle) +{ + struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen; + + scrn->output_handle = whandle.handle; + scrn->output_stride = whandle.stride; +} + +static void vl_dri3_screen_destroy(struct vl_screen *vscreen) { struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen; @@ -706,6 +724,7 @@ vl_dri3_screen_create(Display *display, int screen) scrn->base.set_next_timestamp = vl_dri3_screen_set_next_timestamp; scrn->base.get_private = vl_dri3_screen_get_private; scrn->base.pscreen->flush_frontbuffer = vl_dri3_flush_frontbuffer; + scrn->base.set_output_handle = vl_dri3_set_output_handle; return &scrn->base; diff --git a/src/gallium/state_trackers/vdpau/presentation.c b/src/gallium/state_trackers/vdpau/presentation.c index 2862eaf..d5e832e 100644 --- a/src/gallium/state_trackers/vdpau/presentation.c +++ b/src/gallium/state_trackers/vdpau/presentation.c @@ -30,6 +30,11 @@ #include "util/u_debug.h" #include "util/u_memory.h" +#include "util/u_sampler.h" +#include "util/u_format.h" +#include "util/u_surface.h" + +#include "state_tracker/drm_driver.h" #include "vdpau_private.h" @@ -216,6 +221,7 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue, struct vl_compositor *compositor; struct vl_compositor_state *cstate; struct vl_screen *vscreen; + struct winsys_handle whandle; pq = vlGetDataHTAB(presentation_queue); if (!pq) @@ -231,14 +237,26 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue, vscreen = pq->device->vscreen; pipe_mutex_lock(pq->device->mutex); + + memset(&whandle, 0, sizeof(struct winsys_handle)); + whandle.type = DRM_API_HANDLE_TYPE_FD; + + if (!vscreen->pscreen->resource_get_handle(vscreen->pscreen, surf->device->context, + surf->surface->texture, &whandle, + PIPE_HANDLE_USAGE_READ_WRITE)) + return VDP_STATUS_NO_IMPLEMENTATION; + + vscreen->set_output_handle(vscreen, whandle); + tex = vscreen->texture_from_drawable(vscreen, (void *)pq->drawable); - if (!tex) { - pipe_mutex_unlock(pq->device->mutex); - return VDP_STATUS_INVALID_HANDLE; - } +// if (!tex) { +// pipe_mutex_unlock(pq->device->mutex); +// return VDP_STATUS_INVALID_HANDLE; +// } dirty_area = vscreen->get_dirty_area(vscreen); +#if 0 memset(&surf_templ, 0, sizeof(surf_templ)); surf_templ.format = tex->format; surf_draw = pipe->create_surface(pipe, tex, &surf_templ); @@ -269,6 +287,7 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue, vl_compositor_set_dst_clip(cstate, &dst_clip); vl_compositor_render(cstate, compositor, surf_draw, dirty_area, true); } +#endif vscreen->set_next_timestamp(vscreen, earliest_presentation_time); pipe->screen->flush_frontbuffer(pipe->screen, tex, 0, 0, @@ -294,8 +313,10 @@ vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue, framenum++; } +#if 0 pipe_resource_reference(&tex, NULL); pipe_surface_reference(&surf_draw, NULL); +#endif pipe_mutex_unlock(pq->device->mutex); return VDP_STATUS_OK; Regards, Leo <blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com" type="cite"> But no worry it's only a minor issue and a good task for Nayan to get deeper into the graphics stack. Regards, Christian. <blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> Regards, Leo <blockquote type="cite"> Regards, Christian. <blockquote type="cite"> Regards, Leo <blockquote type="cite">If that is the case then handle of that surface should be send directly to X. If that isn't the case we reallocate the backing buffer, copy the content of the output surface into it and then send the new handle to X. Regards, Christian. </blockquote> _______________________________________________ mesa-dev mailing list <a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a> <a moz-do-not-send="true" class="moz-txt-link-freetext" href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a> </blockquote> </blockquote> </blockquote> </blockquote> </body> </html>