<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<br>
<div class="moz-cite-prefix">On 09/02/2016 10:48 AM, Christian König
wrote:<br>
</div>
<blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<div class="moz-cite-prefix">Am 02.09.2016 um 16:10 schrieb Leo
Liu:<br>
</div>
<blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> <br>
<br>
On 09/02/2016 09:50 AM, Christian König wrote: <br>
<blockquote type="cite">Am 02.09.2016 um 15:27 schrieb Leo Liu:
<br>
<blockquote type="cite"> <br>
<br>
On 09/02/2016 02:11 AM, Christian König wrote: <br>
<blockquote type="cite">Am 02.09.2016 um 04:03 schrieb
Michel Dänzer: <br>
<blockquote type="cite">On 02/09/16 10:17 AM, Michel
Dänzer wrote: <br>
<blockquote type="cite">On 02/09/16 12:58 AM, Leo Liu
wrote: <br>
<blockquote type="cite">On 09/01/2016 11:54 AM, Nayan
Deshmukh wrote: <br>
<blockquote type="cite">I saw the code in dri3_glx.c
and I could somewhat relate some basic <br>
code structure to the vl_winsys_dri3.c. But I am
new to this and not aware of the <br>
terminology that you used about the buffers. Could
you please explain what needs <br>
to be done in more detail or point me to where I
can read about it. <br>
</blockquote>
I believe it's from loader_dri3_helper.c with
"is_different_gpu" <br>
condition true, that will include back buffer and
front buffer case. <br>
you could try only back buffer case for now. <br>
</blockquote>
From a high level, PRIME mainly affects presentation,
not so much the <br>
video decoding / rendering. The important thing is
that the buffer used <br>
for presentation via the Present extension is linear,
not tiled. I'm not <br>
sure whether it makes more sense to allocate a
separate linear buffer <br>
for this purpose, as is done for GLX, or for the vl
code to make the <br>
corresponding back (or front?) buffer linear in the
first place. <br>
</blockquote>
A separate linear buffer is probably better, actually,
since it will <br>
also be pinned to system memory while it's being shared
with another GPU. <br>
</blockquote>
<br>
Yes, I agree. Nayan should also work on avoiding the extra
copy which currently occur because we can't allocate
output buffers directly in the format needed for
presentation. <br>
<br>
The general idea should be to to check during presentation
if the format in the output surface is displayable
directly. <br>
</blockquote>
<br>
Also we have to consider drawable resized case. <br>
</blockquote>
<br>
Actually we don't. Take a look at the VDPAU spec the output
surface should be send for displaying without considering it's
size. <br>
<br>
E.g. when the window is 256x256 pixels, but the application
allocated an output surface of 1024x768 we should still send
the whole surface to the X server. <br>
<br>
It's the job of the application to resize the output surfaces
not the one of the VDPAU state tracker. <br>
</blockquote>
<br>
I thought this get done by vl compositor from presentation,
scaling up or down from output surface to back buffer based on
the resize. <br>
</blockquote>
<br>
No, that is incorrect. Take a look at the VDPAU spec:<br>
<br>
<blockquote type="cite">
<p>Applications may choose to allow resizing of the presentation
queue target (which may be e.g. a regular Window when using an
X11-based implementation).</p>
<p><b>clip_width</b> and <b>clip_height</b> may be used to
limit the size of the displayed region of a surface, in order
to match the specific region that was rendered to.</p>
<p>In turn, this allows the application to allocate over-sized
(e.g. screen-sized) surfaces, but render to a region that
matches the current size of the video window.</p>
<p>Using this technique, an application's response to window
resizing may simply be to render to, and display, a different
region of the surface, rather than de-/re-allocation of
surfaces to match the updated window size.</p>
</blockquote>
<br>
This means that we should send the original output surface size to
X, no matter what size it has or what size the window has it is
displayed in.<br>
<br>
That wasn't possible with DRI2, that's why we have that workaround
with the delayed rendering in the mixer.<br>
</blockquote>
<br>
I did a quick hack on single GPU, and tested, this proves the whole
idea is working including resizing.<br>
Linear is still displayable, just looks kind of sluggish when
playback.<br>
<br>
Here is the hack for reference including remove back buffer
creating, and presentation rendering, use output surface handle for
X<br>
<br>
diff --git a/src/gallium/auxiliary/vl/vl_winsys.h
b/src/gallium/auxiliary/vl/vl_winsys.h<br>
index 26db9f2..908ec3a 100644<br>
--- a/src/gallium/auxiliary/vl/vl_winsys.h<br>
+++ b/src/gallium/auxiliary/vl/vl_winsys.h<br>
@@ -59,6 +59,9 @@ struct vl_screen<br>
void *<br>
(*get_private)(struct vl_screen *vscreen);<br>
<br>
+ void *<br>
+ (*set_output_handle)(struct vl_screen *vscreen, struct
winsys_handle whandle);<br>
+<br>
struct pipe_screen *pscreen;<br>
struct pipe_loader_device *dev;<br>
};<br>
diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
b/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
index 3d596a6..4908699 100644<br>
--- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
+++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c<br>
@@ -64,6 +64,9 @@ struct vl_dri3_screen<br>
xcb_connection_t *conn;<br>
xcb_drawable_t drawable;<br>
<br>
+ unsigned output_handle;<br>
+ unsigned output_stride;<br>
+<br>
uint32_t width, height, depth;<br>
<br>
xcb_present_event_t eid;<br>
@@ -225,6 +228,7 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
*scrn)<br>
if (!shm_fence)<br>
goto close_fd;<br>
<br>
+#if 0<br>
memset(&templ, 0, sizeof(templ));<br>
templ.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW |<br>
PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;<br>
@@ -248,6 +252,11 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
*scrn)<br>
usage);<br>
buffer_fd = whandle.handle;<br>
buffer->pitch = whandle.stride;<br>
+#endif<br>
+<br>
+ buffer_fd = scrn->output_handle;<br>
+ buffer->pitch = scrn->output_stride;<br>
+<br>
xcb_dri3_pixmap_from_buffer(scrn->conn,<br>
(pixmap =
xcb_generate_id(scrn->conn)),<br>
scrn->drawable,<br>
@@ -591,6 +600,15 @@ vl_dri3_screen_get_private(struct vl_screen
*vscreen)<br>
}<br>
<br>
static void<br>
+vl_dri3_set_output_handle(struct vl_screen *vscreen, struct
winsys_handle whandle)<br>
+{<br>
+ struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;<br>
+<br>
+ scrn->output_handle = whandle.handle;<br>
+ scrn->output_stride = whandle.stride;<br>
+}<br>
+<br>
+static void<br>
vl_dri3_screen_destroy(struct vl_screen *vscreen)<br>
{<br>
struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;<br>
@@ -706,6 +724,7 @@ vl_dri3_screen_create(Display *display, int
screen)<br>
scrn->base.set_next_timestamp =
vl_dri3_screen_set_next_timestamp;<br>
scrn->base.get_private = vl_dri3_screen_get_private;<br>
scrn->base.pscreen->flush_frontbuffer =
vl_dri3_flush_frontbuffer;<br>
+ scrn->base.set_output_handle = vl_dri3_set_output_handle;<br>
<br>
return &scrn->base;<br>
<br>
diff --git a/src/gallium/state_trackers/vdpau/presentation.c
b/src/gallium/state_trackers/vdpau/presentation.c<br>
index 2862eaf..d5e832e 100644<br>
--- a/src/gallium/state_trackers/vdpau/presentation.c<br>
+++ b/src/gallium/state_trackers/vdpau/presentation.c<br>
@@ -30,6 +30,11 @@<br>
<br>
#include "util/u_debug.h"<br>
#include "util/u_memory.h"<br>
+#include "util/u_sampler.h"<br>
+#include "util/u_format.h"<br>
+#include "util/u_surface.h"<br>
+<br>
+#include "state_tracker/drm_driver.h"<br>
<br>
#include "vdpau_private.h"<br>
<br>
@@ -216,6 +221,7 @@
vlVdpPresentationQueueDisplay(VdpPresentationQueue
presentation_queue,<br>
struct vl_compositor *compositor;<br>
struct vl_compositor_state *cstate;<br>
struct vl_screen *vscreen;<br>
+ struct winsys_handle whandle;<br>
<br>
pq = vlGetDataHTAB(presentation_queue);<br>
if (!pq)<br>
@@ -231,14 +237,26 @@
vlVdpPresentationQueueDisplay(VdpPresentationQueue
presentation_queue,<br>
vscreen = pq->device->vscreen;<br>
<br>
pipe_mutex_lock(pq->device->mutex);<br>
+<br>
+ memset(&whandle, 0, sizeof(struct winsys_handle));<br>
+ whandle.type = DRM_API_HANDLE_TYPE_FD;<br>
+<br>
+ if
(!vscreen->pscreen->resource_get_handle(vscreen->pscreen,
surf->device->context,<br>
+ surf->surface->texture,
&whandle,<br>
+ PIPE_HANDLE_USAGE_READ_WRITE))<br>
+ return VDP_STATUS_NO_IMPLEMENTATION;<br>
+<br>
+ vscreen->set_output_handle(vscreen, whandle);<br>
+<br>
tex = vscreen->texture_from_drawable(vscreen, (void
*)pq->drawable);<br>
- if (!tex) {<br>
- pipe_mutex_unlock(pq->device->mutex);<br>
- return VDP_STATUS_INVALID_HANDLE;<br>
- }<br>
+// if (!tex) {<br>
+// pipe_mutex_unlock(pq->device->mutex);<br>
+// return VDP_STATUS_INVALID_HANDLE;<br>
+// }<br>
<br>
dirty_area = vscreen->get_dirty_area(vscreen);<br>
<br>
+#if 0<br>
memset(&surf_templ, 0, sizeof(surf_templ));<br>
surf_templ.format = tex->format;<br>
surf_draw = pipe->create_surface(pipe, tex, &surf_templ);<br>
@@ -269,6 +287,7 @@
vlVdpPresentationQueueDisplay(VdpPresentationQueue
presentation_queue,<br>
vl_compositor_set_dst_clip(cstate, &dst_clip);<br>
vl_compositor_render(cstate, compositor, surf_draw,
dirty_area, true);<br>
}<br>
+#endif<br>
<br>
vscreen->set_next_timestamp(vscreen,
earliest_presentation_time);<br>
pipe->screen->flush_frontbuffer(pipe->screen, tex, 0,
0,<br>
@@ -294,8 +313,10 @@
vlVdpPresentationQueueDisplay(VdpPresentationQueue
presentation_queue,<br>
framenum++;<br>
}<br>
<br>
+#if 0<br>
pipe_resource_reference(&tex, NULL);<br>
pipe_surface_reference(&surf_draw, NULL);<br>
+#endif<br>
pipe_mutex_unlock(pq->device->mutex);<br>
<br>
return VDP_STATUS_OK;<br>
<br>
<br>
Regards,<br>
Leo<br>
<br>
<br>
<blockquote cite="mid:8f5965bb-ea89-dadc-f345-ff1904d6512a@amd.com"
type="cite"> <br>
But no worry it's only a minor issue and a good task for Nayan to
get deeper into the graphics stack.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote cite="mid:57C98845.1050102@amd.com" type="cite"> <br>
Regards, <br>
Leo <br>
<br>
<blockquote type="cite"> <br>
Regards, <br>
Christian. <br>
<br>
<blockquote type="cite"> <br>
Regards, <br>
Leo <br>
<br>
<blockquote type="cite">If that is the case then handle of
that surface should be send directly to X. <br>
If that isn't the case we reallocate the backing buffer,
copy the content of the output surface into it and then
send the new handle to X. <br>
<br>
Regards, <br>
Christian. <br>
</blockquote>
<br>
_______________________________________________ <br>
mesa-dev mailing list <br>
<a moz-do-not-send="true" class="moz-txt-link-abbreviated"
href="mailto:mesa-dev@lists.freedesktop.org">mesa-dev@lists.freedesktop.org</a>
<br>
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev">https://lists.freedesktop.org/mailman/listinfo/mesa-dev</a>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</body>
</html>