<div dir="ltr">Hi Leo,<div><br></div><div>I thought so. As Michel suggested present extension needs </div><div>a linear buffer and he and Christian agreed that we should have</div><div>a separate linear buffer for this. But I still don't understand the code</div><div>in vl_winsys_dri3.c so I am not sure how this could be implemented.</div><div><br></div><div>Regards,</div><div>Nayan.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Sep 6, 2016 at 7:08 PM, Leo Liu <span dir="ltr"><<a href="mailto:leo.liu@amd.com" target="_blank">leo.liu@amd.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
Hi Nayan,<br>
<br>
This quick hack was just to prove Christian's idea, and for your
reference.<br>
I don't have multi GPU system, and only had a very brief test on
single GPU, <br>
so it might be some difference on your multi GPU system.<br>
we have to dig more into it.<br>
<br>
Regards,<br>
Leo<div><div class="h5"><br>
<br>
<br>
<div>On 09/05/2016 03:51 AM, Nayan Deshmukh
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Leo,
<div><br>
</div>
<div>I have tested your patch with my mplayer and it gives error
when I try to </div>
<div>increase the size of the window. It gives the following
error:-</div>
<div>
<div><br>
</div>
<div>X11 error: BadAlloc (insufficient resources for
operation)</div>
<div>X11 error: BadDrawable (invalid Pixmap or Window
parameter)</div>
<div>X11 error: BadPixmap (invalid Pixmap parameter)</div>
</div>
<div><br>
</div>
<div>Also when I made the back buffer linear instead of the
providing the handle myself, </div>
<div>it was working fine in my system. </div>
<div><br>
</div>
<div>Regards,</div>
<div>Nayan</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Sep 2, 2016 at 8:51 PM, Leo Liu
<span dir="ltr"><<a href="mailto:leo.liu@amd.com" target="_blank">leo.liu@amd.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<div>
<div> <br>
<br>
<div>On 09/02/2016 10:48 AM, Christian König wrote:<br>
</div>
<blockquote type="cite">
<div>Am 02.09.2016 um 16:10 schrieb Leo Liu:<br>
</div>
<blockquote type="cite"> <br>
<br>
On 09/02/2016 09:50 AM, Christian König wrote: <br>
<blockquote type="cite">Am 02.09.2016 um 15:27
schrieb Leo Liu: <br>
<blockquote type="cite"> <br>
<br>
On 09/02/2016 02:11 AM, Christian König wrote:
<br>
<blockquote type="cite">Am 02.09.2016 um 04:03
schrieb Michel Dänzer: <br>
<blockquote type="cite">On 02/09/16 10:17
AM, Michel Dänzer wrote: <br>
<blockquote type="cite">On 02/09/16 12:58
AM, Leo Liu wrote: <br>
<blockquote type="cite">On 09/01/2016
11:54 AM, Nayan Deshmukh wrote: <br>
<blockquote type="cite">I saw the code
in dri3_glx.c and I could somewhat
relate some basic <br>
code structure to the
vl_winsys_dri3.c. But I am new to
this and not aware of the <br>
terminology that you used about the
buffers. Could you please explain
what needs <br>
to be done in more detail or point
me to where I can read about it. <br>
</blockquote>
I believe it's from
loader_dri3_helper.c with
"is_different_gpu" <br>
condition true, that will include back
buffer and front buffer case. <br>
you could try only back buffer case
for now. <br>
</blockquote>
From a high level, PRIME mainly affects
presentation, not so much the <br>
video decoding / rendering. The
important thing is that the buffer used
<br>
for presentation via the Present
extension is linear, not tiled. I'm not
<br>
sure whether it makes more sense to
allocate a separate linear buffer <br>
for this purpose, as is done for GLX, or
for the vl code to make the <br>
corresponding back (or front?) buffer
linear in the first place. <br>
</blockquote>
A separate linear buffer is probably
better, actually, since it will <br>
also be pinned to system memory while it's
being shared with another GPU. <br>
</blockquote>
<br>
Yes, I agree. Nayan should also work on
avoiding the extra copy which currently
occur because we can't allocate output
buffers directly in the format needed for
presentation. <br>
<br>
The general idea should be to to check
during presentation if the format in the
output surface is displayable directly. <br>
</blockquote>
<br>
Also we have to consider drawable resized
case. <br>
</blockquote>
<br>
Actually we don't. Take a look at the VDPAU spec
the output surface should be send for displaying
without considering it's size. <br>
<br>
E.g. when the window is 256x256 pixels, but the
application allocated an output surface of
1024x768 we should still send the whole surface
to the X server. <br>
<br>
It's the job of the application to resize the
output surfaces not the one of the VDPAU state
tracker. <br>
</blockquote>
<br>
I thought this get done by vl compositor from
presentation, scaling up or down from output
surface to back buffer based on the resize. <br>
</blockquote>
<br>
No, that is incorrect. Take a look at the VDPAU
spec:<br>
<br>
<blockquote type="cite">
<p>Applications may choose to allow resizing of
the presentation queue target (which may be e.g.
a regular Window when using an X11-based
implementation).</p>
<p><b>clip_width</b> and <b>clip_height</b> may
be used to limit the size of the displayed
region of a surface, in order to match the
specific region that was rendered to.</p>
<p>In turn, this allows the application to
allocate over-sized (e.g. screen-sized)
surfaces, but render to a region that matches
the current size of the video window.</p>
<p>Using this technique, an application's response
to window resizing may simply be to render to,
and display, a different region of the surface,
rather than de-/re-allocation of surfaces to
match the updated window size.</p>
</blockquote>
<br>
This means that we should send the original output
surface size to X, no matter what size it has or
what size the window has it is displayed in.<br>
<br>
That wasn't possible with DRI2, that's why we have
that workaround with the delayed rendering in the
mixer.<br>
</blockquote>
<br>
</div>
</div>
I did a quick hack on single GPU, and tested, this proves
the whole idea is working including resizing.<br>
Linear is still displayable, just looks kind of sluggish
when playback.<br>
<br>
Here is the hack for reference including remove back
buffer creating, and presentation rendering, use output
surface handle for X<br>
<br>
diff --git a/src/gallium/auxiliary/vl/vl_<wbr>winsys.h
b/src/gallium/auxiliary/vl/vl_<wbr>winsys.h<br>
index 26db9f2..908ec3a 100644<br>
--- a/src/gallium/auxiliary/vl/vl_<wbr>winsys.h<br>
+++ b/src/gallium/auxiliary/vl/vl_<wbr>winsys.h<br>
@@ -59,6 +59,9 @@ struct vl_screen<br>
void *<br>
(*get_private)(struct vl_screen *vscreen);<br>
<br>
+ void *<br>
+ (*set_output_handle)(struct vl_screen *vscreen, struct
winsys_handle whandle);<br>
+<br>
struct pipe_screen *pscreen;<br>
struct pipe_loader_device *dev;<br>
};<br>
diff --git a/src/gallium/auxiliary/vl/vl_<wbr>winsys_dri3.c
b/src/gallium/auxiliary/vl/vl_<wbr>winsys_dri3.c<br>
index 3d596a6..4908699 100644<br>
--- a/src/gallium/auxiliary/vl/vl_<wbr>winsys_dri3.c<br>
+++ b/src/gallium/auxiliary/vl/vl_<wbr>winsys_dri3.c<br>
@@ -64,6 +64,9 @@ struct vl_dri3_screen<br>
xcb_connection_t *conn;<br>
xcb_drawable_t drawable;<br>
<br>
+ unsigned output_handle;<br>
+ unsigned output_stride;<br>
+<br>
uint32_t width, height, depth;<br>
<br>
xcb_present_event_t eid;<br>
@@ -225,6 +228,7 @@ dri3_alloc_back_buffer(struct
vl_dri3_screen *scrn)<br>
if (!shm_fence)<br>
goto close_fd;<br>
<br>
+#if 0<br>
memset(&templ, 0, sizeof(templ));<br>
templ.bind = PIPE_BIND_RENDER_TARGET |
PIPE_BIND_SAMPLER_VIEW |<br>
PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;<br>
@@ -248,6 +252,11 @@ dri3_alloc_back_buffer(struct
vl_dri3_screen *scrn)<br>
<wbr> usage);<br>
buffer_fd = whandle.handle;<br>
buffer->pitch = whandle.stride;<br>
+#endif<br>
+<br>
+ buffer_fd = scrn->output_handle;<br>
+ buffer->pitch = scrn->output_stride;<br>
+<br>
xcb_dri3_pixmap_from_buffer(sc<wbr>rn->conn,<br>
<wbr> (pixmap =
xcb_generate_id(scrn->conn)),<br>
<wbr> scrn->drawable,<br>
@@ -591,6 +600,15 @@ vl_dri3_screen_get_private(str<wbr>uct
vl_screen *vscreen)<br>
}<br>
<br>
static void<br>
+vl_dri3_set_output_handle(str<wbr>uct vl_screen *vscreen,
struct winsys_handle whandle)<br>
+{<br>
+ struct vl_dri3_screen *scrn = (struct vl_dri3_screen
*)vscreen;<br>
+<br>
+ scrn->output_handle = whandle.handle;<br>
+ scrn->output_stride = whandle.stride;<br>
+}<br>
+<br>
+static void<br>
vl_dri3_screen_destroy(struct vl_screen *vscreen)<br>
{<br>
struct vl_dri3_screen *scrn = (struct vl_dri3_screen
*)vscreen;<br>
@@ -706,6 +724,7 @@ vl_dri3_screen_create(Display
*display, int screen)<br>
scrn->base.set_next_timestamp =
vl_dri3_screen_set_next_timest<wbr>amp;<br>
scrn->base.get_private =
vl_dri3_screen_get_private;<br>
scrn->base.pscreen->flush_fron<wbr>tbuffer =
vl_dri3_flush_frontbuffer;<br>
+ scrn->base.set_output_handle =
vl_dri3_set_output_handle;<br>
<br>
return &scrn->base;<br>
<br>
diff --git a/src/gallium/state_trackers/v<wbr>dpau/presentation.c
b/src/gallium/state_trackers/v<wbr>dpau/presentation.c<br>
index 2862eaf..d5e832e 100644<br>
--- a/src/gallium/state_trackers/v<wbr>dpau/presentation.c<br>
+++ b/src/gallium/state_trackers/v<wbr>dpau/presentation.c<br>
@@ -30,6 +30,11 @@<br>
<br>
#include "util/u_debug.h"<br>
#include "util/u_memory.h"<br>
+#include "util/u_sampler.h"<br>
+#include "util/u_format.h"<br>
+#include "util/u_surface.h"<br>
+<br>
+#include "state_tracker/drm_driver.h"<br>
<br>
#include "vdpau_private.h"<br>
<br>
@@ -216,6 +221,7 @@ vlVdpPresentationQueueDisplay(<wbr>VdpPresentationQueue
presentation_queue,<br>
struct vl_compositor *compositor;<br>
struct vl_compositor_state *cstate;<br>
struct vl_screen *vscreen;<br>
+ struct winsys_handle whandle;<br>
<br>
pq = vlGetDataHTAB(presentation_que<wbr>ue);<br>
if (!pq)<br>
@@ -231,14 +237,26 @@ vlVdpPresentationQueueDisplay(<wbr>VdpPresentationQueue
presentation_queue,<br>
vscreen = pq->device->vscreen;<br>
<br>
pipe_mutex_lock(pq->device->mu<wbr>tex);<br>
+<br>
+ memset(&whandle, 0, sizeof(struct winsys_handle));<br>
+ whandle.type = DRM_API_HANDLE_TYPE_FD;<br>
+<br>
+ if (!vscreen->pscreen->resource_g<wbr>et_handle(vscreen->pscreen,
surf->device->context,<br>
+ <wbr>
surf->surface->texture, &whandle,<br>
+ PIPE_HANDLE_USAGE_READ_WRITE))<br>
+ return VDP_STATUS_NO_IMPLEMENTATION;<br>
+<br>
+ vscreen->set_output_handle(vsc<wbr>reen, whandle);<br>
+<br>
tex = vscreen->texture_from_drawable<wbr>(vscreen,
(void *)pq->drawable);<br>
- if (!tex) {<br>
- pipe_mutex_unlock(pq->device-><wbr>mutex);<br>
- return VDP_STATUS_INVALID_HANDLE;<br>
- }<br>
+// if (!tex) {<br>
+// pipe_mutex_unlock(pq->device-><wbr>mutex);<br>
+// return VDP_STATUS_INVALID_HANDLE;<br>
+// }<br>
<br>
dirty_area = vscreen->get_dirty_area(vscree<wbr>n);<br>
<br>
+#if 0<br>
memset(&surf_templ, 0, sizeof(surf_templ));<br>
surf_templ.format = tex->format;<br>
surf_draw = pipe->create_surface(pipe, tex,
&surf_templ);<br>
@@ -269,6 +287,7 @@ vlVdpPresentationQueueDisplay(<wbr>VdpPresentationQueue
presentation_queue,<br>
vl_compositor_set_dst_clip(cst<wbr>ate,
&dst_clip);<br>
vl_compositor_render(cstate, compositor, surf_draw,
dirty_area, true);<br>
}<br>
+#endif<br>
<br>
vscreen->set_next_timestamp(vs<wbr>creen,
earliest_presentation_time);<br>
pipe->screen->flush_frontbuffe<wbr>r(pipe->screen,
tex, 0, 0,<br>
@@ -294,8 +313,10 @@ vlVdpPresentationQueueDisplay(<wbr>VdpPresentationQueue
presentation_queue,<br>
framenum++;<br>
}<br>
<br>
+#if 0<br>
pipe_resource_reference(&tex, NULL);<br>
pipe_surface_reference(&surf_d<wbr>raw, NULL);<br>
+#endif<br>
pipe_mutex_unlock(pq->device-><wbr>mutex);<br>
<br>
return VDP_STATUS_OK;<br>
<br>
<br>
Regards,<br>
Leo<span><br>
<br>
<br>
<blockquote type="cite"> <br>
But no worry it's only a minor issue and a good task
for Nayan to get deeper into the graphics stack.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<blockquote type="cite"> <br>
Regards, <br>
Leo <br>
<br>
<blockquote type="cite"> <br>
Regards, <br>
Christian. <br>
<br>
<blockquote type="cite"> <br>
Regards, <br>
Leo <br>
<br>
<blockquote type="cite">If that is the case then
handle of that surface should be send directly
to X. <br>
If that isn't the case we reallocate the
backing buffer, copy the content of the output
surface into it and then send the new handle
to X. <br>
<br>
Regards, <br>
Christian. <br>
</blockquote>
<br>
______________________________<wbr>_________________
<br>
mesa-dev mailing list <br>
<a href="mailto:mesa-dev@lists.freedesktop.org" target="_blank">mesa-dev@lists.freedesktop.org</a>
<br>
<a href="https://lists.freedesktop.org/mailman/listinfo/mesa-dev" target="_blank">https://lists.freedesktop.org/<wbr>mailman/listinfo/mesa-dev</a>
<br>
</blockquote>
<br>
<br>
</blockquote>
<br>
</blockquote>
<p><br>
</p>
</blockquote>
<br>
</span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div>