[Mesa-dev] [PATCH 2/2] st/vdapu: use lanczos filter for scaling v4
Leo Liu
leo.liu at amd.com
Tue Sep 6 13:38:02 UTC 2016
Hi Nayan,
This quick hack was just to prove Christian's idea, and for your reference.
I don't have multi GPU system, and only had a very brief test on single
GPU,
so it might be some difference on your multi GPU system.
we have to dig more into it.
Regards,
Leo
On 09/05/2016 03:51 AM, Nayan Deshmukh wrote:
> Hi Leo,
>
> I have tested your patch with my mplayer and it gives error when I try to
> increase the size of the window. It gives the following error:-
>
> X11 error: BadAlloc (insufficient resources for operation)
> X11 error: BadDrawable (invalid Pixmap or Window parameter)
> X11 error: BadPixmap (invalid Pixmap parameter)
>
> Also when I made the back buffer linear instead of the providing the
> handle myself,
> it was working fine in my system.
>
> Regards,
> Nayan
>
> On Fri, Sep 2, 2016 at 8:51 PM, Leo Liu <leo.liu at amd.com
> <mailto:leo.liu at amd.com>> wrote:
>
>
>
> On 09/02/2016 10:48 AM, Christian König wrote:
>> Am 02.09.2016 um 16:10 schrieb Leo Liu:
>>>
>>>
>>> On 09/02/2016 09:50 AM, Christian König wrote:
>>>> Am 02.09.2016 um 15:27 schrieb Leo Liu:
>>>>>
>>>>>
>>>>> On 09/02/2016 02:11 AM, Christian König wrote:
>>>>>> Am 02.09.2016 um 04:03 schrieb Michel Dänzer:
>>>>>>> On 02/09/16 10:17 AM, Michel Dänzer wrote:
>>>>>>>> On 02/09/16 12:58 AM, Leo Liu wrote:
>>>>>>>>> On 09/01/2016 11:54 AM, Nayan Deshmukh wrote:
>>>>>>>>>> I saw the code in dri3_glx.c and I could somewhat relate
>>>>>>>>>> some basic
>>>>>>>>>> code structure to the vl_winsys_dri3.c. But I am new to
>>>>>>>>>> this and not aware of the
>>>>>>>>>> terminology that you used about the buffers. Could you
>>>>>>>>>> please explain what needs
>>>>>>>>>> to be done in more detail or point me to where I can read
>>>>>>>>>> about it.
>>>>>>>>> I believe it's from loader_dri3_helper.c with
>>>>>>>>> "is_different_gpu"
>>>>>>>>> condition true, that will include back buffer and front
>>>>>>>>> buffer case.
>>>>>>>>> you could try only back buffer case for now.
>>>>>>>> From a high level, PRIME mainly affects presentation, not
>>>>>>>> so much the
>>>>>>>> video decoding / rendering. The important thing is that the
>>>>>>>> buffer used
>>>>>>>> for presentation via the Present extension is linear, not
>>>>>>>> tiled. I'm not
>>>>>>>> sure whether it makes more sense to allocate a separate
>>>>>>>> linear buffer
>>>>>>>> for this purpose, as is done for GLX, or for the vl code to
>>>>>>>> make the
>>>>>>>> corresponding back (or front?) buffer linear in the first
>>>>>>>> place.
>>>>>>> A separate linear buffer is probably better, actually, since
>>>>>>> it will
>>>>>>> also be pinned to system memory while it's being shared with
>>>>>>> another GPU.
>>>>>>
>>>>>> Yes, I agree. Nayan should also work on avoiding the extra
>>>>>> copy which currently occur because we can't allocate output
>>>>>> buffers directly in the format needed for presentation.
>>>>>>
>>>>>> The general idea should be to to check during presentation if
>>>>>> the format in the output surface is displayable directly.
>>>>>
>>>>> Also we have to consider drawable resized case.
>>>>
>>>> Actually we don't. Take a look at the VDPAU spec the output
>>>> surface should be send for displaying without considering it's
>>>> size.
>>>>
>>>> E.g. when the window is 256x256 pixels, but the application
>>>> allocated an output surface of 1024x768 we should still send
>>>> the whole surface to the X server.
>>>>
>>>> It's the job of the application to resize the output surfaces
>>>> not the one of the VDPAU state tracker.
>>>
>>> I thought this get done by vl compositor from presentation,
>>> scaling up or down from output surface to back buffer based on
>>> the resize.
>>
>> No, that is incorrect. Take a look at the VDPAU spec:
>>
>>> Applications may choose to allow resizing of the presentation
>>> queue target (which may be e.g. a regular Window when using an
>>> X11-based implementation).
>>>
>>> *clip_width* and *clip_height* may be used to limit the size of
>>> the displayed region of a surface, in order to match the
>>> specific region that was rendered to.
>>>
>>> In turn, this allows the application to allocate over-sized
>>> (e.g. screen-sized) surfaces, but render to a region that
>>> matches the current size of the video window.
>>>
>>> Using this technique, an application's response to window
>>> resizing may simply be to render to, and display, a different
>>> region of the surface, rather than de-/re-allocation of surfaces
>>> to match the updated window size.
>>>
>>
>> This means that we should send the original output surface size
>> to X, no matter what size it has or what size the window has it
>> is displayed in.
>>
>> That wasn't possible with DRI2, that's why we have that
>> workaround with the delayed rendering in the mixer.
>
> I did a quick hack on single GPU, and tested, this proves the
> whole idea is working including resizing.
> Linear is still displayable, just looks kind of sluggish when
> playback.
>
> Here is the hack for reference including remove back buffer
> creating, and presentation rendering, use output surface handle for X
>
> diff --git a/src/gallium/auxiliary/vl/vl_winsys.h
> b/src/gallium/auxiliary/vl/vl_winsys.h
> index 26db9f2..908ec3a 100644
> --- a/src/gallium/auxiliary/vl/vl_winsys.h
> +++ b/src/gallium/auxiliary/vl/vl_winsys.h
> @@ -59,6 +59,9 @@ struct vl_screen
> void *
> (*get_private)(struct vl_screen *vscreen);
>
> + void *
> + (*set_output_handle)(struct vl_screen *vscreen, struct
> winsys_handle whandle);
> +
> struct pipe_screen *pscreen;
> struct pipe_loader_device *dev;
> };
> diff --git a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> index 3d596a6..4908699 100644
> --- a/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> +++ b/src/gallium/auxiliary/vl/vl_winsys_dri3.c
> @@ -64,6 +64,9 @@ struct vl_dri3_screen
> xcb_connection_t *conn;
> xcb_drawable_t drawable;
>
> + unsigned output_handle;
> + unsigned output_stride;
> +
> uint32_t width, height, depth;
>
> xcb_present_event_t eid;
> @@ -225,6 +228,7 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
> *scrn)
> if (!shm_fence)
> goto close_fd;
>
> +#if 0
> memset(&templ, 0, sizeof(templ));
> templ.bind = PIPE_BIND_RENDER_TARGET | PIPE_BIND_SAMPLER_VIEW |
> PIPE_BIND_SCANOUT | PIPE_BIND_SHARED;
> @@ -248,6 +252,11 @@ dri3_alloc_back_buffer(struct vl_dri3_screen
> *scrn)
> usage);
> buffer_fd = whandle.handle;
> buffer->pitch = whandle.stride;
> +#endif
> +
> + buffer_fd = scrn->output_handle;
> + buffer->pitch = scrn->output_stride;
> +
> xcb_dri3_pixmap_from_buffer(scrn->conn,
> (pixmap = xcb_generate_id(scrn->conn)),
> scrn->drawable,
> @@ -591,6 +600,15 @@ vl_dri3_screen_get_private(struct vl_screen
> *vscreen)
> }
>
> static void
> +vl_dri3_set_output_handle(struct vl_screen *vscreen, struct
> winsys_handle whandle)
> +{
> + struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;
> +
> + scrn->output_handle = whandle.handle;
> + scrn->output_stride = whandle.stride;
> +}
> +
> +static void
> vl_dri3_screen_destroy(struct vl_screen *vscreen)
> {
> struct vl_dri3_screen *scrn = (struct vl_dri3_screen *)vscreen;
> @@ -706,6 +724,7 @@ vl_dri3_screen_create(Display *display, int
> screen)
> scrn->base.set_next_timestamp = vl_dri3_screen_set_next_timestamp;
> scrn->base.get_private = vl_dri3_screen_get_private;
> scrn->base.pscreen->flush_frontbuffer = vl_dri3_flush_frontbuffer;
> + scrn->base.set_output_handle = vl_dri3_set_output_handle;
>
> return &scrn->base;
>
> diff --git a/src/gallium/state_trackers/vdpau/presentation.c
> b/src/gallium/state_trackers/vdpau/presentation.c
> index 2862eaf..d5e832e 100644
> --- a/src/gallium/state_trackers/vdpau/presentation.c
> +++ b/src/gallium/state_trackers/vdpau/presentation.c
> @@ -30,6 +30,11 @@
>
> #include "util/u_debug.h"
> #include "util/u_memory.h"
> +#include "util/u_sampler.h"
> +#include "util/u_format.h"
> +#include "util/u_surface.h"
> +
> +#include "state_tracker/drm_driver.h"
>
> #include "vdpau_private.h"
>
> @@ -216,6 +221,7 @@
> vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
> struct vl_compositor *compositor;
> struct vl_compositor_state *cstate;
> struct vl_screen *vscreen;
> + struct winsys_handle whandle;
>
> pq = vlGetDataHTAB(presentation_queue);
> if (!pq)
> @@ -231,14 +237,26 @@
> vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
> vscreen = pq->device->vscreen;
>
> pipe_mutex_lock(pq->device->mutex);
> +
> + memset(&whandle, 0, sizeof(struct winsys_handle));
> + whandle.type = DRM_API_HANDLE_TYPE_FD;
> +
> + if (!vscreen->pscreen->resource_get_handle(vscreen->pscreen,
> surf->device->context,
> + surf->surface->texture, &whandle,
> + PIPE_HANDLE_USAGE_READ_WRITE))
> + return VDP_STATUS_NO_IMPLEMENTATION;
> +
> + vscreen->set_output_handle(vscreen, whandle);
> +
> tex = vscreen->texture_from_drawable(vscreen, (void
> *)pq->drawable);
> - if (!tex) {
> - pipe_mutex_unlock(pq->device->mutex);
> - return VDP_STATUS_INVALID_HANDLE;
> - }
> +// if (!tex) {
> +// pipe_mutex_unlock(pq->device->mutex);
> +// return VDP_STATUS_INVALID_HANDLE;
> +// }
>
> dirty_area = vscreen->get_dirty_area(vscreen);
>
> +#if 0
> memset(&surf_templ, 0, sizeof(surf_templ));
> surf_templ.format = tex->format;
> surf_draw = pipe->create_surface(pipe, tex, &surf_templ);
> @@ -269,6 +287,7 @@
> vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
> vl_compositor_set_dst_clip(cstate, &dst_clip);
> vl_compositor_render(cstate, compositor, surf_draw,
> dirty_area, true);
> }
> +#endif
>
> vscreen->set_next_timestamp(vscreen, earliest_presentation_time);
> pipe->screen->flush_frontbuffer(pipe->screen, tex, 0, 0,
> @@ -294,8 +313,10 @@
> vlVdpPresentationQueueDisplay(VdpPresentationQueue presentation_queue,
> framenum++;
> }
>
> +#if 0
> pipe_resource_reference(&tex, NULL);
> pipe_surface_reference(&surf_draw, NULL);
> +#endif
> pipe_mutex_unlock(pq->device->mutex);
>
> return VDP_STATUS_OK;
>
>
> Regards,
> Leo
>
>
>>
>> But no worry it's only a minor issue and a good task for Nayan to
>> get deeper into the graphics stack.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>> Leo
>>>
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Regards,
>>>>> Leo
>>>>>
>>>>>> If that is the case then handle of that surface should be
>>>>>> send directly to X.
>>>>>> If that isn't the case we reallocate the backing buffer, copy
>>>>>> the content of the output surface into it and then send the
>>>>>> new handle to X.
>>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> <mailto:mesa-dev at lists.freedesktop.org>
>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>>> <https://lists.freedesktop.org/mailman/listinfo/mesa-dev>
>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20160906/30d91a6c/attachment-0001.html>
More information about the mesa-dev
mailing list