[Libva] Quality of the scaled H264 decoded image
Ratin
ratin3 at gmail.com
Thu Mar 28 12:43:50 PDT 2013
Is there a way to do scaling without displaying the video on the monitor? I
need to show a high res version of the video on the display with putSurface
and a lower res version for further processing. vaPutSurface will display
the video at a lower res but the surface resolution is unmodified.
Unfortunately the GPU I am using does not have the PostProcessing
capability or else I could copy the surface to another post Processing
surface and apply the filter and then derive the image.
On Mon, Feb 25, 2013 at 6:35 PM, Ratin <ratin3 at gmail.com> wrote:
> Just to let you know , I tried the VPP code for de-noise, I set the
> de-noise level all the way to (max) it doesn't seem to make any noticable
> difference as far as quality. I noticed the fast path for setting the
> scaling type in VAProcPipelineParameterBuffer, so including putSurface flag
> and as well as an additional filter, there seems to be three different ways
> to do this. Anyways, with VA_FILTER_SCALING_NL_ANAMORPHIC and de-noise, the
> gpu usage increases to 31 % (first measure below). With
> VA_FILTER_SCALING_NL_ANAMORPHIC as part of putSurface flag and no de-noise
> filtering, GPU usage drops 10 %. With VA_FILTER_SCALING_FAST as part of
> putsurface flag, it drops another 4 %. The stream I am decoding / rendering
> is 1280 x 720 H 264, 3 mbps.
>
>
>
> clock: unknown sampler clock: unknown
> render busy: 31%:
> ██████▎ render space: 37/131072
> bitstream busy: 4%: ▉
> bitstream space: 1/131072
> blitter busy: 24%:
> ████▉ blitter space: 10/131072
>
> task percent busy
> GAM: 33%: ██████▋ vert fetch:
> 0 (0/sec)
> TSG: 19%: ███▉ prim fetch:
> 0 (0/sec)
> VFE: 19%: ███▉ VS invocations:
> 104062 (0/sec)
> VF: 19%: ███▉ GS invocations:
> 0 (0/sec)
> GAFS: 14%: ██▉ GS prims:
> 0 (0/sec)
> TDG: 0%: CL invocations:
> 44604 (0/sec)
> GAFM: 0%: CL prims:
> 47270 (0/sec)
> SOL: 0%: PS invocations:
> 3011525360 (0/sec)
> GS: 0%: PS depth pass:
> 3010388146 (0/sec)
>
> render clock: unknown sampler clock: unknown
> render busy: 21%:
> ████▎ render space: 20/131072
> bitstream busy: 4%: ▉
> bitstream space: 1/131072
> blitter busy: 21%:
> ████▎ blitter space: 8/131072
>
> task percent busy
> GAM: 23%: ████▋ vert fetch:
> 0 (0/sec)
> TSG: 10%: ██ prim fetch:
> 0 (0/sec)
> VFE: 10%: ██ VS invocations:
> 104062 (0/sec)
> VF: 10%: ██ GS invocations:
> 0 (0/sec)
> GAFS: 9%: █▉ GS prims:
> 0 (0/sec)
> TDG: 0%: CL invocations:
> 44604 (0/sec)
> GAFM: 0%: CL prims:
> 47270 (0/sec)
> DS: 0%: PS invocations:
> 3011525360 (0/sec)
> GS: 0%: PS depth pass:
> 3010388146 (0/sec)
>
>
>
>
> render clock: unknown sampler clock: unknown
> render busy: 17%:
> ███▎ render space: 10/131072
> bitstream busy: 4%: ▉
> bitstream space: 1/131072
> blitter busy: 17%:
> ███▎ blitter space: 7/131072
>
> task percent busy
> GAM: 17%: ███▌ vert fetch:
> 0 (0/sec)
> GAFS: 4%: ▉ prim fetch:
> 0 (0/sec)
> VS: 0%: VS invocations:
> 104062 (0/sec)
> VF: 0%: GS invocations:
> 0 (0/sec)
> GS prims:
> 0 (0/sec)
> CL invocations:
> 44604 (0/sec)
> CL prims:
> 47270 (0/sec)
> PS invocations:
> 3011525360 (0/sec)
> PS depth pass:
> 3010388146 (0/sec)
>
>
>
>
> On Fri, Feb 22, 2013 at 7:17 AM, Ratin <ratin3 at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Feb 21, 2013 at 5:26 PM, ykzhao <yakui.zhao at intel.com> wrote:
>>
>>> On Thu, 2013-02-21 at 06:30 -0700, Ratin wrote:
>>> > awesome, would like to see the result from HQ scaling sometime in the
>>> > future. I am just using putSurface, don't want to go thru Proc
>>> > pipeline if I don't have to. Is the performance penalty identical in
>>> > both ways? Is there a way I can measure how much GPU processing (%
>>> > and such) is being utilized?
>>>
>>> They are implemented in different ways and it is difficult to check the
>>> performance penalty. The putsurface is based on the 3D model while the
>>> proc pipeline is based on GPGPU model. (Intel_gpu_top may help to show
>>> the GPU utility, which can be downloaded from the
>>> http://cgit.freedesktop.org/xorg/app/intel-gpu-tools/).
>>>
>>> Will you please check whether it can meet with your requirement if you
>>> can use the proc VPP to do the upscaling conversion and then call the
>>> vaPutsurface to display it?
>>>
>>> Thanks.
>>> Yakui
>>>
>>
>> Hi Yakui, Thanks for your reply. I just started looking into this , the
>> total number of filters available for me seems to be only two, not sure if
>> thats normal or not. I am using HD4000 My lspci output shows the following:
>>
>> d02788e046eb:/usr/local/bin# lspci
>> 00:00.0 Host bridge: Intel Corporation 3rd Gen Core processor DRAM
>> Controller (rev 09)
>> 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core
>> processor Graphics Controller (rev 09)
>> 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
>> Family USB xHCI Host Controller (rev 04)
>> 00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series
>> Chipset Family MEI Controller #1 (rev 04)
>> 00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
>> Family USB Enhanced Host Controller #2 (rev 04)
>> 00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family
>> PCI Express Root Port 1 (rev c4)
>> 00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family
>> PCI Express Root Port 4 (rev c4)
>> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
>> Family USB Enhanced Host Controller #1 (rev 04)
>> 00:1f.0 ISA bridge: Intel Corporation HM76 Express Chipset LPC Controller
>> (rev 04)
>> 00:1f.2 IDE interface: Intel Corporation 7 Series Chipset Family 4-port
>> SATA Controller [IDE mode] (rev 04)
>> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
>> SMBus Controller (rev 04)
>> 00:1f.5 IDE interface: Intel Corporation 7 Series Chipset Family 2-port
>> SATA Controller [IDE mode] (rev 04)
>> 01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
>> RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 07)
>> 02:00.0 Network controller: Intel Corporation Centrino Wireless-N 2200
>> (rev c4)
>>
>> I will use more query code to know what are those two filters and will
>> post the results here..
>>
>> Thanks
>>
>> Ratin
>>
>>
>> On Tue, Feb 19, 2013 at 7:24 PM, Xiang, Haihao
>> > <haihao.xiang at intel.com> wrote:
>> >
>> >
>> > > I am using Intel_driver from the staging branch, on a Gen 3
>> > HD4000. So
>> > > there other algorithms like bi-cubic is not supported?
>> >
>> >
>> > You can select another scaling method other than the default
>> > method via
>> > the flag to vaPutSurface() or the filter_flag in
>> > VAProcPipelineParameterBuffer.
>> >
>> > /* Scaling flags for vaPutSurface() */
>> > #define VA_FILTER_SCALING_DEFAULT 0x00000000
>> > #define VA_FILTER_SCALING_FAST 0x00000100
>> > #define VA_FILTER_SCALING_HQ 0x00000200
>> > #define VA_FILTER_SCALING_NL_ANAMORPHIC 0x00000300
>> > #define VA_FILTER_SCALING_MASK 0x00000f00
>> >
>> > In VAProcPipelineParameterBuffer:
>> >
>> > * - Scaling: \c VA_FILTER_SCALING_DEFAULT, \c
>> > VA_FILTER_SCALING_FAST,
>> > * \c VA_FILTER_SCALING_HQ, \c
>> > VA_FILTER_SCALING_NL_ANAMORPHIC.
>> > */
>> > unsigned int filter_flags;
>> >
>> > For Inter driver, Currently only
>> > VA_FILTER_SCALING_NL_ANAMORPHIC and
>> > VA_FILTER_SCALING_DEFAULT/VA_FILTER_SCALING_FAST are
>> > supported. We
>> > will add the support for VA_FILTER_SCALING_HQ.
>> >
>> > Thanks
>> > Haihao
>> >
>> > >
>> > >
>> > >
>> > > On Mon, Feb 18, 2013 at 12:11 AM, Xiang, Haihao
>> > > <haihao.xiang at intel.com> wrote:
>> > > On Fri, 2013-02-15 at 16:18 -0800, Ratin wrote:
>> > > > I am decoding a 720 P video stream from a camera
>> > to 1080 P
>> > > surfaces
>> > > > and displaying them on the screen. I am seeing
>> > noticable
>> > > noise and
>> > > > pulsating which is directly related to the I frame
>> > interval
>> > > > (aparently), the lowest I-frame interval I can
>> > specify for
>> > > the camera
>> > > > is 1 second and selecting that in addition to
>> > bitrate of
>> > > 8192 kbps
>> > > > makes is slightly better but still a lot of noise.
>> > A
>> > > software
>> > > > decoded/scaled video looks all smooth.
>> > > >
>> > > >
>> > > > What I am wondering is what's the default scaling
>> > algorithm
>> > > being used
>> > > > in vaapi/intel driver and how do I specify better
>> > scaling
>> > > algorithms
>> > > > like bi-cubic etc.and possibly specify the
>> > strength of
>> > > deblocking
>> > > > filter level as well, and what can I do to reduce
>> > the
>> > > pulsating ?
>> > >
>> > >
>> > > Which driver are you using ? For Intel, it is
>> > bilinear.
>> > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > Any input would be much appreciated.
>> > > >
>> > > >
>> > > > Thanks
>> > > >
>> > > >
>> > > > Ratin
>> > > >
>> > > >
>> > >
>> > > > _______________________________________________
>> > > > Libva mailing list
>> > > > Libva at lists.freedesktop.org
>> > > >
>> > http://lists.freedesktop.org/mailman/listinfo/libva
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/libva/attachments/20130328/3e5b4c90/attachment-0001.html>
More information about the Libva
mailing list