[Libva] [PATCH 10/13] VME shader keep for HEVC

Zhao Yakui yakui.zhao at intel.com
Wed Jul 13 01:29:19 UTC 2016


On 07/12/2016 03:45 AM, Matt Turner wrote:
> On Thu, Jul 7, 2016 at 2:18 AM, Pengfei Qu<Pengfei.Qu at intel.com>  wrote:
>> +/* Compare three word data to get the min value */
>> +word_imin:
>> +       cmp.le.f0.0 (1)         null:w          INPUT_ARG0.0<0,1,0>:w   INPUT_ARG0.4<0,1,0>:w {align1};
>> +       (f0.0) mov  (1)         TEMP_VAR0.0<1>:w INPUT_ARG0.0<0,1,0>:w                    {align1};
>> +       (-f0.0) mov (1)         TEMP_VAR0.0<1>:w INPUT_ARG0.4<0,1,0>:w                    {align1};
>> +       cmp.le.f0.0 (1)         null:w          TEMP_VAR0.0<0,1,0>:w    INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov  (1)         RET_ARG<1>:w TEMP_VAR0.0<0,1,0>:w                         {align1};
>> +       (-f0.0) mov (1)         RET_ARG<1>:w INPUT_ARG0.8<0,1,0>:w                        {align1};
>
> I think each of these groups of cmp/mov/mov can be replaced with a single sel.

Hi, Matt

     Thanks for your suggestion.

     The above cmp/mov/mov can't be replaced with one single sel. Only 
the two mov/mov can be replaced with one single sel as the condition of 
select instruction is based on cmp instruction.

     Another reason is that the current shader is derived from that for 
H264, which is already verified that it can work well. At the same time 
as it is not critical to the performance, I don't think that we need 
speed much efforts on replacing mov/mov with SEL.

Thanks
    Yakui

>
> sel.l.f0.0 (1) TEMP_VAR0.0<1>:w  INPUT_ARG0.0<0,1,0>:w
> INPUT_ARG0.4<0,1,0>:w {align1};
> sel.l.f0.0 (1) RET_ARG<1>:w         TEMP_VAR0.0<1>:w
> INPUT_ARG0.8<0,1,0>:w {align1};
>
>> +       RETURN          {align1};
>> +
>> +/* Compare three word data to get the max value */
>> +word_imax:
>> +       cmp.ge.f0.0 (1)         null:w          INPUT_ARG0.0<0,1,0>:w   INPUT_ARG0.4<0,1,0>:w {align1};
>> +       (f0.0) mov  (1)         TEMP_VAR0.0<1>:w INPUT_ARG0.0<0,1,0>:w                    {align1};
>> +       (-f0.0) mov (1)         TEMP_VAR0.0<1>:w INPUT_ARG0.4<0,1,0>:w                    {align1};
>> +       cmp.ge.f0.0 (1)         null:w          TEMP_VAR0.0<0,1,0>:w    INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov  (1)         RET_ARG<1>:w TEMP_VAR0.0<0,1,0>:w                         {align1};
>> +       (-f0.0) mov (1)         RET_ARG<1>:w INPUT_ARG0.8<0,1,0>:w                        {align1};
>
> Same here I expect.
>
>> +       RETURN          {align1};
>> +
>> +word_imedian:
>> +       cmp.ge.f0.0 (1) null:w INPUT_ARG0.0<0,1,0>:w INPUT_ARG0.4<0,1,0>:w {align1};
>> +       (f0.0)  jmpi (1) cmp_a_ge_b;
>> +       cmp.ge.f0.0 (1) null:w INPUT_ARG0.0<0,1,0>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.0<0,1,0>:w {align1};
>> +       (f0.0) jmpi (1) cmp_end;
>> +       cmp.ge.f0.0 (1) null:w INPUT_ARG0.4<0,1,0>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (-f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.4<0,1,0>:w {align1};
>> +       jmpi (1) cmp_end;
>> +cmp_a_ge_b:
>> +       cmp.ge.f0.0 (1) null:w INPUT_ARG0.4<0,1,0>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.4<0,1,0>:w {align1};
>> +       (f0.0) jmpi (1) cmp_end;
>> +       cmp.ge.f0.0 (1) null:w INPUT_ARG0.0<0,1,0>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.8<0,1,0>:w {align1};
>> +       (-f0.0) mov (1) RET_ARG<1>:w INPUT_ARG0.0<0,1,0>:w {align1};
>
> And here.
> _______________________________________________
> Libva mailing list
> Libva at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/libva



More information about the Libva mailing list