<div dir="ltr">vanilla 1.4.3<div><div>Performance counter stats for 'gst-launch-1.0 videotestsrc ! video/x-raw,format=BGRx,width=1920,height=1080 ! imagefreeze ! queue ! videoconvert ! video/x-raw,format=NV12 ! fakesink num-buffers=6000':</div><div><br></div><div>      98360.636211      task-clock (msec)         #    1.001 CPUs utilized          </div><div>             6,403      context-switches          #    0.065 K/sec                  </div><div>               135      cpu-migrations            #    0.001 K/sec                  </div><div>             2,353      page-faults               #    0.024 K/sec                  </div><div>   313,125,266,258      cycles                    #    3.183 GHz                    </div><div>    75,629,048,758      stalled-cycles-frontend   #   24.15% frontend cycles idle   </div><div>   <not supported>      stalled-cycles-backend   </div><div>   903,876,411,765      instructions              #    2.89  insns per cycle        </div><div>                                                  #    0.08  stalled cycles per insn</div><div>    65,865,189,773      branches                  #  669.630 M/sec                  </div><div>        36,058,736      branch-misses             #    0.05% of all branches        </div><div><br></div><div>      98.274839849 seconds time elapsed</div></div><div><br></div><div><br></div><div>1.4.2 +SSE <a href="https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/SSE_2_nv12%2Ci420_1.4.0.patch">https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/SSE_2_nv12%2Ci420_1.4.0.patch</a></div><div><div> Performance counter stats for 'gst-launch-1.0 videotestsrc ! video/x-raw,format=BGRx,width=1920,height=1080 ! imagefreeze ! queue ! videoconvert ! video/x-raw,format=NV12 ! fakesink num-buffers=6000':</div><div><br></div><div>      33457.250442      task-clock (msec)         #    1.002 CPUs utilized          </div><div>             6,162      context-switches          #    0.184 K/sec                  </div><div>                90      cpu-migrations            #    0.003 K/sec                  </div><div>             4,367      page-faults               #    0.131 K/sec                  </div><div>   106,550,436,733      cycles                    #    3.185 GHz                    </div><div>    19,416,677,000      stalled-cycles-frontend   #   18.22% frontend cycles idle   </div><div>   <not supported>      stalled-cycles-backend   </div><div>   309,264,199,276      instructions              #    2.90  insns per cycle        </div><div>                                                  #    0.06  stalled cycles per insn</div><div>    16,465,186,680      branches                  #  492.126 M/sec                  </div><div>        29,833,144      branch-misses             #    0.18% of all branches        </div><div><br></div><div>      33.405062383 seconds time elapsed</div></div><div><br></div><div>1.4.2+ SimpleScreeRecoded patch    <a href="https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/SSR_i420_1.4.0.patch">https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/SSR_i420_1.4.0.patch</a></div><div><br></div><div><div>Performance counter stats for 'gst-launch-1.0 videotestsrc ! video/x-raw,format=BGRx,width=1920,height=1080 ! imagefreeze ! queue ! videoconvert ! video/x-raw,format=I420 ! fakesink num-buffers=6000':</div><div><br></div><div>       7158.287710      task-clock (msec)         #    1.016 CPUs utilized          </div><div>             6,074      context-switches          #    0.849 K/sec                  </div><div>               121      cpu-migrations            #    0.017 K/sec                  </div><div>             2,838      page-faults               #    0.396 K/sec                  </div><div>    22,678,342,775      cycles                    #    3.168 GHz                    </div><div>     4,182,567,305      stalled-cycles-frontend   #   18.44% frontend cycles idle   </div><div>   <not supported>      stalled-cycles-backend   </div><div>    66,936,755,926      instructions              #    2.95  insns per cycle        </div><div>                                                  #    0.06  stalled cycles per insn</div><div>       447,701,888      branches                  #   62.543 M/sec                  </div><div>         5,987,845      branch-misses             #    1.34% of all branches        </div><div><br></div><div>       7.048563154 seconds time elapsed</div></div><div><br></div><div><br></div><div>1.5.0 orc</div><div><div>Performance counter stats for 'gst-launch-1.0 videotestsrc ! video/x-raw,format=BGRx,width=1920,height=1080 ! imagefreeze ! queue ! videoconvert ! video/x-raw,format=I420 ! fakesink num-buffers=6000':</div><div><br></div><div>      50316.822297      task-clock (msec)         #    1.001 CPUs utilized          </div><div>             6,589      context-switches          #    0.131 K/sec                  </div><div>                95      cpu-migrations            #    0.002 K/sec                  </div><div>             8,255      page-faults               #    0.164 K/sec                  </div><div>   160,235,388,591      cycles                    #    3.185 GHz                    </div><div>    34,241,769,176      stalled-cycles-frontend   #   21.37% frontend cycles idle   </div><div>   <not supported>      stalled-cycles-backend   </div><div>   468,313,611,472      instructions              #    2.92  insns per cycle        </div><div>                                                  #    0.07  stalled cycles per insn</div><div>    22,484,833,596      branches                  #  446.865 M/sec                  </div><div>        35,166,607      branch-misses             #    0.16% of all branches        </div><div><br></div><div>      50.261769163 seconds time elapsed</div><div><br></div></div><div><br></div><div>I do not test very simple table64 patch <a href="https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/table64_nv12_i420_1.4.0.patch">https://github.com/pontostroy/gstreamer-screenrecording/blob/master/patches/1.4.0/table64_nv12_i420_1.4.0.patch</a> ,but I think that it is not worse than  ORC</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-10-05 16:46 GMT+03:00 Tim Müller <span dir="ltr"><<a href="mailto:tim@centricular.com" target="_blank">tim@centricular.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Sun, 2014-10-05 at 16:23 +0300, Yaroslav Andrusyak wrote:<br>
<br>
> CPU time of this pipeline<br>
> gst-launch-1.0 -e ximagesrc display-name=:0 use-damage=0 !<br>
> multiqueue ! video/x-raw,format=BGRx ! videoconvert !<br>
> video/x-raw,format=I420,framerate=30/1 ! multiqueue !<br>
> vaapiencode_h264  ! vaapiparse_h264 ! multiqueue ! matroskamux<br>
> name=muxer muxer. ! progressreport name=Rec_time ! filesink<br>
> location=/disk/tmp//rec_2014-10-05_123757.mkv<br>
> and then<br>
> perf stat -p `pidof gst-launch-1.0`<br>
<br>
</span>It might make sense to try and condense the pipeline to the essential<br>
bits that you're trying to benchmark, e.g. something like:<br>
<br>
perf stat gst-launch-1.0 videotestsrc !<br>
video/x-raw,format=BGRx,width=1920,height=1080 ! imagefreeze ! queue !<br>
videoconvert ! video/x-raw,format=I420 ! fakesink num-buffers=1000<br>
<br>
or just<br>
<br>
time gst-launch-1.0 ...<br>
<br>
Cheers<br>
<span class="HOEnZb"><font color="#888888"> -Tim<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
--<br>
Tim Müller, Centricular Ltd - <a href="http://www.centricular.com" target="_blank">http://www.centricular.com</a><br>
<br>
_______________________________________________<br>
gstreamer-devel mailing list<br>
<a href="mailto:gstreamer-devel@lists.freedesktop.org">gstreamer-devel@lists.freedesktop.org</a><br>
<a href="http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel" target="_blank">http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel</a><br>
</div></div></blockquote></div><br></div>