<div dir="ltr"><div><div><div><div><div>Hi<br><br></div>I'm observing a heavy performance penalty for using GStreamer 1.0 (1.2.0, 1.2.1, 1.2.2) over GStreamer 0.10 (0.10.36). The penalty is in the area of 3-4 times as much CPU time consumed for decoding a given video clip. Is this a known issue?<br>
<br></div>For reproduction purpose (to generate a common test clip), you can use the following pipeline for about 60 seconds before hitting ctrl-c<br><br>gst-launch-1.0 -e -v videotestsrc pattern=18 is-live=true ! 'video/x-raw, width=1024, height=576, format=I420' ! videoconvert ! x264enc ! 'video/x-h264, profile=main' ! avimux ! filesink location=video1024x576.mp4<br>
<br></div>The two nearly identical pipelines used for testing are shown further down in this email. The cpu usage is estimated using top. Here is what I see:<br><br> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND <br>
5757 stream 20 0 403m 43m 7272 S 28.6 1.1 0:05.76 gst-launch-0.10 <br><br> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND <br> 5763 stream 20 0 1068m 61m 5232 S 85.5 1.6 0:07.22 gst-launch-1.0 <br>
<br></div>So the 0.10 is using roughly 28.6% CPU while 1.0 is using 85.6%<br><br></div>The pipelines used for playing the clip are these:<br><div><br>/usr/bin/gst-launch-0.10 -v filesrc location=./video1024x576.mp4 do-timestamp=true ! decodebin2 name=decoder ! ffmpegcolorspace ! videorate ! videoscale ! ffmpegcolorspace ! 'video/x-raw-rgb, bpp=(int)32, depth=(int)32, endianness=(int)4321, red_mask=(int)65280, green_mask=(int)16711680, blue_mask=(int)-16777216, pixel-aspect-ratio=(fraction)1/1, interlaced=(boolean)false, width=(int)1280, height=(int)720, framerate=(fraction)25/1' ! queue ! fakesink silent=true sync=true<br>
<br>/usr/local/bin/gst-launch-1.0 -v filesrc location=./video1024x576.mp4 do-timestamp=true ! decodebin name=decoder ! videoconvert ! videorate ! videoscale ! videoconvert ! 'video/x-raw, format=(string)BGRA, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, width=(int)1280, height=(int)720, framerate=(fraction)25/1' ! identity silent=true ! queue ! fakesink silent=true sync=true<br>
<div><br></div><div>I used to be able to easily live decode up to 10 video clip concurrent of 720p video on my 8 core server while maintaining enough spare CPU to also mix video en most importanly, encode video again in 720p. That is not really possibly any more with the heavy penalty in performance I have to pay.<br>
<br></div><div>Kind regards<br></div><div>Peter Maersk-Moller<br></div></div></div>