nvcodec: low resolutions transcode faster with host memory, high resolutions faster with GL memory

Sid Sethupathi sid.sethupathi at gmail.com
Tue Apr 21 18:03:28 UTC 2020


Hello,

I noticed that when transcoding using nvh264dec and nvh264enc, lower
resolutions perform better when using system/host memory instead of GL
memory. Higher resolutions perform better when using GL memory instead of
system/host memory.

If you profile the pipelines using nvprof, the memory copy operations seem
in line with what you'd expect: device to host memory copies are slower
than device to device. Since the memory copy operation performance seems as
expected, what could be the cause of this slower performance and why does
it only affect lower resolutions?

This gist has results of my testing:
https://gist.github.com/sidsethupathi/b464a6dc30907768a074d8dc526b2b66.

I created 10 minute test sources, one at 320x420 and another at 3840x2160
and ran them through a "filesrc ! nvh264dec ! nvh264enc ! fakesink"
pipeline, similar to Seungha's benchmarks in this MR:
https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/539
.

The results are better formatted in the gist, but they are copied below:

320x240, host memory. Execution time = 0:00:11.252269640

gst-launch-1.0 filesrc location=low_res.ts ! tsdemux ! h264parse !
nvh264dec ! "video/x-raw" ! nvh264enc ! fakesink

            Type  Time(%)      Time     Calls       Avg       Min
 Max  Name
 GPU activities:   49.35%  368.89ms     36000  10.246us  5.9200us
19.872us  [CUDA memcpy HtoD]
                   40.12%  299.85ms     36000  8.3290us  5.3120us
24.256us  [CUDA memcpy DtoH]
                    6.34%  47.353ms     36000  1.3150us  1.2150us
9.3440us  Convert_PL2BL
                    4.19%  31.291ms     18000  1.7380us  1.6640us
2.2400us  ConvertNV24toNV12
                    0.01%  77.632us        68  1.1410us     704ns
2.6240us  [CUDA memset]

320x240, GL memory. Execution time = 0:00:20.584277338

gst-launch-1.0 filesrc location=low_res.ts ! tsdemux ! h264parse !
nvh264dec ! "video/x-raw(memory:GLMemory)" ! nvh264enc ! fakesink

            Type  Time(%)      Time     Calls       Avg       Min
 Max  Name
 GPU activities:   50.84%  79.086ms     72000  1.0980us     864ns
14.208us  [CUDA memcpy DtoD]
                   27.69%  43.070ms     36000  1.1960us     991ns
14.016us  Convert_PL2BL
                   21.41%  33.308ms     18000  1.8500us  1.5030us
2.8480us  ConvertNV24toNV12
                    0.05%  78.944us        68  1.1600us     672ns
2.6560us  [CUDA memset]

3840x2160, host memory. Execution time = 0:03:20.462018560

gst-launch-1.0 filesrc location=hi_res.ts ! tsdemux ! h264parse !
nvh264dec ! "video/x-raw" ! nvh264enc ! fakesink

            Type  Time(%)      Time     Calls       Avg       Min
 Max  Name
 GPU activities:   54.40%  46.3980s     36000  1.2888ms  738.27us
2.7441ms  [CUDA memcpy HtoD]
                   42.74%  36.4568s     36000  1.0127ms  599.52us
3.0454ms  [CUDA memcpy DtoH]
                    1.47%  1.25313s     18000  69.618us  67.584us
72.192us  ConvertNV24toNV12
                    1.39%  1.18157s     36000  32.821us  23.328us
45.856us  Convert_PL2BL
                    0.00%  81.504us        66  1.2340us     704ns
2.6560us  [CUDA memset]

3840x2160, GL memory. Execution time = 0:02:18.106101429

gst-launch-1.0 filesrc location=hi_res.ts ! tsdemux ! h264parse !
nvh264dec ! "video/x-raw(memory:GLMemory)" ! nvh264enc ! fakesink

            Type  Time(%)      Time     Calls       Avg       Min
 Max  Name
 GPU activities:   50.48%  2.63958s     72000  36.660us  22.976us
58.976us  [CUDA memcpy DtoD]
                   25.11%  1.31285s     36000  36.468us  23.744us
49.024us  Convert_PL2BL
                   24.41%  1.27668s     18000  70.926us  67.585us
71.872us  ConvertNV24toNV12
                    0.00%  81.536us        66  1.2350us     704ns
2.9120us  [CUDA memset]


Sid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/gstreamer-devel/attachments/20200421/e845b3ab/attachment-0001.htm>


More information about the gstreamer-devel mailing list