Slow memory access when using OpenCL without X11
Lauri Ehrenpreis
laurioma at gmail.com
Sun Mar 10 19:13:21 UTC 2019
Seems sysbench cpu test does not slow down:
1) run sysbench cpu on idle machine:
sysbench cpu run
...
General statistics:
total time: 10.0033s
total number of events: 19052
2) start ./cl_slow_test 1 10000 in background
3) run sysbench again
sysbench cpu run
..
General statistics:
total time: 10.0036s
total number of events: 18979
So if did not slow down considerably.
If I do similar test with sysbench memory test I get following results:
1) run sysbench memory on idle machine:
sysbench memory --memory-block-size=32M --memory-total-size=100G run
...
66432.00 MiB transferred (6638.95 MiB/sec)
2) start ./cl_slow_test 1 10000 in background
3) sysbench memory --memory-block-size=32M --memory-total-size=100G run
...
672.00 MiB transferred (66.40 MiB/sec)
It confirms that memory speed is reduced 100x :(
--
Lauri
On Sat, Mar 9, 2019 at 9:22 PM Jan Vesely <jan.vesely at rutgers.edu> wrote:
> On Sat, Mar 9, 2019 at 1:54 AM Lauri Ehrenpreis <laurioma at gmail.com>
> wrote:
> >
> > Even if it's using CPU for OCL (I know it's not doing this), why does
> memcpy on CPU slow down permanently, if I'm not doing anything with OpenCL
> after clCreateContext?
> >
> > As you see from test program it just does clCreateContext and then a
> loop of memcpy-s on CPU.
> >
> > Also I found out that writing different values to
> /sys/class/drm/card0/device/power_dpm_force_performance_level changes my
> max memcpy speed on CPU:
> >
> > echo "low" >
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> > ./cl_slow_test 1 5
> > got 1 platforms 1 devices
> > speed 731.810425 avg 731.810425 mbytes/s
> > speed 163.425583 avg 447.618011 mbytes/s
> > speed 123.441612 avg 339.559235 mbytes/s
> > speed 121.655266 avg 285.083252 mbytes/s
> > speed 123.806801 avg 252.827972 mbytes/s
> >
> > echo "high" >
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> > ./cl_slow_test 1 5
> > got 1 platforms 1 devices
> > speed 3742.063721 avg 3742.063721 mbytes/s
> > speed 836.148987 avg 2289.106445 mbytes/s
> > speed 189.379166 avg 1589.197266 mbytes/s
> > speed 189.271393 avg 1239.215820 mbytes/s
> > speed 188.290451 avg 1029.030762 mbytes/s
> >
> > echo "profile_standard" >
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> > ./cl_slow_test 1 5
> > got 1 platforms 1 devices
> > speed 2303.955566 avg 2303.955566 mbytes/s
> > speed 2298.224121 avg 2301.089844 mbytes/s
> > speed 2295.585205 avg 2299.254883 mbytes/s
> > speed 2295.762939 avg 2298.381836 mbytes/s
> > speed 2288.766602 avg 2296.458740 mbytes/s
> >
> > echo "profile_peak" >
> /sys/class/drm/card0/device/power_dpm_force_performance_level
> > ./cl_slow_test 1 5
> > got 1 platforms 1 devices
> > speed 3710.360352 avg 3710.360352 mbytes/s
> > speed 3713.660400 avg 3712.010254 mbytes/s
> > speed 3797.630859 avg 3740.550537 mbytes/s
> > speed 3708.004883 avg 3732.414062 mbytes/s
> > speed 3796.403076 avg 3745.211914 mbytes/s
> >
> > However none of those is close to the memcpy speed I get when I don't do
> clCreateContext (my test prog has first arg 0):
> > ./cl_slow_test 0 5
> > speed 7299.201660 avg 7299.201660 mbytes/s
> > speed 9298.841797 avg 8299.021484 mbytes/s
> > speed 9360.181641 avg 8652.742188 mbytes/s
> > speed 9004.759766 avg 8740.746094 mbytes/s
> > speed 9414.607422 avg 8875.518555 mbytes/s
> >
> > Also attached clinfo.txt. It shows that opencl is using GPU so device
> node permissions are probably not the issue.
>
> Is it only memory accesses or does overall CPU performance degrade
> (including compute - say sysbench) as well?
>
> Jan
>
> > --
> > Lauri
> >
> > On Fri, Mar 8, 2019 at 10:35 PM Alex Deucher <alexdeucher at gmail.com>
> wrote:
> >>
> >> I think you are probably using the CPU for OCL in the remote login
> >> case. When you log into the desktop, the permissions on the device
> >> nodes get changed dynamically to support accelerated rendering. You
> >> probably need to change the permissions on the device nodes manually
> >> if you are not logging into the desktop.
> >>
> >> Alex
> >>
> >> On Fri, Mar 8, 2019 at 2:43 PM Lauri Ehrenpreis <laurioma at gmail.com>
> wrote:
> >> >
> >> > Hi!
> >> >
> >> > I am using Ryzen 2400G with Gigabyte AMD B450 AORUS board. I have
> latest bios, ubuntu 18.04 and latest mainline kernel (5.0.0-050000-generic)
> installed. Also I have rocm-dev 2.1.96 but no rock-dkms installed.
> >> >
> >> > I found that when I log in over ssh and try to use OpenCL (doing
> clCreateContext is enough) then cpu memory accesses after that will slow
> down by 100x.
> >> > If I connect HDMI cable and log in to desktop mode then this does not
> happen. Also if I don't call clCreateContext then everything works properly.
> >> >
> >> > Attached the test program and kernel log also. Test works like that :
> >> > g++ cl_slow_test.cpp -o cl_slow_test -I /opt/rocm/opencl/include/ -L
> /opt/rocm/opencl/lib/x86_64/ -lOpenCL
> >> > lauri at rv:~$ ./cl_slow_test 0 5
> >> > speed 7003.145508 avg 7003.145508 mbytes/s
> >> > speed 8427.357422 avg 7715.251465 mbytes/s
> >> > speed 9203.049805 avg 8211.184570 mbytes/s
> >> > speed 9845.956055 avg 8619.877930 mbytes/s
> >> > speed 9882.748047 avg 8872.452148 mbytes/s
> >> > lauri at rv:~$ ./cl_slow_test 1 5
> >> > got 1 platforms 1 devices
> >> > speed 1599.803589 avg 1599.803589 mbytes/s
> >> > speed 1665.426392 avg 1632.614990 mbytes/s
> >> > speed 146.137253 avg 1137.122437 mbytes/s
> >> > speed 121.056877 avg 883.106018 mbytes/s
> >> > speed 122.428970 avg 730.970581 mbytes/s
> >> >
> >> > I also tried latest amd-staging kernel
> https://github.com/M-Bab/linux-kernel-amdgpu-binaries and it had the same
> issue.
> >> >
> >> > Can anyone point me into right direction?
> >> >
> >> > Br,
> >> > Lauri
> >> > _______________________________________________
> >> > amd-gfx mailing list
> >> > amd-gfx at lists.freedesktop.org
> >> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
> >
> > _______________________________________________
> > amd-gfx mailing list
> > amd-gfx at lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20190310/301d3e59/attachment.html>
More information about the amd-gfx
mailing list