[Beignet] Performances of beignet ... macbook pro 13"

Jérôme Kieffer intel at terre-adelie.org
Wed Oct 15 12:50:33 PDT 2014


Dear Beignet developers,

I am pleased to report you that the beignet OpenCL driver works well on
a macbook pro 13" with an Iris 5100 GPU integrated into the Haswell
processor (i5-4308U).

The code used is describes on pages 7-14 of this document:
http://pdebuyl.be/tmp/esp2014_draft.pdf

It consists of a map operation (cast and multiplication/divisions)
followed by a sparse matrix dense vector multiplication implemented as
an array of struct (method called LUT, better suited to CPU) or as a
struct of array (calles CSR, better suited to GPU). CSR is implemented
using parallel reduction within a workgroup. All OpenCL method use
single precision floating point arithmetics and Kahan summation while OpenMP
code uses double precision arithmetics.

This benchmark is the execution time in millisecond of the complete
treatment for input images of various size (from 1 to 16 Mpixel).
It is the best timing out of 3, averaged over 10 processing.

As the LUT implementation needs much memory, larger images could not be
processed (limited to 256MByte). Apple allows up to 1.5GB.

1D_GPU_LUT_OpenCL
Img size        Beignet         Apple
   1.02         7.50            10.066
   2.10         14.44           16.345
   4.19         28.91           34.538
   6.22         -----           37.570
  11.90         -----           68.443
  16.78         -----           78.333

1D_GPU_CSR_OpenCL
Img size        Beignet         Apple
   1.02         3.95            6.0475
   2.10         7.55            13.324
   4.19         15.62           23.255
   6.22         23.88           33.352
  11.90         45.63           55.099
  16.78         68.78           82.569

I compared also the same code on the CPU side:
The CPU drivers tested under linux are those of Intel, AMD and POCL.

1D_CPU_LUT_OpenMP
Img size        Linux/gcc       Apple/clang
   1.02         12.12            13.451
   2.10         30.14            35.307
   4.19         63.79            87.110
   6.22         96.17            130.77
  11.90         222.15           265.94
  16.78         270.42           359.93
        
1D_CPU_CSR_OpenMP
Img size        Linux/gcc       Apple/clang
   1.02         12.31           12.256
   2.10         30.20           33.220
   4.19         64.34           76.948
   6.22         88.82           111.60
  11.90         206.82          218.81
  16.78         280.03          443.35

1D_CPU_LUT_OpenCL
Img size        AMD             Intel           Apple           POCL
   1.02         13.11           8.25            9.7813           8.47   
   2.10         29.85           15.20           20.563          17.85  
   4.19         58.08           32.77           47.877          47.19  
   6.22         97.88           53.04           80.372          62.53  
  11.90         184.29          125.52          149.33          135.89 
  16.78         261.21          149.31          205.81          190.14 

1D_CPU_CSR_OpenCL
Img size        AMD             Intel           Apple           PoCL
   1.02         16.96           10.05           9.8027           10.02
   2.10         37.12           18.46           21.904           21.35
   4.19         82.78           42.24           46.961           59.89
   6.22         133.41          70.17           68.312           73.87
  11.90         271.61          182.41          143.57           178.77
  16.78         346.55          222.82          212.17           260.62

I am really impressed by the two open-source drivers: Beignet and POCL.

Cheers,

-- 
Jérôme Kieffer <intel at terre-adelie.org>


More information about the Beignet mailing list