[Mesa-dev] [PATCH 2/3] radeon/compute: Implement PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS

Bruno Jimenez brunojimen at gmail.com
Thu Jun 5 11:39:29 PDT 2014


On Tue, 2014-06-03 at 08:55 -0400, Alex Deucher wrote:
> On Mon, Jun 2, 2014 at 7:34 PM, Bruno Jimenez <brunojimen at gmail.com> wrote:
> > On Mon, 2014-06-02 at 16:16 -0400, Alex Deucher wrote:
> >> On Sat, May 31, 2014 at 7:13 AM, Bruno Jimenez <brunojimen at gmail.com> wrote:
> >> > On Fri, 2014-05-30 at 19:33 -0400, Alex Deucher wrote:
> >> >> On Fri, May 30, 2014 at 11:31 AM, Bruno Jiménez <brunojimen at gmail.com> wrote:
> >> >> > The data has been extracted from:
> >> >> > AMD Accelerated Parallel Processing OpenCL Programming Guide (rev 2.7)
> >> >> > Appendix D: Device Parameters
> >> >>
> >> >> You should add a query for the number of compute units to the
> >> >> RADEON_INFO ioctl and then just ask the kernel how many CUs/SIMDs the
> >> >> hw has.  This will properly handle all boards (harvest, etc.) since we
> >> >> can read the actual number of CUs off the GPU.
> >> >>
> >> >> Alex
> >> >
> >> > Hi,
> >> >
> >> > At first I tried to do so (as for the maximum clock frequency), but I
> >> > couldn't find how to query that value, nor many docs about what I could
> >> > ask the kernel for.
> >> >
> >> > I think I have found now the appropiate docs, and I will try again to
> >> > query the kernel later.
> >>
> >> You'd need to add a new query.  It doesn't look like we expose this
> >> yet.  The attached untested patch should mostly do the trick.
> >>
> >> Alex
> >>
> >
> > Honestly, I would have never ever been able to come up with this. I
> > tried quering for MAX_PIPES, MAX_SE and MAX_SH_PER_SE (only for SI), and
> > multiplying them together. And it did work for my little CEDAR, but
> > getting a 2 it's easy. And looking at what would return for other cards
> > it didn't look so well.
> >
> > Should I try this patch on top of kernel 3.14.4? or should I use other
> > version?

With a couple of changes, it applied cleanly to 3.14.5 (Arch's stable).
And with the attached patch as #2 for my series I can get the correct
number of compute units for my CEDAR.

But I don't know how or where I should add this new query param, given
that it hasn't been added to the kernel yet. For now I have hardcoded
the '0x20'.

Thanks for all Alex!
Bruno


> It was against Dave's drm-next, but it may apply to 3.14 as well.
> 
> Alex
> 
> >
> > Thanks in advance and sorry for any inconvenience.
> > Bruno
> >
> >>
> >> >
> >> > Sorry for any inconvenience.
> >> > Bruno
> >> >
> >> >>
> >> >> > ---
> >> >> >  src/gallium/drivers/radeon/r600_pipe_common.c | 90 +++++++++++++++++++++++++++
> >> >> >  1 file changed, 90 insertions(+)
> >> >> >
> >> >> > diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c
> >> >> > index 70c4d1a..c4abacd 100644
> >> >> > --- a/src/gallium/drivers/radeon/r600_pipe_common.c
> >> >> > +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
> >> >> > @@ -422,6 +422,89 @@ const char *r600_get_llvm_processor_name(enum radeon_family family)
> >> >> >         }
> >> >> >  }
> >> >> >
> >> >> > +static uint32_t radeon_max_compute_units(enum radeon_family family)
> >> >> > +{
> >> >> > +       switch (family) {
> >> >> > +       case CHIP_CEDAR:
> >> >> > +               return 2;
> >> >> > +
> >> >> > +       /* Redwood PRO2: 4
> >> >> > +        * Redwood PRO:  5
> >> >> > +        * Redwood XT:   5 */
> >> >> > +       case CHIP_REDWOOD:
> >> >> > +               return 4;
> >> >> > +
> >> >> > +       /* Juniper LE:  9
> >> >> > +        * Juniper XT: 10 */
> >> >> > +       case CHIP_JUNIPER:
> >> >> > +               return 9;
> >> >> > +
> >> >> > +       /* Cypress LE:  14
> >> >> > +        * Cypress PRO: 18
> >> >> > +        * Cypress XT:  20 */
> >> >> > +       case CHIP_CYPRESS:
> >> >> > +               return 14;
> >> >> > +
> >> >> > +       case CHIP_HEMLOCK:
> >> >> > +               return 40;
> >> >> > +
> >> >> > +       /* XXX: is Zacate really equal to Ontario?
> >> >> > +        * Zacate E-350: 2
> >> >> > +        * Zacate E-240: 2
> >> >> > +        * Ontario C-50: 2
> >> >> > +        * Ontario C-30: 2 */
> >> >> > +       case CHIP_PALM:
> >> >> > +               return 2;
> >> >> > +
> >> >> > +       /* Caicos:      2
> >> >> > +        * Seymour LP:  2
> >> >> > +        * Seymour PRO: 2
> >> >> > +        * Seymour XT:  2
> >> >> > +        * Seymour XTX: 2 */
> >> >> > +       case CHIP_CAICOS:
> >> >> > +               return 2;
> >> >> > +
> >> >> > +       /* Turks PRO:    6
> >> >> > +        * Turks XT:     6
> >> >> > +        * Whistler LP:  6
> >> >> > +        * Whistler PRO: 6
> >> >> > +        * Whistler XT:  6 */
> >> >> > +       case CHIP_TURKS:
> >> >> > +               return 6;
> >> >> > +
> >> >> > +       /* Barts LE:      10
> >> >> > +        * Barts PRO:     12
> >> >> > +        * Barts XT:      14
> >> >> > +        * Blackcomb PRO: 12 */
> >> >> > +       case CHIP_BARTS:
> >> >> > +               return 10;
> >> >> > +
> >> >> > +       /* Cayman PRO: 22
> >> >> > +        * Cayman XT:  24
> >> >> > +        * Cayman Gemini: 48 */
> >> >> > +       case CHIP_CAYMAN:
> >> >> > +               return 22;
> >> >> > +
> >> >> > +       /* Verde PRO:  8
> >> >> > +        * Verde XT:  10 */
> >> >> > +       case CHIP_VERDE:
> >> >> > +               return 8;
> >> >> > +
> >> >> > +       /* Pitcairn PRO: 16
> >> >> > +        * Pitcairn XT:  20 */
> >> >> > +       case CHIP_PITCAIRN:
> >> >> > +               return 16;
> >> >> > +
> >> >> > +       /* Tahiti PRO: 28
> >> >> > +        * Tahiti XT:  32 */
> >> >> > +       case CHIP_TAHITI:
> >> >> > +               return 28;
> >> >> > +
> >> >> > +       default:
> >> >> > +               return 1;
> >> >> > +       }
> >> >> > +}
> >> >> > +
> >> >> >  static int r600_get_compute_param(struct pipe_screen *screen,
> >> >> >          enum pipe_compute_cap param,
> >> >> >          void *ret)
> >> >> > @@ -519,6 +602,13 @@ static int r600_get_compute_param(struct pipe_screen *screen,
> >> >> >                 }
> >> >> >                 return sizeof(uint32_t);
> >> >> >
> >> >> > +       case PIPE_COMPUTE_CAP_MAX_COMPUTE_UNITS:
> >> >> > +               if (ret) {
> >> >> > +                       uint32_t *max_compute_units = ret;
> >> >> > +                       *max_compute_units = radeon_max_compute_units(rscreen->family);
> >> >> > +               }
> >> >> > +               return sizeof(uint32_t);
> >> >> > +
> >> >> >         default:
> >> >> >                 fprintf(stderr, "unknown PIPE_COMPUTE_CAP %d\n", param);
> >> >> >                 return 0;
> >> >> > --
> >> >> > 1.9.3
> >> >> >
> >> >> > _______________________________________________
> >> >> > mesa-dev mailing list
> >> >> > mesa-dev at lists.freedesktop.org
> >> >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >> >
> >> >
> >
> >

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-radeon-compute-Implement-PIPE_COMPUTE_CAP_MAX_COMPUT.patch
Type: text/x-patch
Size: 2483 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140605/fc91a1d6/attachment-0001.bin>


More information about the mesa-dev mailing list