[RFC PATCH 0/8] Qualcomm Cloud AI 100 driver

Daniel Vetter daniel at ffwll.ch
Wed May 20 08:34:19 UTC 2020


On Wed, May 20, 2020 at 7:15 AM Greg Kroah-Hartman
<gregkh at linuxfoundation.org> wrote:
>
> On Tue, May 19, 2020 at 10:41:15PM +0200, Daniel Vetter wrote:
> > On Tue, May 19, 2020 at 07:41:20PM +0200, Greg Kroah-Hartman wrote:
> > > On Tue, May 19, 2020 at 08:57:38AM -0600, Jeffrey Hugo wrote:
> > > > On 5/18/2020 11:08 PM, Dave Airlie wrote:
> > > > > On Fri, 15 May 2020 at 00:12, Jeffrey Hugo <jhugo at codeaurora.org> wrote:
> > > > > >
> > > > > > Introduction:
> > > > > > Qualcomm Cloud AI 100 is a PCIe adapter card which contains a dedicated
> > > > > > SoC ASIC for the purpose of efficently running Deep Learning inference
> > > > > > workloads in a data center environment.
> > > > > >
> > > > > > The offical press release can be found at -
> > > > > > https://www.qualcomm.com/news/releases/2019/04/09/qualcomm-brings-power-efficient-artificial-intelligence-inference
> > > > > >
> > > > > > The offical product website is -
> > > > > > https://www.qualcomm.com/products/datacenter-artificial-intelligence
> > > > > >
> > > > > > At the time of the offical press release, numerious technology news sites
> > > > > > also covered the product.  Doing a search of your favorite site is likely
> > > > > > to find their coverage of it.
> > > > > >
> > > > > > It is our goal to have the kernel driver for the product fully upstream.
> > > > > > The purpose of this RFC is to start that process.  We are still doing
> > > > > > development (see below), and thus not quite looking to gain acceptance quite
> > > > > > yet, but now that we have a working driver we beleive we are at the stage
> > > > > > where meaningful conversation with the community can occur.
> > > > >
> > > > >
> > > > > Hi Jeffery,
> > > > >
> > > > > Just wondering what the userspace/testing plans for this driver.
> > > > >
> > > > > This introduces a new user facing API for a device without pointers to
> > > > > users or tests for that API.
> > > >
> > > > We have daily internal testing, although I don't expect you to take my word
> > > > for that.
> > > >
> > > > I would like to get one of these devices into the hands of Linaro, so that
> > > > it can be put into KernelCI.  Similar to other Qualcomm products. I'm trying
> > > > to convince the powers that be to make this happen.
> > > >
> > > > Regarding what the community could do on its own, everything but the Linux
> > > > driver is considered proprietary - that includes the on device firmware and
> > > > the entire userspace stack.  This is a decision above my pay grade.
> > >
> > > Ok, that's a decision you are going to have to push upward on, as we
> > > really can't take this without a working, open, userspace.
> >
> > Uh wut.
> >
> > So the merge criteria for drivers/accel (atm still drivers/misc but I
> > thought that was interim until more drivers showed up) isn't actually
> > "totally-not-a-gpu accel driver without open source userspace".
> >
> > Instead it's "totally-not-a-gpu accel driver without open source
> > userspace" _and_ you have to be best buddies with Greg. Or at least
> > not be on the naughty company list. Since for habanalabs all you
> > wanted is a few test cases to exercise the ioctls. Not the entire
> > userspace.
>
> Also, to be fair, I have changed my mind after seeing the mess of
> complexity that these "ioctls for everyone!" type of pass-through
> these kinds of drivers are creating.  You were right, we need open
> userspace code in order to be able to properly evaluate and figure out
> what they are doing is right or not and be able to maintain things over
> time correctly.
>
> So I was wrong, and you were right, my apologies for my previous
> stubbornness.

Awesome and don't worry, I'm pretty sure we've all been stubborn
occasionally :-)

>From a drivers/gpu pov I think still not quite there since we also
want to see the compiler for these programmable accelerator thingies.
But just having a fairly good consensus that "userspace library with
all the runtime stuff excluding compiler must be open" is a huge step
forward. Next step may be that we (kernel overall, drivers/gpu will
still ask for the full thing) have ISA docs for these programmable
things, so that we can also evaluate that aspect and gauge how many
security issues there might be. Plus have a fighting chance to fix up
the security leaks when (post smeltdown I don't really want to
consider this an if) someone finds a hole in the hw security wall. At
least in drivers/gpu we historically have a ton of drivers with
command checkers to validate what userspace wants to run on the
accelerator thingie. Both in cases where the hw was accidentally too
strict, and not strict enough.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch


More information about the dri-devel mailing list