[PATCH 00/83] AMD HSA kernel driver

Thu Jul 24 00:16:01 PDT 2014

On Wed, Jul 23, 2014 at 04:57:48PM -0700, Jesse Barnes wrote:
> On Sun, 13 Jul 2014 12:40:32 -0400
> j.glisse at gmail.com (Jerome Glisse) wrote:
> 
> > On Sun, Jul 13, 2014 at 11:42:58AM +0200, Daniel Vetter wrote:
> > > On Sat, Jul 12, 2014 at 6:49 PM, Jerome Glisse <j.glisse at gmail.com> wrote:
> > > >> Hm, so the hsa part is a completely new driver/subsystem, not just an
> > > >> additional ioctl tacked onto radoen? The history of drm is littered with
> > > >> "generic" ioctls that turned out to be useful for exactly one driver.
> > > >> Which is why _all_ the command submission is now done with driver-private
> > > >> ioctls.
> > > >>
> > > >> I'd be quite a bit surprised if that suddenly works differently, so before
> > > >> we bless a generic hsa interface I really want to see some implementation
> > > >> from a different vendor (i.e. nvdidia or intel) using the same ioctls.
> > > >> Otherwise we just repeat history and I'm not terribly inclined to keep on
> > > >> cleanup up cruft forever - one drm legacy is enough ;-)
> > > >>
> > > >> Jesse is the guy from our side to talk to about this.
> > > >> -Daniel
> > > >
> > > > I am not worried about that side, the hsa foundation has pretty strict
> > > > guidelines on what is hsa compliant hardware ie the hw needs to understand
> > > > the pm4 packet format of radeon (well small subset of it). But of course
> > > > this require hsa compliant hardware and from member i am guessing ARM Mali,
> > > > ImgTech, Qualcomm, ... so unless Intel and NVidia joins hsa you will not
> > > > see it for those hardware.
> > > >
> > > > So yes for once same ioctl would apply to different hardware. The only things
> > > > that is different is the shader isa. The hsafoundation site has some pdf
> > > > explaining all that but someone thought that slideshare would be a good idea
> > > > personnaly i would not register to any of the website just to get the pdf.
> > > >
> > > > So to sumup i am ok with having a new device file that present uniform set
> > > > of ioctl. It would actualy be lot easier for userspace, just open this fix
> > > > device file and ask for list of compliant hardware.
> > > >
> > > > Then radeon kernel driver would register itself as a provider. So all ioctl
> > > > decoding marshalling would be share which makes sense.
> > > 
> > > There's also the other side namely that preparing the cp ring in
> > > userspace and submitting the entire pile through a doorbell to the hw
> > > scheduler isn't really hsa exclusive. And for a solid platform with
> > > seamless gpu/cpu integration that means we need standard ways to set
> > > gpu context priorities and get at useful stats like gpu time used by a
> > > given context.
> > > 
> > > To get there I guess intel/nvidia need to reuse the hsa subsystem with
> > > the command submission adjusted a bit. Kinda like drm where kms and
> > > buffer sharing is common and cs driver specific.
> > 
> > HSA module would be for HSA compliant hardware and thus hardware would
> > need to follow HSA specification which again is pretty clear on what
> > the hardware need to provide. So if Intel and NVidia wants to join HSA
> > i am sure they would be welcome, the more the merrier :)
> > 
> > So i would not block HSA kernel ioctl design in order to please non HSA
> > hardware especialy if at this point in time nor Intel or NVidia can
> > share anything concret on the design and how this things should be setup
> > for there hardware.
> > 
> > When Intel or NVidia present their own API they should provide their
> > own set of ioctl through their own platform.
> 
> Yeah things are different enough that a uniform ioctl doesn't make
> sense.  If/when all the vendors decide on a single standard, we can use
> that, but until then I don't see a nice way to share our doorbell &
> submission scheme with HSA, and I assume nvidia is the same.
> 
> Using HSA as a basis for non-HSA systems seems like it would add a lot
> of complexity, since non-HSA hardware would have to intercept the queue
> writes and manage the submission requests etc as bytecodes in the
> kernel driver, or maybe as a shim layer library that wraps that stuff.
> 
> Probably not worth the effort given that the command sets themselves
> are all custom as well, driven by specific user level drivers like GL,
> CL, and libva.

Well I know that - drm also has the split between shared management stuff
like prime and driver private cmd submission. I still think that some
common interfaces would benefit us. I want things like a gputop (and also
perf counters and all that) to work the same way with the same tooling on
all svm/gpgpu stuff. So a shared namespace/create for svm contexts or
something like that.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch