[PATCH v2 00/25] AMDKFD kernel driver

Mon Jul 21 08:58:52 PDT 2014

On Mon, Jul 21, 2014 at 05:25:11PM +0200, Daniel Vetter wrote:
> On Mon, Jul 21, 2014 at 03:39:09PM +0200, Christian König wrote:
> > Am 21.07.2014 14:36, schrieb Oded Gabbay:
> > >On 20/07/14 20:46, Jerome Glisse wrote:
> > >>On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote:
> > >>>Forgot to cc mailing list on cover letter. Sorry.
> > >>>
> > >>>As a continuation to the existing discussion, here is a v2 patch series
> > >>>restructured with a cleaner history and no
> > >>>totally-different-early-versions
> > >>>of the code.
> > >>>
> > >>>Instead of 83 patches, there are now a total of 25 patches, where 5 of
> > >>>them
> > >>>are modifications to radeon driver and 18 of them include only amdkfd
> > >>>code.
> > >>>There is no code going away or even modified between patches, only
> > >>>added.
> > >>>
> > >>>The driver was renamed from radeon_kfd to amdkfd and moved to reside
> > >>>under
> > >>>drm/radeon/amdkfd. This move was done to emphasize the fact that this
> > >>>driver
> > >>>is an AMD-only driver at this point. Having said that, we do foresee a
> > >>>generic hsa framework being implemented in the future and in that
> > >>>case, we
> > >>>will adjust amdkfd to work within that framework.
> > >>>
> > >>>As the amdkfd driver should support multiple AMD gfx drivers, we want
> > >>>to
> > >>>keep it as a seperate driver from radeon. Therefore, the amdkfd code is
> > >>>contained in its own folder. The amdkfd folder was put under the radeon
> > >>>folder because the only AMD gfx driver in the Linux kernel at this
> > >>>point
> > >>>is the radeon driver. Having said that, we will probably need to move
> > >>>it
> > >>>(maybe to be directly under drm) after we integrate with additional
> > >>>AMD gfx
> > >>>drivers.
> > >>>
> > >>>For people who like to review using git, the v2 patch set is located
> > >>>at:
> > >>>http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2
> > >>>
> > >>>Written by Oded Gabbayh <oded.gabbay at amd.com>
> > >>
> > >>So quick comments before i finish going over all patches. There is many
> > >>things that need more documentation espacialy as of right now there is
> > >>no userspace i can go look at.
> > >So quick comments on some of your questions but first of all, thanks for
> > >the time you dedicated to review the code.
> > >>
> > >>There few show stopper, biggest one is gpu memory pinning this is a big
> > >>no, that would need serious arguments for any hope of convincing me on
> > >>that side.
> > >We only do gpu memory pinning for kernel objects. There are no userspace
> > >objects that are pinned on the gpu memory in our driver. If that is the
> > >case, is it still a show stopper ?
> > >
> > >The kernel objects are:
> > >- pipelines (4 per device)
> > >- mqd per hiq (only 1 per device)
> > >- mqd per userspace queue. On KV, we support up to 1K queues per process,
> > >for a total of 512K queues. Each mqd is 151 bytes, but the allocation is
> > >done in 256 alignment. So total *possible* memory is 128MB
> > >- kernel queue (only 1 per device)
> > >- fence address for kernel queue
> > >- runlists for the CP (1 or 2 per device)
> > 
> > The main questions here are if it's avoid able to pin down the memory and if
> > the memory is pinned down at driver load, by request from userspace or by
> > anything else.
> > 
> > As far as I can see only the "mqd per userspace queue" might be a bit
> > questionable, everything else sounds reasonable.
> 
> Aside, i915 perspective again (i.e. how we solved this): When scheduling
> away from contexts we unpin them and put them into the lru. And in the
> shrinker we have a last-ditch callback to switch to a default context
> (since you can't ever have no context once you've started) which means we
> can evict any context object if it's getting in the way.

So Intel hardware report through some interrupt or some channel when it is
not using a context ? ie kernel side get notification when some user context
is done executing ?

The issue with radeon hardware AFAICT is that the hardware do not report any
thing about the userspace context running ie you do not get notification when
a context is not use. Well AFAICT. Maybe hardware do provide that.

Like the VMID is a limited resources so you have to dynamicly bind them so
maybe we can only allocate pinned buffer for each VMID and then when binding
a PASID to a VMID it also copy back pinned buffer to pasid unpinned copy.

Cheers,
Jérôme

> 
> We must do that since the contexts have to be in global gtt, which is
> shared for scanouts. So fragmenting that badly with lots of context
> objects and other stuff is a no-go, since that means we'll start to fail
> pageflips.
> 
> I don't know whether ttm has a ready-made concept for such
> opportunistically pinned stuff. I guess you could wire up the "switch to
> dflt context" action to the evict/move function if ttm wants to get rid of
> the currently used hw context.
> 
> Oh and: This is another reason for letting the kernel schedule contexts,
> since you can't do this defrag trick if the gpu does all the scheduling
> itself.
> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel