[Intel-gfx] [PATCH v2 1/2] i965/gen9: Pass alignment as function parameter in drm_intel_gem_bo_alloc_internal()

Fri Jun 26 13:43:18 PDT 2015

On Thu, Jun 25, 2015 at 12:17 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
> On Thu, Jun 25, 2015 at 07:11:21PM +0100, Chris Wilson wrote:
>> On Thu, Jun 25, 2015 at 11:01:44AM -0700, Ben Widawsky wrote:
>> > On Wed, Jun 24, 2015 at 08:28:13AM +0100, Chris Wilson wrote:
>> > > On Tue, Jun 23, 2015 at 04:44:52PM -0700, Anuj Phogat wrote:
>> > > > On Mon, Jun 22, 2015 at 1:04 PM, Chris Wilson <chris at chris-wilson.co.uk> wrote:
>> > > > > On Mon, Jun 22, 2015 at 09:51:08PM +0200, Daniel Vetter wrote:
>> > > > >> On Mon, Jun 22, 2015 at 11:47:02AM -0700, Anuj Phogat wrote:
>> > > > >> > and use it to initialize the align variable in drm_intel_bo.
>> > > > >> >
>> > > > >> > In case of YF/YS tiled buffers libdrm need not know about the tiling
>> > > > >> > format because these buffers don't have hardware support to be tiled
>> > > > >> > or detiled through a fenced region. But, libdrm still need to know
>> > > > >> > about buffer alignment restrictions because kernel uses it when
>> > > > >> > resolving the relocation.
>> > > > >> >
>> > > > >> > Mesa uses drm_intel_gem_bo_alloc_for_render() to allocate Yf/Ys buffers.
>> > > > >> > So, use the passed alignment value in this function. Note that we continue
>> > > > >> > ignoring the alignment value passed to drm_intel_gem_bo_alloc() to follow
>> > > > >> > the previous behavior.
>> > > > >> >
>> > > > >> > V2: Add a condition to avoid allocation from cache. (Ben)
>> > > > >>
>> > > > >> This will hurt badly since some programs love to cycle through textures.
>> > > > >> You can still allocate from the cache, you only need to update the
>> > > > >> alignement constraint. The kernel will move the buffer on the next execbuf
>> > > > >> if it's misplaced.
>> > > > >
>> > > > > For llc, using fresh pages just puts memory and aperture pressure (plus
>> > > > > a small amount of interface pressure) on the system by allocating more bo.
>> > > > >
>> > > > > For !llc, it is far better to move an object in the GTT to match a
>> > > > > change in alignment than it is to allocate fresh pages (and deallocate
>> > > > > stale pages).
>> > > > Could you please explain this and point me to what you want to be
>> > > > changed in this patch?
>> > >
>> > > You can reuse the cached bo if the alignment mismatches. You can
>> > > experiment with searching for a better match, but it's unlikely to be
>> > > useful (numbers talk here).
>> >
>> > There's no "better" match, there's only match or no match. It seems pretty
>> > intuitive to me that trying to find a match would be better though, I'm
>> > curious why you think it's unlikely. Even though we don't specifically get the
>> > alignment back from the kernel after the relocation, we can certainly use the
>> > presumed offset as a guess when trying to find a buffer in the cache.
>>
>> I was thinking in terms of cycles saved, the cost of walking the cache
>> for an exact match vs returning the first available buffer + the cost of
>> any relocation, and whether there would be a measurable difference. I
>> would have thought two patches, do the naive search that doesn't require
>> any changes to the cached allocation strategy then a second
>> demonstrating the perf improvement from being a bit smarter in buffer
>> reuse would have been your ideal anyway.
>> -Chris
>
> That's exactly what I suggested Anuj do :-)
Thanks for helping me to find the right fix. I'll send out an updated patch
doing the naive search with passed alignment and a new patch adding
up the alignment in aperture size.
>>
>> --
>> Chris Wilson, Intel Open Source Technology Centre