[Mesa-dev] [PATCH v2 1/2] anv: Add support for 48-bit addresses
Jason Ekstrand
jason at jlekstrand.net
Wed Mar 29 15:36:36 UTC 2017
On Wed, Mar 29, 2017 at 1:51 AM, Chris Wilson <chris at chris-wilson.co.uk>
wrote:
> On Tue, Mar 28, 2017 at 05:41:12PM -0700, Jason Ekstrand wrote:
> > This commit adds support for using the full 48-bit address space on
> > Broadwell and newer hardware. Thanks to certain limitations, not all
> > objects can be placed above the 32-bit boundary. In particular, general
> > and state base address need to live within 32 bits. (See also
> > Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order
> > to handle this, we add a supports_48bit_address field to anv_bo and only
> > set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit
> > for all client-allocated memory objects but leave it false for
> > driver-allocated objects. While this is more conservative than needed,
> > all driver allocations should easily fit in the first 32 bits of address
> > space and keeps things simple because we don't have to think about
> > whether or not any given one of our allocation data structures will be
> > used in a 48-bit-unsafe way.
> > ---
> > static VkResult
> > -anv_execbuf_add_bo(struct anv_execbuf *exec,
> > +anv_execbuf_add_bo(struct anv_device *device,
> > + struct anv_execbuf *exec,
> > struct anv_bo *bo,
> > struct anv_reloc_list *relocs,
> > const VkAllocationCallbacks *alloc)
> > @@ -1039,6 +1040,10 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> > obj->flags = bo->is_winsys_bo ? EXEC_OBJECT_WRITE : 0;
> > obj->rsvd1 = 0;
> > obj->rsvd2 = 0;
> > +
> > + if (device->instance->physicalDevice.supports_48bit_addresses &&
> > + bo->supports_48bit_address)
> > + obj->flags |= EXEC_OBJECT_SUPPORTS_48B_ADDRESS;
>
> Don't set bo->supports_48bit_address when
> !device->instance->physicalDevice.supports_48bit_addresses? My guess is
> that flagging bo is a rarer task than add_bo(), and it looks like you
> already have device available in the callers of bo_init(true).
>
I thought a bout making that change right before I sent it but decided I
marginally liked this better. I'm happy to change it if you'd like.
> > }
> >
> > if (relocs != NULL && obj->relocation_count == 0) {
> > @@ -1052,7 +1057,7 @@ anv_execbuf_add_bo(struct anv_execbuf *exec,
> > for (size_t i = 0; i < relocs->num_relocs; i++) {
> > /* A quick sanity check on relocations */
> > assert(relocs->relocs[i].offset < bo->size);
> > - anv_execbuf_add_bo(exec, relocs->reloc_bos[i], NULL, alloc);
> > + anv_execbuf_add_bo(device, exec, relocs->reloc_bos[i], NULL,
> alloc);
> > }
> > }
> >
> > @@ -1264,7 +1269,8 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> > adjust_relocations_from_state_pool(ss_pool,
> &cmd_buffer->surface_relocs,
> > cmd_buffer->last_ss_pool_center);
> > VkResult result =
> > - anv_execbuf_add_bo(&execbuf, &ss_pool->bo,
> &cmd_buffer->surface_relocs,
> > + anv_execbuf_add_bo(device, &execbuf, &ss_pool->bo,
> > + &cmd_buffer->surface_relocs,
> > &cmd_buffer->pool->alloc);
> > if (result != VK_SUCCESS)
> > return result;
> > @@ -1277,7 +1283,7 @@ anv_cmd_buffer_execbuf(struct anv_device *device,
> > adjust_relocations_to_state_pool(ss_pool, &(*bbo)->bo,
> &(*bbo)->relocs,
> > cmd_buffer->last_ss_pool_
> center);
> >
> > - anv_execbuf_add_bo(&execbuf, &(*bbo)->bo, &(*bbo)->relocs,
> > + anv_execbuf_add_bo(device, &execbuf, &(*bbo)->bo, &(*bbo)->relocs,
> > &cmd_buffer->pool->alloc);
> > }
> >
> > diff --git a/src/intel/vulkan/anv_device.c
> b/src/intel/vulkan/anv_device.c
> > index 4e4fa19..f9d04ee 100644
> > --- a/src/intel/vulkan/anv_device.c
> > +++ b/src/intel/vulkan/anv_device.c
> > @@ -149,6 +149,8 @@ anv_physical_device_init(struct anv_physical_device
> *device,
> > goto fail;
> > }
> >
> > + device->supports_48bit_addresses = anv_gem_supports_48b_
> addresses(fd);
> > +
> > if (!anv_device_get_cache_uuid(device->uuid)) {
> > result = vk_errorf(VK_ERROR_INITIALIZATION_FAILED,
> > "cannot generate UUID");
> > @@ -1396,7 +1398,7 @@ anv_bo_init_new(struct anv_bo *bo, struct
> anv_device *device, uint64_t size)
> > if (!gem_handle)
> > return vk_error(VK_ERROR_OUT_OF_DEVICE_MEMORY);
> >
> > - anv_bo_init(bo, gem_handle, size);
> > + anv_bo_init(bo, gem_handle, size, true);
> >
> > return VK_SUCCESS;
> > }
> > diff --git a/src/intel/vulkan/anv_gem.c b/src/intel/vulkan/anv_gem.c
> > index 0dde6d9..3d45243 100644
> > --- a/src/intel/vulkan/anv_gem.c
> > +++ b/src/intel/vulkan/anv_gem.c
> > @@ -301,6 +301,24 @@ anv_gem_get_aperture(int fd, uint64_t *size)
> > return 0;
> > }
> >
> > +bool
> > +anv_gem_supports_48b_addresses(int fd)
> > +{
> > + struct drm_i915_gem_exec_object2 obj = {
> > + .flags = EXEC_OBJECT_SUPPORTS_48B_ADDRESS,
> > + };
> > +
> > + struct drm_i915_gem_execbuffer2 execbuf = {
> > + .buffers_ptr = (uintptr_t)&obj,
> > + .buffer_count = 1,
> > + .rsvd1 = 0xffffffu,
> > + };
> > +
> > + int ret = anv_ioctl(fd, DRM_IOCTL_I915_GEM_EXECBUFFER2, &execbuf);
> > +
> > + return ret == -1 && errno == ENOENT;
> > +}
> > +
> > int
> > anv_gem_handle_to_fd(struct anv_device *device, uint32_t gem_handle)
> > {
> > diff --git a/src/intel/vulkan/anv_intel.c b/src/intel/vulkan/anv_intel.c
> > index c356e84..d97fbf7 100644
> > --- a/src/intel/vulkan/anv_intel.c
> > +++ b/src/intel/vulkan/anv_intel.c
> > @@ -57,7 +57,7 @@ VkResult anv_CreateDmaBufImageINTEL(
> >
> > uint64_t size = (uint64_t)pCreateInfo->strideInBytes *
> pCreateInfo->extent.height;
> >
> > - anv_bo_init(&mem->bo, gem_handle, size);
> > + anv_bo_init(&mem->bo, gem_handle, size, true);
> >
> > anv_image_create(_device,
> > &(struct anv_image_create_info) {
> > diff --git a/src/intel/vulkan/anv_private.h b/src/intel/vulkan/anv_
> private.h
> > index 27c887c..425e376 100644
> > --- a/src/intel/vulkan/anv_private.h
> > +++ b/src/intel/vulkan/anv_private.h
> > @@ -299,11 +299,34 @@ struct anv_bo {
> > * writing to them and synchronize uses on other rings (eg if the
> display
> > * server uses the blitter ring).
> > */
> > - bool is_winsys_bo;
> > + bool is_winsys_bo:1;
> > +
> > + /* Whether or not this BO supports having a 48-bit address. Not all
> > + * buffers support arbitrary 48-bit addresses. In particular, we
> need to
> > + * be careful with general and instruction state buffers because we
> set the
> > + * size in STATE_BASE_ADDRESS to 0xfffff (the maximum) even though
> the BO
> > + * is most likely significantly smaller. If we let the kernel place
> it
> > + * anywhere it wants, it will default to placing it as high up the
> address
> > + * space as possible, the range specified by STATE_BASE_ADDRESS will
> > + * over-flow the 48-bit address range, and the GPU will hang. In
> order to
> > + * avoid this problem, we tell the kernel that the buffer does not
> support
> > + * 48-bit addresses, and it places the buffer at a 32-bit address.
> While
> > + * this solution is probably overkill, it is effective.
>
> How about just setting the field to the bo->size? You must know the bo
> already at that point so that you can set the relocation target.
>
Actually, we don't. We have a pointer to a thing that claims to be a BO
but the actual GEM handle and size aren't known until execbuf time. (Yes,
that's a bit weird but there are good reasons for it and it's not likely to
change. When we stop doing relocations, there's a separate plan for how to
handle that.)
> Or you could allocate the sparse trtt in the high range and the kernel
> will then never be able to place objects high enough to overflow. I
> presume the overflow of the bo base + limit into the trtt will not cause
> similar issues...
>
As far as I can tell, overflowing the GTT is the only issue. However, I
can't claim my testing was exhaustive.
--Jason
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170329/5a12b56b/attachment-0001.html>
More information about the mesa-dev
mailing list