[Mesa-dev] [PATCH] anv: don't crash when creating a huge image

Jason Ekstrand jason at jlekstrand.net
Fri Nov 10 00:34:20 UTC 2017


On Thu, Nov 9, 2017 at 4:23 PM, Chad Versace <chadversary at chromium.org>
wrote:

> On Wed 08 Nov 2017, Jason Ekstrand wrote:
> > On Wed, Nov 8, 2017 at 1:34 AM, Samuel Iglesias Gonsálvez <[1]
> > siglesias at igalia.com> wrote:
> >
> >     The HW has some limits but, according to the spec, we can create
> >     the image as it has not yet any memory backing it. When we allocate
> >     that memory, then we fail following what Vulkan spec, "10.2. Device
> >     Memory" says when talking about vkAllocateMemory():
> >
> >     "Some platforms may have a limit on the maximum size of a single
> >      allocation. For example, certain systems may fail to create
> >      allocations with a size greater than or equal to 4GB. Such a limit
> is
> >      implementation-dependent, and if such a failure occurs then the
> error
> >      VK_ERROR_OUT_OF_DEVICE_MEMORY must be returned."
> >
> >     Fixes the crashes on BDW for the following tests:
> >
> >     dEQP-VK.pipeline.render_to_image.core.2d_array.huge.*
> >     dEQP-VK.pipeline.render_to_image.core.cube_array.huge.*
> >
> >     Signed-off-by: Samuel Iglesias Gonsálvez <[2]siglesias at igalia.com>
> >     ---
> >
> >     Jason, I was tempted to move this piece of code to
> anv_AllocateMemory()
> >     but then I found the kernel relocation limitation of 32-bit... Is
> that
> >     limitation still applicable? Or was it from the BDW age and we forgot
> >     to update that limitation for gen9+?
> >
> >
> > We're still using relocations on all hardware so it applies to everything
> > today.  One of my 2018 projects is to fix that and get rid of
> relocations on
> > gen8+.
> >
> >
> >     Sam
> >
> >      src/intel/isl/isl.c | 22 ----------------------
> >      1 file changed, 22 deletions(-)
> >
> >     diff --git a/src/intel/isl/isl.c b/src/intel/isl/isl.c
> >     index 59f512fc050..aaadcbaf991 100644
> >     --- a/src/intel/isl/isl.c
> >     +++ b/src/intel/isl/isl.c
> >     @@ -1472,28 +1472,6 @@ isl_surf_init_s(const struct isl_device *dev,
> >            base_alignment = MAX(info->min_alignment, tile_size);
> >         }
> >
> >     -   if (ISL_DEV_GEN(dev) < 9) {
> >     -      /* From the Broadwell PRM Vol 5, Surface Layout:
> >     -       *
> >     -       *    "In addition to restrictions on maximum height, width,
> and
> >     depth,
> >     -       *     surfaces are also restricted to a maximum size in
> bytes. This
> >     -       *     maximum is 2 GB for all products and all surface
> types."
> >     -       *
> >     -       * This comment is applicable to all Pre-gen9 platforms.
> >     -       */
> >     -      if (size > (uint64_t) 1 << 31)
> >     -         return false;
> >     -   } else {
> >     -      /* From the Skylake PRM Vol 5, Maximum Surface Size in Bytes:
> >     -       *    "In addition to restrictions on maximum height, width,
> and
> >     depth,
> >     -       *     surfaces are also restricted to a maximum size of 2^38
> bytes.
> >     -       *     All pixels within the surface must be contained within
> 2^38
> >     bytes
> >     -       *     of the base address."
> >     -       */
> >     -      if (size > (uint64_t) 1 << 38)
> >     -         return false;
> >     -   }
>
> I think it very unwise to delete code that enforces requirements defined
> by the hardware spec. Deleting the code doesn't make the hardware
> requirements go away :)
>
> > I'm not sure how I feel about removing this from ISL.  There are really
> two
> > limitations going on here.  One is a limitation imposed by relocations,
> and the
> > other is some sort of fundamental hardware surface size limitation.  Most
> > likely, the surface size limitation has to do with how many bits they
> use for
> > image address computations in the hardware.  Most likely, on gen8, they
> do all
> > of the internal calculations in 32 bits and only convert to 48 at the
> end when
> > they need to add it to Surface Base Address.
> >
> > If my understanding is correct then we will still have this limitation
> on gen8
> > even after we get rid of relocations and remove the BO size limitation.
> I see
> > a couple of options, neither of which I like very much:
> >
> >  1) Take something like this patch and then keep the BO size limitation
> on BDW
> > to 2GiB when we get rid of relocations even though it's artificial.
> >  2) "Gracefully" handle isl_surf_init failure by doing a debug_log and
> > succeeding but setting the image size (that will be returned by
> > vkGetImageMemoryRequirements) to UINT64_MAX so that the client won't
> ever be
> > able to find memory for it.
> >
> > My feeling is that 1) above is probably the better of the two as 2)
> seems to be
> > a twisting of the spec.  That said, I would like to keep the restriction
> in ISL
> > somehow and we need to make sure it still gets applied in GL.
>
> I dislike both. I originally designed isl to mimic the VkImage API, so
> let's continue that trend.
>
>   Option 3) Change isl_surf_init() to return a meaningful result code:
>
>             ISL_SUCCESS = 0
>             ISL_ERROR_SOMETHING_SOMETHING_THE_USUAL_FAILURES = -1
>             ISL_ERROR_SURFACE_SIZE_TOO_LARGE = -2
>
> I like option 3 because it avoids secret implicit contracts between isl
> and anvil, and thus avoids hidden hacks.
>

I mostly agree but that still doesn't answer the question of what do we do
with that return code?  We can't fail the vkCreateImage because our only
options there are the two OOM errors both of which are lies.  The only real
option we have is to go ahead and create the VkImage but give it a size
that doesn't fit in any allocatable memory object.  So, we're back to
options 1 and 2...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20171109/b25a3988/attachment-0001.html>


More information about the mesa-dev mailing list