[Intel-gfx] [PATCH 1/5] shmemfs: Report ENOMEM for page allocation failures outside of tmpfs faults
Hugh Dickins
hughd at google.com
Tue Mar 11 11:09:48 CET 2014
On Thu, 6 Mar 2014, Chris Wilson wrote:
> On Wed, Mar 05, 2014 at 07:14:49PM -0800, Hugh Dickins wrote:
> > On Wed, 5 Mar 2014, Chris Wilson wrote:
> >
> > > The intention of returning ENOSPC for a page allocation failure due to
> > > memory exhausstion in shmem_getpage_gfp() is purely "so that a failure
> > > on a sparse tmpfs mapping will give SIGBUS not OOM." However, for other
> > > callers, for example i915.ko, we want to distinguish the error message
> > > reported to userspace between ENOSPC (meaning that we were unable to fit
> > > the object/execbuffer into the aperture) and ENOMEM (meaning that we
> > > were unable to allocate pages for the object).
> > >
> > > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > > Cc: Hugh Dickins <hughd at google.com>
> >
> > I'm not keen on this: perhaps because it looks like a hack of yours -
> > and draws attention to what might be thought a hack of mine ;)
> >
> > Nor am I thrilled by what was there before (we have three cases which
> > ought to be distinguished, but only ENOMEM and ENOSPC to distinguish
> > between them); but would rather it stay as is, than change what's
> > reported to the user after all these years.
> >
> > But I do see your point, and asking you to convert ENOSPC to ENOMEM
> > in all your drivers/gpu calls might be tiresome.
> >
> > I think we have a reasonable compromise: shmem_read_mapping_page_gfp()
> > is already the wrapper provided just for you guys, how about posting
> > a patch to map ENOSPC to ENOMEM there?
> >
> > (You're using the MS_KERNMOUNT, so won't hit a max_blocks limit.)
>
> Actually, I only need to the conversion along a single error path. So if
> you are happy that the only reason shmem_get_page() would report
> -ENOSPC would be for a hard memory failure, it seems permissible to
> simply fix it up in our driver:
>
> git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f2efed9955bd..cfc4fcd90b6b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2099,7 +2099,19 @@ err_pages:
> page_cache_release(sg_page_iter_page(&sg_iter));
> sg_free_table(st);
> kfree(st);
> - return PTR_ERR(page);
> +
> + /* shmemfs first checks if there is enough memory to allocate the page
> + * and reports ENOSPC should there be insufficient, along with the usual
> + * ENOMEM for a genuine allocation failure.
> + *
> + * We use ENOSPC in our driver to mean that we have run out of GTT
> + * space and so want to translate the error from shmemfs back to our
> + * usual understanding of ENOMEM.
> + */
> + if (PTR_ERR(page) == -ENOSPC)
> + return -ENOMEM;
> + else
> + return PTR_ERR(page);
> }
Yes, that's fine by me: it's your territory, and shmem is not suddenly
going to use ENOSPC for something else altogether. I assumed you were
wanting to do it once for all the GPU drivers, but if the others don't
care at present, yes, do it here for i915 - we can always move the
fixup to shmem_read_mapping_page_gfp() later on if there's a need,
>
> /* Ensure that the associated pages are gathered from the backing storage
>
> Whether the other drivers (gma500, msm, omapdrm, udl, armada) care about
> the strict distinction between ENOSPC and ENOMEM, I do not know.
>
> Would it be possible for us to use sbinfo->max_blocks to reject large
> objects in drm_gem_object_init()?
No, not without using your own separate shm mount: shmem_file_setup()
uses the internal shm_mnt (used for anon shared mmappings and SysV SHM),
and (rightly or wrongly) we have never imposed a max_blocks limit on
that mount.
But your question does suggest one possibility. drm_gem.c is doing
the shmem_file_setup() with VM_NORESERVE flag, which gives you the
behaviour of tmpfs files (page by page accounting, perhaps ENOSPC
if !vm_enough_memory for a page) rather than SysV SHM accounting
(succeed or fail according to the initial shmem_file_setup size,
without page by page accounting thereafter).
You might want to consider whether the SHM-style accounting would
suit you better: I don't know what dictated the choice of VM_NORESERVE
originally, and I don't know if it would be a good idea to change over
or not - it would be a terrible idea if you use large sparse objects.
Hugh
More information about the Intel-gfx
mailing list