[PATCH] mm/hugetlb: Don't crash when allocating a folio if there are no resv

Wed Jun 25 14:18:28 UTC 2025

Hi Andrew,

> Subject: Re: [PATCH] mm/hugetlb: Don't crash when allocating a folio if
> there are no resv
> 
> On Thu, 19 Jun 2025 05:30:52 +0000 "Kasireddy, Vivek"
> <vivek.kasireddy at intel.com> wrote:
> 
> > Hi Andrew, Anshuman,
> >
> > > Subject: Re: [PATCH] mm/hugetlb: Don't crash when allocating a folio if
> there
> > > are no resv
> > >
> > > On Wed, 18 Jun 2025 12:14:49 +0530 Anshuman Khandual
> > > <anshuman.khandual at arm.com> wrote:
> > >
> > > > > Therefore, prevent the above crash by replacing the VM_BUG_ON()
> > > > > with WARN_ON_ONCE() as there is no need to crash the system in
> > > > > this situation and instead we could just warn and fail the
> > > > > allocation.
> > > >
> > > > Why there are no reserved huge pages in such situations and also how
> > > > likely this might happen ? Is it recoverable ?
> > As described in the commit message above, the specific situation where
> this
> > happens is when we try to pin memfd folios before they are faulted-in.
> > Although, this is a valid thing to do, it is not the regular or the common
> > use-case. Let me explain this further with the following scenarios:
> > 1) hugetlbfs_file_mmap()
> >     memfd_alloc_folio()
> >     hugetlb_fault()
> >
> > 2) memfd_alloc_folio()
> >     hugetlbfs_file_mmap()
> >     hugetlb_fault()
> >
> > 3) hugetlbfs_file_mmap()
> >     hugetlb_fault()
> >         alloc_hugetlb_folio()
> >
> > 3) is the most common use-case where first a memfd is allocated followed
> > by mmap(), user writes/updates and then the relevant folios are pinned
> > (memfd_pin_folios()). The BUG this patch is fixing occurs in 2) because we
> > try to pin the folios before hugetlbfs_file_mmap() is called. So, in this
> > situation we try to allocate the folios before pinning them but since we
> did
> > not make any reservations, resv_huge_pages would be 0, leading to this
> issue.
> 
> Cool, thanks, I'll paste that into the changelog ;)
> 
> So if this code path is rare but expected and normal, should we be
> emitting this warning at all?
I think it would be OK to drop the warning. Otherwise, Syzbot would continue
to flag this issue.

Thanks,
Vivek

> 
> > > I can't find any mailing report/discussion of this.  The Closes: takes
> > > us to the syskaller report which is a bit of a dead end.
> > My understanding is that the Closes tag can be associated with a URL for
> > a public bugtracker like Syzkaller. Would the following be a better Closes
> link:
> >
> https://lore.kernel.org/all/677928b5.050a0220.3b53b0.004d.GAE@google.co
> m/T/
> 
> I'll add that - the more the merrier.
> 
> > >
> > > I agree with the patch - converting a BUG into a WARN+recover is a good
> > > thing but as far as I can tell, we don't know what's causing this
> > > situation.
> > >
> > > syskaller has a C reproducer, if anyone is feeling brave.
> > The udmabuf selftest added in patch #3 of the other series can also
> reproduce
> > this issue and is a lot simpler.
> >
> > Thanks,
> > Vivek