[PATCH] mm/hugetlb: Don't crash when allocating a folio if there are no resv
Kasireddy, Vivek
vivek.kasireddy at intel.com
Thu Jun 19 05:30:52 UTC 2025
Hi Andrew, Anshuman,
> Subject: Re: [PATCH] mm/hugetlb: Don't crash when allocating a folio if there
> are no resv
>
> On Wed, 18 Jun 2025 12:14:49 +0530 Anshuman Khandual
> <anshuman.khandual at arm.com> wrote:
>
> > > Therefore, prevent the above crash by replacing the VM_BUG_ON()
> > > with WARN_ON_ONCE() as there is no need to crash the system in
> > > this situation and instead we could just warn and fail the
> > > allocation.
> >
> > Why there are no reserved huge pages in such situations and also how
> > likely this might happen ? Is it recoverable ?
As described in the commit message above, the specific situation where this
happens is when we try to pin memfd folios before they are faulted-in.
Although, this is a valid thing to do, it is not the regular or the common
use-case. Let me explain this further with the following scenarios:
1) hugetlbfs_file_mmap()
memfd_alloc_folio()
hugetlb_fault()
2) memfd_alloc_folio()
hugetlbfs_file_mmap()
hugetlb_fault()
3) hugetlbfs_file_mmap()
hugetlb_fault()
alloc_hugetlb_folio()
3) is the most common use-case where first a memfd is allocated followed
by mmap(), user writes/updates and then the relevant folios are pinned
(memfd_pin_folios()). The BUG this patch is fixing occurs in 2) because we
try to pin the folios before hugetlbfs_file_mmap() is called. So, in this
situation we try to allocate the folios before pinning them but since we did
not make any reservations, resv_huge_pages would be 0, leading to this issue.
>
> I'm suspecting we don't know.
>
> > >
> > > Fixes: 26a8ea80929c ("mm/hugetlb: fix memfd_pin_folios
> resv_huge_pages leak")
>
> How was this arrived at? This is merely the patch which added the assertion.
Right, 26a8ea80929c is indeed the commit that introduced code that led to this
BUG/crash. Would this not qualify for Fixes?
>
> > > Reported-by: syzbot+a504cb5bae4fe117ba94 at syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=a504cb5bae4fe117ba94
>
> I can't find any mailing report/discussion of this. The Closes: takes
> us to the syskaller report which is a bit of a dead end.
My understanding is that the Closes tag can be associated with a URL for
a public bugtracker like Syzkaller. Would the following be a better Closes link:
https://lore.kernel.org/all/677928b5.050a0220.3b53b0.004d.GAE@google.com/T/
>
> I agree with the patch - converting a BUG into a WARN+recover is a good
> thing but as far as I can tell, we don't know what's causing this
> situation.
>
> syskaller has a C reproducer, if anyone is feeling brave.
The udmabuf selftest added in patch #3 of the other series can also reproduce
this issue and is a lot simpler.
Thanks,
Vivek
More information about the dri-devel
mailing list