Regression on linux-next (next-20240228)

Borah, Chaitanya Kumar chaitanya.kumar.borah at intel.com
Wed Mar 6 16:32:26 UTC 2024


Hello Mathew,

> -----Original Message-----
> From: Matthew Wilcox <willy at infradead.org>
> Sent: Tuesday, March 5, 2024 11:19 PM
> To: Borah, Chaitanya Kumar <chaitanya.kumar.borah at intel.com>
> Cc: intel-gfx at lists.freedesktop.org; Kurmi, Suresh Kumar
> <suresh.kumar.kurmi at intel.com>; Saarinen, Jani <jani.saarinen at intel.com>
> Subject: Re: Regression on linux-next (next-20240228)
> 
> On Tue, Mar 05, 2024 at 06:49:16AM +0000, Borah, Chaitanya Kumar wrote:
> > Issue is still seen with the following changes
> >
> > void put_pages_list(struct list_head *pages)
> >
> >         folio_batch_init(&fbatch);
> >         list_for_each_entry(folio, pages, lru) {
> > -               if (!folio_put_testzero(folio))
> > +               if (!folio_put_testzero(folio)) {
> > +                       list_del(&folio->lru);
> >                         continue;
> > +               }
> >                 if (folio_test_large(folio)) {
> >                         __folio_put_large(folio);
> > +                       list_del(&folio->lru);
> >                         continue;
> >                 }
> 
> Thanks for testing.  Sorry about this.  I think I figured out what the problem
> actually is.  I switched from list_for_each_entry_safe() to list_for_each_entry()
> since I was no longer deleting the entries from the list.  Unfortunately, I was
> still freeing the entries as I walked the list!  So it would dereference folio-
> >lru.next after giving folio back to the page allocator (which probably put it on
> the PCP list, where it would point to another free folio?)
> 
> Anyway, this should do the job, without the change I asked you to test above.
> If this doesn't do the job by itself, you could try combining the two changes,
> but I don't think that will be necessary.
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index a910af21ba68..1d4b7713605d 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -139,10 +139,10 @@ EXPORT_SYMBOL(__folio_put);  void
> put_pages_list(struct list_head *pages)  {
>  	struct folio_batch fbatch;
> -	struct folio *folio;
> +	struct folio *folio, *next;
> 
>  	folio_batch_init(&fbatch);
> -	list_for_each_entry(folio, pages, lru) {
> +	list_for_each_entry_safe(folio, next, pages, lru) {
>  		if (!folio_put_testzero(folio))
>  			continue;
>  		if (folio_test_hugetlb(folio)) {

The following change works for us.

void put_pages_list(struct list_head *pages)
{
        struct folio_batch fbatch;
-       struct folio *folio;
+       struct folio *folio, *next;
 
        folio_batch_init(&fbatch);
-       list_for_each_entry(folio, pages, lru) {
+       list_for_each_entry_safe(folio, next, pages, lru) {
                if (!folio_put_testzero(folio))
                        continue;
                if (folio_test_large(folio)) {

Thank you for the resolution. When can we expect a patch?

Regards

Chaitanya


More information about the Intel-gfx mailing list