[PATCH v1 2/2] mm: remove extra ZONE_DEVICE struct page refcount

Joao Martins joao.m.martins at oracle.com
Tue Oct 19 15:13:34 UTC 2021


On 10/19/21 00:06, Jason Gunthorpe wrote:
> On Mon, Oct 18, 2021 at 12:37:30PM -0700, Dan Williams wrote:
> 
>>> device-dax uses PUD, along with TTM, they are the only places. I'm not
>>> sure TTM is a real place though.
>>
>> I was setting device-dax aside because it can use Joao's changes to
>> get compound-page support.
> 
> Ideally, but that ideas in that patch series have been floating around
> for a long time now..
>  
The current status of the series misses a Rb on patches 6,7,10,12-14.
Well, patch 8 too should now drop its tag, considering the latest
discussion.

If it helps moving things forward I could split my series further into:

1) the compound page introduction (patches 1-7) of my aforementioned series
2) vmemmap deduplication for memory gains (patches 9-14)
3) gup improvements (patch 8 and gup-slow improvements)

The reason being that item 1) is the the main dependency listed below.
And allows 2) and 3) to be parallelized. FWIW, it is almost fully reviewed
by Dan (as of v3->v4). For (1) patches 6 & 7 are on changes to
device-dax subsystem (drivers/dax/*) which still needs his Ack.

>>> Here I imagine the thing that creates the pgmap would specify the
>>> policy it wants. In most cases the policy is tightly coupled to what
>>> the free function in the the provided dev_pagemap_ops does..
>>
>> The thing that creates the pgmap is the device-driver, and
>> device-driver does not implement truncate or reclaim. It's not until
>> the FS mounts that the pgmap needs to start enforcing pin lifetime
>> guarantees.
> 
> I am explaining this wrong, the immediate need is really 'should
> foll_longterm fail fast-gup to the slow path' and something like the
> nvdimm driver can just set that to 1 and rely on VMA flags to control
> what the slow path does - as is today.
> 
> It is not as elegant as more flags in the pgmap, but it would get the
> job done with minimal fuss.
> 
> Might be nice to either rely fully on VMA flags or fully on pgmap
> holder flags for FOLL_LONGTERM?
>

Whats the benefit between preventing longterm at start
versus only after mounting the filesystem? Or is the intended future purpose
to pass more context into an holder potential future callback e.g. nack longterm
pins on a page basis?

Maybe we can start by at least not add any flags and just prevent
FOLL_LONGTERM on fsdax -- which I guess was the original purpose of
commit 7af75561e171 ("mm/gup: add FOLL_LONGTERM capability to GUP fast").
This patch (which I can formally send) has a sketch of that (below scissors mark):

https://lore.kernel.org/linux-mm/6a18179e-65f7-367d-89a9-d5162f10fef0@oracle.com/

It uses pgmap->type rather than adding further fields into pgmap, given this
restriction applies only to fsdax.

... and then we could improve devmap_longterm_available(pgmap) to look at the
holder::flags or pgmap::flags should we decide that an explicit flags is required
from holder/pgmap .. as a further improvement?

	Joao


More information about the dri-devel mailing list