<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
Let's do this as valid in fdinfo.<br>
<br>
This way we can easily extend whatever the kernel wants to display
as statistics in the userspace HUD.<br>
<br>
Regards,<br>
Christian.<br>
<br>
<div class="moz-cite-prefix">Am 21.01.23 um 01:45 schrieb Marek
Olšák:<br>
</div>
<blockquote type="cite"
cite="mid:CAAxE2A6JcREmKKmh1n0xSgkOZq77kpnzC-27-srunLKduyAwiw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">
<div>We badly need a way to query evicted memory usage. It's
essential for investigating performance problems and it
uncovered the buddy allocator disaster. Please either suggest
an alternative, suggest changes, or review. We need it ASAP.<br>
</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Marek<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Jan 10, 2023 at 11:55
AM Marek Olšák <<a href="mailto:maraeo@gmail.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">maraeo@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Jan 10, 2023 at
11:23 AM Christian König <<a
href="mailto:ckoenig.leichtzumerken@gmail.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">ckoenig.leichtzumerken@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> Am 10.01.23 um 16:28 schrieb Marek Olšák:<br>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Jan 4,
2023 at 9:51 AM Christian König <<a
href="mailto:ckoenig.leichtzumerken@gmail.com"
target="_blank" moz-do-not-send="true"
class="moz-txt-link-freetext">ckoenig.leichtzumerken@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> Am 04.01.23 um 00:08 schrieb Marek
Olšák:<br>
<blockquote type="cite">
<div dir="ltr">
<div>I see about the access now, but did
you even look at the patch?</div>
</div>
</blockquote>
<br>
I did look at the patch, but I haven't fully
understood yet what you are trying to do
here.<br>
</div>
</blockquote>
<div><br>
</div>
<div>First and foremost, it returns the evicted
size of VRAM and visible VRAM, and returns
visible VRAM usage. It should be obvious which
stat includes the size of another.<br>
</div>
<div><br>
</div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
<blockquote type="cite">
<div dir="ltr">
<div> Because what the patch does isn't
even exposed to common drm code, such
as the preferred domain and visible
VRAM placement, so it can't be in
fdinfo right now.<br>
</div>
<div><br>
</div>
<div>Or do you even know what fdinfo
contains? Because it contains nothing
useful. It only has VRAM and GTT
usage, which we already have in the
INFO ioctl, so it has nothing that we
need. We mainly need the eviction
information and visible VRAM
information now. Everything else is a
bonus.<br>
</div>
</div>
</blockquote>
<br>
Well the main question is what are you
trying to get from that information? The
eviction list for example is completely
meaningless to userspace, that stuff is only
temporary and will be cleared on the next CS
again.<br>
</div>
</blockquote>
<div><br>
</div>
<div>I don't know what you mean. The returned
eviction stats look correct and are stable
(they don't change much). You can suggest
changes if you think some numbers are not
reported correctly.<br>
</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
What we could expose is the VRAM over-commit
value, e.g. how much BOs which where
supposed to be in VRAM are in GTT now. I
think that's what you are looking for here,
right?<br>
</div>
</blockquote>
<div><br>
</div>
<div>The VRAM overcommit value is
"evicted_vram".<br>
</div>
<div> </div>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>Also, it's undesirable to open
and parse a text file if we can just
call an ioctl.</div>
</div>
</div>
</blockquote>
<br>
Well I see the reasoning for that, but I
also see why other drivers do a lot of the
stuff we have as IOCTL as separate files in
sysfs, fdinfo or debugfs.<br>
<br>
Especially repeating all the static
information which were already available
under sysfs in the INFO IOCTL was a design
mistake as far as I can see. Just compare
what AMDGPU and the KFD code is doing to
what for example i915 is doing.<br>
<br>
Same for things like debug information about
a process. The fdinfo stuff can be queried
from external tools (gdb, gputop, umr
etc...) as well which makes that interface
more preferred.<br>
</div>
</blockquote>
<div><br>
</div>
<div>Nothing uses fdinfo in Mesa. No driver uses
sysfs in Mesa except drm shims, noop drivers,
and Intel for perf metrics. sysfs itself is an
unusable mess for the PCIe query and is
missing information.</div>
<div><br>
</div>
<div>I'm not against exposing more stuff through
sysfs and fdinfo for tools, but I don't see
any reason why drivers should use it (other
than for slowing down queries and
initialization).</div>
</div>
</div>
</blockquote>
<br>
That's what I'm asking: Is this for some tool or to
make some driver decision based on it?<br>
<br>
If you just want the numbers for over displaying then
I think it would be better to put this into fdinfo
together with the other existing stuff there.<br>
</div>
</blockquote>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div> <br>
If you want to make allocation decisions based on this
then we should have that as IOCTL or even better as
mmap() page between kernel and userspace. But in this
case I would also calculation the numbers completely
different as well.<br>
<br>
See we have at least the following things in the
kernel:<br>
1. The eviction list in the VM.<br>
Those are the BOs which are currently evicted and
tried to moved back in on the next CS.<br>
<br>
2. The VRAM over commit value.<br>
In other words how much more VRAM than available
has the application tried to allocate?<br>
<br>
3. The visible VRAM usage by this application.<br>
<br>
The end goal is that the eviction list will go away,
e.g. we will always have stable allocations based on
allocations of other applications and not constantly
swap things in and out.<br>
<br>
When you now expose the eviction list to userspace we
will be stuck with this interface forever.<br>
</div>
</blockquote>
<div><br>
</div>
<div>It's for the GALLIUM HUD.</div>
<div><br>
</div>
<div>The only missing thing is the size of all evicted
VRAM allocations, and the size of all evicted visible
VRAM allocations.<br>
</div>
<div><br>
</div>
<div>1. No list is exposed. Only sums of buffer sizes are
exposed. Also, the eviction list has no meaning here.
All lists are treated equally, and mem_type is compared
with preferred_domains to determine where buffers are
and where they should be.<br>
</div>
<div><br>
</div>
<div>2. I'm not interested in the overcommit value. I'm
only interested in knowing the number of bytes of
evicted VRAM right now. It can be as variable as the CPU
load, but in practice it shouldn't be because PCIe
doesn't have the bandwidth to move things quickly.<br>
</div>
<div><br>
</div>
<div>3. Yes, that's true.</div>
<div><br>
</div>
<div>Marek</div>
<br>
</div>
</div>
</blockquote>
</div>
</blockquote>
<br>
</body>
</html>