<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=big5">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<p style="font-family:Arial;font-size:11pt;color:#0078D7;margin:5pt;" align="Left">
[AMD Official Use Only - Internal Distribution Only]<br>
</p>
<br>
<div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
We had entertained the idea of exposing the processes as sysfs nodes as you proposed, but we had concerns about exposing process info in there, especially since /proc already exists for that purpose.</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
<br>
</div>
<div style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0); background-color: rgb(255, 255, 255);">
I think if you were to follow that approach, we could have tools like top that support exposing GPU engine usage.</div>
<div id="appendonsend"></div>
<hr style="display:inline-block;width:98%" tabindex="-1">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> Alex Deucher <alexdeucher@gmail.com><br>
<b>Sent:</b> Thursday, May 13, 2021 10:58 PM<br>
<b>To:</b> Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>; Nieto, David M <David.Nieto@amd.com>; Koenig, Christian <Christian.Koenig@amd.com><br>
<b>Cc:</b> Intel Graphics Development <Intel-gfx@lists.freedesktop.org>; Maling list - DRI developers <dri-devel@lists.freedesktop.org>; Daniel Vetter <daniel@ffwll.ch><br>
<b>Subject:</b> Re: [PATCH 0/7] Per client engine busyness</font>
<div> </div>
</div>
<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
<div class="PlainText">+ David, Christian<br>
<br>
On Thu, May 13, 2021 at 12:41 PM Tvrtko Ursulin<br>
<tvrtko.ursulin@linux.intel.com> wrote:<br>
><br>
><br>
> Hi,<br>
><br>
> On 13/05/2021 16:48, Alex Deucher wrote:<br>
> > On Thu, May 13, 2021 at 7:00 AM Tvrtko Ursulin<br>
> > <tvrtko.ursulin@linux.intel.com> wrote:<br>
> >><br>
> >> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com><br>
> >><br>
> >> Resurrect of the previosuly merged per client engine busyness patches. In a<br>
> >> nutshell it enables intel_gpu_top to be more top(1) like useful and show not<br>
> >> only physical GPU engine usage but per process view as well.<br>
> >><br>
> >> Example screen capture:<br>
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
> >> intel-gpu-top - 906/ 955 MHz; 0% RC6; 5.30 Watts; 933 irqs/s<br>
> >><br>
> >> IMC reads: 4414 MiB/s<br>
> >> IMC writes: 3805 MiB/s<br>
> >><br>
> >> ENGINE BUSY MI_SEMA MI_WAIT<br>
> >> Render/3D/0 93.46% |¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢i¢n | 0% 0%<br>
> >> Blitter/0 0.00% | | 0% 0%<br>
> >> Video/0 0.00% | | 0% 0%<br>
> >> VideoEnhance/0 0.00% | | 0% 0%<br>
> >><br>
> >> PID NAME Render/3D Blitter Video VideoEnhance<br>
> >> 2733 neverball |¢i¢i¢i¢i¢i¢i¢m || || || |<br>
> >> 2047 Xorg |¢i¢i¢i¢o || || || |<br>
> >> 2737 glxgears |¢i¢l || || || |<br>
> >> 2128 xfwm4 | || || || |<br>
> >> 2047 Xorg | || || || |<br>
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
> >><br>
> >> Internally we track time spent on engines for each struct intel_context, both<br>
> >> for current and past contexts belonging to each open DRM file.<br>
> >><br>
> >> This can serve as a building block for several features from the wanted list:<br>
> >> smarter scheduler decisions, getrusage(2)-like per-GEM-context functionality<br>
> >> wanted by some customers, setrlimit(2) like controls, cgroups controller,<br>
> >> dynamic SSEU tuning, ...<br>
> >><br>
> >> To enable userspace access to the tracked data, we expose time spent on GPU per<br>
> >> client and per engine class in sysfs with a hierarchy like the below:<br>
> >><br>
> >> # cd /sys/class/drm/card0/clients/<br>
> >> # tree<br>
> >> .<br>
> >> ¢u¢w¢w 7<br>
> >> ¢x ¢u¢w¢w busy<br>
> >> ¢x ¢x ¢u¢w¢w 0<br>
> >> ¢x ¢x ¢u¢w¢w 1<br>
> >> ¢x ¢x ¢u¢w¢w 2<br>
> >> ¢x ¢x ¢|¢w¢w 3<br>
> >> ¢x ¢u¢w¢w name<br>
> >> ¢x ¢|¢w¢w pid<br>
> >> ¢u¢w¢w 8<br>
> >> ¢x ¢u¢w¢w busy<br>
> >> ¢x ¢x ¢u¢w¢w 0<br>
> >> ¢x ¢x ¢u¢w¢w 1<br>
> >> ¢x ¢x ¢u¢w¢w 2<br>
> >> ¢x ¢x ¢|¢w¢w 3<br>
> >> ¢x ¢u¢w¢w name<br>
> >> ¢x ¢|¢w¢w pid<br>
> >> ¢|¢w¢w 9<br>
> >> ¢u¢w¢w busy<br>
> >> ¢x ¢u¢w¢w 0<br>
> >> ¢x ¢u¢w¢w 1<br>
> >> ¢x ¢u¢w¢w 2<br>
> >> ¢x ¢|¢w¢w 3<br>
> >> ¢u¢w¢w name<br>
> >> ¢|¢w¢w pid<br>
> >><br>
> >> Files in 'busy' directories are numbered using the engine class ABI values and<br>
> >> they contain accumulated nanoseconds each client spent on engines of a<br>
> >> respective class.<br>
> ><br>
> > We did something similar in amdgpu using the gpu scheduler. We then<br>
> > expose the data via fdinfo. See<br>
> > <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&reserved=0">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&reserved=0</a><br>
> > <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&reserved=0">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&reserved=0</a><br>
><br>
> Interesting!<br>
><br>
> Is yours wall time or actual GPU time taking preemption and such into<br>
> account? Do you have some userspace tools parsing this data and how to<br>
> do you client discovery? Presumably there has to be a better way that<br>
> going through all open file descriptors?<br>
<br>
Wall time. It uses the fences in the scheduler to calculate engine<br>
time. We have some python scripts to make it look pretty, but mainly<br>
just reading the files directly. If you know the process, you can<br>
look it up in procfs.<br>
<br>
><br>
> Our implementation was merged in January but Daniel took it out recently<br>
> because he wanted to have discussion about a common vendor framework for<br>
> this whole story on dri-devel. I think. +Daniel to comment.<br>
><br>
> I couldn't find the patch you pasted on the mailing list to see if there<br>
> was any such discussion around your version.<br>
<br>
It was on the amd-gfx mailing list.<br>
<br>
Alex<br>
<br>
><br>
> Regards,<br>
><br>
> Tvrtko<br>
><br>
> ><br>
> > Alex<br>
> ><br>
> ><br>
> >><br>
> >> Tvrtko Ursulin (7):<br>
> >> drm/i915: Expose list of clients in sysfs<br>
> >> drm/i915: Update client name on context create<br>
> >> drm/i915: Make GEM contexts track DRM clients<br>
> >> drm/i915: Track runtime spent in closed and unreachable GEM contexts<br>
> >> drm/i915: Track all user contexts per client<br>
> >> drm/i915: Track context current active time<br>
> >> drm/i915: Expose per-engine client busyness<br>
> >><br>
> >> drivers/gpu/drm/i915/Makefile | 5 +-<br>
> >> drivers/gpu/drm/i915/gem/i915_gem_context.c | 61 ++-<br>
> >> .../gpu/drm/i915/gem/i915_gem_context_types.h | 16 +-<br>
> >> drivers/gpu/drm/i915/gt/intel_context.c | 27 +-<br>
> >> drivers/gpu/drm/i915/gt/intel_context.h | 15 +-<br>
> >> drivers/gpu/drm/i915/gt/intel_context_types.h | 24 +-<br>
> >> .../drm/i915/gt/intel_execlists_submission.c | 23 +-<br>
> >> .../gpu/drm/i915/gt/intel_gt_clock_utils.c | 4 +<br>
> >> drivers/gpu/drm/i915/gt/intel_lrc.c | 27 +-<br>
> >> drivers/gpu/drm/i915/gt/intel_lrc.h | 24 ++<br>
> >> drivers/gpu/drm/i915/gt/selftest_lrc.c | 10 +-<br>
> >> drivers/gpu/drm/i915/i915_drm_client.c | 365 ++++++++++++++++++<br>
> >> drivers/gpu/drm/i915/i915_drm_client.h | 123 ++++++<br>
> >> drivers/gpu/drm/i915/i915_drv.c | 6 +<br>
> >> drivers/gpu/drm/i915/i915_drv.h | 5 +<br>
> >> drivers/gpu/drm/i915/i915_gem.c | 21 +-<br>
> >> drivers/gpu/drm/i915/i915_gpu_error.c | 31 +-<br>
> >> drivers/gpu/drm/i915/i915_gpu_error.h | 2 +-<br>
> >> drivers/gpu/drm/i915/i915_sysfs.c | 8 +<br>
> >> 19 files changed, 716 insertions(+), 81 deletions(-)<br>
> >> create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c<br>
> >> create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h<br>
> >><br>
> >> --<br>
> >> 2.30.2<br>
> >><br>
</div>
</span></font></div>
</div>
</body>
</html>