<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    Well in my opinion exposing it through fdinfo turned out to be a
    really clean approach.<br>
    <br>
    It describes exactly the per file descriptor information we need.<br>
    <br>
    Making that device driver independent is potentially useful as well.<br>
    <br>
    Regards,<br>
    Christian.<br>
    <br>
    <div class="moz-cite-prefix">Am 14.05.21 um 09:22 schrieb Nieto,
      David M:<br>
    </div>
    <blockquote type="cite" cite="mid:BYAPR12MB2840AA68BCAEBD9279C6184FF4509@BYAPR12MB2840.namprd12.prod.outlook.com">
      
      <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
      <p style="font-family:Arial;font-size:11pt;color:#0078D7;margin:5pt;" align="Left">
        [AMD Official Use Only - Internal Distribution Only]<br>
      </p>
      <br>
      <div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0); background-color:
          rgb(255, 255, 255);">
          We had entertained the idea of exposing the processes as sysfs
          nodes as you proposed, but we had concerns about exposing
          process info in there, especially since /proc already exists
          for that purpose.</div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0); background-color:
          rgb(255, 255, 255);">
          <br>
        </div>
        <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
          font-size: 12pt; color: rgb(0, 0, 0); background-color:
          rgb(255, 255, 255);">
          I think if you were to follow that approach, we could have
          tools like top that support exposing GPU engine usage.</div>
        <hr style="display:inline-block;width:98%" tabindex="-1">
        <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b> Alex
            Deucher <a class="moz-txt-link-rfc2396E" href="mailto:alexdeucher@gmail.com"><alexdeucher@gmail.com></a><br>
            <b>Sent:</b> Thursday, May 13, 2021 10:58 PM<br>
            <b>To:</b> Tvrtko Ursulin
            <a class="moz-txt-link-rfc2396E" href="mailto:tvrtko.ursulin@linux.intel.com"><tvrtko.ursulin@linux.intel.com></a>; Nieto, David M
            <a class="moz-txt-link-rfc2396E" href="mailto:David.Nieto@amd.com"><David.Nieto@amd.com></a>; Koenig, Christian
            <a class="moz-txt-link-rfc2396E" href="mailto:Christian.Koenig@amd.com"><Christian.Koenig@amd.com></a><br>
            <b>Cc:</b> Intel Graphics Development
            <a class="moz-txt-link-rfc2396E" href="mailto:Intel-gfx@lists.freedesktop.org"><Intel-gfx@lists.freedesktop.org></a>; Maling list - DRI
            developers <a class="moz-txt-link-rfc2396E" href="mailto:dri-devel@lists.freedesktop.org"><dri-devel@lists.freedesktop.org></a>; Daniel
            Vetter <a class="moz-txt-link-rfc2396E" href="mailto:daniel@ffwll.ch"><daniel@ffwll.ch></a><br>
            <b>Subject:</b> Re: [PATCH 0/7] Per client engine busyness</font>
          <div> </div>
        </div>
        <div class="BodyFragment"><font size="2"><span style="font-size:11pt;">
              <div class="PlainText">+ David, Christian<br>
                <br>
                On Thu, May 13, 2021 at 12:41 PM Tvrtko Ursulin<br>
                <a class="moz-txt-link-rfc2396E" href="mailto:tvrtko.ursulin@linux.intel.com"><tvrtko.ursulin@linux.intel.com></a> wrote:<br>
                ><br>
                ><br>
                > Hi,<br>
                ><br>
                > On 13/05/2021 16:48, Alex Deucher wrote:<br>
                > > On Thu, May 13, 2021 at 7:00 AM Tvrtko Ursulin<br>
                > > <a class="moz-txt-link-rfc2396E" href="mailto:tvrtko.ursulin@linux.intel.com"><tvrtko.ursulin@linux.intel.com></a> wrote:<br>
                > >><br>
                > >> From: Tvrtko Ursulin
                <a class="moz-txt-link-rfc2396E" href="mailto:tvrtko.ursulin@intel.com"><tvrtko.ursulin@intel.com></a><br>
                > >><br>
                > >> Resurrect of the previosuly merged per
                client engine busyness patches. In a<br>
                > >> nutshell it enables intel_gpu_top to be
                more top(1) like useful and show not<br>
                > >> only physical GPU engine usage but per
                process view as well.<br>
                > >><br>
                > >> Example screen capture:<br>
                > >>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
                > >> intel-gpu-top -  906/ 955 MHz;    0% RC6; 
                5.30 Watts;      933 irqs/s<br>
                > >><br>
                > >>        IMC reads:     4414 MiB/s<br>
                > >>       IMC writes:     3805 MiB/s<br>
                > >><br>
                > >>            ENGINE     
                BUSY                                      MI_SEMA
                MI_WAIT<br>
                > >>       Render/3D/0   93.46%
                |████████████████████████████████▋  |      0%      0%<br>
                > >>         Blitter/0    0.00%
                |                                   |      0%      0%<br>
                > >>           Video/0    0.00%
                |                                   |      0%      0%<br>
                > >>    VideoEnhance/0    0.00%
                |                                   |      0%      0%<br>
                > >><br>
                > >>    PID            NAME  Render/3D     
                Blitter        Video      VideoEnhance<br>
                > >>   2733       neverball |██████▌    
                ||            ||            ||            |<br>
                > >>   2047            Xorg |███▊       
                ||            ||            ||            |<br>
                > >>   2737        glxgears |█▍         
                ||            ||            ||            |<br>
                > >>   2128           xfwm4 |           
                ||            ||            ||            |<br>
                > >>   2047            Xorg |           
                ||            ||            ||            |<br>
                > >>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~<br>
                > >><br>
                > >> Internally we track time spent on engines
                for each struct intel_context, both<br>
                > >> for current and past contexts belonging to
                each open DRM file.<br>
                > >><br>
                > >> This can serve as a building block for
                several features from the wanted list:<br>
                > >> smarter scheduler decisions,
                getrusage(2)-like per-GEM-context functionality<br>
                > >> wanted by some customers, setrlimit(2)
                like controls, cgroups controller,<br>
                > >> dynamic SSEU tuning, ...<br>
                > >><br>
                > >> To enable userspace access to the tracked
                data, we expose time spent on GPU per<br>
                > >> client and per engine class in sysfs with
                a hierarchy like the below:<br>
                > >><br>
                > >>          # cd
                /sys/class/drm/card0/clients/<br>
                > >>          # tree<br>
                > >>          .<br>
                > >>          ├── 7<br>
                > >>          │   ├── busy<br>
                > >>          │   │   ├── 0<br>
                > >>          │   │   ├── 1<br>
                > >>          │   │   ├── 2<br>
                > >>          │   │   └── 3<br>
                > >>          │   ├── name<br>
                > >>          │   └── pid<br>
                > >>          ├── 8<br>
                > >>          │   ├── busy<br>
                > >>          │   │   ├── 0<br>
                > >>          │   │   ├── 1<br>
                > >>          │   │   ├── 2<br>
                > >>          │   │   └── 3<br>
                > >>          │   ├── name<br>
                > >>          │   └── pid<br>
                > >>          └── 9<br>
                > >>              ├── busy<br>
                > >>              │   ├── 0<br>
                > >>              │   ├── 1<br>
                > >>              │   ├── 2<br>
                > >>              │   └── 3<br>
                > >>              ├── name<br>
                > >>              └── pid<br>
                > >><br>
                > >> Files in 'busy' directories are numbered
                using the engine class ABI values and<br>
                > >> they contain accumulated nanoseconds each
                client spent on engines of a<br>
                > >> respective class.<br>
                > ><br>
                > > We did something similar in amdgpu using the
                gpu scheduler.  We then<br>
                > > expose the data via fdinfo.  See<br>
                > > <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&amp;reserved=0" moz-do-not-send="true">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&amp;reserved=0</a><br>
                > > <a href="https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&amp;reserved=0" moz-do-not-send="true">
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&amp;data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&amp;reserved=0</a><br>
                ><br>
                > Interesting!<br>
                ><br>
                > Is yours wall time or actual GPU time taking
                preemption and such into<br>
                > account? Do you have some userspace tools parsing
                this data and how to<br>
                > do you client discovery? Presumably there has to be
                a better way that<br>
                > going through all open file descriptors?<br>
                <br>
                Wall time.  It uses the fences in the scheduler to
                calculate engine<br>
                time.  We have some python scripts to make it look
                pretty, but mainly<br>
                just reading the files directly.  If you know the
                process, you can<br>
                look it up in procfs.<br>
                <br>
                ><br>
                > Our implementation was merged in January but Daniel
                took it out recently<br>
                > because he wanted to have discussion about a common
                vendor framework for<br>
                > this whole story on dri-devel. I think. +Daniel to
                comment.<br>
                ><br>
                > I couldn't find the patch you pasted on the mailing
                list to see if there<br>
                > was any such discussion around your version.<br>
                <br>
                It was on the amd-gfx mailing list.<br>
                <br>
                Alex<br>
                <br>
                ><br>
                > Regards,<br>
                ><br>
                > Tvrtko<br>
                ><br>
                > ><br>
                > > Alex<br>
                > ><br>
                > ><br>
                > >><br>
                > >> Tvrtko Ursulin (7):<br>
                > >>    drm/i915: Expose list of clients in
                sysfs<br>
                > >>    drm/i915: Update client name on context
                create<br>
                > >>    drm/i915: Make GEM contexts track DRM
                clients<br>
                > >>    drm/i915: Track runtime spent in closed
                and unreachable GEM contexts<br>
                > >>    drm/i915: Track all user contexts per
                client<br>
                > >>    drm/i915: Track context current active
                time<br>
                > >>    drm/i915: Expose per-engine client
                busyness<br>
                > >><br>
                > >>  
                drivers/gpu/drm/i915/Makefile                 |   5 +-<br>
                > >>  
                drivers/gpu/drm/i915/gem/i915_gem_context.c   |  61 ++-<br>
                > >>  
                .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-<br>
                > >>  
                drivers/gpu/drm/i915/gt/intel_context.c       |  27 +-<br>
                > >>  
                drivers/gpu/drm/i915/gt/intel_context.h       |  15 +-<br>
                > >>  
                drivers/gpu/drm/i915/gt/intel_context_types.h |  24 +-<br>
                > >>  
                .../drm/i915/gt/intel_execlists_submission.c  |  23 +-<br>
                > >>  
                .../gpu/drm/i915/gt/intel_gt_clock_utils.c    |   4 +<br>
                > >>  
                drivers/gpu/drm/i915/gt/intel_lrc.c           |  27 +-<br>
                > >>  
                drivers/gpu/drm/i915/gt/intel_lrc.h           |  24 ++<br>
                > >>  
                drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-<br>
                > >>  
                drivers/gpu/drm/i915/i915_drm_client.c        | 365
                ++++++++++++++++++<br>
                > >>  
                drivers/gpu/drm/i915/i915_drm_client.h        | 123
                ++++++<br>
                > >>  
                drivers/gpu/drm/i915/i915_drv.c               |   6 +<br>
                > >>  
                drivers/gpu/drm/i915/i915_drv.h               |   5 +<br>
                > >>  
                drivers/gpu/drm/i915/i915_gem.c               |  21 +-<br>
                > >>  
                drivers/gpu/drm/i915/i915_gpu_error.c         |  31 +-<br>
                > >>  
                drivers/gpu/drm/i915/i915_gpu_error.h         |   2 +-<br>
                > >>  
                drivers/gpu/drm/i915/i915_sysfs.c             |   8 +<br>
                > >>   19 files changed, 716 insertions(+), 81
                deletions(-)<br>
                > >>   create mode 100644
                drivers/gpu/drm/i915/i915_drm_client.c<br>
                > >>   create mode 100644
                drivers/gpu/drm/i915/i915_drm_client.h<br>
                > >><br>
                > >> --<br>
                > >> 2.30.2<br>
                > >><br>
              </div>
            </span></font></div>
      </div>
    </blockquote>
    <br>
  </body>
</html>