[PATCH 0/7] Per client engine busyness

Christian König christian.koenig at amd.com
Fri May 14 08:04:49 UTC 2021


Well in my opinion exposing it through fdinfo turned out to be a really 
clean approach.

It describes exactly the per file descriptor information we need.

Making that device driver independent is potentially useful as well.

Regards,
Christian.

Am 14.05.21 um 09:22 schrieb Nieto, David M:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> We had entertained the idea of exposing the processes as sysfs nodes 
> as you proposed, but we had concerns about exposing process info in 
> there, especially since /proc already exists for that purpose.
>
> I think if you were to follow that approach, we could have tools like 
> top that support exposing GPU engine usage.
> ------------------------------------------------------------------------
> *From:* Alex Deucher <alexdeucher at gmail.com>
> *Sent:* Thursday, May 13, 2021 10:58 PM
> *To:* Tvrtko Ursulin <tvrtko.ursulin at linux.intel.com>; Nieto, David M 
> <David.Nieto at amd.com>; Koenig, Christian <Christian.Koenig at amd.com>
> *Cc:* Intel Graphics Development <Intel-gfx at lists.freedesktop.org>; 
> Maling list - DRI developers <dri-devel at lists.freedesktop.org>; Daniel 
> Vetter <daniel at ffwll.ch>
> *Subject:* Re: [PATCH 0/7] Per client engine busyness
> + David, Christian
>
> On Thu, May 13, 2021 at 12:41 PM Tvrtko Ursulin
> <tvrtko.ursulin at linux.intel.com> wrote:
> >
> >
> > Hi,
> >
> > On 13/05/2021 16:48, Alex Deucher wrote:
> > > On Thu, May 13, 2021 at 7:00 AM Tvrtko Ursulin
> > > <tvrtko.ursulin at linux.intel.com> wrote:
> > >>
> > >> From: Tvrtko Ursulin <tvrtko.ursulin at intel.com>
> > >>
> > >> Resurrect of the previosuly merged per client engine busyness 
> patches. In a
> > >> nutshell it enables intel_gpu_top to be more top(1) like useful 
> and show not
> > >> only physical GPU engine usage but per process view as well.
> > >>
> > >> Example screen capture:
> > >> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >> intel-gpu-top -  906/ 955 MHz;    0% RC6; 5.30 Watts;      933 irqs/s
> > >>
> > >>        IMC reads:     4414 MiB/s
> > >>       IMC writes:     3805 MiB/s
> > >>
> > >>            ENGINE BUSY                                      
> MI_SEMA MI_WAIT
> > >>       Render/3D/0   93.46% |████████████████████████████████▋  
> |      0%      0%
> > >>         Blitter/0    0.00% |                                   
> |      0%      0%
> > >>           Video/0    0.00% |                                   
> |      0%      0%
> > >>    VideoEnhance/0    0.00% |                                   
> |      0%      0%
> > >>
> > >>    PID            NAME  Render/3D Blitter        Video      
> VideoEnhance
> > >>   2733       neverball |██████▌ ||            ||            
> ||            |
> > >>   2047            Xorg |███▊ ||            ||            
> ||            |
> > >>   2737        glxgears |█▍ ||            ||            
> ||            |
> > >>   2128           xfwm4 | ||            ||            ||            |
> > >>   2047            Xorg | ||            ||            ||            |
> > >> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > >>
> > >> Internally we track time spent on engines for each struct 
> intel_context, both
> > >> for current and past contexts belonging to each open DRM file.
> > >>
> > >> This can serve as a building block for several features from the 
> wanted list:
> > >> smarter scheduler decisions, getrusage(2)-like per-GEM-context 
> functionality
> > >> wanted by some customers, setrlimit(2) like controls, cgroups 
> controller,
> > >> dynamic SSEU tuning, ...
> > >>
> > >> To enable userspace access to the tracked data, we expose time 
> spent on GPU per
> > >> client and per engine class in sysfs with a hierarchy like the below:
> > >>
> > >>          # cd /sys/class/drm/card0/clients/
> > >>          # tree
> > >>          .
> > >>          ├── 7
> > >>          │   ├── busy
> > >>          │   │   ├── 0
> > >>          │   │   ├── 1
> > >>          │   │   ├── 2
> > >>          │   │   └── 3
> > >>          │   ├── name
> > >>          │   └── pid
> > >>          ├── 8
> > >>          │   ├── busy
> > >>          │   │   ├── 0
> > >>          │   │   ├── 1
> > >>          │   │   ├── 2
> > >>          │   │   └── 3
> > >>          │   ├── name
> > >>          │   └── pid
> > >>          └── 9
> > >>              ├── busy
> > >>              │   ├── 0
> > >>              │   ├── 1
> > >>              │   ├── 2
> > >>              │   └── 3
> > >>              ├── name
> > >>              └── pid
> > >>
> > >> Files in 'busy' directories are numbered using the engine class 
> ABI values and
> > >> they contain accumulated nanoseconds each client spent on engines 
> of a
> > >> respective class.
> > >
> > > We did something similar in amdgpu using the gpu scheduler.  We then
> > > expose the data via fdinfo.  See
> > > 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&reserved=0 
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D1774baa64f9395fa884ea9ed494bcb043f3b83f5&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=mt1EIL%2Fc9pHCXR%2FYSd%2BTr1e64XHoeYcdQ2cYufJ%2FcYQ%3D&reserved=0>
> > > 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&reserved=0 
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcgit.freedesktop.org%2Fdrm%2Fdrm-misc%2Fcommit%2F%3Fid%3D874442541133f78c78b6880b8cc495bab5c61704&data=04%7C01%7CDavid.Nieto%40amd.com%7C5e3c05578ef14be3692508d9169d55bf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637565687273144615%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=%2F3zMGw0LPTC1kG4NebTwUPTx7QCtEyw%2B4JToXDK5QXI%3D&reserved=0>
> >
> > Interesting!
> >
> > Is yours wall time or actual GPU time taking preemption and such into
> > account? Do you have some userspace tools parsing this data and how to
> > do you client discovery? Presumably there has to be a better way that
> > going through all open file descriptors?
>
> Wall time.  It uses the fences in the scheduler to calculate engine
> time.  We have some python scripts to make it look pretty, but mainly
> just reading the files directly.  If you know the process, you can
> look it up in procfs.
>
> >
> > Our implementation was merged in January but Daniel took it out recently
> > because he wanted to have discussion about a common vendor framework for
> > this whole story on dri-devel. I think. +Daniel to comment.
> >
> > I couldn't find the patch you pasted on the mailing list to see if there
> > was any such discussion around your version.
>
> It was on the amd-gfx mailing list.
>
> Alex
>
> >
> > Regards,
> >
> > Tvrtko
> >
> > >
> > > Alex
> > >
> > >
> > >>
> > >> Tvrtko Ursulin (7):
> > >>    drm/i915: Expose list of clients in sysfs
> > >>    drm/i915: Update client name on context create
> > >>    drm/i915: Make GEM contexts track DRM clients
> > >>    drm/i915: Track runtime spent in closed and unreachable GEM 
> contexts
> > >>    drm/i915: Track all user contexts per client
> > >>    drm/i915: Track context current active time
> > >>    drm/i915: Expose per-engine client busyness
> > >>
> > >> drivers/gpu/drm/i915/Makefile                 |   5 +-
> > >> drivers/gpu/drm/i915/gem/i915_gem_context.c   |  61 ++-
> > >> .../gpu/drm/i915/gem/i915_gem_context_types.h |  16 +-
> > >> drivers/gpu/drm/i915/gt/intel_context.c       |  27 +-
> > >> drivers/gpu/drm/i915/gt/intel_context.h       |  15 +-
> > >> drivers/gpu/drm/i915/gt/intel_context_types.h |  24 +-
> > >> .../drm/i915/gt/intel_execlists_submission.c  |  23 +-
> > >> .../gpu/drm/i915/gt/intel_gt_clock_utils.c    |   4 +
> > >> drivers/gpu/drm/i915/gt/intel_lrc.c           |  27 +-
> > >> drivers/gpu/drm/i915/gt/intel_lrc.h           |  24 ++
> > >> drivers/gpu/drm/i915/gt/selftest_lrc.c        |  10 +-
> > >> drivers/gpu/drm/i915/i915_drm_client.c        | 365 
> ++++++++++++++++++
> > >> drivers/gpu/drm/i915/i915_drm_client.h        | 123 ++++++
> > >> drivers/gpu/drm/i915/i915_drv.c               |   6 +
> > >> drivers/gpu/drm/i915/i915_drv.h               |   5 +
> > >> drivers/gpu/drm/i915/i915_gem.c               |  21 +-
> > >> drivers/gpu/drm/i915/i915_gpu_error.c         |  31 +-
> > >> drivers/gpu/drm/i915/i915_gpu_error.h         |   2 +-
> > >> drivers/gpu/drm/i915/i915_sysfs.c             |   8 +
> > >>   19 files changed, 716 insertions(+), 81 deletions(-)
> > >>   create mode 100644 drivers/gpu/drm/i915/i915_drm_client.c
> > >>   create mode 100644 drivers/gpu/drm/i915/i915_drm_client.h
> > >>
> > >> --
> > >> 2.30.2
> > >>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20210514/cde7f63c/attachment-0001.htm>


More information about the dri-devel mailing list