<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    Am 24.08.21 um 14:42 schrieb Tom St Denis:<br>
    <blockquote type="cite"
cite="mid:CAAzXoRKWy74Wst_1br+N37dkO+5Uode4+zwQpEtM9su+JA9oLg@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">The IOCTL data is in the debugfs header as it is. 
        I could move that to the amdgpu_drm.h and include it from
        amdgpu_debugfs.h.</div>
    </blockquote>
    <br>
    Na, keep it like that and just add a comment.<br>
    <br>
    On second thought I don't want to raise any discussion on the
    mailing list if we should have a debugfs structure in the UAPI
    headers.<br>
    <br>
    Christian.<br>
    <br>
    <blockquote type="cite"
cite="mid:CAAzXoRKWy74Wst_1br+N37dkO+5Uode4+zwQpEtM9su+JA9oLg@mail.gmail.com">
      <div dir="ltr">
        <div><br>
        </div>
        <div>I'll re-write the STATE IOCTL to use a struct and then test
          against what I have in umr.</div>
        <div><br>
        </div>
        <div>Refactoring the read/write is trivial and I'll do that no
          problem (with style fixes).</div>
        <div><br>
        </div>
        <div>Cheers,</div>
        <div>Tom</div>
      </div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Tue, Aug 24, 2021 at 8:34
          AM Christian König <<a
            href="mailto:ckoenig.leichtzumerken@gmail.com"
            moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Am
          24.08.21 um 14:27 schrieb StDenis, Tom:<br>
          > [AMD Official Use Only]<br>
          ><br>
          > What do you mean a "shared header?"  How would they be
          shared between kernel and user?<br>
          <br>
          Somewhere in the include/uapi/drm/ folder I think. Either add
          that to <br>
          amdgpu_drm.h or maybe amdgpu_debugfs.h?<br>
          <br>
          Or just keep it as a structure in amdgpu_debugfs.h with a
          comment that <br>
          it is used for an IOCTL.<br>
          <br>
          Just not something hard coded like first byte is this, second
          that etc....<br>
          <br>
          > As for why not read/write.  Jus wanted to keep it
          simple.  It's not really performance bound.  umr never does
          reads/writes larger than 32-bits anyways.  It doesn't have to
          be this way though but I figured the fewer LOC the better.<br>
          <br>
          I think it would be cleaner if we would have separate
          functions for this.<br>
          <br>
          As far as I can see you also don't need the dance with the
          power <br>
          managerment etc for the IOCTL. It's just get_user() into your
          structure.<br>
          <br>
          BTW: In the kernel coding style put "switch" and "case" on the
          same <br>
          column, otherwise <a href="http://checkpatch.pl"
            rel="noreferrer" target="_blank" moz-do-not-send="true">checkpatch.pl</a>
          might complain.<br>
          <br>
          Christian.<br>
          <br>
          ><br>
          > Tom<br>
          ><br>
          > ________________________________________<br>
          > From: Christian König <<a
            href="mailto:ckoenig.leichtzumerken@gmail.com"
            target="_blank" moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>><br>
          > Sent: Tuesday, August 24, 2021 08:20<br>
          > To: StDenis, Tom; <a
            href="mailto:amd-gfx@lists.freedesktop.org" target="_blank"
            moz-do-not-send="true">amd-gfx@lists.freedesktop.org</a><br>
          > Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs
          interface for MMIO registers<br>
          ><br>
          ><br>
          ><br>
          > Am 24.08.21 um 14:16 schrieb Tom St Denis:<br>
          >> This new debugfs interface uses an IOCTL interface in
          order to pass<br>
          >> along state information like SRBM and GRBM bank
          switching.  This<br>
          >> new interface also allows a full 32-bit MMIO address
          range which<br>
          >> the previous didn't.  With this new design we have
          room to grow<br>
          >> the flexibility of the file as need be.<br>
          >><br>
          >> Signed-off-by: Tom St Denis <<a
            href="mailto:tom.stdenis@amd.com" target="_blank"
            moz-do-not-send="true">tom.stdenis@amd.com</a>><br>
          >> ---<br>
          >>    drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 177
          ++++++++++++++++++++<br>
          >>    drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h |  32
          ++++<br>
          >>    2 files changed, 209 insertions(+)<br>
          >><br>
          >> diff --git
          a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
          b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
          >> index 277128846dd1..ab2d92f84da5 100644<br>
          >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
          >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
          >> @@ -279,6 +279,173 @@ static ssize_t
          amdgpu_debugfs_regs_write(struct file *f, const char __user
          *buf,<br>
          >>        return amdgpu_debugfs_process_reg_op(false, f,
          (char __user *)buf, size, pos);<br>
          >>    }<br>
          >><br>
          >> +static int amdgpu_debugfs_regs2_open(struct inode
          *inode, struct file *file)<br>
          >> +{<br>
          >> +     struct amdgpu_debugfs_regs2_data *rd;<br>
          >> +<br>
          >> +     rd = kzalloc(sizeof *rd, GFP_KERNEL);<br>
          >> +     if (!rd)<br>
          >> +             return -ENOMEM;<br>
          >> +     rd->adev = file_inode(file)->i_private;<br>
          >> +     file->private_data = rd;<br>
          >> +<br>
          >> +     return 0;<br>
          >> +}<br>
          >> +<br>
          >> +static int amdgpu_debugfs_regs2_release(struct inode
          *inode, struct file *file)<br>
          >> +{<br>
          >> +     kfree(file->private_data);<br>
          >> +     return 0;<br>
          >> +}<br>
          >> +<br>
          >> +static int amdgpu_debugfs_regs2_op(struct file *f,
          char __user *buf, int write_en)<br>
          >> +{<br>
          >> +     struct amdgpu_debugfs_regs2_data *rd =
          f->private_data;<br>
          >> +     struct amdgpu_device *adev = rd->adev;<br>
          >> +     int result = 0, r;<br>
          >> +     uint32_t value;<br>
          >> +<br>
          >> +     if (rd->state.offset & 0x3)<br>
          >> +             return -EINVAL;<br>
          >> +<br>
          >> +     if (rd->state.use_grbm) {<br>
          >> +             if (rd-><a
            href="http://state.grbm.se" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.se</a> == 0x3FF)<br>
          >> +                     rd-><a
            href="http://state.grbm.se" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.se</a> = 0xFFFFFFFF;<br>
          >> +             if (rd-><a
            href="http://state.grbm.sh" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.sh</a> == 0x3FF)<br>
          >> +                     rd-><a
            href="http://state.grbm.sh" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.sh</a> = 0xFFFFFFFF;<br>
          >> +             if (rd->state.grbm.instance ==
          0x3FF)<br>
          >> +                     rd->state.grbm.instance =
          0xFFFFFFFF;<br>
          >> +     }<br>
          >> +<br>
          >> +     r =
          pm_runtime_get_sync(adev_to_drm(adev)->dev);<br>
          >> +     if (r < 0) {<br>
          >> +           
           pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
          >> +             return r;<br>
          >> +     }<br>
          >> +<br>
          >> +     r = amdgpu_virt_enable_access_debugfs(adev);<br>
          >> +     if (r < 0) {<br>
          >> +           
           pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
          >> +             return r;<br>
          >> +     }<br>
          >> +<br>
          >> +     if (rd->state.use_grbm) {<br>
          >> +             if ((rd-><a
            href="http://state.grbm.sh" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.sh</a> != 0xFFFFFFFF
          && rd-><a href="http://state.grbm.sh"
            rel="noreferrer" target="_blank" moz-do-not-send="true">state.grbm.sh</a>
          >= adev->gfx.config.max_sh_per_se) ||<br>
          >> +                 (rd-><a
            href="http://state.grbm.se" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.se</a> != 0xFFFFFFFF
          && rd-><a href="http://state.grbm.se"
            rel="noreferrer" target="_blank" moz-do-not-send="true">state.grbm.se</a>
          >= adev->gfx.config.max_shader_engines)) {<br>
          >> +                   
           pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
          >> +                   
           pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
          >> +                   
           amdgpu_virt_disable_access_debugfs(adev);<br>
          >> +                     return -EINVAL;<br>
          >> +             }<br>
          >> +           
           mutex_lock(&adev->grbm_idx_mutex);<br>
          >> +             amdgpu_gfx_select_se_sh(adev, rd-><a
            href="http://state.grbm.se" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.se</a>,<br>
          >> +                                                   
                   rd-><a href="http://state.grbm.sh"
            rel="noreferrer" target="_blank" moz-do-not-send="true">state.grbm.sh</a>,<br>
          >> +                                                   
                   rd->state.grbm.instance);<br>
          >> +     } else if (rd->state.use_grbm) {<br>
          >> +             mutex_lock(&adev->srbm_mutex);<br>
          >> +             amdgpu_gfx_select_me_pipe_q(adev,
          rd-><a href="http://state.srbm.me" rel="noreferrer"
            target="_blank" moz-do-not-send="true">state.srbm.me</a>,
          rd->state.srbm.pipe,<br>
          >> +                                                   
                           rd->state.srbm.queue,
          rd->state.srbm.vmid);<br>
          >> +     }<br>
          >> +<br>
          >> +     if (rd->state.pg_lock)<br>
          >> +             mutex_lock(&adev->pm.mutex);<br>
          >> +<br>
          >> +     if (!write_en) {<br>
          >> +             value = RREG32(rd->state.offset
          >> 2);<br>
          >> +             r = put_user(value, (uint32_t *)buf);<br>
          >> +     } else {<br>
          >> +             r = get_user(value, (uint32_t *)buf);<br>
          >> +             if (!r)<br>
          >> +                     amdgpu_mm_wreg_mmio_rlc(adev,
          rd->state.offset >> 2, value);<br>
          >> +     }<br>
          >> +     if (r) {<br>
          >> +             result = r;<br>
          >> +             goto end;<br>
          >> +     }<br>
          >> +     result = 0;<br>
          >> +end:<br>
          >> +     if (rd->state.use_grbm) {<br>
          >> +             amdgpu_gfx_select_se_sh(adev,
          0xffffffff, 0xffffffff, 0xffffffff);<br>
          >> +           
           mutex_unlock(&adev->grbm_idx_mutex);<br>
          >> +     } else if (rd->state.use_srbm) {<br>
          >> +             amdgpu_gfx_select_me_pipe_q(adev, 0, 0,
          0, 0);<br>
          >> +             mutex_unlock(&adev->srbm_mutex);<br>
          >> +     }<br>
          >> +<br>
          >> +     if (rd->state.pg_lock)<br>
          >> +             mutex_unlock(&adev->pm.mutex);<br>
          >> +<br>
          >> +     // in umr (the likely user of this) flags are
          set per file operation<br>
          >> +     // which means they're never "unset"
          explicitly.  To avoid breaking<br>
          >> +     // this convention we unset the flags after
          each operation<br>
          >> +     // flags are for a single call (need to be set
          for every read/write)<br>
          >> +     rd->state.use_grbm = 0;<br>
          >> +     rd->state.use_srbm = 0;<br>
          >> +     rd->state.pg_lock  = 0;<br>
          >> +<br>
          >> +   
           pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
          >> +   
           pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
          >> +<br>
          >> +     amdgpu_virt_disable_access_debugfs(adev);<br>
          >> +     return result;<br>
          >> +}<br>
          >> +<br>
          >> +static long amdgpu_debugfs_regs2_ioctl(struct file
          *f, unsigned int cmd, unsigned long data)<br>
          >> +{<br>
          >> +     struct amdgpu_debugfs_regs2_data *rd =
          f->private_data;<br>
          >> +     unsigned char st[32], v;<br>
          >> +     int r, x;<br>
          >> +<br>
          >> +     // always read first 4 bytes of data<br>
          >> +     for (x = 0; x < 4; x++) {<br>
          >> +             if ((r = get_user(v, (unsigned char
          *)data))) {<br>
          >> +                     return r;<br>
          >> +             }<br>
          >> +             ++data;<br>
          >> +             st[x] = v;<br>
          >> +     }<br>
          >> +<br>
          >> +     // first 4 bytes are offset<br>
          >> +     rd->state.offset = ((u32)st[0]) |
          ((u32)st[1] << 8) |<br>
          >> +                                        ((u32)st[2]
          << 16) | ((u32)st[3] << 24);<br>
          >> +<br>
          >> +     switch (cmd) {<br>
          >> +             case
          AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE:<br>
          >> +                     for (x = 4; x < 32; x++) {<br>
          >> +                             if ((r = get_user(v,
          (unsigned char *)data))) {<br>
          >> +                                     return r;<br>
          >> +                             }<br>
          >> +                             ++data;<br>
          >> +                             st[x] = v;<br>
          >> +                     }<br>
          >> +<br>
          >> +                     // next byte contains the flag<br>
          >> +                     // we only consume the
          remainder of the state if bit 1 is set<br>
          >> +                     // this allows the subsequent
          read/write<br>
          >> +                     rd->state.use_grbm = (st[4]
          & 1) ? 1 : 0;<br>
          >> +                     rd->state.use_srbm = (st[4]
          & 2) ? 1 : 0;<br>
          >> +                     rd->state.pg_lock  = (st[4]
          & 4) ? 1 : 0;<br>
          >> +<br>
          >> +                     // next 6 bytes are grbm data<br>
          >> +                     rd-><a
            href="http://state.grbm.se" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.se</a>       = 
          ((u32)st[5]) | ((u32)st[6] << 8);<br>
          >> +                     rd-><a
            href="http://state.grbm.sh" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.grbm.sh</a>       = 
          ((u32)st[7]) | ((u32)st[8] << 8);<br>
          >> +                     rd->state.grbm.instance = 
          ((u32)st[9]) | ((u32)st[10] << 8);<br>
          >> +<br>
          >> +                     // next 8 are srbm data<br>
          >> +                     rd-><a
            href="http://state.srbm.me" rel="noreferrer" target="_blank"
            moz-do-not-send="true">state.srbm.me</a>       = 
          ((u32)st[11]) | ((u32)st[12] << 8);<br>
          >> +                     rd->state.srbm.pipe     = 
          ((u32)st[13]) | ((u32)st[14] << 8);<br>
          >> +                     rd->state.srbm.queue    = 
          ((u32)st[15]) | ((u32)st[16] << 8);<br>
          >> +                     rd->state.srbm.vmid     = 
          ((u32)st[17]) | ((u32)st[18] << 8);<br>
          >> +                     break;<br>
          >> +             case AMDGPU_DEBUGFS_REGS2_IOC_READ:<br>
          >> +                     return
          amdgpu_debugfs_regs2_op(f, (char __user *)data, 0);<br>
          >> +             case AMDGPU_DEBUGFS_REGS2_IOC_WRITE:<br>
          >> +                     return
          amdgpu_debugfs_regs2_op(f, (char __user *)data, 1);<br>
          >> +             default:<br>
          >> +                     return -EINVAL;<br>
          >> +     }<br>
          >> +     return 0;<br>
          >> +}<br>
          >><br>
          >>    /**<br>
          >>     * amdgpu_debugfs_regs_pcie_read - Read from a
          PCIE register<br>
          >> @@ -1091,6 +1258,14 @@ static ssize_t
          amdgpu_debugfs_gfxoff_read(struct file *f, char __user *buf,<br>
          >>        return result;<br>
          >>    }<br>
          >><br>
          >> +static const struct file_operations
          amdgpu_debugfs_regs2_fops = {<br>
          >> +     .owner = THIS_MODULE,<br>
          >> +     .unlocked_ioctl = amdgpu_debugfs_regs2_ioctl,<br>
          >> +     .open = amdgpu_debugfs_regs2_open,<br>
          >> +     .release = amdgpu_debugfs_regs2_release,<br>
          >> +     .llseek = default_llseek<br>
          >> +};<br>
          >> +<br>
          >>    static const struct file_operations
          amdgpu_debugfs_regs_fops = {<br>
          >>        .owner = THIS_MODULE,<br>
          >>        .read = amdgpu_debugfs_regs_read,<br>
          >> @@ -1148,6 +1323,7 @@ static const struct
          file_operations amdgpu_debugfs_gfxoff_fops = {<br>
          >><br>
          >>    static const struct file_operations
          *debugfs_regs[] = {<br>
          >>        &amdgpu_debugfs_regs_fops,<br>
          >> +     &amdgpu_debugfs_regs2_fops,<br>
          >>        &amdgpu_debugfs_regs_didt_fops,<br>
          >>        &amdgpu_debugfs_regs_pcie_fops,<br>
          >>        &amdgpu_debugfs_regs_smc_fops,<br>
          >> @@ -1160,6 +1336,7 @@ static const struct
          file_operations *debugfs_regs[] = {<br>
          >><br>
          >>    static const char *debugfs_regs_names[] = {<br>
          >>        "amdgpu_regs",<br>
          >> +     "amdgpu_regs2",<br>
          >>        "amdgpu_regs_didt",<br>
          >>        "amdgpu_regs_pcie",<br>
          >>        "amdgpu_regs_smc",<br>
          >> diff --git
          a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
          b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
          >> index 141a8474e24f..04c81cd4bcc7 100644<br>
          >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
          >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
          >> @@ -22,6 +22,7 @@<br>
          >>     * OTHER DEALINGS IN THE SOFTWARE.<br>
          >>     *<br>
          >>     */<br>
          >> +#include <linux/ioctl.h><br>
          >><br>
          >>    /*<br>
          >>     * Debugfs<br>
          >> @@ -38,3 +39,34 @@ void
          amdgpu_debugfs_fence_init(struct amdgpu_device *adev);<br>
          >>    void amdgpu_debugfs_firmware_init(struct
          amdgpu_device *adev);<br>
          >>    void amdgpu_debugfs_gem_init(struct amdgpu_device
          *adev);<br>
          >>    int amdgpu_debugfs_wait_dump(struct amdgpu_device
          *adev);<br>
          >> +<br>
          >> +struct amdgpu_debugfs_regs2_data {<br>
          >> +     struct amdgpu_device *adev;<br>
          >> +     struct {<br>
          >> +             // regs state<br>
          >> +             int use_srbm,<br>
          >> +                 use_grbm,<br>
          >> +                 pg_lock;<br>
          >> +             struct {<br>
          >> +                     u32 se, sh, instance;<br>
          >> +             } grbm;<br>
          >> +             struct {<br>
          >> +                     u32 me, pipe, queue, vmid;<br>
          >> +             } srbm;<br>
          >> +             u32 offset;<br>
          >> +     } state;<br>
          >> +};<br>
          > This stuff needs to be in some shared header then.<br>
          ><br>
          >> +<br>
          >> +enum AMDGPU_DEBUGFS_REGS2_CMDS {<br>
          >> +     AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE=0,<br>
          >> +     AMDGPU_DEBUGFS_REGS2_CMD_READ,<br>
          >> +     AMDGPU_DEBUGFS_REGS2_CMD_WRITE,<br>
          > Why not just using the normal read and write functions?<br>
          ><br>
          > Christian.<br>
          ><br>
          >> +};<br>
          >> +<br>
          >> +struct amdgpu_debugfs_regs2_iocdata {<br>
          >> +     unsigned char t[32];<br>
          >> +};<br>
          >> +<br>
          >> +#define AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE
          _IOWR(0x20, AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE, struct
          amdgpu_debugfs_regs2_iocdata)<br>
          >> +#define AMDGPU_DEBUGFS_REGS2_IOC_READ _IOWR(0x20,
          AMDGPU_DEBUGFS_REGS2_CMD_WRITE, struct
          amdgpu_debugfs_regs2_iocdata)<br>
          >> +#define AMDGPU_DEBUGFS_REGS2_IOC_WRITE _IOWR(0x20,
          AMDGPU_DEBUGFS_REGS2_CMD_READ, struct
          amdgpu_debugfs_regs2_iocdata)<br>
          <br>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>