<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body>
    I suggest to give this command here at least a try (remembered the
    name after a moment):<br>
    <br>
    pahole drivers/gpu/drm/amd/amdgpu/amdgpu.o -C
    amdgpu_debugfs_regs2_iocdata<br>
    <br>
    It has a rather nifty output with padding holes, byte addresses,
    cache lines etc for your structure.<br>
    <br>
    Christian.<br>
    <br>
    <div class="moz-cite-prefix">Am 25.08.21 um 13:04 schrieb Tom St
      Denis:<br>
    </div>
    <blockquote type="cite"
cite="mid:CAAzXoRJsSxm7UOt++H9Ko8pnnNdO0Zp=+hPgCh8TwUaQhv-e_w@mail.gmail.com">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      <div dir="ltr">I tested it by forcing bit patterns into the ioctl
        data and printing it out in the kernel log.  I'm not siloed into
        it one way or the other.  I'll just change it to u32.</div>
      <br>
      <div class="gmail_quote">
        <div dir="ltr" class="gmail_attr">On Wed, Aug 25, 2021 at 7:03
          AM Christian König <<a
            href="mailto:ckoenig.leichtzumerken@gmail.com"
            moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>>
          wrote:<br>
        </div>
        <blockquote class="gmail_quote" style="margin:0px 0px 0px
          0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
          <div> Using u8 is ok as well, just make sure that you don't
            have any hidden padding.<br>
            <br>
            Nirmoy had a tool to double check for paddings which I once
            more forgot the name of.<br>
            <br>
            Christian.<br>
            <br>
            <div>Am 25.08.21 um 12:40 schrieb Tom St Denis:<br>
            </div>
            <blockquote type="cite">
              <div dir="ltr">The struct works as is but I'll change them
                to u32.  The offset is an artefact of the fact this was
                an IOCTL originally.  I'm working both ends in parallel
                trying to make the changes at the same time because I'm
                only submitting the kernel patch if I've tested it in
                userspace.
                <div><br>
                </div>
                <div>I'll send a v4 in a bit this morning....</div>
                <div><br>
                </div>
                <div>Tom</div>
              </div>
              <br>
              <div class="gmail_quote">
                <div dir="ltr" class="gmail_attr">On Wed, Aug 25, 2021
                  at 2:35 AM Christian König <<a
                    href="mailto:ckoenig.leichtzumerken@gmail.com"
                    target="_blank" moz-do-not-send="true">ckoenig.leichtzumerken@gmail.com</a>>
                  wrote:<br>
                </div>
                <blockquote class="gmail_quote" style="margin:0px 0px
                  0px 0.8ex;border-left:1px solid
                  rgb(204,204,204);padding-left:1ex"><br>
                  <br>
                  Am 24.08.21 um 15:36 schrieb Tom St Denis:<br>
                  > This new debugfs interface uses an IOCTL
                  interface in order to pass<br>
                  > along state information like SRBM and GRBM bank
                  switching.  This<br>
                  > new interface also allows a full 32-bit MMIO
                  address range which<br>
                  > the previous didn't.  With this new design we
                  have room to grow<br>
                  > the flexibility of the file as need be.<br>
                  ><br>
                  > (v2): Move read/write to .read/.write, fix style,
                  add comment<br>
                  >        for IOCTL data structure<br>
                  ><br>
                  > Signed-off-by: Tom St Denis <<a
                    href="mailto:tom.stdenis@amd.com" target="_blank"
                    moz-do-not-send="true">tom.stdenis@amd.com</a>><br>
                  > ---<br>
                  >   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c |
                  162 ++++++++++++++++++++<br>
                  >   drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h | 
                  32 ++++<br>
                  >   2 files changed, 194 insertions(+)<br>
                  ><br>
                  > diff --git
                  a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
                  b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
                  > index 277128846dd1..8e8f5743c8f5 100644<br>
                  > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
                  > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
                  > @@ -279,6 +279,156 @@ static ssize_t
                  amdgpu_debugfs_regs_write(struct file *f, const char
                  __user *buf,<br>
                  >       return amdgpu_debugfs_process_reg_op(false,
                  f, (char __user *)buf, size, pos);<br>
                  >   }<br>
                  >   <br>
                  > +static int amdgpu_debugfs_regs2_open(struct
                  inode *inode, struct file *file)<br>
                  > +{<br>
                  > +     struct amdgpu_debugfs_regs2_data *rd;<br>
                  > +<br>
                  > +     rd = kzalloc(sizeof *rd, GFP_KERNEL);<br>
                  > +     if (!rd)<br>
                  > +             return -ENOMEM;<br>
                  > +     rd->adev =
                  file_inode(file)->i_private;<br>
                  > +     file->private_data = rd;<br>
                  > +<br>
                  > +     return 0;<br>
                  > +}<br>
                  > +<br>
                  > +static int amdgpu_debugfs_regs2_release(struct
                  inode *inode, struct file *file)<br>
                  > +{<br>
                  > +     kfree(file->private_data);<br>
                  > +     return 0;<br>
                  > +}<br>
                  > +<br>
                  > +static ssize_t amdgpu_debugfs_regs2_op(struct
                  file *f, char __user *buf, size_t size, int write_en)<br>
                  > +{<br>
                  > +     struct amdgpu_debugfs_regs2_data *rd =
                  f->private_data;<br>
                  > +     struct amdgpu_device *adev = rd->adev;<br>
                  > +     ssize_t result = 0;<br>
                  > +     int r;<br>
                  > +     uint32_t value;<br>
                  > +<br>
                  > +     if (size & 0x3 || rd->state.offset
                  & 0x3)<br>
                  > +             return -EINVAL;<br>
                  > +<br>
                  > +     if (rd->state.id.use_grbm) {<br>
                  > +             if (rd-><a
                    href="http://state.id.grbm.se" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.se</a>
                  == 0x3FF)<br>
                  > +                     rd-><a
                    href="http://state.id.grbm.se" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.se</a>
                  = 0xFFFFFFFF;<br>
                  > +             if (rd-><a
                    href="http://state.id.grbm.sh" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.sh</a>
                  == 0x3FF)<br>
                  > +                     rd-><a
                    href="http://state.id.grbm.sh" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.sh</a>
                  = 0xFFFFFFFF;<br>
                  > +             if (rd->state.id.grbm.instance
                  == 0x3FF)<br>
                  > +                   
                   rd->state.id.grbm.instance = 0xFFFFFFFF;<br>
                  > +     }<br>
                  > +<br>
                  > +     r =
                  pm_runtime_get_sync(adev_to_drm(adev)->dev);<br>
                  > +     if (r < 0) {<br>
                  > +           
                   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
                  > +             return r;<br>
                  > +     }<br>
                  > +<br>
                  > +     r =
                  amdgpu_virt_enable_access_debugfs(adev);<br>
                  > +     if (r < 0) {<br>
                  > +           
                   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
                  > +             return r;<br>
                  > +     }<br>
                  > +<br>
                  > +     if (rd->state.id.use_grbm) {<br>
                  > +             if ((rd-><a
                    href="http://state.id.grbm.sh" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.sh</a>
                  != 0xFFFFFFFF && rd-><a
                    href="http://state.id.grbm.sh" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.sh</a>
                  >= adev->gfx.config.max_sh_per_se) ||<br>
                  > +                 (rd-><a
                    href="http://state.id.grbm.se" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.se</a>
                  != 0xFFFFFFFF && rd-><a
                    href="http://state.id.grbm.se" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id.grbm.se</a>
                  >= adev->gfx.config.max_shader_engines)) {<br>
                  > +                   
                   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
                  > +                   
                   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
                  > +                   
                   amdgpu_virt_disable_access_debugfs(adev);<br>
                  > +                     return -EINVAL;<br>
                  > +             }<br>
                  > +           
                   mutex_lock(&adev->grbm_idx_mutex);<br>
                  > +             amdgpu_gfx_select_se_sh(adev,
                  rd-><a href="http://state.id.grbm.se"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">state.id.grbm.se</a>,<br>
                  > +                                               
                               rd-><a href="http://state.id.grbm.sh"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">state.id.grbm.sh</a>,<br>
                  > +                                               
                               rd->state.id.grbm.instance);<br>
                  > +     }<br>
                  > +<br>
                  > +     if (rd->state.id.use_srbm) {<br>
                  > +           
                   mutex_lock(&adev->srbm_mutex);<br>
                  > +             amdgpu_gfx_select_me_pipe_q(adev,
                  rd-><a href="http://state.id.srbm.me"
                    rel="noreferrer" target="_blank"
                    moz-do-not-send="true">state.id.srbm.me</a>,
                  rd->state.id.srbm.pipe,<br>
                  > +                                               
                                       rd->state.id.srbm.queue,
                  rd->state.id.srbm.vmid);<br>
                  > +     }<br>
                  > +<br>
                  > +     if (rd->state.id.pg_lock)<br>
                  > +             mutex_lock(&adev->pm.mutex);<br>
                  > +<br>
                  > +     while (size) {<br>
                  > +             if (!write_en) {<br>
                  > +                     value =
                  RREG32(rd->state.offset >> 2);<br>
                  > +                     r = put_user(value,
                  (uint32_t *)buf);<br>
                  > +             } else {<br>
                  > +                     r = get_user(value,
                  (uint32_t *)buf);<br>
                  > +                     if (!r)<br>
                  > +                           
                   amdgpu_mm_wreg_mmio_rlc(adev, rd->state.offset
                  >> 2, value);<br>
                  > +             }<br>
                  > +             if (r) {<br>
                  > +                     result = r;<br>
                  > +                     goto end;<br>
                  > +             }<br>
                  > +             rd->state.offset += 4;<br>
                  > +             size -= 4;<br>
                  > +             result += 4;<br>
                  > +             buf += 4;<br>
                  > +     }<br>
                  > +end:<br>
                  > +     if (rd->state.id.use_grbm) {<br>
                  > +             amdgpu_gfx_select_se_sh(adev,
                  0xffffffff, 0xffffffff, 0xffffffff);<br>
                  > +           
                   mutex_unlock(&adev->grbm_idx_mutex);<br>
                  > +     }<br>
                  > +<br>
                  > +     if (rd->state.id.use_srbm) {<br>
                  > +             amdgpu_gfx_select_me_pipe_q(adev,
                  0, 0, 0, 0);<br>
                  > +           
                   mutex_unlock(&adev->srbm_mutex);<br>
                  > +     }<br>
                  > +<br>
                  > +     if (rd->state.id.pg_lock)<br>
                  > +           
                   mutex_unlock(&adev->pm.mutex);<br>
                  > +<br>
                  > +     // in umr (the likely user of this) flags
                  are set per file operation<br>
                  > +     // which means they're never "unset"
                  explicitly.  To avoid breaking<br>
                  > +     // this convention we unset the flags after
                  each operation<br>
                  > +     // flags are for a single call (need to be
                  set for every read/write)<br>
                  > +     rd->state.id.use_grbm = 0;<br>
                  > +     rd->state.id.use_srbm = 0;<br>
                  > +     rd->state.id.pg_lock  = 0;<br>
                  > +<br>
                  > +   
                   pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
                  > +   
                   pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
                  > +<br>
                  > +     amdgpu_virt_disable_access_debugfs(adev);<br>
                  > +     return result;<br>
                  > +}<br>
                  > +<br>
                  > +static long amdgpu_debugfs_regs2_ioctl(struct
                  file *f, unsigned int cmd, unsigned long data)<br>
                  > +{<br>
                  > +     struct amdgpu_debugfs_regs2_data *rd =
                  f->private_data;<br>
                  > +<br>
                  > +     switch (cmd) {<br>
                  > +     case AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE:<br>
                  > +             if (copy_from_user(&rd-><a
                    href="http://state.id" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id</a>,
                  (struct amdgpu_debugfs_regs2_iocdata *)data, sizeof
                  rd-><a href="http://state.id" rel="noreferrer"
                    target="_blank" moz-do-not-send="true">state.id</a>))<br>
                  > +                     return -EINVAL;<br>
                  > +             break;<br>
                  > +     default:<br>
                  > +             return -EINVAL;<br>
                  > +     }<br>
                  > +     return 0;<br>
                  > +}<br>
                  > +<br>
                  > +static ssize_t amdgpu_debugfs_regs2_read(struct
                  file *f, char __user *buf, size_t size, loff_t *pos)<br>
                  > +{<br>
                  > +     struct amdgpu_debugfs_regs2_data *rd =
                  f->private_data;<br>
                  > +     rd->state.offset = *pos;<br>
                  > +     return amdgpu_debugfs_regs2_op(f, buf,
                  size, 0);<br>
                  > +}<br>
                  > +<br>
                  > +static ssize_t amdgpu_debugfs_regs2_write(struct
                  file *f, const char __user *buf, size_t size, loff_t
                  *pos)<br>
                  > +{<br>
                  > +     struct amdgpu_debugfs_regs2_data *rd =
                  f->private_data;<br>
                  > +     rd->state.offset = *pos;<br>
                  > +     return amdgpu_debugfs_regs2_op(f, (char
                  __user *)buf, size, 1);<br>
                  > +}<br>
                  > +<br>
                  >   <br>
                  >   /**<br>
                  >    * amdgpu_debugfs_regs_pcie_read - Read from a
                  PCIE register<br>
                  > @@ -1091,6 +1241,16 @@ static ssize_t
                  amdgpu_debugfs_gfxoff_read(struct file *f, char __user
                  *buf,<br>
                  >       return result;<br>
                  >   }<br>
                  >   <br>
                  > +static const struct file_operations
                  amdgpu_debugfs_regs2_fops = {<br>
                  > +     .owner = THIS_MODULE,<br>
                  > +     .unlocked_ioctl =
                  amdgpu_debugfs_regs2_ioctl,<br>
                  > +     .read = amdgpu_debugfs_regs2_read,<br>
                  > +     .write = amdgpu_debugfs_regs2_write,<br>
                  > +     .open = amdgpu_debugfs_regs2_open,<br>
                  > +     .release = amdgpu_debugfs_regs2_release,<br>
                  > +     .llseek = default_llseek<br>
                  > +};<br>
                  > +<br>
                  >   static const struct file_operations
                  amdgpu_debugfs_regs_fops = {<br>
                  >       .owner = THIS_MODULE,<br>
                  >       .read = amdgpu_debugfs_regs_read,<br>
                  > @@ -1148,6 +1308,7 @@ static const struct
                  file_operations amdgpu_debugfs_gfxoff_fops = {<br>
                  >   <br>
                  >   static const struct file_operations
                  *debugfs_regs[] = {<br>
                  >       &amdgpu_debugfs_regs_fops,<br>
                  > +     &amdgpu_debugfs_regs2_fops,<br>
                  >       &amdgpu_debugfs_regs_didt_fops,<br>
                  >       &amdgpu_debugfs_regs_pcie_fops,<br>
                  >       &amdgpu_debugfs_regs_smc_fops,<br>
                  > @@ -1160,6 +1321,7 @@ static const struct
                  file_operations *debugfs_regs[] = {<br>
                  >   <br>
                  >   static const char *debugfs_regs_names[] = {<br>
                  >       "amdgpu_regs",<br>
                  > +     "amdgpu_regs2",<br>
                  >       "amdgpu_regs_didt",<br>
                  >       "amdgpu_regs_pcie",<br>
                  >       "amdgpu_regs_smc",<br>
                  > diff --git
                  a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
                  b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
                  > index 141a8474e24f..ec044df5d428 100644<br>
                  > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
                  > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
                  > @@ -22,6 +22,8 @@<br>
                  >    * OTHER DEALINGS IN THE SOFTWARE.<br>
                  >    *<br>
                  >    */<br>
                  > +#include <linux/ioctl.h><br>
                  > +#include <uapi/drm/amdgpu_drm.h><br>
                  >   <br>
                  >   /*<br>
                  >    * Debugfs<br>
                  > @@ -38,3 +40,33 @@ void
                  amdgpu_debugfs_fence_init(struct amdgpu_device *adev);<br>
                  >   void amdgpu_debugfs_firmware_init(struct
                  amdgpu_device *adev);<br>
                  >   void amdgpu_debugfs_gem_init(struct
                  amdgpu_device *adev);<br>
                  >   int amdgpu_debugfs_wait_dump(struct
                  amdgpu_device *adev);<br>
                  > +<br>
                  > +/*<br>
                  > + * MMIO debugfs IOCTL structure<br>
                  > + */<br>
                  > +struct amdgpu_debugfs_regs2_iocdata {<br>
                  > +     __u8 use_srbm, use_grbm, pg_lock;<br>
                  <br>
                  You should consider using u32 here as well or add
                  explicitly padding.<br>
                  <br>
                  > +     struct {<br>
                  > +             __u32 se, sh, instance;<br>
                  > +     } grbm;<br>
                  > +     struct {<br>
                  > +             __u32 me, pipe, queue, vmid;<br>
                  > +     } srbm;<br>
                  > +};<br>
                  > +<br>
                  > +/*<br>
                  > + * MMIO debugfs state data (per file* handle)<br>
                  > + */<br>
                  > +struct amdgpu_debugfs_regs2_data {<br>
                  > +     struct amdgpu_device *adev;<br>
                  > +     struct {<br>
                  > +             struct amdgpu_debugfs_regs2_iocdata
                  id;<br>
                  > +             __u32 offset;<br>
                  <br>
                  What is the offset good for here?<br>
                  <br>
                  Regards,<br>
                  Christian.<br>
                  <br>
                  > +     } state;<br>
                  > +};<br>
                  > +<br>
                  > +enum AMDGPU_DEBUGFS_REGS2_CMDS {<br>
                  > +     AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE=0,<br>
                  > +};<br>
                  > +<br>
                  > +#define AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE
                  _IOWR(0x20, AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE, struct
                  amdgpu_debugfs_regs2_iocdata)<br>
                  <br>
                </blockquote>
              </div>
            </blockquote>
            <br>
          </div>
        </blockquote>
      </div>
    </blockquote>
    <br>
  </body>
</html>