<div dir="ltr">hehehe I just moved it to uapi... No worries, you're the maintainer, I'll move it back before posting v2.<div><br></div><div>Cheers,</div><div>Tom</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 24, 2021 at 9:22 AM Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com">ckoenig.leichtzumerken@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
Am 24.08.21 um 14:42 schrieb Tom St Denis:<br>
<blockquote type="cite">
<div dir="ltr">The IOCTL data is in the debugfs header as it is.
I could move that to the amdgpu_drm.h and include it from
amdgpu_debugfs.h.</div>
</blockquote>
<br>
Na, keep it like that and just add a comment.<br>
<br>
On second thought I don't want to raise any discussion on the
mailing list if we should have a debugfs structure in the UAPI
headers.<br>
<br>
Christian.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I'll re-write the STATE IOCTL to use a struct and then test
against what I have in umr.</div>
<div><br>
</div>
<div>Refactoring the read/write is trivial and I'll do that no
problem (with style fixes).</div>
<div><br>
</div>
<div>Cheers,</div>
<div>Tom</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Tue, Aug 24, 2021 at 8:34
AM Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com" target="_blank">ckoenig.leichtzumerken@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Am
24.08.21 um 14:27 schrieb StDenis, Tom:<br>
> [AMD Official Use Only]<br>
><br>
> What do you mean a "shared header?" How would they be
shared between kernel and user?<br>
<br>
Somewhere in the include/uapi/drm/ folder I think. Either add
that to <br>
amdgpu_drm.h or maybe amdgpu_debugfs.h?<br>
<br>
Or just keep it as a structure in amdgpu_debugfs.h with a
comment that <br>
it is used for an IOCTL.<br>
<br>
Just not something hard coded like first byte is this, second
that etc....<br>
<br>
> As for why not read/write. Jus wanted to keep it
simple. It's not really performance bound. umr never does
reads/writes larger than 32-bits anyways. It doesn't have to
be this way though but I figured the fewer LOC the better.<br>
<br>
I think it would be cleaner if we would have separate
functions for this.<br>
<br>
As far as I can see you also don't need the dance with the
power <br>
managerment etc for the IOCTL. It's just get_user() into your
structure.<br>
<br>
BTW: In the kernel coding style put "switch" and "case" on the
same <br>
column, otherwise <a href="http://checkpatch.pl" rel="noreferrer" target="_blank">checkpatch.pl</a>
might complain.<br>
<br>
Christian.<br>
<br>
><br>
> Tom<br>
><br>
> ________________________________________<br>
> From: Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com" target="_blank">ckoenig.leichtzumerken@gmail.com</a>><br>
> Sent: Tuesday, August 24, 2021 08:20<br>
> To: StDenis, Tom; <a href="mailto:amd-gfx@lists.freedesktop.org" target="_blank">amd-gfx@lists.freedesktop.org</a><br>
> Subject: Re: [PATCH] drm/amd/amdgpu: New debugfs
interface for MMIO registers<br>
><br>
><br>
><br>
> Am 24.08.21 um 14:16 schrieb Tom St Denis:<br>
>> This new debugfs interface uses an IOCTL interface in
order to pass<br>
>> along state information like SRBM and GRBM bank
switching. This<br>
>> new interface also allows a full 32-bit MMIO address
range which<br>
>> the previous didn't. With this new design we have
room to grow<br>
>> the flexibility of the file as need be.<br>
>><br>
>> Signed-off-by: Tom St Denis <<a href="mailto:tom.stdenis@amd.com" target="_blank">tom.stdenis@amd.com</a>><br>
>> ---<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 177
++++++++++++++++++++<br>
>> drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h | 32
++++<br>
>> 2 files changed, 209 insertions(+)<br>
>><br>
>> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
>> index 277128846dd1..ab2d92f84da5 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c<br>
>> @@ -279,6 +279,173 @@ static ssize_t
amdgpu_debugfs_regs_write(struct file *f, const char __user
*buf,<br>
>> return amdgpu_debugfs_process_reg_op(false, f,
(char __user *)buf, size, pos);<br>
>> }<br>
>><br>
>> +static int amdgpu_debugfs_regs2_open(struct inode
*inode, struct file *file)<br>
>> +{<br>
>> + struct amdgpu_debugfs_regs2_data *rd;<br>
>> +<br>
>> + rd = kzalloc(sizeof *rd, GFP_KERNEL);<br>
>> + if (!rd)<br>
>> + return -ENOMEM;<br>
>> + rd->adev = file_inode(file)->i_private;<br>
>> + file->private_data = rd;<br>
>> +<br>
>> + return 0;<br>
>> +}<br>
>> +<br>
>> +static int amdgpu_debugfs_regs2_release(struct inode
*inode, struct file *file)<br>
>> +{<br>
>> + kfree(file->private_data);<br>
>> + return 0;<br>
>> +}<br>
>> +<br>
>> +static int amdgpu_debugfs_regs2_op(struct file *f,
char __user *buf, int write_en)<br>
>> +{<br>
>> + struct amdgpu_debugfs_regs2_data *rd =
f->private_data;<br>
>> + struct amdgpu_device *adev = rd->adev;<br>
>> + int result = 0, r;<br>
>> + uint32_t value;<br>
>> +<br>
>> + if (rd->state.offset & 0x3)<br>
>> + return -EINVAL;<br>
>> +<br>
>> + if (rd->state.use_grbm) {<br>
>> + if (rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a> == 0x3FF)<br>
>> + rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a> = 0xFFFFFFFF;<br>
>> + if (rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a> == 0x3FF)<br>
>> + rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a> = 0xFFFFFFFF;<br>
>> + if (rd->state.grbm.instance ==
0x3FF)<br>
>> + rd->state.grbm.instance =
0xFFFFFFFF;<br>
>> + }<br>
>> +<br>
>> + r =
pm_runtime_get_sync(adev_to_drm(adev)->dev);<br>
>> + if (r < 0) {<br>
>> +
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
>> + return r;<br>
>> + }<br>
>> +<br>
>> + r = amdgpu_virt_enable_access_debugfs(adev);<br>
>> + if (r < 0) {<br>
>> +
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
>> + return r;<br>
>> + }<br>
>> +<br>
>> + if (rd->state.use_grbm) {<br>
>> + if ((rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a> != 0xFFFFFFFF
&& rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a>
>= adev->gfx.config.max_sh_per_se) ||<br>
>> + (rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a> != 0xFFFFFFFF
&& rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a>
>= adev->gfx.config.max_shader_engines)) {<br>
>> +
pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
>> +
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
>> +
amdgpu_virt_disable_access_debugfs(adev);<br>
>> + return -EINVAL;<br>
>> + }<br>
>> +
mutex_lock(&adev->grbm_idx_mutex);<br>
>> + amdgpu_gfx_select_se_sh(adev, rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a>,<br>
>> +
rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a>,<br>
>> +
rd->state.grbm.instance);<br>
>> + } else if (rd->state.use_grbm) {<br>
>> + mutex_lock(&adev->srbm_mutex);<br>
>> + amdgpu_gfx_select_me_pipe_q(adev,
rd-><a href="http://state.srbm.me" rel="noreferrer" target="_blank">state.srbm.me</a>,
rd->state.srbm.pipe,<br>
>> +
rd->state.srbm.queue,
rd->state.srbm.vmid);<br>
>> + }<br>
>> +<br>
>> + if (rd->state.pg_lock)<br>
>> + mutex_lock(&adev->pm.mutex);<br>
>> +<br>
>> + if (!write_en) {<br>
>> + value = RREG32(rd->state.offset
>> 2);<br>
>> + r = put_user(value, (uint32_t *)buf);<br>
>> + } else {<br>
>> + r = get_user(value, (uint32_t *)buf);<br>
>> + if (!r)<br>
>> + amdgpu_mm_wreg_mmio_rlc(adev,
rd->state.offset >> 2, value);<br>
>> + }<br>
>> + if (r) {<br>
>> + result = r;<br>
>> + goto end;<br>
>> + }<br>
>> + result = 0;<br>
>> +end:<br>
>> + if (rd->state.use_grbm) {<br>
>> + amdgpu_gfx_select_se_sh(adev,
0xffffffff, 0xffffffff, 0xffffffff);<br>
>> +
mutex_unlock(&adev->grbm_idx_mutex);<br>
>> + } else if (rd->state.use_srbm) {<br>
>> + amdgpu_gfx_select_me_pipe_q(adev, 0, 0,
0, 0);<br>
>> + mutex_unlock(&adev->srbm_mutex);<br>
>> + }<br>
>> +<br>
>> + if (rd->state.pg_lock)<br>
>> + mutex_unlock(&adev->pm.mutex);<br>
>> +<br>
>> + // in umr (the likely user of this) flags are
set per file operation<br>
>> + // which means they're never "unset"
explicitly. To avoid breaking<br>
>> + // this convention we unset the flags after
each operation<br>
>> + // flags are for a single call (need to be set
for every read/write)<br>
>> + rd->state.use_grbm = 0;<br>
>> + rd->state.use_srbm = 0;<br>
>> + rd->state.pg_lock = 0;<br>
>> +<br>
>> +
pm_runtime_mark_last_busy(adev_to_drm(adev)->dev);<br>
>> +
pm_runtime_put_autosuspend(adev_to_drm(adev)->dev);<br>
>> +<br>
>> + amdgpu_virt_disable_access_debugfs(adev);<br>
>> + return result;<br>
>> +}<br>
>> +<br>
>> +static long amdgpu_debugfs_regs2_ioctl(struct file
*f, unsigned int cmd, unsigned long data)<br>
>> +{<br>
>> + struct amdgpu_debugfs_regs2_data *rd =
f->private_data;<br>
>> + unsigned char st[32], v;<br>
>> + int r, x;<br>
>> +<br>
>> + // always read first 4 bytes of data<br>
>> + for (x = 0; x < 4; x++) {<br>
>> + if ((r = get_user(v, (unsigned char
*)data))) {<br>
>> + return r;<br>
>> + }<br>
>> + ++data;<br>
>> + st[x] = v;<br>
>> + }<br>
>> +<br>
>> + // first 4 bytes are offset<br>
>> + rd->state.offset = ((u32)st[0]) |
((u32)st[1] << 8) |<br>
>> + ((u32)st[2]
<< 16) | ((u32)st[3] << 24);<br>
>> +<br>
>> + switch (cmd) {<br>
>> + case
AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE:<br>
>> + for (x = 4; x < 32; x++) {<br>
>> + if ((r = get_user(v,
(unsigned char *)data))) {<br>
>> + return r;<br>
>> + }<br>
>> + ++data;<br>
>> + st[x] = v;<br>
>> + }<br>
>> +<br>
>> + // next byte contains the flag<br>
>> + // we only consume the
remainder of the state if bit 1 is set<br>
>> + // this allows the subsequent
read/write<br>
>> + rd->state.use_grbm = (st[4]
& 1) ? 1 : 0;<br>
>> + rd->state.use_srbm = (st[4]
& 2) ? 1 : 0;<br>
>> + rd->state.pg_lock = (st[4]
& 4) ? 1 : 0;<br>
>> +<br>
>> + // next 6 bytes are grbm data<br>
>> + rd-><a href="http://state.grbm.se" rel="noreferrer" target="_blank">state.grbm.se</a> =
((u32)st[5]) | ((u32)st[6] << 8);<br>
>> + rd-><a href="http://state.grbm.sh" rel="noreferrer" target="_blank">state.grbm.sh</a> =
((u32)st[7]) | ((u32)st[8] << 8);<br>
>> + rd->state.grbm.instance =
((u32)st[9]) | ((u32)st[10] << 8);<br>
>> +<br>
>> + // next 8 are srbm data<br>
>> + rd-><a href="http://state.srbm.me" rel="noreferrer" target="_blank">state.srbm.me</a> =
((u32)st[11]) | ((u32)st[12] << 8);<br>
>> + rd->state.srbm.pipe =
((u32)st[13]) | ((u32)st[14] << 8);<br>
>> + rd->state.srbm.queue =
((u32)st[15]) | ((u32)st[16] << 8);<br>
>> + rd->state.srbm.vmid =
((u32)st[17]) | ((u32)st[18] << 8);<br>
>> + break;<br>
>> + case AMDGPU_DEBUGFS_REGS2_IOC_READ:<br>
>> + return
amdgpu_debugfs_regs2_op(f, (char __user *)data, 0);<br>
>> + case AMDGPU_DEBUGFS_REGS2_IOC_WRITE:<br>
>> + return
amdgpu_debugfs_regs2_op(f, (char __user *)data, 1);<br>
>> + default:<br>
>> + return -EINVAL;<br>
>> + }<br>
>> + return 0;<br>
>> +}<br>
>><br>
>> /**<br>
>> * amdgpu_debugfs_regs_pcie_read - Read from a
PCIE register<br>
>> @@ -1091,6 +1258,14 @@ static ssize_t
amdgpu_debugfs_gfxoff_read(struct file *f, char __user *buf,<br>
>> return result;<br>
>> }<br>
>><br>
>> +static const struct file_operations
amdgpu_debugfs_regs2_fops = {<br>
>> + .owner = THIS_MODULE,<br>
>> + .unlocked_ioctl = amdgpu_debugfs_regs2_ioctl,<br>
>> + .open = amdgpu_debugfs_regs2_open,<br>
>> + .release = amdgpu_debugfs_regs2_release,<br>
>> + .llseek = default_llseek<br>
>> +};<br>
>> +<br>
>> static const struct file_operations
amdgpu_debugfs_regs_fops = {<br>
>> .owner = THIS_MODULE,<br>
>> .read = amdgpu_debugfs_regs_read,<br>
>> @@ -1148,6 +1323,7 @@ static const struct
file_operations amdgpu_debugfs_gfxoff_fops = {<br>
>><br>
>> static const struct file_operations
*debugfs_regs[] = {<br>
>> &amdgpu_debugfs_regs_fops,<br>
>> + &amdgpu_debugfs_regs2_fops,<br>
>> &amdgpu_debugfs_regs_didt_fops,<br>
>> &amdgpu_debugfs_regs_pcie_fops,<br>
>> &amdgpu_debugfs_regs_smc_fops,<br>
>> @@ -1160,6 +1336,7 @@ static const struct
file_operations *debugfs_regs[] = {<br>
>><br>
>> static const char *debugfs_regs_names[] = {<br>
>> "amdgpu_regs",<br>
>> + "amdgpu_regs2",<br>
>> "amdgpu_regs_didt",<br>
>> "amdgpu_regs_pcie",<br>
>> "amdgpu_regs_smc",<br>
>> diff --git
a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
>> index 141a8474e24f..04c81cd4bcc7 100644<br>
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.h<br>
>> @@ -22,6 +22,7 @@<br>
>> * OTHER DEALINGS IN THE SOFTWARE.<br>
>> *<br>
>> */<br>
>> +#include <linux/ioctl.h><br>
>><br>
>> /*<br>
>> * Debugfs<br>
>> @@ -38,3 +39,34 @@ void
amdgpu_debugfs_fence_init(struct amdgpu_device *adev);<br>
>> void amdgpu_debugfs_firmware_init(struct
amdgpu_device *adev);<br>
>> void amdgpu_debugfs_gem_init(struct amdgpu_device
*adev);<br>
>> int amdgpu_debugfs_wait_dump(struct amdgpu_device
*adev);<br>
>> +<br>
>> +struct amdgpu_debugfs_regs2_data {<br>
>> + struct amdgpu_device *adev;<br>
>> + struct {<br>
>> + // regs state<br>
>> + int use_srbm,<br>
>> + use_grbm,<br>
>> + pg_lock;<br>
>> + struct {<br>
>> + u32 se, sh, instance;<br>
>> + } grbm;<br>
>> + struct {<br>
>> + u32 me, pipe, queue, vmid;<br>
>> + } srbm;<br>
>> + u32 offset;<br>
>> + } state;<br>
>> +};<br>
> This stuff needs to be in some shared header then.<br>
><br>
>> +<br>
>> +enum AMDGPU_DEBUGFS_REGS2_CMDS {<br>
>> + AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE=0,<br>
>> + AMDGPU_DEBUGFS_REGS2_CMD_READ,<br>
>> + AMDGPU_DEBUGFS_REGS2_CMD_WRITE,<br>
> Why not just using the normal read and write functions?<br>
><br>
> Christian.<br>
><br>
>> +};<br>
>> +<br>
>> +struct amdgpu_debugfs_regs2_iocdata {<br>
>> + unsigned char t[32];<br>
>> +};<br>
>> +<br>
>> +#define AMDGPU_DEBUGFS_REGS2_IOC_SET_STATE
_IOWR(0x20, AMDGPU_DEBUGFS_REGS2_CMD_SET_STATE, struct
amdgpu_debugfs_regs2_iocdata)<br>
>> +#define AMDGPU_DEBUGFS_REGS2_IOC_READ _IOWR(0x20,
AMDGPU_DEBUGFS_REGS2_CMD_WRITE, struct
amdgpu_debugfs_regs2_iocdata)<br>
>> +#define AMDGPU_DEBUGFS_REGS2_IOC_WRITE _IOWR(0x20,
AMDGPU_DEBUGFS_REGS2_CMD_READ, struct
amdgpu_debugfs_regs2_iocdata)<br>
<br>
</blockquote>
</div>
</blockquote>
<br>
</div>
</blockquote></div>