<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<p style="font-family:Arial;font-size:10pt;color:#0000FF;margin:5pt;font-style:normal;font-weight:normal;text-decoration:none;" align="Left">
[AMD Official Use Only - General]<br>
</p>
<br>
<div>
<p style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</p>
<div>On 2023-11-22 17:21, Felix Kuehling wrote:</div>
<blockquote><br>
On 2023-11-03 09:11, James Zhu wrote:<br>
<blockquote>Add trace_id return for new pc sampling creation per device,<br>
Use IDR to quickly locate pc_sampling_entry for reference.<br>
<br>
Signed-off-by: James Zhu <a href="mailto:James.Zhu@amd.com" id="OWAf3384bc8-297e-17a4-1946-8e0dcf6d030b" class="moz-txt-link-rfc2396E OWAAutoLink" data-loopstyle="linkonly">
<James.Zhu@amd.com></a><br>
---<br>
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 ++<br>
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +++++++++++++++++++-<br>
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 6 ++++++<br>
3 files changed, 27 insertions(+), 1 deletion(-)<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c<br>
index 0e24e011f66b..bcaeedac8fe0 100644<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c<br>
@@ -536,10 +536,12 @@ static void kfd_smi_init(struct kfd_node *dev)<br>
static void kfd_pc_sampling_init(struct kfd_node *dev)<br>
{<br>
mutex_init(&dev->pcs_data.mutex);<br>
+ idr_init_base(&dev->pcs_data.hosttrap_entry.base.pc_sampling_idr, 1);<br>
}<br>
static void kfd_pc_sampling_exit(struct kfd_node *dev)<br>
{<br>
+ idr_destroy(&dev->pcs_data.hosttrap_entry.base.pc_sampling_idr);<br>
mutex_destroy(&dev->pcs_data.mutex);<br>
}<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c<br>
index f0d910ee730c..4c9fc48e1a6a 100644<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c<br>
@@ -99,6 +99,7 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,<br>
{<br>
struct kfd_pc_sample_info *supported_format = NULL;<br>
struct kfd_pc_sample_info user_info;<br>
+ struct pc_sampling_entry *pcs_entry;<br>
int ret;<br>
int i;<br>
@@ -140,7 +141,19 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,<br>
return ret ? ret : -EEXIST;<br>
}<br>
- /* TODO: add trace_id return */<br>
+ pcs_entry = kvzalloc(sizeof(*pcs_entry), GFP_KERNEL);<br>
+ if (!pcs_entry) {<br>
+ mutex_unlock(&pdd->dev->pcs_data.mutex);<br>
+ return -ENOMEM;<br>
+ }<br>
+<br>
+ i = idr_alloc_cyclic(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sampling_idr,<br>
+ pcs_entry, 1, 0, GFP_KERNEL);<br>
+ if (i < 0) {<br>
+ mutex_unlock(&pdd->dev->pcs_data.mutex);<br>
+ kvfree(pcs_entry);<br>
+ return i;<br>
+ }<br>
if (!pdd->dev->pcs_data.hosttrap_entry.base.use_count)<br>
memcpy(&pdd->dev->pcs_data.hosttrap_entry.base.pc_sample_info,<br>
@@ -149,6 +162,11 @@ static int kfd_pc_sample_create(struct kfd_process_device *pdd,<br>
pdd->dev->pcs_data.hosttrap_entry.base.use_count++;<br>
mutex_unlock(&pdd->dev->pcs_data.mutex);<br>
+ pcs_entry->pdd = pdd;<br>
</blockquote>
<br>
One more thought: You have a bunch of pcs_entries pointing to pdd. What mechanism guarantees, that the pcs_entries are destroyed before the pdd on process termination? I think kfd_pc_sampling_exit doesn't run during process termination, because it deals with
per-device data structures, not per-process data structures.<br>
<br>
Should the IDR be in the PDD rather than the device? In that case you wouldn't even need the pdd pointer in struct pcs_entry.<br>
</blockquote>
<div class="elementToProof">[JZ] the IDR here is mainly for generating trace_id with this device. I am not sure if ROCr/ROCprofiler are fine with this change which means same process has same trace_id value for different nodes. <a href="mailto:David.YatSin@amd.com" id="OWAAM742798" class="tWKOu mention ms-bgc-nlr ms-fcl-b">@Yat
Sin, David</a> would you mind give your comments here?</div>
<blockquote><br>
I see you have a patch much later in the series "drm/amdkfd: add pc sampling release when process release". I'd prefer this squashed here and in the patches that add the start/stop functions.<br>
<br>
Regards,<br>
Felix<br>
<br>
<br>
<blockquote>+ user_args->trace_id = (uint32_t)i;<br>
+<br>
+ pr_debug("alloc pcs_entry = %p, trace_id = 0x%x on gpu 0x%x", pcs_entry, i, pdd->dev->id);<br>
+<br>
return 0;<br>
}<br>
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h<br>
index 81c925fb2952..642558026d16 100644<br>
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h<br>
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h<br>
@@ -258,6 +258,7 @@ struct kfd_dev;<br>
struct kfd_dev_pc_sampling_data {<br>
uint32_t use_count; /* Num of PC sampling sessions */<br>
+ struct idr pc_sampling_idr;<br>
struct kfd_pc_sample_info pc_sample_info;<br>
};<br>
@@ -743,6 +744,11 @@ enum kfd_pdd_bound {<br>
*/<br>
#define SDMA_ACTIVITY_DIVISOR 100<br>
+struct pc_sampling_entry {<br>
+ bool enabled;<br>
+ struct kfd_process_device *pdd;<br>
+};<br>
+<br>
/* Data that is per-process-per device. */<br>
struct kfd_process_device {<br>
/* The device that owns this data. */<br>
</blockquote>
</blockquote>
</div>
</body>
</html>