[PATCH 2/2] drm/xe/pf: Expose access to the VF GGTT PTEs over debugfs
Michal Wajdeczko
michal.wajdeczko at intel.com
Tue Nov 5 16:41:40 UTC 2024
On 05.11.2024 02:14, Matthew Brost wrote:
> On Sun, Nov 03, 2024 at 09:16:33PM +0100, Michal Wajdeczko wrote:
>> For feature enabling and testing purposes, allow to capture and
>> replace VF's GGTT PTEs data using debugfs blob file.
>>
>> Signed-off-by: Michal Wajdeczko <michal.wajdeczko at intel.com>
>> ---
>> drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c | 62 +++++++++++++++++++++
>> 1 file changed, 62 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> index 05df4ab3514b..69ba830d9e8d 100644
>> --- a/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> +++ b/drivers/gpu/drm/xe/xe_gt_sriov_pf_debugfs.c
>> @@ -11,6 +11,7 @@
>> #include "xe_bo.h"
>> #include "xe_debugfs.h"
>> #include "xe_device.h"
>> +#include "xe_ggtt.h"
>> #include "xe_gt.h"
>> #include "xe_gt_debugfs.h"
>> #include "xe_gt_sriov_pf_config.h"
>> @@ -497,6 +498,64 @@ static const struct file_operations config_blob_ops = {
>> .llseek = default_llseek,
>> };
>>
>> +/*
>> + * /sys/kernel/debug/dri/0/
>> + * ├── gt0
>> + * │ ├── vf1
>> + * │ │ ├── ggtt_raw
>> + */
>> +
>> +static ssize_t ggtt_raw_read(struct file *file, char __user *buf,
>> + size_t count, loff_t *pos)
>> +{
>> + struct dentry *dent = file_dentry(file);
>> + struct dentry *parent = dent->d_parent;
>> + unsigned int vfid = extract_vfid(parent);
>> + struct xe_gt *gt = extract_gt(parent);
>> + struct xe_device *xe = gt_to_xe(gt);
>> + ssize_t ret;
>> +
>> + xe_pm_runtime_get(xe);
>> + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
>
> + Thomas to confirm I'm making sense here.
>
> So this relates to this patch [1] / Thomas comment [2].
>
> You are adding memory allocations here under the
> xe_gt_sriov_pf_master_mutex which renders [1] incomplete.
I was assuming that using GFP_NOWAIT and then on fail having a fallback
to fixed 64B local chunk is fine, no?
>
> So you need to one of two things:
>
> 1. Never do any memory allocations under xe_gt_sriov_pf_master_mutex. If
> you choose this option taint this mutex with reclaim when loading the
> PF. It is then safe to xe_gt_sriov_pf_master_mutex in suspend / resume /
> reset flows.
well, due to lack of [1] there are still some allocations done during
sending a VF config to the GuC, but hopefully we can mitigate that soon
but what I found recently is that due to recent GGTT refactoring, the
xe_ggtt_node is now allocated (with GFP_NOFS) flag under that mutex,
which may require another round of fixes
>
> 2. Remove xe_gt_sriov_pf_master_mutex from suspend / resume / reset
> flows.
reprovisioning (sending VFs configs to GuC) is only done as one of the
final reset steps, and as long it's there it will require that mutex
alternate option would be to decouple reprovisioning to an async worker
triggered from the reset, will take a look at this
>
> In addition to above, also never allocate memory in suspend / resume /
> reset flows.
>
> Not blocker here but just using this as an example to explain the
> current SRIOV locking problems. Hope this helps.
>
> Matt
>
> [1] https://patchwork.freedesktop.org/patch/619024/?series=139801&rev=1
> [2] https://lore.kernel.org/intel-xe/3e13401972fd49240f486fd7d47580e576794c78.camel@intel.com/
>
>> +
>> + ret = xe_ggtt_node_read(gt->sriov.pf.vfs[vfid].config.ggtt_region,
>> + buf, count, pos);
>> +
>> + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>> + xe_pm_runtime_put(xe);
>> +
>> + return ret;
>> +}
>> +
>> +static ssize_t ggtt_raw_write(struct file *file, const char __user *buf,
>> + size_t count, loff_t *pos)
>> +{
>> + struct dentry *dent = file_dentry(file);
>> + struct dentry *parent = dent->d_parent;
>> + unsigned int vfid = extract_vfid(parent);
>> + struct xe_gt *gt = extract_gt(parent);
>> + struct xe_device *xe = gt_to_xe(gt);
>> + ssize_t ret;
>> +
>> + xe_pm_runtime_get(xe);
>> + mutex_lock(xe_gt_sriov_pf_master_mutex(gt));
>> +
>> + ret = xe_ggtt_node_write(gt->sriov.pf.vfs[vfid].config.ggtt_region,
>> + buf, count, pos);
>> +
>> + mutex_unlock(xe_gt_sriov_pf_master_mutex(gt));
>> + xe_pm_runtime_put(xe);
>> +
>> + return ret;
>> +}
>> +
>> +static const struct file_operations ggtt_raw_ops = {
>> + .owner = THIS_MODULE,
>> + .read = ggtt_raw_read,
>> + .write = ggtt_raw_write,
>> + .llseek = default_llseek,
>> +};
>> +
>> /**
>> * xe_gt_sriov_pf_debugfs_register - Register SR-IOV PF specific entries in GT debugfs.
>> * @gt: the &xe_gt to register
>> @@ -554,6 +613,9 @@ void xe_gt_sriov_pf_debugfs_register(struct xe_gt *gt, struct dentry *root)
>> debugfs_create_file("config_blob",
>> IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
>> vfdentry, NULL, &config_blob_ops);
>> + debugfs_create_file("ggtt_raw",
>> + IS_ENABLED(CONFIG_DRM_XE_DEBUG_SRIOV) ? 0600 : 0400,
>> + vfdentry, NULL, &ggtt_raw_ops);
>> }
>> }
>> }
>> --
>> 2.43.0
>>
More information about the Intel-xe
mailing list