[PATCH] drm/xe: Add page queue multiplier
Lucas De Marchi
lucas.demarchi at intel.com
Fri Apr 18 02:00:50 UTC 2025
On Tue, Apr 08, 2025 at 08:59:15AM -0700, Matthew Brost wrote:
>For an unknown reason the math to determine the PF queue size does is
>not correct - compute UMD applications are overflowing the PF queue
>which is fatal. A multipplier of 8 fixes the problem.
Sorry, but that seems a terrible justification and I'm sure having it
"fixed" here is just a recipe to let it forever like that. Could we
spend a bit more time understanding the root-cause? This showed up
on my queue for drm-xe-fixes, but I'm dropping it. At least until next
week in the hope we have more clarity.
>
>Fixes: 3338e4f90c14 ("drm/xe: Use topology to determine page fault queue size")
>Cc: stable at vger.kernel.org
>Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>---
> drivers/gpu/drm/xe/xe_gt_pagefault.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>index 9fa11e837dd1..10622ca471a2 100644
>--- a/drivers/gpu/drm/xe/xe_gt_pagefault.c
>+++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c
>@@ -435,9 +435,16 @@ static int xe_alloc_pf_queue(struct xe_gt *gt, struct pf_queue *pf_queue)
> num_eus = bitmap_weight(gt->fuse_topo.eu_mask_per_dss,
> XE_MAX_EU_FUSE_BITS) * num_dss;
So... for Xe2 we have eu_type == XE_GT_EU_TYPE_SIMD16. Would that be
something into play here? You said 8 fixes the problem, but I don't see
anything about "does 2 fixes it? what about 4?"
Another thing is about "threads per-EU" as documented in
e.g. bspec 67165.
>
>- /* user can issue separate page faults per EU and per CS */
>+ /*
>+ * user can issue separate page faults per EU and per CS
>+ *
>+ * XXX: Multiplier required as compute UMD are getting PF queue errors
>+ * without it. Follow on why this multiplier is required.
>+ */
>+#define PF_MULTIPLIER 8
> pf_queue->num_dw =
>- (num_eus + XE_NUM_HW_ENGINES) * PF_MSG_LEN_DW;
>+ (num_eus + XE_NUM_HW_ENGINES) * PF_MSG_LEN_DW * PF_MULTIPLIER;
>+#undef PF_MULTIPLIER
First of all, that XE_NUM_HW_ENGINES seems weird as we are talking about
engines that are not necessarily enabled. As that OR between
fuse_topo.c_dss_mask and fuse_topo.g_dss_mask also look odd when the error
you mentioned is in the compute umd and we document num_dw as
"Dynamically calculated based on the number of **compute** resources".
Lucas De Marchi
>
> pf_queue->gt = gt;
> pf_queue->data = devm_kcalloc(xe->drm.dev, pf_queue->num_dw,
>--
>2.34.1
>
More information about the Intel-xe
mailing list