[Intel-gfx] [PATCH 3/3] drm/i915/guc: Increase size of CTB buffers

Thu Nov 21 16:11:29 UTC 2019

On Thu, Nov 21, 2019 at 01:25:05PM +0100, Michal Wajdeczko wrote:
>On Thu, 21 Nov 2019 00:56:04 +0100, <John.C.Harrison at intel.com> wrote:
>
>>From: Matthew Brost <matthew.brost at intel.com>
>>
>>With the introduction of non-blocking CTBs more than one CTB can be in
>>flight at a time. Increasing the size of the CTBs should reduce how
>>often software hits the case where no space is available in the CTB
>>buffer.
>>
>>Cc: John Harrison <john.c.harrison at intel.com>
>>Signed-off-by: Matthew Brost <matthew.brost at intel.com>
>>---
>> drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c | 50 +++++++++++++++--------
>> 1 file changed, 32 insertions(+), 18 deletions(-)
>>
>>diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c 
>>b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>index 4d8a4c6afd71..31c512e7ecc2 100644
>>--- a/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>+++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_ct.c
>>@@ -14,6 +14,10 @@
>> #define CT_DEBUG_DRIVER(...)	do { } while (0)
>> #endif
>>+#define CTB_DESC_SIZE		(PAGE_SIZE / 2)
>
>CTB descriptors is now 64B each, not sure why we want to waste
>whole page for them. Maybe to better use space (and be ready for
>upcoming changes) we can place the buffer right after descriptor:
>
> *       <------ DESCRIPTOR ------> <--- BUFFER ------------->
> *
> *      +--------------------------+--------------------------+
> *      | addr | head | tail | ... |                          |
> *      +--------------------------+--------------------------+
> *          \                      ^
> *           \____________________/
> *
> *       <------------ 64B -------> <---- n * PAGE - 64B ---->
>

I agree that is wasteful but if the linux/circ_buf.h wants to be used to
determine the space in the buffer, the buffer size has to be a power of 2. I
suspect the GuC determines space in a similar way and requires the size to be a
power of 2. I can reach out and find out if this the case.

>>+#define CTB_H2G_BUFFER_SIZE	(PAGE_SIZE)
>>+#define CTB_G2H_BUFFER_SIZE	(CTB_H2G_BUFFER_SIZE * 2)
>>+
>> #define MAX_RETRY		0x1000000
>>struct ct_request {
>>@@ -143,30 +147,35 @@ static int ctch_init(struct intel_guc *guc,
>>	GEM_BUG_ON(ctch->vma);
>>-	/* We allocate 1 page to hold both descriptors and both buffers.
>>+	/* We allocate 3 pages to hold both descriptors and both buffers.
>> 	 *       ___________.....................
>> 	 *      |desc (SEND)|                   :
>>-	 *      |___________|                   PAGE/4
>>+	 *      |___________|                   PAGE/2
>> 	 *      :___________....................:
>> 	 *      |desc (RECV)|                   :
>>-	 *      |___________|                   PAGE/4
>>+	 *      |___________|                   PAGE/2
>> 	 *      :_______________________________:
>> 	 *      |cmds (SEND)                    |
>>-	 *      |                               PAGE/4
>>+	 *      |                               PAGE
>> 	 *      |_______________________________|
>> 	 *      |cmds (RECV)                    |
>>-	 *      |                               PAGE/4
>>+	 *      |                               PAGE * 2
>> 	 *      |_______________________________|
>> 	 *
>> 	 * Each message can use a maximum of 32 dwords and we don't expect to
>>-	 * have more than 1 in flight at any time, so we have enough space.
>>-	 * Some logic further ahead will rely on the fact that there is only 1
>>-	 * page and that it is always mapped, so if the size is changed the
>>-	 * other code will need updating as well.
>>+	 * have more than 1 in flight at any time, unless we are using the GuC
>>+	 * submission. In that case each request requires a minimum 8 bytes
>>+	 * which gives us a maximum 512 queue'd requests. Hopefully this enough
>
>hmm, do we really expect to have 512 messages in flight ?
>

Potentially, yes. In fact we may expect more than that. Part of the reason for
this patch is to add the defines CTB_H2G_BUFFER_SIZE, CTB_G2H_BUFFER_SIZE so
that the CTB sizes can easily be changed when profiling the code. We are going
to want to size these buffers in way that flow control (no space in the buffer)
is rarely hit.

BTW - This comment is wrong. The header CTB header is 8 bytes and I think the
minimum payload is 8 bytes actually this should be 256 in flight.

>>+	 * space to avoid backpressure on the driver. We also double the 
>>size of
>>+	 * the receive buffer (relative to the send) to ensure a g2h response
>>+	 * CTB has a landing spot.
>
>We do plan to send nob-blocking messages that might generate higher 
>traffic, but
>do we expect matching increase in incoming traffic ? what kind of data 
>will be
>there ? and do we expect that driver will be unable consume them in 
>timely manner?
>

Yes, as part of the new interface there are asynchronous G2H received as
response to a H2G. In addition there several spontaneous G2H generated by the
GuC.

>> 	 */
>>	/* allocate vma */
>>-	vma = intel_guc_allocate_vma(guc, PAGE_SIZE);
>>+	vma = intel_guc_allocate_vma(guc, CTB_DESC_SIZE *
>>+				     ARRAY_SIZE(ctch->ctbs) +
>>+				     CTB_H2G_BUFFER_SIZE +
>>+				     CTB_G2H_BUFFER_SIZE);
>> 	if (IS_ERR(vma)) {
>> 		err = PTR_ERR(vma);
>> 		goto err_out;
>>@@ -185,8 +194,9 @@ static int ctch_init(struct intel_guc *guc,
>> 	/* store pointers to desc and cmds */
>> 	for (i = 0; i < ARRAY_SIZE(ctch->ctbs); i++) {
>> 		GEM_BUG_ON((i != CTB_SEND) && (i != CTB_RECV));
>>-		ctch->ctbs[i].desc = blob + PAGE_SIZE/4 * i;
>>-		ctch->ctbs[i].cmds = blob + PAGE_SIZE/4 * i + PAGE_SIZE/2;
>>+		ctch->ctbs[i].desc = blob + CTB_DESC_SIZE * i;
>>+		ctch->ctbs[i].cmds = blob + CTB_H2G_BUFFER_SIZE * i +
>>+			CTB_DESC_SIZE * ARRAY_SIZE(ctch->ctbs);
>> 	}
>>	return 0;
>>@@ -210,7 +220,7 @@ static void ctch_fini(struct intel_guc *guc,
>> static int ctch_enable(struct intel_guc *guc,
>> 		       struct intel_guc_ct_channel *ctch)
>> {
>>-	u32 base;
>>+	u32 base, size;
>> 	int err;
>> 	int i;
>>@@ -226,9 +236,12 @@ static int ctch_enable(struct intel_guc *guc,
>> 	 */
>> 	for (i = 0; i < ARRAY_SIZE(ctch->ctbs); i++) {
>> 		GEM_BUG_ON((i != CTB_SEND) && (i != CTB_RECV));
>>+		size = (i == CTB_SEND) ? CTB_H2G_BUFFER_SIZE :
>>+			CTB_G2H_BUFFER_SIZE;
>> 		guc_ct_buffer_desc_init(&ctch->ctbs[i],
>>-					base + PAGE_SIZE/4 * i + PAGE_SIZE/2,
>>-					PAGE_SIZE/4,
>>+					base + CTB_H2G_BUFFER_SIZE * i +
>>+					CTB_DESC_SIZE * ARRAY_SIZE(ctch->ctbs),
>>+					size,
>> 					ctch->owner);
>> 	}
>>@@ -236,13 +249,13 @@ static int ctch_enable(struct intel_guc *guc,
>> 	 * descriptors are in first half of the blob
>> 	 */
>> 	err = guc_action_register_ct_buffer(guc,
>>-					    base + PAGE_SIZE/4 * CTB_RECV,
>>+					    base + CTB_DESC_SIZE * CTB_RECV,
>> 					    INTEL_GUC_CT_BUFFER_TYPE_RECV);
>> 	if (unlikely(err))
>> 		goto err_out;
>>	err = guc_action_register_ct_buffer(guc,
>>-					    base + PAGE_SIZE/4 * CTB_SEND,
>>+					    base + CTB_DESC_SIZE * CTB_SEND,
>> 					    INTEL_GUC_CT_BUFFER_TYPE_SEND);
>> 	if (unlikely(err))
>> 		goto err_deregister;
>>@@ -635,7 +648,8 @@ static int ctb_read(struct intel_guc_ct_buffer 
>>*ctb, u32 *data)
>> 	/* beware of buffer wrap case */
>> 	if (unlikely(available < 0))
>> 		available += size;
>>-	CT_DEBUG_DRIVER("CT: available %d (%u:%u)\n", available, head, tail);
>>+	CT_DEBUG_DRIVER("CT: available %d (%u:%u:%d)\n", available, head, tail,
>>+			size);
>
>size will not change, not sure if it is worth to repeat that in every log
>

This is a debug mechanism not turned on in normal operation. I find more
information helpful when debugging problems.

>> 	GEM_BUG_ON(available < 0);
>>	data[0] = cmds[head];
>_______________________________________________
>Intel-gfx mailing list
>Intel-gfx at lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/intel-gfx