[Intel-xe] [PATCH] drm/xe: limit GGTT size to GUC_GGTT_TOP

Thu Jun 15 16:35:55 UTC 2023

On 6/15/2023 1:55 AM, Matthew Auld wrote:
> On Thu, 15 Jun 2023 at 01:25, Daniele Ceraolo Spurio
> <daniele.ceraolospurio at intel.com> wrote:
>> The GuC can't access addresses above GUC_GGTT_TOP, so any GuC-accessible
>> objects can't be mapped above that offset. Instead of checking each
>> object to see if GuC may access it or not before mapping it, we just
>> limit the GGTT size to GUC_GGTT_TOP. This wastes a bit of address space
>> (about ~18 MBs, which is in addition to what already removed at the bottom
>> of the GGTT), but it is a good tradeoff to keep the code simple.
>>
>> The in-code comment has also been updated to explain the limitation.
>>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio at intel.com>
>> ---
>>   drivers/gpu/drm/xe/xe_ggtt.c | 24 ++++++++++++++++++------
>>   1 file changed, 18 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xe/xe_ggtt.c b/drivers/gpu/drm/xe/xe_ggtt.c
>> index bae9f66a33cb..7588fbc2f278 100644
>> --- a/drivers/gpu/drm/xe/xe_ggtt.c
>> +++ b/drivers/gpu/drm/xe/xe_ggtt.c
>> @@ -24,6 +24,9 @@
>>   #define MTL_GGTT_PTE_PAT0      BIT_ULL(52)
>>   #define MTL_GGTT_PTE_PAT1      BIT_ULL(53)
>>
>> +/* GuC addresses above GUC_GGTT_TOP also don't map through the GTT */
>> +#define GUC_GGTT_TOP   0xFEE00000
>> +
>>   u64 xe_ggtt_pte_encode(struct xe_bo *bo, u64 bo_offset)
>>   {
>>          struct xe_device *xe = xe_bo_device(bo);
>> @@ -111,12 +114,18 @@ int xe_ggtt_init_noalloc(struct xe_ggtt *ggtt)
>>          /*
>>           * 8B per entry, each points to a 4KB page.
>>           *
>> -        * The GuC owns the WOPCM space, thus we can't allocate GGTT address in
>> -        * this area. Even though we likely configure the WOPCM to less than the
>> -        * maximum value, to simplify the driver load (no need to fetch HuC +
>> -        * GuC firmwares and determine there sizes before initializing the GGTT)
>> -        * just start the GGTT allocation above the max WOPCM size. This might
>> -        * waste space in the GGTT (WOPCM is 2MB on modern platforms) but we can
>> +        * The GuC address space is limited on both ends of the GGTT, because
>> +        * the GuC shim HW redirects accesses to those addresses to other HW
>> +        * areas instead of going through the GGTT. On the bottom end, the GuC
>> +        * can't access offsets below the WOPCM size, while on the top side the
>> +        * limit is fixed at GUC_GGTT_TOP. To keep things simple, instead of
>> +        * checking each object to see if they are accessed by GuC or not, we
>> +        * just exclude those areas from the allocator. Additionally, to
>> +        * simplify the driver load, we use the maximum WOPCM size in this logic
>> +        * instead of the programmed one, so we don't need to wait until the
>> +        * actual size to be programmed is determined (which requires FW fetch)
>> +        * before initializing the GGTT. These simplifications might waste space
>> +        * in the GGTT (about 20-25 MBs depending on the platform) but we can
>>           * live with this.
>>           *
>>           * Another benifit of this is the GuC bootrom can't access anything
>> @@ -125,6 +134,9 @@ int xe_ggtt_init_noalloc(struct xe_ggtt *ggtt)
>>           * Starting the GGTT allocations above the WOPCM max give us the correct
>>           * placement for free.
>>           */
>> +       if (ggtt->size > GUC_GGTT_TOP)
>> +               ggtt->size = GUC_GGTT_TOP;
>> +
>>          drm_mm_init(&ggtt->mm, xe_wopcm_size(xe),
>>                      ggtt->size - xe_wopcm_size(xe));
> We also subtract the xe_wopcm_size(). Do you know if that is still
> needed with GUC_GGTT_TOP?

It is. The GGTT range is reduced from [0, 4GB] to [wopcm_size, 
GUC_GGTT_TOP], so we need to subtract wopcm_size to account for the 
chunk removed at the bottom.

Daniele

>
>>          mutex_init(&ggtt->lock);
>> --
>> 2.40.0
>>