[PATCH] drm/xe/sa: Drop hardcoded 4K guard in sub-allocator

Thu Dec 19 04:16:01 UTC 2024

On Wed, Dec 18, 2024 at 08:47:37PM +0100, Michal Wajdeczko wrote:
> 
> 
> On 18.12.2024 10:15, Matthew Auld wrote:
> > On 17/12/2024 22:39, Matthew Brost wrote:
> >> On Tue, Dec 17, 2024 at 11:22:46PM +0100, Michal Wajdeczko wrote:
> >>> Any required prefetch guards are added during batch buffer
> >>> allocations anyway.
> >>>
> >>
> >> This should work but I think we actually want to do the opposite of
> >> this - drop the prefetch pad in BB allocation. This would enable a more
> >> optimial usage of each suballocation. I think that would work unless we
> >> have an odd caching issue - if caching is a problem then maybe the BB is
> >> a cacheline.
> > 
> > Also would be good to update bb_prefetch(), since current prefetch value
> > is too small for xe2+ on some engines, so the hardcoded 4K here was
> > maybe saving the day.
> 
> since I don't what would be a good value for xe2+, I'll hold with this
> patch until someone fixes the bb_prefetch()
> 

Let me post, test, a quick patch dropping the BB pad and see if it works.
I don't know enough about the GPU caching structure to know if this
create problems though.

e.g.

Job prefetchs unused memory
CPU writes unused memory
Subsequent job executes unused memory - will this read the updated memmory?

I think we have cflushes in the jobs at the end (?) but I get little
lost on exactly how our hardware works.

Michal - don't consider any of this blocking for your GuC cache series.
Easy enough to fix on top of that series.

Matt

> > 
> >>
> >> I haven't had time to try to out yet but I think we explore the above
> >> option first. If I'm missing something and the above does not work, then
> >> agree with this patch.
> >>
> >> Matt
> >>