[Intel-gfx] [RFC PATCH v2 0/1] Splitting up platform-specific calls

Fri Feb 11 13:51:23 UTC 2022

On 11/02/2022 11:55, Jani Nikula wrote:
> On Thu, 10 Feb 2022, Casey Bowman <casey.g.bowman at intel.com> wrote:
>> In this RFC I would like to ask the community their thoughts
>> on how we can best handle splitting architecture-specific
>> calls.
>>
>> I would like to address the following:
>>
>> 1. How do we want to split architecture calls? Different object files
>> per platform? Separate function calls within the same object file?
>>
>> 2. How do we address dummy functions? If we have a function call that is
>> used for one or more platforms, but is not used in another, what should
>> we do for this case?
>>
>> I've given an example of splitting an architecture call
>> in my patch with run_as_guest() being split into different
>> implementations for x86 and arm64 in separate object files, sharing
>> a single header.
>>
>> Another suggestion from Michael (michael.cheng at intel.com) involved
>> using a single object file, a single header, and splitting various
>> functions calls via ifdefs in the header file.
>>
>> I would appreciate any input on how we can avoid scaling issues when
>> including multiple architectures and multiple functions (as the number
>> of function calls will inevitably increase with more architectures).
>>
>> v2: Revised to use kernel's platform-splitting scheme.
> 
> I think this is overengineering.
> 
> Just add different implementations of the functions per architecture
> next to where they are now, like I suggested before.
> 
> If we need to split them better later, it'll be a trivial undertaking,
> and we'll be in a better position to do it because we'll know how many
> functions there'll be and where they are and what they do.
> 
> Adding a bunch of overhead from the start seems like the wrong thing to
> do.

I don't see it adds real complexity, which would normally be associated 
with over-engineering. As a benefit I see it helping with driving the 
clean re-design (during the porting effort) in a way that it will be 
easy to spot is something is overly hacky, split on the wrong level, or 
incorrectly placed.

And it moves run_as_guest outside of intel_vtd.[hc] which IMO shows 
immediate benefit, since it has nothing to do with intel_vtd.

I suggested to add clflush as well, since I think going for 
drm_flush_virt_range everywhere is a bit lazy given how it is a clear 
regression for older platforms.

But after that I indeed don't have a crystal ball to show me how many 
more appropriate low-level primitives would be to use the pattern.

So my vote would be to go with it, although the main thing is probably 
to solve the conflicting asks and let guys focus on the port. Put it to 
voting then? :)

Regards,

Tvrtko