[PATCH 1/2] drm/xe/gsc: Initialize GSC proxy

Tue Jan 9 18:44:11 UTC 2024

I think we are ready for a rb after the next rev:

On Mon, 2024-01-08 at 10:58 -0800, Daniele Ceraolo Spurio wrote:
> On 1/4/2024 11:21 PM, Teres Alexis, Alan Previn wrote:
> > On Mon, 2023-12-11 at 17:05 -0800, Daniele Ceraolo Spurio wrote:
> > 
> > alan: per offline chat, apologies on my delay causing to warrant a rebase.
> > alan: (nit?/not-nit?): for Xe driver, are we enforcing to ensure all
> > functions require documentation? (description, input, output, return...).
> > If so, i think many functions in this patch is lacking this.
alan:snip

> > 
> > >   
> > > +err_fini_gt:
> > > +	for_each_gt(gt, xe, id) {
> > > +		if (id < last_gt)
> > > +			xe_gt_remove(gt);
> > alan: since xe_gt_init does a drmm_add_action_or_reset(... gt_fini ...),
> > is there a reason we don't want to place the new xe_uc_remove under there?
> > (which in turn calls the new xe_gsc_remove).
> 
> Copying from the commit message:
> 
> Note that the component must be removed before the pci_remove call 
> completes, so we can't use a drmm helper for it and we need to instead 
> perform the cleanup as part of the removal flow.
alan: oh my bad, got it, thanks.
alan:snip

> 
> > > +/* shorthand define for code compactness */
> > > +#define PROXY_HDR_SIZE (sizeof(struct xe_gsc_proxy_header))
> > alan: nit: "GSC_PROXY_HDR_SIZE"? (for no reason other than consistency).
> 
> I'd prefer to keep it as short as possible. This is defined in a .c 
> file, so there shouldn't be risk of confusion as it can't be used in 
> another non-gsc file.
alan: was just a nit - so its good. just nice read-consistency in the header.

> > > +static inline struct xe_device *kdev_to_xe(struct device *kdev)
> > > +{
> > > +	return dev_get_drvdata(kdev);
> > alan: seeing some other xe codes that convert from dev to xe, should this rather be:
> > 	return to_xe_device(dev_get_drvdata(kdev);
> 
> AFAICS we set the drvdata to the xe structure (via pci_set_drvdata). The 
> drm device is the first structure inside the xe_device struct, so a 
> pointer to the drm struct is also a pointer to the xe struct, which is 
> why the code you suggested works as well (with an extra unneeded step).
alan: sounds good. thanks
> 
> > > +}
> > > +
> > > +static bool gsc_proxy_init_done(struct xe_gsc *gsc)
> > alan: not blocking: as i compared simple hw/fw facing helpers like
> > this against i915 (logically the same), I notice that in xe, we dont
> > have the equivalent for i915's __wait_gsc_proxy_completed which
> > is called from selftest code. In i915 that helper calls a gsc-helper
> > same as this function after also checking CONFIG_INTEL_MEI_GSC_PROXY
> > and the fw loading state. Do we not need this type of checking for
> > xe BISTs that excercise all engines? To be fair that can be a separate
> > patch. Probably an offline to-do.
> 
> AFAICT we don't have xe selftests yet, so no user for that function.
alan: sounds good. thanks for checking.
alan:snip
> > 

> > > +u32 emit_proxy_header(struct xe_device *xe, struct iosys_map *map, u32 offset)
> > > +{
> > > +	xe_map_memset(xe, map, offset, 0, PROXY_HDR_SIZE);
> > > +
> > > +	proxy_header_wr(xe, map, offset, hdr,
> > > +			FIELD_PREP(GSC_PROXY_TYPE, GSC_PROXY_MSG_TYPE_PROXY_QUERY) |
> > > +			FIELD_PREP(GSC_PROXY_PAYLOAD_LENGTH, 0));
> > alan: thinking of discrete devices, i assume proxy messages are occasional and in between
> > (not sure about how occasional for hdcp case), so as a low priority feedback:
> > perhaps its more efficient (at the hw pci bus transaction level) if we create a local
> > structure filled up and then use iosysmap for a memcpy as opposed to writing dwords
> > one at a time? or am missing something about how we are cache/flush these buffers leading
> > up  to submission?
> 
> I didn't think about optimizing this since, as you said, it's relatively 
> rare. I can switch to a memcpy if you think it works better.
alan: i leave this to you. Based on current platforms and feature, there is no
immediate motivation to pursue with corner case optimization. Perhaps we send an
offline fyi to the hdcp folks so they can look at it later if they choose to.
alan:snip

> > 
> > > +static int proxy_query(struct xe_gsc *gsc)
> > > +{
> > > +	struct xe_gt *gt = gsc_to_gt(gsc);
> > > +	struct xe_device *xe = gt_to_xe(gt);
> > > +	struct xe_gsc_proxy_header *to_csme_hdr = gsc->proxy.to_csme;
> > > +	void *to_csme_payload = gsc->proxy.to_csme + PROXY_HDR_SIZE;
> > > +	u32 wr_offset;
> > > +	u32 rd_offset;
> > > +	u32 size;
> > > +	int ret;
> > > +
> > > +	wr_offset = xe_gsc_emit_header(xe, &gsc->proxy.to_gsc, 0,
> > > +				       HECI_MEADDRESS_PROXY, 0, PROXY_HDR_SIZE);
> > > +	wr_offset = emit_proxy_header(xe, &gsc->proxy.to_gsc, wr_offset);
> > > +
> > > +	size = wr_offset;
> > > +
> > > +	while (1) {
> > > +		/* clear the GSC response header space */
> > > +		xe_map_memset(xe, &gsc->proxy.from_gsc, 0, 0, SZ_4K);
> > alan: any reason we are clearing out a whole page as opposed to
> > just the sizeof(gsc-cmd-header) or sizeof(gsc-cmd-header) + sizeof(proxy-header)?
> 
> This function has no visibility of how big the GSC cmd header is (this 
> is on purpose). In theory the only thing we need to clear is the first 
> dword (where the header marker is), but IMO clearing the whole page is 
> better so we're sure all header are cleared (including whatever is at 
> the beginning of the payload, which is a black box to us).
alan: i think its better to clear just the most important parts of the header
(magic, dest, size)... 4K is a big chunk of transaction size. As per spec, that
should work just fine. Whatdya think?
alan:snip

> > > +		/* check the reply from GSC */
> > > +		ret = xe_gsc_read_out_header(xe, &gsc->proxy.from_gsc, 0,
> > > +					     PROXY_HDR_SIZE, &rd_offset);
> > alan: nit: could we rename "rd_offset" to "gsc_payload_offset"? (i believe this
> > would be more descriptive of how it's used in the subsequent places in this loop)
> 
> "payload" would be incorrect, since this is proxy_header + 
> proxy_payload. I can change it to "reply_offset".
alan: sounds good - it was a nit feedback - but reply offset is way more descriptive - thanks.

> 
> > 
> > > +		if (ret) {
> > > +			xe_gt_err(gt, "Invalid gsc header in proxy reply (%pe)\n",
> > > +				  ERR_PTR(ret));
> > alan: why the ERR_PTR conversion here?
> because %pe takes an error pointer and converts it to a readable errno 
> string.
alan: got it - thanks.
alan:snip

> > > +void xe_gsc_proxy_remove(struct xe_gsc *gsc)
> > > +{
> > > +	struct xe_gt *gt = gsc_to_gt(gsc);
> > > +	struct xe_device *xe = gt_to_xe(gt);
> > > +
> > > +	if (gsc->proxy.component_added) {
> > > +		component_del(xe->drm.dev, &xe_gsc_proxy_component_ops);
> > alan: we ought to put the wait for the worker to complete since that worker
> > could take long time for the component to get added - we could get racy no?
> 
> The wait is the first thing inside the unbind() action, so the unbind 
> can't complete until the worker is done.
alan: got it - thanks.