[Freedreno] [PATCH 12/12] drm/msm: gpu: Use the zap shader on 5XX if we can

Tue Dec 6 17:18:04 UTC 2016

On Tue 06 Dec 07:35 PST 2016, Jordan Crouse wrote:

> On Mon, Dec 05, 2016 at 11:57:12AM -0800, Bjorn Andersson wrote:
> > On Mon 28 Nov 11:28 PST 2016, Jordan Crouse wrote:
> > 
> > > The A5XX GPU powers on in "secure" mode. In secure mode the GPU can
> > > only render to buffers that are marked as secure and inaccessible
> > > to the kernel and user through a series of hardware protections. In
> > > practice secure mode is used to draw things like a UI on a secure
> > > video frame.
> > > 
> > > In order to switch out of secure mode the GPU executes a special
> > > shader that clears out the GMEM and other sensitve registers and
> > > then writes a register. Because the kernel can't be trusted the
> > > shader binary is signed and verified and programmed by the
> > > secure world. To do this we need to read the MDT header and the
> > > segments from the firmware location and put them in memory and
> > > present them for approval.
> > > 
> > > For targets without secure support there is an out: if the
> > > secure world doesn't support secure then there are no hardware
> > > protections and we can freely write the SECVID_TRUST register from
> > > the CPU. We don't have 100% confidence that we can query the
> > > secure capabilities at run time but we have enough calls that
> > > need to go right to give us some confidence that we're at least doing
> > > something useful.
> > > 
> > > Of course if we guess wrong you trigger a permissions violation
> > > which usually ends up in a system crash but thats a problem
> > > that shows up immediately.
> > > 
> > > Signed-off-by: Jordan Crouse <jcrouse at codeaurora.org>
> > > ---
> > >  drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 72 ++++++++++++++++++++++++++++++++++-
> > >  1 file changed, 70 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > index eefe197..a7a58ec 100644
> > > --- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > +++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
> > > @@ -469,6 +469,55 @@ static int a5xx_ucode_init(struct msm_gpu *gpu)
> > >  	return 0;
> > >  }
> > >  
> > > +static int a5xx_zap_shader_resume(struct msm_gpu *gpu)
> > > +{
> > > +	int ret;
> > > +
> > > +	ret = qcom_scm_gpu_zap_resume();
> > > +	if (ret)
> > > +		DRM_ERROR("%s: zap-shader resume failed: %d\n",
> > > +			gpu->name, ret);
> > > +
> > > +	return ret;
> > > +}
> > > +
> > > +static int a5xx_zap_shader_init(struct msm_gpu *gpu)
> > > +{
> > > +	static bool loaded;
> > > +	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
> > > +	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
> > > +	struct platform_device *pdev = a5xx_gpu->pdev;
> > > +	struct device_node *node;
> > > +	int ret;
> > > +
> > > +	/*
> > > +	 * If the zap shader is already loaded into memory we just need to kick
> > > +	 * the remote processor to reinitialize it
> > > +	 */
> > > +	if (loaded)
> > 
> > Why is this handling needed? Why can init be called multiple times?
> 
> This is for resume - if we suspend and resume the device without
> losing state the secure zone we can't load it again, so we have to
> call a different operation to "resume" it. This will be much more
> heavily used when we have more aggressive power management.
> 

I presume then that the rest of the resume path is the same as the
initialization.

> > > +		return a5xx_zap_shader_resume(gpu);
> > > +
> > > +	/* Populate the sub-nodes if they haven't already been done */
> > > +	of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev);
> > 
> > I haven't been able to find the qcom,zap-shader platform driver, but I
> > presume you have something like:
> > 
> > adreno {
> > 	qcom,zap-shader {
> > 		compatible = "qcom,zap-shader";
> > 
> > 		firmware = "zapfw";
> > 		memory-region = <&zap_region>;
> > 	};
> > };
> > 
> > I presume this is done to not "taint" the adreno device's with the zap
> > memory region, but I don't think you should (ab)use a platform driver
> > for this.
> > 
> > You should rather add a struct device zap_dev to your adreno context, do
> > minimal initialization (name and a parent I think is enough), call
> > device_register(&zap_dev);, of_reserved_mem_device_init() and then use
> > that for your dma allocation.
> > 
> > This saves you from creating a platform_driver, instantiating a
> > platform_device and the worry of the race between the creation of that
> > device and the of_find_device_by_node() below.
> 
> As far as I know, of_platform_populate() just fleshed out the platform
> devices for the sub-nodes.

Correct, it will scan all subnodes and try to match their compatible
with platform_drivers. For each match it finds it probes that driver.

> We are not creating a new platform driver
> or doing any sort of probe code, we're just setting up the useful
> memory to attach the memory-region to. As far as I can tell, using
> of_platform_populate() + of_find_device_by_node() does a lot of the
> heavy lifting of what you describe.

But as far as I know the compatible of node that you name
"qcom,zap-shader" must actually match a platform_driver for there to be
a device that of_find_device_by_node() will be able to match against.

The heavy lifting I'm talking about is (untested):

	struct device zap_dev;

	zap_dev.of_node = of_get_child_by_name("zap-shader");
	zap_dev.parent = &pdev->dev;
	dev_set_name(&zap_dev, "zap-shader");

	ret = device_register(&zap_dev);
	if (ret < 0) {
		dev_err();
		return ret;
	}

	of_reserved_mem_device_init(&zap_dev);

And then you need a device_put(&zap_dev) when you're done with the
device and if you set zap_dev.release you can have that clean things
up...

Regards,
Bjorn