[Intel-gfx] [PATCH 05/15] drm/i915: GuC-specific firmware loader

Thu Jun 18 11:54:42 PDT 2015

On 15/06/15 21:30, Chris Wilson wrote:
> On Mon, Jun 15, 2015 at 07:36:23PM +0100, Dave Gordon wrote:
>> +	/* We can't enable contexts until all firmware is loaded */
>> +	ret = intel_guc_ucode_load(dev, false);
> 
> Pardon. I know context initialisation is broken, but adding to that
> breakage is not pleasant.

Sorry, but that's just the way it works. If you want to use the GuC for
batch submission, then you cannot submit any commands to any engine via
the GuC before its firmware is loaded, nor can you submit anything at
all directly to the ELSPs.

However in /this/ patch the 'false' above should have been 'true' to
give synchronous load semantics; and then ignoring the return is
intentional, because either it's worked and we're going to use the GuC,
or it hasn't and we're not (and it's already printed a message). Then
there's a later patch that tries to decouple engine MMIO setup from
engine setup using batches & contexts, at which point we can make use of
the return code.

>>  	ret = i915_gem_context_enable(dev_priv);
>>  	if (ret && ret != -EIO) {
>>  		DRM_ERROR("Context enable failed %d\n", ret);
> 
>> diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
>> index 82367c9..0b44265 100644
>> --- a/drivers/gpu/drm/i915/intel_guc.h
>> +++ b/drivers/gpu/drm/i915/intel_guc.h
>> @@ -166,4 +166,9 @@ struct intel_guc {
>>  #define GUC_WD_VECS_IER		0xC558
>>  #define GUC_PM_P24C_IER		0xC55C
>>  
>> +/* intel_guc_loader.c */
>> +extern void intel_guc_ucode_init(struct drm_device *dev);
>> +extern int intel_guc_ucode_load(struct drm_device *dev, bool wait);
>> +extern void intel_guc_ucode_fini(struct drm_device *dev);
>> +
>>  #endif
>> diff --git a/drivers/gpu/drm/i915/intel_guc_loader.c b/drivers/gpu/drm/i915/intel_guc_loader.c
>> new file mode 100644
>> index 0000000..16eef4c
>> --- /dev/null
>> +++ b/drivers/gpu/drm/i915/intel_guc_loader.c
>> @@ -0,0 +1,416 @@
>> +/*
>> + * Copyright © 2014 Intel Corporation
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the "Software"),
>> + * to deal in the Software without restriction, including without limitation
>> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
>> + * and/or sell copies of the Software, and to permit persons to whom the
>> + * Software is furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the next
>> + * paragraph) shall be included in all copies or substantial portions of the
>> + * Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
>> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
>> + * IN THE SOFTWARE.
>> + *
>> + * Authors:
>> + *    Vinit Azad <vinit.azad at intel.com>
>> + *    Ben Widawsky <ben at bwidawsk.net>
>> + *    Dave Gordon <david.s.gordon at intel.com>
>> + *    Alex Dai <yu.dai at intel.com>
>> + */
>> +#include <linux/firmware.h>
>> +#include "i915_drv.h"
>> +#include "intel_guc.h"
>> +
>> +/**
>> + * DOC: GuC
>> + *
>> + * intel_guc:
>> + * Top level structure of guc. It handles firmware loading and manages client
>> + * pool and doorbells. intel_guc owns a i915_guc_client to replace the legacy
>> + * ExecList submission.
>> + *
>> + * Firmware versioning:
>> + * The firmware build process will generate a version header file with major and
>> + * minor version defined. The versions are built into CSS header of firmware.
>> + * i915 kernel driver set the minimal firmware version required per platform.
>> + * The firmware installation package will install (symbolic link) proper version
>> + * of firmware.
>> + *
>> + * GuC address space:
>> + * GuC does not allow any gfx GGTT address that falls into range [0, WOPCM_TOP),
>> + * which is reserved for Boot ROM, SRAM and WOPCM. Currently this top address is
>> + * 512K. In order to exclude 0-512K address space from GGTT, all gfx objects
>> + * used by GuC is pinned with PIN_OFFSET_BIAS along with size of WOPCM.
>> + *
>> + * Firmware log:
>> + * Firmware log is enabled by setting i915.guc_log_level to non-negative level.
>> + * Log data is printed out via reading debugfs i915_guc_log_dump. Reading from
>> + * i915_guc_load_status will print out firmware loading status and scratch
>> + * registers value.
>> + *
>> + */
>> +
>> +#define I915_SKL_GUC_UCODE "i915/skl_guc_ver3.bin"
>> +MODULE_FIRMWARE(I915_SKL_GUC_UCODE);
>> +
>> +static u32 get_gttype(struct drm_device *dev)
>> +{
>> +	/* XXX: GT type based on PCI device ID? field seems unused by fw */
>> +	return 0;
>> +}
>> +
>> +static u32 get_core_family(struct drm_device *dev)
> 
> For new code we really should be in the habit of passing around the
> right pointer, not dev.

Good idea :) Especially as the caller actually passes dev_priv->dev!!

>> +{
>> +	switch (INTEL_INFO(dev)->gen) {
>> +	case 8:
>> +		return GFXCORE_FAMILY_GEN8;
>> +	case 9:
>> +		return GFXCORE_FAMILY_GEN9;
>> +	default:
>> +		DRM_ERROR("GUC: unknown gen for scheduler init\n");
>> +		return GFXCORE_FAMILY_FORCE_ULONG;
>> +	}
>> +}
>> +
>> +static void set_guc_init_params(struct drm_i915_private *dev_priv)
>> +{
>> +	struct intel_guc *guc = &dev_priv->guc;
>> +	u32 params[GUC_CTL_MAX_DWORDS];
>> +	int i;
>> +
>> +	memset(&params, 0, sizeof(params));
>> +
>> +	params[GUC_CTL_DEVICE_INFO] |=
>> +		(get_gttype(dev_priv->dev) << GUC_CTL_GTTYPE_SHIFT) |
>> +		(get_core_family(dev_priv->dev) << GUC_CTL_COREFAMILY_SHIFT);
>> +
>> +	/* GuC ARAT increment is 10 ns. GuC default scheduler quantum is one
>> +	 * second. This ARAR is calculated by:
>> +	 * Scheduler-Quantum-in-ns / ARAT-increment-in-ns = 1000000000 / 10
>> +	 */
>> +	params[GUC_CTL_ARAT_HIGH] = 0;
>> +	params[GUC_CTL_ARAT_LOW] = 100000000;
>> +
>> +	params[GUC_CTL_WA] |= GUC_CTL_WA_UK_BY_DRIVER;
>> +
>> +	params[GUC_CTL_FEATURE] |= GUC_CTL_DISABLE_SCHEDULER |
>> +			GUC_CTL_VCS2_ENABLED;
>> +
>> +	if (i915.guc_log_level >= 0) {
>> +		params[GUC_CTL_LOG_PARAMS] = guc->log_flags;
>> +		params[GUC_CTL_DEBUG] =
>> +			i915.guc_log_level << GUC_LOG_VERBOSITY_SHIFT;
>> +	}
>> +
>> +	I915_WRITE(SOFT_SCRATCH(0), 0);
>> +
>> +	for (i = 0; i < GUC_CTL_MAX_DWORDS; i++)
>> +		I915_WRITE(SOFT_SCRATCH(1 + i), params[i]);
>> +}
>> +
>> +/* Read GuC status register (GUC_STATUS)
>> + * Return true if get a success code from normal boot or RC6 boot
>> + */
>> +static inline bool i915_guc_get_status(struct drm_i915_private *dev_priv,
>> +					u32 *status)
>> +{
>> +	*status = I915_READ(GUC_STATUS);
>> +	return (((*status) & GS_UKERNEL_MASK) == GS_UKERNEL_READY ||
>> +		((*status) & GS_UKERNEL_MASK) == GS_UKERNEL_LAPIC_DONE);
> 
> Weird function. Does two things, only one of those is get_status. Maybe
> you would like to split this up better and use a switch when you mean a
> switch. Or rename it to reflect it's use only as a condition.

The weirdness is down to the fact that it's passed as an argument to the
MACRO "wait_for_atomic()". The "caller" of wait_for_atomic() also wants
to see the status value that caused the MACRO to exit so it has to save
that indirectly via the pointer. We can't break the "status = READ()"
and "classify the result" stages into two separate functions because we
have to pass a single expression to the MACRO; both have to be inside
the generated loop.

So it may be weird, but at least it's simple; and the comment above does
tell you that it does two things. We could call it
i915_read_guc_status_and_test_whether_ready() if you like, but I think
that'll make the line where it's used more than 80 characters ;-(
Other (shorter) suggestions happily accepted.

Macros that repeatedly evaluate the text of their arguments are ugly :(

>> +}
>> +
>> +/* Transfers the firmware image to RAM for execution by the microcontroller.
>> + *
>> + * GuC Firmware layout:
>> + * +-------------------------------+  ----
>> + * |          CSS header           |  128B
>> + * +-------------------------------+  ----
>> + * |             uCode             |
>> + * +-------------------------------+  ----
>> + * |         RSA signature         |  256B
>> + * +-------------------------------+  ----
>> + * |         RSA public Key        |  256B
>> + * +-------------------------------+  ----
>> + * |       Public key modulus      |    4B
>> + * +-------------------------------+  ----
>> + *
>> + * Architecturally, the DMA engine is bidirectional, and in can potentially
>> + * even transfer between GTT locations. This functionality is left out of the
>> + * API for now as there is no need for it.
>> + *
>> + * Be note that GuC need the CSS header plus uKernel code to be copied as one
>> + * chunk of data. RSA sig data is loaded via MMIO.
>> + */
>> +static int guc_ucode_xfer_dma(struct drm_i915_private *dev_priv)
>> +{
>> +	struct intel_uc_fw *guc_fw = &dev_priv->guc.guc_fw;
>> +	struct drm_i915_gem_object *fw_obj = guc_fw->uc_fw_obj;
>> +	unsigned long offset;
>> +	struct sg_table *sg = fw_obj->pages;
>> +	u32 status, ucode_size, rsa[UOS_RSA_SIG_SIZE / sizeof(u32)];
>> +	int i, ret = 0;
>> +
>> +	/* uCode size, also is where RSA signature starts */
>> +	offset = ucode_size = guc_fw->uc_fw_size - UOS_CSS_SIGNING_SIZE;
>> +
>> +	/* Copy RSA signature from the fw image to HW for verification */
>> +	sg_pcopy_to_buffer(sg->sgl, sg->nents, rsa, UOS_RSA_SIG_SIZE, offset);
>> +	for (i = 0; i < UOS_RSA_SIG_SIZE / sizeof(u32); i++)
>> +		I915_WRITE(UOS_RSA_SCRATCH_0 + i * sizeof(u32), rsa[i]);
>> +
>> +	/* Set the source address for the new blob */
>> +	offset = i915_gem_obj_ggtt_offset(fw_obj);
> 
> Why would it even have a GGTT vma? There's no precondition here to
> assert that it should.

The (only) caller already did:

        ret = i915_gem_obj_ggtt_pin(guc_fw->uc_fw_obj, 0, 0);

and also deals with unpinning it after use.

>> +	I915_WRITE(DMA_ADDR_0_LOW, lower_32_bits(offset));
>> +	I915_WRITE(DMA_ADDR_0_HIGH, upper_32_bits(offset) & 0xFFFF);
>> +
>> +	/* Set the destination. Current uCode expects an 8k stack starting from
>> +	 * offset 0. */
>> +	I915_WRITE(DMA_ADDR_1_LOW, 0x2000);
>> +
>> +	/* XXX: The image is automatically transfered to SRAM after the RSA
>> +	 * verification. This is why the address space is chosen as such. */
>> +	I915_WRITE(DMA_ADDR_1_HIGH, DMA_ADDRESS_SPACE_WOPCM);
>> +
>> +	I915_WRITE(DMA_COPY_SIZE, ucode_size);
>> +
>> +	/* Finally start the DMA */
>> +	I915_WRITE(DMA_CTRL, _MASKED_BIT_ENABLE(UOS_MOVE | START_DMA));
>> +
> 
> Just assuming that the writes land and in the order you expect?

Yes. If they don't then the mapping of the MMIO registers is set up
wrong. No one should ever map h/w registers as writeback or
write-combining; or in fact anything other than uncached and strongly
ordered w.r.t. each other for both reads and writes.

Sometimes we need a POSTING_READ() to ensure that a WRITE has reached
the h/w before touching something other than a device register -- s/w
state or shared memory -- but not between consecutive writes to
registers of the same device.

The next operation is going to be a READ (inside i915_guc_get_status()
above, so that will flush (in order) any of the above writes that
haven't actually reached the h/w yet ...

>> +	/*
>> +	 * Spin-wait for the DMA to complete & the GuC to start up.
>> +	 * NB: Docs recommend not using the interrupt for completion.
>> +	 * FIXME: what's a valid timeout?
>> +	 */
>> +	ret = wait_for_atomic(i915_guc_get_status(dev_priv, &status), 10);
> 
> FIXME, error handling is too hard.

I got a new timeout value from the MinuteIA team, so I'll update that.
The error code is passed back for the caller to handle.

>> +	DRM_DEBUG_DRIVER("DMA status = 0x%x, GuC status 0x%x\n",
>> +			I915_READ(DMA_CTRL), status);
>> +
>> +	if ((status & GS_BOOTROM_MASK) == GS_BOOTROM_RSA_FAILED) {
>> +		DRM_ERROR("%s firmware signature verification failed\n",
>> +			guc_fw->uc_name);
>> +		ret = -ENOEXEC;
>> +	}
>> +
>> +	DRM_DEBUG_DRIVER("GuC fw load status %s %d\n",
>> +			ret ? "FAIL" : "SUCCESS", ret);
>> +
>> +	return ret;
>> +}
> 
> I'm guessing the other functions are basically more of the same...
> -Chris

?
.Dave.