[PATCH 4/6] drm/msm/a7xx: Initialize a750 "software fuse"
Dmitry Baryshkov
dmitry.baryshkov at linaro.org
Fri Apr 26 15:38:25 UTC 2024
On Fri, 26 Apr 2024 at 18:36, Connor Abbott <cwabbott0 at gmail.com> wrote:
>
> On Fri, Apr 26, 2024 at 4:24 PM Dmitry Baryshkov
> <dmitry.baryshkov at linaro.org> wrote:
> >
> > On Fri, 26 Apr 2024 at 18:08, Connor Abbott <cwabbott0 at gmail.com> wrote:
> > >
> > > On Fri, Apr 26, 2024 at 3:53 PM Dmitry Baryshkov
> > > <dmitry.baryshkov at linaro.org> wrote:
> > > >
> > > > On Fri, 26 Apr 2024 at 17:05, Connor Abbott <cwabbott0 at gmail.com> wrote:
> > > > >
> > > > > On Fri, Apr 26, 2024 at 2:31 PM Dmitry Baryshkov
> > > > > <dmitry.baryshkov at linaro.org> wrote:
> > > > > >
> > > > > > On Fri, 26 Apr 2024 at 15:35, Connor Abbott <cwabbott0 at gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Apr 26, 2024 at 12:02 AM Dmitry Baryshkov
> > > > > > > <dmitry.baryshkov at linaro.org> wrote:
> > > > > > > >
> > > > > > > > On Thu, 25 Apr 2024 at 16:44, Connor Abbott <cwabbott0 at gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > On all Qualcomm platforms with a7xx GPUs, qcom_scm provides a method to
> > > > > > > > > initialize cx_mem. Copy this from downstream (minus BCL which we
> > > > > > > > > currently don't support). On a750, this includes a new "fuse" register
> > > > > > > > > which can be used by qcom_scm to fuse off certain features like
> > > > > > > > > raytracing in software. The fuse is default off, and is initialized by
> > > > > > > > > calling the method. Afterwards we have to read it to find out which
> > > > > > > > > features were enabled.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Connor Abbott <cwabbott0 at gmail.com>
> > > > > > > > > ---
> > > > > > > > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 89 ++++++++++++++++++++++++-
> > > > > > > > > drivers/gpu/drm/msm/adreno/adreno_gpu.h | 2 +
> > > > > > > > > 2 files changed, 90 insertions(+), 1 deletion(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > index cf0b1de1c071..fb2722574ae5 100644
> > > > > > > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
> > > > > > > > > @@ -10,6 +10,7 @@
> > > > > > > > >
> > > > > > > > > #include <linux/bitfield.h>
> > > > > > > > > #include <linux/devfreq.h>
> > > > > > > > > +#include <linux/firmware/qcom/qcom_scm.h>
> > > > > > > > > #include <linux/pm_domain.h>
> > > > > > > > > #include <linux/soc/qcom/llcc-qcom.h>
> > > > > > > > >
> > > > > > > > > @@ -1686,7 +1687,8 @@ static int a6xx_zap_shader_init(struct msm_gpu *gpu)
> > > > > > > > > A6XX_RBBM_INT_0_MASK_RBBM_HANG_DETECT | \
> > > > > > > > > A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
> > > > > > > > > A6XX_RBBM_INT_0_MASK_UCHE_TRAP_INTR | \
> > > > > > > > > - A6XX_RBBM_INT_0_MASK_TSBWRITEERROR)
> > > > > > > > > + A6XX_RBBM_INT_0_MASK_TSBWRITEERROR | \
> > > > > > > > > + A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > >
> > > > > > > > > #define A7XX_APRIV_MASK (A6XX_CP_APRIV_CNTL_ICACHE | \
> > > > > > > > > A6XX_CP_APRIV_CNTL_RBFETCH | \
> > > > > > > > > @@ -2356,6 +2358,26 @@ static void a6xx_fault_detect_irq(struct msm_gpu *gpu)
> > > > > > > > > kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static void a7xx_sw_fuse_violation_irq(struct msm_gpu *gpu)
> > > > > > > > > +{
> > > > > > > > > + u32 status;
> > > > > > > > > +
> > > > > > > > > + status = gpu_read(gpu, REG_A7XX_RBBM_SW_FUSE_INT_STATUS);
> > > > > > > > > + gpu_write(gpu, REG_A7XX_RBBM_SW_FUSE_INT_MASK, 0);
> > > > > > > > > +
> > > > > > > > > + dev_err_ratelimited(&gpu->pdev->dev, "SW fuse violation status=%8.8x\n", status);
> > > > > > > > > +
> > > > > > > > > + /* Ignore FASTBLEND violations, because the HW will silently fall back
> > > > > > > > > + * to legacy blending.
> > > > > > > > > + */
> > > > > > > > > + if (status & (A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > > + A7XX_CX_MISC_SW_FUSE_VALUE_LPAC)) {
> > > > > > > > > + del_timer(&gpu->hangcheck_timer);
> > > > > > > > > +
> > > > > > > > > + kthread_queue_work(gpu->worker, &gpu->recover_work);
> > > > > > > > > + }
> > > > > > > > > +}
> > > > > > > > > +
> > > > > > > > > static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > > > {
> > > > > > > > > struct msm_drm_private *priv = gpu->dev->dev_private;
> > > > > > > > > @@ -2384,6 +2406,9 @@ static irqreturn_t a6xx_irq(struct msm_gpu *gpu)
> > > > > > > > > if (status & A6XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
> > > > > > > > > dev_err_ratelimited(&gpu->pdev->dev, "UCHE | Out of bounds access\n");
> > > > > > > > >
> > > > > > > > > + if (status & A6XX_RBBM_INT_0_MASK_SWFUSEVIOLATION)
> > > > > > > > > + a7xx_sw_fuse_violation_irq(gpu);
> > > > > > > > > +
> > > > > > > > > if (status & A6XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS)
> > > > > > > > > msm_gpu_retire(gpu);
> > > > > > > > >
> > > > > > > > > @@ -2525,6 +2550,60 @@ static void a6xx_llc_slices_init(struct platform_device *pdev,
> > > > > > > > > a6xx_gpu->llc_mmio = ERR_PTR(-EINVAL);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > +static int a7xx_cx_mem_init(struct a6xx_gpu *a6xx_gpu)
> > > > > > > > > +{
> > > > > > > > > + struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
> > > > > > > > > + struct msm_gpu *gpu = &adreno_gpu->base;
> > > > > > > > > + u32 gpu_req = QCOM_SCM_GPU_ALWAYS_EN_REQ;
> > > > > > > > > + u32 fuse_val;
> > > > > > > > > + int ret;
> > > > > > > > > +
> > > > > > > > > + if (adreno_is_a740(adreno_gpu)) {
> > > > > > > > > + /* Raytracing is always enabled on a740 */
> > > > > > > > > + adreno_gpu->has_ray_tracing = true;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > + if (!qcom_scm_is_available()) {
> > > > > > > > > + /* Assume that if qcom scm isn't available, that whatever
> > > > > > > > > + * replacement allows writing the fuse register ourselves.
> > > > > > > > > + * Users of alternative firmware need to make sure this
> > > > > > > > > + * register is writeable or indicate that it's not somehow.
> > > > > > > > > + * Print a warning because if you mess this up you're about to
> > > > > > > > > + * crash horribly.
> > > > > > > > > + */
> > > > > > > > > + if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > > + dev_warn_once(gpu->dev->dev,
> > > > > > > > > + "SCM is not available, poking fuse register\n");
> > > > > > > > > + a6xx_llc_write(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE,
> > > > > > > > > + A7XX_CX_MISC_SW_FUSE_VALUE_RAYTRACING |
> > > > > > > > > + A7XX_CX_MISC_SW_FUSE_VALUE_FASTBLEND |
> > > > > > > > > + A7XX_CX_MISC_SW_FUSE_VALUE_LPAC);
> > > > > > > > > + adreno_gpu->has_ray_tracing = true;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > + return 0;
> > > > > > > > > + }
> > > > > > > > > +
> > > > > > > > > + if (adreno_is_a750(adreno_gpu))
> > > > > > > >
> > > > > > > > Most of the function is under the if (adreno_is_a750) conditions. Can
> > > > > > > > we invert the logic and add a single block of if(adreno_is_a750) and
> > > > > > > > then place all the code underneath?
> > > > > > >
> > > > > > > You mean to duplicate the qcom_scm_is_available check and qcom_scm_
> > > > > > >
> > > > > > > >
> > > > > > > > > + gpu_req |= QCOM_SCM_GPU_TSENSE_EN_REQ;
> > > > > > > > > +
> > > > > > > > > + ret = qcom_scm_gpu_init_regs(gpu_req);
> > > > > > > > > + if (ret)
> > > > > > > > > + return ret;
> > > > > > > > > +
> > > > > > > > > + /* On a750 raytracing may be disabled by the firmware, find out whether
> > > > > > > > > + * that's the case. The scm call above sets the fuse register.
> > > > > > > > > + */
> > > > > > > > > + if (adreno_is_a750(adreno_gpu)) {
> > > > > > > > > + fuse_val = a6xx_llc_read(a6xx_gpu, REG_A7XX_CX_MISC_SW_FUSE_VALUE);
> > > > > > > >
> > > > > > > > This register isn't accessible with the current sm8650.dtsi. Since DT
> > > > > > > > and driver are going through different trees, please add safety guards
> > > > > > > > here, so that the driver doesn't crash if used with older dtsi
> > > > > > >
> > > > > > > I don't see how this is an issue. msm-next is currently based on 6.9,
> > > > > > > which doesn't have the GPU defined in sm8650.dtsi. AFAIK patches 1 and
> > > > > > > 2 will have to go through the linux-arm-msm tree, which will have to
> > > > > > > be merged into msm-next before this patch lands there, so there will
> > > > > > > never be any breakage.
> > > > > >
> > > > > > linux-arm-msm isn't going to be merged into msm-next. If we do not ask
> > > > > > for ack for the fix to go through msm-next, they will get these
> > > > > > patches in parallel.
> > > > >
> > > > > I'm not familiar with how complicated cross-tree changes like this get
> > > > > merged, but why would we merge these in parallel given that this patch
> > > > > depends on the previous patch that introduces
> > > > > qcom_scm_gpu_init_regs(), and that would (I assume?) normally go
> > > > > through the same tree as patch 1? Even if patch 1 gets merged in
> > > > > parallel in linux-arm-msm, in what scenario would we have a broken
> > > > > boot? You won't have a devicetree with a working sm8650 GPU and
> > > > > drm/msm with raytracing until linux-arm-msm is merged into msm-next at
> > > > > which point patch 1 will have landed somehow.
> > > >
> > > > arch/arm64/qcom/dts and drivers/firmware/qcom are two separate trees.
> > > > So yes, this needs a lot of coordination.
> > >
> > >
> > >
> > > >
> > > > >
> > > > > >
> > > > > > Another option is to get dtsi fix into 6.9 and delay the raytracing
> > > > > > until 6.10-rc which doesn't make a lot of sense from my POV).
> > > > > >
> > > > > > >
> > > > > > > > (not to mention that dts is considered to be an ABI and newer kernels
> > > > > > > > are supposed not to break with older DT files).
> > > > > > >
> > > > > > > That policy only applies to released kernels, so that's irrelevant here.
> > > > > >
> > > > > > It applies to all kernels, the reason being pretty simple: git-bisect
> > > > > > should not be broken.
> > > > >
> > > > > As I wrote above, this is not an issue. The point I was making is that
> > > > > mixing and matching dtb's from one unmerged subsystem tree and a
> > > > > kernel from another isn't supported AFAIK, and that's the only
> > > > > scenario where this could break.
> > > >
> > > > And it can happen if somebody running a bisect ends up in the branch
> > > > with these patches in, but with the dtsi bits not being picked up.
> > >
> > > That wouldn't be possible unless we merged the "bad" commit
> > > introducing the GPU node to sm8650.dtsi into msm-next but not the fix.
> > > So yeah, it's going to require a lot of careful cooperation but it
> > > should be possible to avoid that happening.
> >
> > Well, the GPU node is already there in the linux-next.
>
> And? As long as the devicetree fix lands first, linux-next will never be broken.
So we need to land dtsi for 6.10 and delay the drm/msm changes for
6.11. If that's fine with you and Bjorn, I'm ok with that.
>
> > Anyway. Please. Don't break compat with old DTS. That is a rule of thumb.
>
> It's exactly that, a rule of thumb. This is obviously a bit of an
> exceptional case, and you haven't articulated any reason why we should
> follow it in this case when there's an obvious reason not to.
--
With best wishes
Dmitry
More information about the Freedreno
mailing list