[PATCH v5 2/7] drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU
Neil Armstrong
neil.armstrong at linaro.org
Thu Dec 12 21:45:08 UTC 2024
On 12/12/2024 20:55, Konrad Dybcio wrote:
> On 11.12.2024 9:29 AM, Neil Armstrong wrote:
>> The Adreno GPU Management Unit (GMU) can also scale DDR Bandwidth along
>> the Frequency and Power Domain level, but by default we leave the
>> OPP core scale the interconnect ddr path.
>>
>> While scaling via the interconnect path was sufficient, newer GPUs
>> like the A750 requires specific vote paremeters and bandwidth to
>> achieve full functionality.
>>
>> In order to calculate vote values used by the GPU Management
>> Unit (GMU), we need to parse all the possible OPP Bandwidths and
>> create a vote value to be sent to the appropriate Bus Control
>> Modules (BCMs) declared in the GPU info struct.
>>
>> This vote value is called IB, while on the other side the GMU also
>> takes another vote called AB which is a 16bit quantized value
>> of the floor bandwidth against the maximum supported bandwidth.
>> The AB vote will be calculated later when setting the frequency.
>>
>> The vote array will then be used to dynamically generate the GMU
>> bw_table sent during the GMU power-up.
>>
>> Reviewed-by: Akhil P Oommen <quic_akhilpo at quicinc.com>
>> Signed-off-by: Neil Armstrong <neil.armstrong at linaro.org>
>> ---
>> drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 144 ++++++++++++++++++++++++++++++++++
>> drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 13 +++
>> drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 +
>> 3 files changed, 158 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> index 14db7376c712d19446b38152e480bd5a1e0a5198..36696d372a42a27b26a018b19e73bc6d8a4a5235 100644
>> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
>> @@ -9,6 +9,7 @@
>> #include <linux/pm_domain.h>
>> #include <linux/pm_opp.h>
>> #include <soc/qcom/cmd-db.h>
>> +#include <soc/qcom/tcs.h>
>> #include <drm/drm_gem.h>
>>
>> #include "a6xx_gpu.h"
>> @@ -1287,6 +1288,101 @@ static int a6xx_gmu_memory_probe(struct a6xx_gmu *gmu)
>> return 0;
>> }
>>
>> +/**
>> + * struct bcm_db - Auxiliary data pertaining to each Bus Clock Manager (BCM)
>> + * @unit: divisor used to convert bytes/sec bw value to an RPMh msg
>> + * @width: multiplier used to convert bytes/sec bw value to an RPMh msg
>> + * @vcd: virtual clock domain that this bcm belongs to
>> + * @reserved: reserved field
>> + */
>> +struct bcm_db {
>> + __le32 unit;
>> + __le16 width;
>> + u8 vcd;
>> + u8 reserved;
>> +};
>
> No. This is a direct copypasta of drivers/interconnect/qcom/icc-rpmh.h
> You cannot just randomly duplicate things..
>
> Move it out to a shared header in include/ (and remove the duplicate from
> clk-rpmh.c while at it)
>
Not sure if this a good idea
>
> I'd also really prefer if you took
>
> drivers/interconnect/qcom/bcm-voter.c : tcs_list_gen()
>
> and abstracted it to operate on struct bcm_db with any additional
> required parameters passed as arguments.. Still left some comments
> on this version if you decide to go with it
They are still very different, look closely, tcs_list_gen is designed to
operate on BW aggregations + scsaling, it would make no sense to unify them.
The calculation is simple enough, I made it explicitely easy to read and
maintain, but honestly there's nothing special.
>
>> +
>> +static int a6xx_gmu_rpmh_bw_votes_init(const struct a6xx_info *info,
>> + struct a6xx_gmu *gmu)
>> +{
>> + const struct bcm_db *bcm_data[GMU_MAX_BCMS] = { 0 };
>> + unsigned int bcm_index, bw_index, bcm_count = 0;
>> +
>> + if (!info->bcms)
>> + return 0;
>
> You already checked that from the caller
Good catch
>
>> +
>> + /* Retrieve BCM data from cmd-db */
>> + for (bcm_index = 0; bcm_index < GMU_MAX_BCMS; bcm_index++) {
>> + size_t count;
>> +
>> + /* Stop at first unconfigured bcm */
>> + if (!info->bcms[bcm_index].name)
>> + break;
>
> Unconfigured doesn't really fit here.. Maybe just mention the list is NULL
> -terminated
Ack
>
>> +
>> + bcm_data[bcm_index] = cmd_db_read_aux_data(
>> + info->bcms[bcm_index].name,
>> + &count);
>> + if (IS_ERR(bcm_data[bcm_index]))
>> + return PTR_ERR(bcm_data[bcm_index]);
>> +
>> + if (!count)
>> + return -EINVAL;
>
> If this condition ever happens, it'll be impossible to track down,
> please add an err message
Hmm sure
>
>> +
>> + ++bcm_count;
>
> I've heard somewhere that prefixed increments are discouraged for
> "reasons" and my OCD would like to support that
Never got this memo...
>
>> + }
>> +
>> + /* Generate BCM votes values for each bandwidth & BCM */
>> + for (bw_index = 0; bw_index < gmu->nr_gpu_bws; bw_index++) {
>> + u32 *data = gmu->gpu_ib_votes[bw_index];
>> + u32 bw = gmu->gpu_bw_table[bw_index];
>> +
>> + /* Calculations loosely copied from bcm_aggregate() & tcs_cmd_gen() */
>> + for (bcm_index = 0; bcm_index < bcm_count; bcm_index++) {
>> + bool commit = false;
>> + u64 peak;
>> + u32 vote;
>> +
>> + /* Skip unconfigured BCM */
>> + if (!bcm_data[bcm_index])
>> + continue;
>
> I don't see how this is useful here
It's a leftover, will drop
>
>> +
>> + if (bcm_index == bcm_count - 1 ||
>> + (bcm_data[bcm_index + 1] &&
>> + bcm_data[bcm_index]->vcd != bcm_data[bcm_index + 1]->vcd))
>> + commit = true;
>> +
>> + if (!bw) {
>> + data[bcm_index] = BCM_TCS_CMD(commit, false, 0, 0);
>> + continue;
>> + }
>> +
>> + if (info->bcms[bcm_index].fixed) {
>
> You may want to take a pointer to info->bcms[bcm_index]
Sure, will help
>
>> + u32 perfmode = 0;
>> +
>> + if (bw >= info->bcms[bcm_index].perfmode_bw)
>> + perfmode = info->bcms[bcm_index].perfmode;
>> +
>> + data[bcm_index] = BCM_TCS_CMD(commit, true, 0, perfmode);
>> + continue;
>> + }
>> +
>> + /* Multiply the bandwidth by the width of the connection */
>> + peak = (u64)bw * le16_to_cpu(bcm_data[bcm_index]->width);
>> + do_div(peak, info->bcms[bcm_index].buswidth);
>> +
>> + /* Input bandwidth value is in KBps, scale the value to BCM unit */
>> + peak *= 1000ULL;
>
> I don't think this needs to be ULL since the other argument is an u64
>
>> + do_div(peak, le32_to_cpu(bcm_data[bcm_index]->unit));
>> +
>> + vote = clamp(peak, 1, BCM_TCS_CMD_VOTE_MASK);
>> +
>> + data[bcm_index] = BCM_TCS_CMD(commit, true, vote, vote);
>
> x is the avg vote, y is the peak vote
downstream sets both calculated from the exact same value and the same way...
>
> Just noting down for my future self I guess, a6xx sets ab=0,
> a7xx sets ab=ib like you did here
Probably, I'll need to check on that, but it can be done in a second step when enabling it on a6xx
>
> Konrad
More information about the Freedreno
mailing list