[Mesa-dev] [PATCH 11/12] i965/cnl: Properly handle l3 configuration
Francisco Jerez
currojerez at riseup.net
Tue May 2 18:51:28 UTC 2017
Anuj Phogat <anuj.phogat at gmail.com> writes:
> On Mon, Apr 24, 2017 at 9:15 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
>
>> On 17-04-18 18:18:39, Francisco Jerez wrote:
>>
>> Most, if not all of the unrelated changes that snuck in were due to rebase.
>> Anuj, would you mind fixing those? I tried my best to address the rest,
>> but I'm
>> admittedly stumbling my way through some of the l3 programming.
>>
>> Anuj Phogat <anuj.phogat at gmail.com> writes:
>>>
>>> From: Ben Widawsky <benjamin.widawsky at intel.com>
>>>>
>>>> V2: Squash the changes in one patch and rebased on master (Anuj).
>>>>
>>>> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>>>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>>>> ---
>>>> src/intel/common/gen_l3_config.c | 43 ++++++++++++++++++++++++++++++
>>>> ++++------
>>>> 1 file changed, 37 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/src/intel/common/gen_l3_config.c
>>>> b/src/intel/common/gen_l3_config.c
>>>> index 4fe3503..f3e8793 100644
>>>> --- a/src/intel/common/gen_l3_config.c
>>>> +++ b/src/intel/common/gen_l3_config.c
>>>> @@ -102,6 +102,26 @@ static const struct gen_l3_config chv_l3_configs[]
>>>> = {
>>>> };
>>>>
>>>> /**
>>>> + * On CNL, RO clients are merged and shared with read/write space. As a
>>>> result
>>>> + * we have fewer allocation parameters.
>>>>
>>>
>>> The two sentences above make it sound like RO clients haven't been part
>>> of the same partition until CNL. They have. I'd drop this.
>>>
>>>
>> So the difference I was trying to spell out is that the previous "IS" "C"
>> and
>> "T" fields do not exist in a programmable way.
>>
>> Also, programming does not require any
>>>> + * back scaling. Programming simply works in 2k increments and is
>>>> scaled by the
>>>> + * hardware.
>>>>
>>>
>>> That's basically the case (up to the specific scale factor) on all
>>> hardware, I'd drop this too.
>>>
>>>
>> I personally think the existing code isn't as self-documenting to me as it
>> is to
>> you, and so I was trying to spell it out. I was trying to document, not
>> show
>> differentiation. In either event, I don't care if we keep this or leave it.
>>
>> + */
>>>> +static const struct gen_l3_config cnl_l3_configs[] = {
>>>> + /* SLM URB Rest DC RO */
>>>>
>>>
>>> s/Rest/ALL/ (these are L3 partition enum labels), and align to the
>>> column boundaries below.
>>>
>>>
>> Sure.
>>
>>
>> + {{ 0, 64, 64, 0, 0 }},
>>>> + {{ 0, 64, 0, 16, 48 }},
>>>> + {{ 0, 48, 0, 16, 64 }},
>>>> + {{ 0, 32, 0, 0, 96 }},
>>>> + {{ 0, 32, 96, 0, 0 }},
>>>> + {{ 0, 32, 0, 16, 80 }},
>>>> + {{ 32, 16, 80, 0, 0 }},
>>>> + {{ 32, 16, 0, 64, 16 }},
>>>> + {{ 32, 0, 96, 0, 0 }},
>>>> + {{ 0 }}
>>>> +};
>>>> +
>>>> +/**
>>>> * Return a zero-terminated array of validated L3 configurations for the
>>>> * specified device.
>>>> */
>>>> @@ -116,9 +136,11 @@ get_l3_configs(const struct gen_device_info
>>>> *devinfo)
>>>> return (devinfo->is_cherryview ? chv_l3_configs : bdw_l3_configs);
>>>>
>>>> case 9:
>>>> - case 10:
>>>> return chv_l3_configs;
>>>>
>>>> + case 10:
>>>> + return cnl_l3_configs;
>>>> +
>>>> default:
>>>> unreachable("Not implemented");
>>>> }
>>>> @@ -258,13 +280,19 @@ get_l3_way_size(const struct gen_device_info
>>>> *devinfo)
>>>> if (devinfo->is_baytrail)
>>>> return 2;
>>>>
>>>> - else if (devinfo->gt == 1 ||
>>>> - devinfo->is_cherryview ||
>>>> - devinfo->is_broxton)
>>>>
>>>
>>> Unrelated change sneaked in.
>>>
>>>
>> See above reply (not sure how this got in other than rebase).
>>
>> + /* Way size is actually 6 * num_slices, because it's 2k per bank, and
>>>> + * normally 3 banks per slice. However, on CNL+ this information
>>>> isn't
>>>> + * needed to setup the URB/l3 configuration. We fudge the answer here
>>>> + * and then use the scaling to fix it up later.
>>>> + */
>>>>
>>>
>>> The comment makes it sound like you're lying to the caller and returning
>>> a bogus way size you're going to fix up later. That's not the case
>>> though, the value you're returning below is accurate for all CNL
>>> configs. 6 * num_slices OTOH *would* be inaccurate. I'd drop the
>>> comment.
>>>
>>>
>> Anuj, would you mind doing what Curro asks?
>>
>> + if (devinfo->gen >= 10)
>>>> + return 2 * devinfo->l3_banks;
>>>> +
>>>>
>>>
>>> It would be nice if we could use the 'l3_banks' devinfo field you just
>>> added on previous gens too in order to simplify things. I'm okay if you
>>> don't feel like doing the clean up right away but maybe add a short
>>> FINISHME comment next to its definition in gen_device_info.h so we don't
>>> forget about this (hopefully temporary) inconsistency.
>>>
>>> + /* XXX: Cherryview and Broxton are always gt1 */
>>>> + if (devinfo->gt == 1)
>>>>
>>>
>>> Unrelated change.
>>>
>>> return 4;
>>>>
>>>> - else
>>>> - return 8 * devinfo->num_slices;
>>>> + return 8 * devinfo->num_slices;
>>>>
>>>
>>> Unrelated change.
>>>
>>>
>> I think here too it must have been a rebase change.
>>
>> }
>>>>
>>>> /**
>>>> @@ -274,6 +302,9 @@ get_l3_way_size(const struct gen_device_info
>>>> *devinfo)
>>>> static unsigned
>>>> get_urb_size_scale(const struct gen_device_info *devinfo)
>>>> {
>>>> + if (devinfo->gen == 10)
>>>> + return devinfo->l3_banks;
>>>> +
>>>>
>>>
>>> This seems bogus, AFAICT URB programming is done in size per slice units
>>> (just like it was the case on previous gens), not in size per L3 bank
>>> units.
>>>
>>>
>> We have parts which have different l3 banks per slice, and those seemed to
>> need
>> to be programmed differently which is how I got to this. Tell us what you'd
>> recommend instead, and Anuj can try to run with that.
>>
> I tried to run glxgears with above hunk removed as suggested by Curro. I'm
> seeing
> glxgears rendering with missing triangles. It's the similar out put we've
> seen during
> power on.
>
I'd recommend investigating the issue in some depth, since according to
the hardware docs this change will cause the URB to be programmed to
num_slices / l3_banks times the actual L3 size allocated for it. This
means that roughly 66% of your URB memory will be wasted, potentially
leading to reduced performance of the geometry pipeline. On top of that
this is likely to be hiding a serious hardware programming issue
elsewhere (e.g. URB setup) which may be responsible for some of the
hangs you said you were seeing with this series. I've spent some time
this morning comparing this with the behaviour of the windows driver
hoping it would shed some light, but it seems to program the URB exactly
as the docs claim, so this hunk is almost certainly wrong.
>>
>> Thanks.
>>
>>
>> return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
>>>> }
>>>>
>>>> --
>>>> 2.9.3
>>>>
>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170502/885336aa/attachment.sig>
More information about the mesa-dev
mailing list