[Mesa-dev] [PATCH 11/12] i965/cnl: Properly handle l3 configuration
Ben Widawsky
ben at bwidawsk.net
Thu May 11 15:31:39 UTC 2017
On 17-05-02 11:51:28, Francisco Jerez wrote:
>Anuj Phogat <anuj.phogat at gmail.com> writes:
>
>> On Mon, Apr 24, 2017 at 9:15 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
>>
>>> On 17-04-18 18:18:39, Francisco Jerez wrote:
>>>
>>> Most, if not all of the unrelated changes that snuck in were due to rebase.
>>> Anuj, would you mind fixing those? I tried my best to address the rest,
>>> but I'm
>>> admittedly stumbling my way through some of the l3 programming.
>>>
>>> Anuj Phogat <anuj.phogat at gmail.com> writes:
>>>>
>>>> From: Ben Widawsky <benjamin.widawsky at intel.com>
>>>>>
>>>>> V2: Squash the changes in one patch and rebased on master (Anuj).
>>>>>
>>>>> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>>>>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>>>>> ---
>>>>> src/intel/common/gen_l3_config.c | 43 ++++++++++++++++++++++++++++++
>>>>> ++++------
>>>>> 1 file changed, 37 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/src/intel/common/gen_l3_config.c
>>>>> b/src/intel/common/gen_l3_config.c
>>>>> index 4fe3503..f3e8793 100644
>>>>> --- a/src/intel/common/gen_l3_config.c
>>>>> +++ b/src/intel/common/gen_l3_config.c
>>>>> @@ -102,6 +102,26 @@ static const struct gen_l3_config chv_l3_configs[]
>>>>> = {
>>>>> };
>>>>>
>>>>> /**
>>>>> + * On CNL, RO clients are merged and shared with read/write space. As a
>>>>> result
>>>>> + * we have fewer allocation parameters.
>>>>>
>>>>
>>>> The two sentences above make it sound like RO clients haven't been part
>>>> of the same partition until CNL. They have. I'd drop this.
>>>>
>>>>
>>> So the difference I was trying to spell out is that the previous "IS" "C"
>>> and
>>> "T" fields do not exist in a programmable way.
>>>
>>> Also, programming does not require any
>>>>> + * back scaling. Programming simply works in 2k increments and is
>>>>> scaled by the
>>>>> + * hardware.
>>>>>
>>>>
>>>> That's basically the case (up to the specific scale factor) on all
>>>> hardware, I'd drop this too.
>>>>
>>>>
>>> I personally think the existing code isn't as self-documenting to me as it
>>> is to
>>> you, and so I was trying to spell it out. I was trying to document, not
>>> show
>>> differentiation. In either event, I don't care if we keep this or leave it.
>>>
>>> + */
>>>>> +static const struct gen_l3_config cnl_l3_configs[] = {
>>>>> + /* SLM URB Rest DC RO */
>>>>>
>>>>
>>>> s/Rest/ALL/ (these are L3 partition enum labels), and align to the
>>>> column boundaries below.
>>>>
>>>>
>>> Sure.
>>>
>>>
>>> + {{ 0, 64, 64, 0, 0 }},
>>>>> + {{ 0, 64, 0, 16, 48 }},
>>>>> + {{ 0, 48, 0, 16, 64 }},
>>>>> + {{ 0, 32, 0, 0, 96 }},
>>>>> + {{ 0, 32, 96, 0, 0 }},
>>>>> + {{ 0, 32, 0, 16, 80 }},
>>>>> + {{ 32, 16, 80, 0, 0 }},
>>>>> + {{ 32, 16, 0, 64, 16 }},
>>>>> + {{ 32, 0, 96, 0, 0 }},
>>>>> + {{ 0 }}
>>>>> +};
>>>>> +
>>>>> +/**
>>>>> * Return a zero-terminated array of validated L3 configurations for the
>>>>> * specified device.
>>>>> */
>>>>> @@ -116,9 +136,11 @@ get_l3_configs(const struct gen_device_info
>>>>> *devinfo)
>>>>> return (devinfo->is_cherryview ? chv_l3_configs : bdw_l3_configs);
>>>>>
>>>>> case 9:
>>>>> - case 10:
>>>>> return chv_l3_configs;
>>>>>
>>>>> + case 10:
>>>>> + return cnl_l3_configs;
>>>>> +
>>>>> default:
>>>>> unreachable("Not implemented");
>>>>> }
>>>>> @@ -258,13 +280,19 @@ get_l3_way_size(const struct gen_device_info
>>>>> *devinfo)
>>>>> if (devinfo->is_baytrail)
>>>>> return 2;
>>>>>
>>>>> - else if (devinfo->gt == 1 ||
>>>>> - devinfo->is_cherryview ||
>>>>> - devinfo->is_broxton)
>>>>>
>>>>
>>>> Unrelated change sneaked in.
>>>>
>>>>
>>> See above reply (not sure how this got in other than rebase).
>>>
>>> + /* Way size is actually 6 * num_slices, because it's 2k per bank, and
>>>>> + * normally 3 banks per slice. However, on CNL+ this information
>>>>> isn't
>>>>> + * needed to setup the URB/l3 configuration. We fudge the answer here
>>>>> + * and then use the scaling to fix it up later.
>>>>> + */
>>>>>
>>>>
>>>> The comment makes it sound like you're lying to the caller and returning
>>>> a bogus way size you're going to fix up later. That's not the case
>>>> though, the value you're returning below is accurate for all CNL
>>>> configs. 6 * num_slices OTOH *would* be inaccurate. I'd drop the
>>>> comment.
>>>>
>>>>
>>> Anuj, would you mind doing what Curro asks?
>>>
>>> + if (devinfo->gen >= 10)
>>>>> + return 2 * devinfo->l3_banks;
>>>>> +
>>>>>
>>>>
>>>> It would be nice if we could use the 'l3_banks' devinfo field you just
>>>> added on previous gens too in order to simplify things. I'm okay if you
>>>> don't feel like doing the clean up right away but maybe add a short
>>>> FINISHME comment next to its definition in gen_device_info.h so we don't
>>>> forget about this (hopefully temporary) inconsistency.
>>>>
>>>> + /* XXX: Cherryview and Broxton are always gt1 */
>>>>> + if (devinfo->gt == 1)
>>>>>
>>>>
>>>> Unrelated change.
>>>>
>>>> return 4;
>>>>>
>>>>> - else
>>>>> - return 8 * devinfo->num_slices;
>>>>> + return 8 * devinfo->num_slices;
>>>>>
>>>>
>>>> Unrelated change.
>>>>
>>>>
>>> I think here too it must have been a rebase change.
>>>
>>> }
>>>>>
>>>>> /**
>>>>> @@ -274,6 +302,9 @@ get_l3_way_size(const struct gen_device_info
>>>>> *devinfo)
>>>>> static unsigned
>>>>> get_urb_size_scale(const struct gen_device_info *devinfo)
>>>>> {
>>>>> + if (devinfo->gen == 10)
>>>>> + return devinfo->l3_banks;
>>>>> +
>>>>>
>>>>
>>>> This seems bogus, AFAICT URB programming is done in size per slice units
>>>> (just like it was the case on previous gens), not in size per L3 bank
>>>> units.
>>>>
>>>>
>>> We have parts which have different l3 banks per slice, and those seemed to
>>> need
>>> to be programmed differently which is how I got to this. Tell us what you'd
>>> recommend instead, and Anuj can try to run with that.
>>>
>> I tried to run glxgears with above hunk removed as suggested by Curro. I'm
>> seeing
>> glxgears rendering with missing triangles. It's the similar out put we've
>> seen during
>> power on.
>>
>
>I'd recommend investigating the issue in some depth, since according to
>the hardware docs this change will cause the URB to be programmed to
>num_slices / l3_banks times the actual L3 size allocated for it. This
>means that roughly 66% of your URB memory will be wasted, potentially
>leading to reduced performance of the geometry pipeline. On top of that
>this is likely to be hiding a serious hardware programming issue
>elsewhere (e.g. URB setup) which may be responsible for some of the
>hangs you said you were seeing with this series. I've spent some time
>this morning comparing this with the behaviour of the windows driver
>hoping it would shed some light, but it seems to program the URB exactly
>as the docs claim, so this hunk is almost certainly wrong.
>
Any progress made here?
>>>
>>> Thanks.
>>>
>>>
>>> return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
>>>>> }
>>>>>
>>>>> --
>>>>> 2.9.3
>>>>>
>>>>>
More information about the mesa-dev
mailing list