[Mesa-dev] [PATCH 11/12] i965/cnl: Properly handle l3 configuration

Ben Widawsky ben at bwidawsk.net
Thu May 11 15:31:39 UTC 2017


On 17-05-02 11:51:28, Francisco Jerez wrote:
>Anuj Phogat <anuj.phogat at gmail.com> writes:
>
>> On Mon, Apr 24, 2017 at 9:15 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
>>
>>> On 17-04-18 18:18:39, Francisco Jerez wrote:
>>>
>>> Most, if not all of the unrelated changes that snuck in were due to rebase.
>>> Anuj, would you mind fixing those? I tried my best to address the rest,
>>> but I'm
>>> admittedly stumbling my way through some of the l3 programming.
>>>
>>> Anuj Phogat <anuj.phogat at gmail.com> writes:
>>>>
>>>> From: Ben Widawsky <benjamin.widawsky at intel.com>
>>>>>
>>>>> V2: Squash the changes in one patch and rebased on master (Anuj).
>>>>>
>>>>> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>>>>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>>>>> ---
>>>>>  src/intel/common/gen_l3_config.c | 43 ++++++++++++++++++++++++++++++
>>>>> ++++------
>>>>>  1 file changed, 37 insertions(+), 6 deletions(-)
>>>>>
>>>>> diff --git a/src/intel/common/gen_l3_config.c
>>>>> b/src/intel/common/gen_l3_config.c
>>>>> index 4fe3503..f3e8793 100644
>>>>> --- a/src/intel/common/gen_l3_config.c
>>>>> +++ b/src/intel/common/gen_l3_config.c
>>>>> @@ -102,6 +102,26 @@ static const struct gen_l3_config chv_l3_configs[]
>>>>> = {
>>>>>  };
>>>>>
>>>>>  /**
>>>>> + * On CNL, RO clients are merged and shared with read/write space. As a
>>>>> result
>>>>> + * we have fewer allocation parameters.
>>>>>
>>>>
>>>> The two sentences above make it sound like RO clients haven't been part
>>>> of the same partition until CNL.  They have.  I'd drop this.
>>>>
>>>>
>>> So the difference I was trying to spell out is that the previous "IS" "C"
>>> and
>>> "T" fields do not exist in a programmable way.
>>>
>>> Also, programming does not require any
>>>>> + * back scaling. Programming simply works in 2k increments and is
>>>>> scaled by the
>>>>> + * hardware.
>>>>>
>>>>
>>>> That's basically the case (up to the specific scale factor) on all
>>>> hardware, I'd drop this too.
>>>>
>>>>
>>> I personally think the existing code isn't as self-documenting to me as it
>>> is to
>>> you, and so I was trying to spell it out. I was trying to document, not
>>> show
>>> differentiation. In either event, I don't care if we keep this or leave it.
>>>
>>> + */
>>>>> +static const struct gen_l3_config cnl_l3_configs[] = {
>>>>> +   /* SLM URB Rest  DC  RO */
>>>>>
>>>>
>>>> s/Rest/ALL/ (these are L3 partition enum labels), and align to the
>>>> column boundaries below.
>>>>
>>>>
>>> Sure.
>>>
>>>
>>> +   {{  0, 64, 64,  0,  0 }},
>>>>> +   {{  0, 64,  0, 16, 48 }},
>>>>> +   {{  0, 48,  0, 16, 64 }},
>>>>> +   {{  0, 32,  0,  0, 96 }},
>>>>> +   {{  0, 32, 96,  0,  0 }},
>>>>> +   {{  0, 32,  0, 16, 80 }},
>>>>> +   {{ 32, 16, 80,  0,  0 }},
>>>>> +   {{ 32, 16,  0, 64, 16 }},
>>>>> +   {{ 32,  0, 96,  0,  0 }},
>>>>> +   {{ 0 }}
>>>>> +};
>>>>> +
>>>>> +/**
>>>>>   * Return a zero-terminated array of validated L3 configurations for the
>>>>>   * specified device.
>>>>>   */
>>>>> @@ -116,9 +136,11 @@ get_l3_configs(const struct gen_device_info
>>>>> *devinfo)
>>>>>        return (devinfo->is_cherryview ? chv_l3_configs : bdw_l3_configs);
>>>>>
>>>>>     case 9:
>>>>> -   case 10:
>>>>>        return chv_l3_configs;
>>>>>
>>>>> +   case 10:
>>>>> +      return cnl_l3_configs;
>>>>> +
>>>>>     default:
>>>>>        unreachable("Not implemented");
>>>>>     }
>>>>> @@ -258,13 +280,19 @@ get_l3_way_size(const struct gen_device_info
>>>>> *devinfo)
>>>>>     if (devinfo->is_baytrail)
>>>>>        return 2;
>>>>>
>>>>> -   else if (devinfo->gt == 1 ||
>>>>> -            devinfo->is_cherryview ||
>>>>> -            devinfo->is_broxton)
>>>>>
>>>>
>>>> Unrelated change sneaked in.
>>>>
>>>>
>>> See above reply (not sure how this got in other than rebase).
>>>
>>> +   /* Way size is actually 6 * num_slices, because it's 2k per bank, and
>>>>> +    * normally 3 banks per slice. However, on CNL+ this information
>>>>> isn't
>>>>> +    * needed to setup the URB/l3 configuration. We fudge the answer here
>>>>> +    * and then use the scaling to fix it up later.
>>>>> +    */
>>>>>
>>>>
>>>> The comment makes it sound like you're lying to the caller and returning
>>>> a bogus way size you're going to fix up later.  That's not the case
>>>> though, the value you're returning below is accurate for all CNL
>>>> configs.  6 * num_slices OTOH *would* be inaccurate.  I'd drop the
>>>> comment.
>>>>
>>>>
>>> Anuj, would you mind doing what Curro asks?
>>>
>>> +   if (devinfo->gen >= 10)
>>>>> +      return 2 * devinfo->l3_banks;
>>>>> +
>>>>>
>>>>
>>>> It would be nice if we could use the 'l3_banks' devinfo field you just
>>>> added on previous gens too in order to simplify things.  I'm okay if you
>>>> don't feel like doing the clean up right away but maybe add a short
>>>> FINISHME comment next to its definition in gen_device_info.h so we don't
>>>> forget about this (hopefully temporary) inconsistency.
>>>>
>>>> +   /* XXX: Cherryview and Broxton are always gt1 */
>>>>> +   if (devinfo->gt == 1)
>>>>>
>>>>
>>>> Unrelated change.
>>>>
>>>>        return 4;
>>>>>
>>>>> -   else
>>>>> -      return 8 * devinfo->num_slices;
>>>>> +   return 8 * devinfo->num_slices;
>>>>>
>>>>
>>>> Unrelated change.
>>>>
>>>>
>>> I think here too it must have been a rebase change.
>>>
>>>  }
>>>>>
>>>>>  /**
>>>>> @@ -274,6 +302,9 @@ get_l3_way_size(const struct gen_device_info
>>>>> *devinfo)
>>>>>  static unsigned
>>>>>  get_urb_size_scale(const struct gen_device_info *devinfo)
>>>>>  {
>>>>> +   if (devinfo->gen == 10)
>>>>> +      return devinfo->l3_banks;
>>>>> +
>>>>>
>>>>
>>>> This seems bogus, AFAICT URB programming is done in size per slice units
>>>> (just like it was the case on previous gens), not in size per L3 bank
>>>> units.
>>>>
>>>>
>>> We have parts which have different l3 banks per slice, and those seemed to
>>> need
>>> to be programmed differently which is how I got to this. Tell us what you'd
>>> recommend instead, and Anuj can try to run with that.
>>>
>> ​I tried to run glxgears with above hunk removed as suggested by Curro. I'm
>> seeing
>> glxgears rendering with missing triangles. It's the similar out put we've
>> seen during
>> power on.​
>>
>
>I'd recommend investigating the issue in some depth, since according to
>the hardware docs this change will cause the URB to be programmed to
>num_slices / l3_banks times the actual L3 size allocated for it.  This
>means that roughly 66% of your URB memory will be wasted, potentially
>leading to reduced performance of the geometry pipeline.  On top of that
>this is likely to be hiding a serious hardware programming issue
>elsewhere (e.g. URB setup) which may be responsible for some of the
>hangs you said you were seeing with this series.  I've spent some time
>this morning comparing this with the behaviour of the windows driver
>hoping it would shed some light, but it seems to program the URB exactly
>as the docs claim, so this hunk is almost certainly wrong.
>

Any progress made here?

>>>
>>> Thanks.
>>>
>>>
>>>     return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
>>>>>  }
>>>>>
>>>>> --
>>>>> 2.9.3
>>>>>
>>>>>





More information about the mesa-dev mailing list