[Mesa-dev] [PATCH 11/12] i965/cnl: Properly handle l3 configuration

Anuj Phogat anuj.phogat at gmail.com
Thu Jun 1 18:52:44 UTC 2017


On Thu, May 11, 2017 at 8:31 AM, Ben Widawsky <ben at bwidawsk.net> wrote:
> On 17-05-02 11:51:28, Francisco Jerez wrote:
>>
>> Anuj Phogat <anuj.phogat at gmail.com> writes:
>>
>>> On Mon, Apr 24, 2017 at 9:15 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
>>>
>>>> On 17-04-18 18:18:39, Francisco Jerez wrote:
>>>>
>>>> Most, if not all of the unrelated changes that snuck in were due to
>>>> rebase.
>>>> Anuj, would you mind fixing those? I tried my best to address the rest,
>>>> but I'm
>>>> admittedly stumbling my way through some of the l3 programming.
>>>>
>>>> Anuj Phogat <anuj.phogat at gmail.com> writes:
>>>>>
>>>>>
>>>>> From: Ben Widawsky <benjamin.widawsky at intel.com>
>>>>>>
>>>>>>
>>>>>> V2: Squash the changes in one patch and rebased on master (Anuj).
>>>>>>
>>>>>> Signed-off-by: Ben Widawsky <ben at bwidawsk.net>
>>>>>> Signed-off-by: Anuj Phogat <anuj.phogat at gmail.com>
>>>>>> ---
>>>>>>  src/intel/common/gen_l3_config.c | 43 ++++++++++++++++++++++++++++++
>>>>>> ++++------
>>>>>>  1 file changed, 37 insertions(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/src/intel/common/gen_l3_config.c
>>>>>> b/src/intel/common/gen_l3_config.c
>>>>>> index 4fe3503..f3e8793 100644
>>>>>> --- a/src/intel/common/gen_l3_config.c
>>>>>> +++ b/src/intel/common/gen_l3_config.c
>>>>>> @@ -102,6 +102,26 @@ static const struct gen_l3_config
>>>>>> chv_l3_configs[]
>>>>>> = {
>>>>>>  };
>>>>>>
>>>>>>  /**
>>>>>> + * On CNL, RO clients are merged and shared with read/write space. As
>>>>>> a
>>>>>> result
>>>>>> + * we have fewer allocation parameters.
>>>>>>
>>>>>
>>>>> The two sentences above make it sound like RO clients haven't been part
>>>>> of the same partition until CNL.  They have.  I'd drop this.
>>>>>
>>>>>
>>>> So the difference I was trying to spell out is that the previous "IS"
>>>> "C"
>>>> and
>>>> "T" fields do not exist in a programmable way.
>>>>
>>>> Also, programming does not require any
>>>>>>
>>>>>> + * back scaling. Programming simply works in 2k increments and is
>>>>>> scaled by the
>>>>>> + * hardware.
>>>>>>
>>>>>
>>>>> That's basically the case (up to the specific scale factor) on all
>>>>> hardware, I'd drop this too.
>>>>>
>>>>>
>>>> I personally think the existing code isn't as self-documenting to me as
>>>> it
>>>> is to
>>>> you, and so I was trying to spell it out. I was trying to document, not
>>>> show
>>>> differentiation. In either event, I don't care if we keep this or leave
>>>> it.
>>>>
>>>> + */
>>>>>>
>>>>>> +static const struct gen_l3_config cnl_l3_configs[] = {
>>>>>> +   /* SLM URB Rest  DC  RO */
>>>>>>
>>>>>
>>>>> s/Rest/ALL/ (these are L3 partition enum labels), and align to the
>>>>> column boundaries below.
>>>>>
>>>>>
>>>> Sure.
>>>>
>>>>
>>>> +   {{  0, 64, 64,  0,  0 }},
>>>>>>
>>>>>> +   {{  0, 64,  0, 16, 48 }},
>>>>>> +   {{  0, 48,  0, 16, 64 }},
>>>>>> +   {{  0, 32,  0,  0, 96 }},
>>>>>> +   {{  0, 32, 96,  0,  0 }},
>>>>>> +   {{  0, 32,  0, 16, 80 }},
>>>>>> +   {{ 32, 16, 80,  0,  0 }},
>>>>>> +   {{ 32, 16,  0, 64, 16 }},
>>>>>> +   {{ 32,  0, 96,  0,  0 }},
>>>>>> +   {{ 0 }}
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>>   * Return a zero-terminated array of validated L3 configurations for
>>>>>> the
>>>>>>   * specified device.
>>>>>>   */
>>>>>> @@ -116,9 +136,11 @@ get_l3_configs(const struct gen_device_info
>>>>>> *devinfo)
>>>>>>        return (devinfo->is_cherryview ? chv_l3_configs :
>>>>>> bdw_l3_configs);
>>>>>>
>>>>>>     case 9:
>>>>>> -   case 10:
>>>>>>        return chv_l3_configs;
>>>>>>
>>>>>> +   case 10:
>>>>>> +      return cnl_l3_configs;
>>>>>> +
>>>>>>     default:
>>>>>>        unreachable("Not implemented");
>>>>>>     }
>>>>>> @@ -258,13 +280,19 @@ get_l3_way_size(const struct gen_device_info
>>>>>> *devinfo)
>>>>>>     if (devinfo->is_baytrail)
>>>>>>        return 2;
>>>>>>
>>>>>> -   else if (devinfo->gt == 1 ||
>>>>>> -            devinfo->is_cherryview ||
>>>>>> -            devinfo->is_broxton)
>>>>>>
>>>>>
>>>>> Unrelated change sneaked in.
>>>>>
>>>>>
>>>> See above reply (not sure how this got in other than rebase).
>>>>
>>>> +   /* Way size is actually 6 * num_slices, because it's 2k per bank,
>>>> and
>>>>>>
>>>>>> +    * normally 3 banks per slice. However, on CNL+ this information
>>>>>> isn't
>>>>>> +    * needed to setup the URB/l3 configuration. We fudge the answer
>>>>>> here
>>>>>> +    * and then use the scaling to fix it up later.
>>>>>> +    */
>>>>>>
>>>>>
>>>>> The comment makes it sound like you're lying to the caller and
>>>>> returning
>>>>> a bogus way size you're going to fix up later.  That's not the case
>>>>> though, the value you're returning below is accurate for all CNL
>>>>> configs.  6 * num_slices OTOH *would* be inaccurate.  I'd drop the
>>>>> comment.
>>>>>
>>>>>
>>>> Anuj, would you mind doing what Curro asks?
>>>>
>>>> +   if (devinfo->gen >= 10)
>>>>>>
>>>>>> +      return 2 * devinfo->l3_banks;
>>>>>> +
>>>>>>
>>>>>
>>>>> It would be nice if we could use the 'l3_banks' devinfo field you just
>>>>> added on previous gens too in order to simplify things.  I'm okay if
>>>>> you
>>>>> don't feel like doing the clean up right away but maybe add a short
>>>>> FINISHME comment next to its definition in gen_device_info.h so we
>>>>> don't
>>>>> forget about this (hopefully temporary) inconsistency.
>>>>>
>>>>> +   /* XXX: Cherryview and Broxton are always gt1 */
>>>>>>
>>>>>> +   if (devinfo->gt == 1)
>>>>>>
>>>>>
>>>>> Unrelated change.
>>>>>
>>>>>        return 4;
>>>>>>
>>>>>>
>>>>>> -   else
>>>>>> -      return 8 * devinfo->num_slices;
>>>>>> +   return 8 * devinfo->num_slices;
>>>>>>
>>>>>
>>>>> Unrelated change.
>>>>>
>>>>>
>>>> I think here too it must have been a rebase change.
>>>>
>>>>  }
>>>>>>
>>>>>>
>>>>>>  /**
>>>>>> @@ -274,6 +302,9 @@ get_l3_way_size(const struct gen_device_info
>>>>>> *devinfo)
>>>>>>  static unsigned
>>>>>>  get_urb_size_scale(const struct gen_device_info *devinfo)
>>>>>>  {
>>>>>> +   if (devinfo->gen == 10)
>>>>>> +      return devinfo->l3_banks;
>>>>>> +
>>>>>>
>>>>>
>>>>> This seems bogus, AFAICT URB programming is done in size per slice
>>>>> units
>>>>> (just like it was the case on previous gens), not in size per L3 bank
>>>>> units.
>>>>>
>>>>>
>>>> We have parts which have different l3 banks per slice, and those seemed
>>>> to
>>>> need
>>>> to be programmed differently which is how I got to this. Tell us what
>>>> you'd
>>>> recommend instead, and Anuj can try to run with that.
>>>>
>>> I tried to run glxgears with above hunk removed as suggested by Curro.
>>> I'm
>>> seeing
>>> glxgears rendering with missing triangles. It's the similar out put we've
>>> seen during
>>> power on.
>>>
>>
>> I'd recommend investigating the issue in some depth, since according to
>> the hardware docs this change will cause the URB to be programmed to
>> num_slices / l3_banks times the actual L3 size allocated for it.  This
>> means that roughly 66% of your URB memory will be wasted, potentially
>> leading to reduced performance of the geometry pipeline.  On top of that
>> this is likely to be hiding a serious hardware programming issue
>> elsewhere (e.g. URB setup) which may be responsible for some of the
>> hangs you said you were seeing with this series.  I've spent some time
>> this morning comparing this with the behaviour of the windows driver
>> hoping it would shed some light, but it seems to program the URB exactly
>> as the docs claim, so this hunk is almost certainly wrong.
>>
>
> Any progress made here?
>
I searched the gfxspecs and nothing seems to suggest that we should
program urb any differently than skl. glxgears and basic piglit tests continues
to misrender with per slice urb allocation. I'm going through workarounds
to see if we are missing anything.
>
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>     return (devinfo->gen >= 8 ? devinfo->num_slices : 1);
>>>>>>
>>>>>>  }
>>>>>>
>>>>>> --
>>>>>> 2.9.3
>>>>>>
>>>>>>
>
>
>


More information about the mesa-dev mailing list