[Mesa-dev] [PATCH 4/5] i965: Handle Y-tiled modifier

Jason Ekstrand jason at jlekstrand.net
Mon Mar 20 22:36:37 UTC 2017


On Mon, Mar 20, 2017 at 3:25 PM, Ben Widawsky <ben at bwidawsk.net> wrote:

> On 17-03-20 12:00:44, Jason Ekstrand wrote:
>
>> On Fri, Mar 17, 2017 at 5:34 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
>>
>> This patch begins introducing how we'll actually handle the potentially
>>> many modifiers coming in from the API, how we'll store them, and the
>>> structure in the code to support it.
>>>
>>> Prior to this patch, the Y-tiled modifier would be entirely ignored. It
>>> shouldn't actually be used until this point because we've not bumped the
>>> DRIimage extension version (which is a requirement to use modifiers).
>>>
>>> With X-tiling:
>>> Writes:          6,583.58 MiB
>>> Reads:           6,540.93 MiB
>>>
>>> With Y-tiling:
>>> Writes:          5,361.78 MiB
>>> Reads            6,052.45 MiB
>>>
>>> Savings per frame
>>> Writes:  2 MiB
>>> Reads:  .8 MiB
>>>
>>> Similar functionality was introduced and then reverted here:
>>>
>>> commit 6a0d036483caf87d43ebe2edd1905873446c9589
>>> Author: Ben Widawsky <ben at bwidawsk.net>
>>> Date:   Thu Apr 21 20:14:58 2016 -0700
>>>
>>>     i965: Always use Y-tiled buffers on SKL+
>>>
>>> v2: Use last set bit instead of first set bit in modifiers to address
>>> bug found by Daniel Stone.
>>>
>>> Signed-off-by: Ben Widawsky <benjamin.widawsky at intel.com>
>>> Reviewed-by: Eric Engestrom <eric.engestrom at imgtec.com>
>>> Acked-by: Daniel Stone <daniels at collabora.com>
>>> ---
>>>  src/mesa/drivers/dri/i965/intel_screen.c | 55
>>> ++++++++++++++++++++++++++++----
>>>  1 file changed, 49 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
>>> b/src/mesa/drivers/dri/i965/intel_screen.c
>>> index 22ab3a30b6..1954757d1e 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_screen.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
>>> @@ -23,6 +23,7 @@
>>>   * SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
>>>   */
>>>
>>> +#include <drm_fourcc.h>
>>>  #include <errno.h>
>>>  #include <time.h>
>>>  #include <unistd.h>
>>> @@ -520,16 +521,35 @@ select_best_modifier(struct gen_device_info
>>> *devinfo,
>>>                       const uint64_t *modifiers,
>>>                       const unsigned count)
>>>  {
>>> -   uint64_t modifier = DRM_FORMAT_MOD_INVALID;
>>> +#define YTILE  (1 << 1)
>>> +#define LINEAR (1 << 0)
>>> +
>>> +   const uint64_t prio_modifiers[] = { I915_FORMAT_MOD_Y_TILED,
>>> DRM_FORMAT_MOD_LINEAR };
>>> +   uint32_t modifier_bitmask = 0; /* API only allows 32 */
>>>
>>>
>> The bitfield thing is still confusing to me.  Here's an idea on how we
>> could maybe make it better.
>>
>> enum modifier_priority {
>>   MODIFIER_PRIORITY_LINEAR,
>>   MODIFIER_PRIORITY_X,
>>   MODIFIER_PRIORITY_Y,
>>   MODIFIER_PRIORITY_Y_CCS,
>> };
>>
>> uint32_t priority_to_modifier[] = {
>>   [MODIFIER_PRIORITY_LINEAR] = DRM_FORMAT_MOD_LINEAR,
>>   [MODIFIER_PRIORITY_X] = I915_FORMAT_MOD_X_TILED,
>>   [MODIFIER_PRIORITY_Y] = I915_FORMAT_MOD_Y_TILED,
>>   [MODIFIER_PRIORITY_Y_CCS] = I915_FORMAT_MOD_Y_TILED_CCS,
>> }
>>
>> enum modier_priority prio = 0;
>> for (int i = 0; i < count; i++) {
>>   switch (modifiers[i]) {
>>   case DRM_FORMAT_MOD_LINEAR:
>>      prio = MAX2(prio, MODIFIER_PRIORITY_LINEAR);
>>      break;
>>
>>   case DRM_FORMAT_MOD_X_TILED:
>>      prio = MAX2(prio, MODIFIER_PRIORITY_X);
>>      break;
>>
>>   case DRM_FORMAT_MOD_Y_TILED:
>>      prio = MAX2(prio, MODIFIER_PRIORITY_Y);
>>      break;
>>
>>   case DRM_FORMAT_MOD_Y_TILIED_CCS:
>>      prio = MAX2(prio, MODIFIER_PRIORITY_Y_CCS);
>>      break;
>> }
>>
>> return priority_to_modifier[prio];
>>
>> How does this strike your fancy?  I'm ok with the bit set approach if you
>> really prefer it but I find it hard to reason about.
>>
>>
> I don't really prefer. This looks pretty good. Seems no less complex to
> me, but
> I wrote the first one, so perhaps I'm partial.
>
> Originally, I had some code in the equivalent function (before select_best
> was
> separate) which would try fallbacks, ie. if Y tiled allocation failed,
> it'd go
> down to the next modifier just walking down the bits, but logic is now
> gone, so
> there isn't really a point in the bitmask.
>
> Will respin with this and the fixes meant below.
>
>
>>
>>>     for (int i = 0; i < count; i++) {
>>>        switch (modifiers[i]) {
>>>        case DRM_FORMAT_MOD_LINEAR:
>>> -         return modifiers[i];
>>> +         modifier_bitmask |= LINEAR;
>>> +         break;
>>> +      case I915_FORMAT_MOD_Y_TILED:
>>> +         if (devinfo->gen < 9) {
>>> +            _mesa_warning(NULL, "Invalid Y-tiling parameter\n");
>>> +            continue;
>>>
>>>
>> This isn't invalid.  It's just invalid for scanout.  If you wanted to
>> create an image to share between two GL implementations, Y-tiling works
>> fine on everything.
>>
>>
>>
> As a general function to support modifiers, you are correct, however since
> this
> is only called for image creation, I believe the existing warning is
> correct.
>

But what if I want to create an image to share between two userspace
processes with no scanout involved?  While the GBM portion of the API is
mostly intended for scanout, the EGL extension will be something that can
and will be used for GL <-> GL.  I guess we can always flip it on when we
add support for the EGL extension but I see no reason why it shouldn't work
through GBM.

Part of the problem may be that I really don't understand why GBM exists.
It's like a linux-specific half-of-EGL thing. :-/


> +         }
>>> +
>>> +         modifier_bitmask |= YTILE;
>>> +         break;
>>>        }
>>>     }
>>>
>>> -   return modifier;
>>> +   if (modifier_bitmask)
>>> +      return prio_modifiers[util_last_bit64(modifier_bitmask)-1];
>>> +
>>> +   return DRM_FORMAT_MOD_INVALID;
>>> +
>>> +#undef LINEAR
>>> +#undef YTILE
>>>  }
>>>
>>>  static __DRIimage *
>>> @@ -560,6 +580,9 @@ intel_create_image_common(__DRIscreen *dri_screen,
>>>     case DRM_FORMAT_MOD_LINEAR:
>>>        tiling = I915_TILING_NONE;
>>>        break;
>>> +   case I915_FORMAT_MOD_Y_TILED:
>>> +      tiling = I915_TILING_Y;
>>> +      break;
>>>     case DRM_FORMAT_MOD_INVALID:
>>>     default:
>>>           break;
>>> @@ -611,8 +634,26 @@ intel_create_image_with_modifiers(__DRIscreen
>>> *dri_screen,
>>>                                    const unsigned count,
>>>                                    void *loaderPrivate)
>>>  {
>>> -   return intel_create_image_common(dri_screen, width, height, format,
>>> 0, NULL,
>>> -                                    0, loaderPrivate);
>>> +   uint64_t local_mods[count];
>>> +   int local_count = 0;
>>> +
>>> +   /* This compacts the actual modifiers to the ones we know how to
>>> handle */
>>> +   for (int i = 0; i < count; i++) {
>>> +      switch (modifiers[i]) {
>>> +      case I915_FORMAT_MOD_Y_TILED:
>>> +      case DRM_FORMAT_MOD_LINEAR:
>>> +         local_mods[local_count++] = modifiers[i];
>>> +         break;
>>> +      case DRM_FORMAT_MOD_INVALID:
>>> +         unreachable("Invalid modifiers specified\n");
>>>
>>>
>> I'm not sure this is truely unreachable.  What prevents someone from
>> calling create_image_with_modifiers with INVALID?  Do we have a check
>> somewhere higher up that that prevents it?  If this is something that's
>> actually triggerable from client code, then I think we probably want to
>> just let it fall through.
>>
>>
>>
> Nothing. It was meant to be handled in the entry points, but I missed it
> apparently.


If you want to add code to the entrypoints to handle it, and keep the
unreachable, that's fine with me.


>
> +      default:
>>> +         /* Modifiers from other vendors would land here. */
>>> +         break;
>>> +      }
>>> +   }
>>> +
>>> +   return intel_create_image_common(dri_screen, width, height, format,
>>> 0,
>>> +                                    local_mods, local_count,
>>> loaderPrivate);
>>>  }
>>>
>>>  static GLboolean
>>> @@ -1867,7 +1908,9 @@ intelAllocateBuffer(__DRIscreen *dri_screen,
>>>     if (intelBuffer == NULL)
>>>        return NULL;
>>>
>>> -   /* The front and back buffers are color buffers, which are X tiled.
>>> */
>>> +   /* The front and back buffers are color buffers, which are X tiled.
>>> GEN9+
>>> +    * supports Y tiled and compressed buffers, but there is no way to
>>> plumb that
>>> +    * through to here. */
>>>     uint32_t tiling = I915_TILING_X;
>>>     unsigned long pitch;
>>>     int cpp = format / 8;
>>> --
>>> 2.12.0
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/mesa-dev/attachments/20170320/75c67ef0/attachment.html>


More information about the mesa-dev mailing list