[Mesa-dev] [PATCH] mesa/mtypes: repack gl_texture_object.

Nicolai Hähnle nhaehnle at gmail.com
Wed Sep 6 15:47:43 UTC 2017


On 06.09.2017 11:47, Marek Olšák wrote:
> On Wed, Sep 6, 2017 at 4:44 AM, Dave Airlie <airlied at gmail.com> wrote:
>> On 6 September 2017 at 03:11, Marek Olšák <maraeo at gmail.com> wrote:
>>> On Tue, Sep 5, 2017 at 5:50 PM, Brian Paul <brianp at vmware.com> wrote:
>>>> On 09/04/2017 05:29 AM, Marek Olšák wrote:
>>>>>
>>>>> On Sun, Sep 3, 2017 at 1:18 PM, Dave Airlie <airlied at gmail.com> wrote:
>>>>>>
>>>>>> From: Dave Airlie <airlied at redhat.com>
>>>>>>
>>>>>> reduces size from 1144 to 1128.
>>>>>>
>>>>>> Signed-off-by: Dave Airlie <airlied at redhat.com>
>>>>>> ---
>>>>>>    src/mesa/main/mtypes.h | 10 +++++-----
>>>>>>    1 file changed, 5 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
>>>>>> index d44897b..3d68a6d 100644
>>>>>> --- a/src/mesa/main/mtypes.h
>>>>>> +++ b/src/mesa/main/mtypes.h
>>>>>> @@ -1012,7 +1012,6 @@ struct gl_texture_object
>>>>>>       struct gl_sampler_object Sampler;
>>>>>>
>>>>>>       GLenum DepthMode;           /**< GL_ARB_depth_texture */
>>>>>
>>>>>
>>>>> The patch looks good, but here are some ideas for future improvements:
>>>>>
>>>>> GLenum can be uint16_t everywhere, because GL doesn't set higher bits:
>>>>>
>>>>> typedef uint16_t GLenum16.
>>>>> s/GLenum/GLenum16/
>>>>>
>>>>>> -   bool StencilSampling;       /**< Should we sample stencil instead of
>>>>>> depth? */
>>>>>>
>>>>>>       GLfloat Priority;           /**< in [0,1] */
>>>>>>       GLint BaseLevel;            /**< min mipmap level, OpenGL 1.2 */
>>>>>> @@ -1033,12 +1032,17 @@ struct gl_texture_object
>>>>>>       GLboolean Immutable;        /**< GL_ARB_texture_storage */
>>>>>>       GLboolean _IsFloat;         /**< GL_OES_float_texture */
>>>>>>       GLboolean _IsHalfFloat;     /**< GL_OES_half_float_texture */
>>>>>> +   bool StencilSampling;       /**< Should we sample stencil instead of
>>>>>> depth? */
>>>>>> +   bool HandleAllocated;       /**< GL_ARB_bindless_texture */
>>>>>
>>>>>
>>>>> All bools can be 1 bit:
>>>>>
>>>>> bool x:1;
>>>>> GLboolean y:1;
>>>>>
>>>>> etc.
>>>>>
>>>>>>
>>>>>>       GLuint MinLevel;            /**< GL_ARB_texture_view */
>>>>>>       GLuint MinLayer;            /**< GL_ARB_texture_view */
>>>>>>       GLuint NumLevels;           /**< GL_ARB_texture_view */
>>>>>>       GLuint NumLayers;           /**< GL_ARB_texture_view */
>>>>>
>>>>>
>>>>> MinLevel, NumLevels can be ubyte (uint8_t). MinLayer, NumLayers can be
>>>>> ushort (uint16_t)... simply by considering the range of possible
>>>>> values.
>>>>
>>>>
>>>> There's lots of opportunities along these lines in gl_texture_image. And
>>>> since we often have many gl_texture_images per gl_texture_object, and we
>>>> often have many textures, it'll probably have considerable impact.  I've
>>>> suggested this in the past but never got around to working on it.
>>>>
>>>> I recall Eric Anholt mentioning a memory profiling tool that was helpful for
>>>> finding wasted space in structures, etc.  I don't recall the name right now.
>>>> Eric?
>>>
>>> Dave used pahole for this patch series too. It can't obviously suggest
>>> what I suggested above (like changing the types and bits).
>>
>> Yup this was pahole, doing what Marek describes is definitely something
>> that can be done, but needs a lot more care and attention.
>>
>> Replacing bool with unsigned :1 fields isn't always a win, as you then
>> have a mask/shift on the accesses so overall may end up slowing things
>> down, and increasing instruction count etc.
> 
> bool:1 is better than unsigned:1, because it's has the behavior of
> bool. (and the 1-byte alignment might also matter)
> 
> Testing bool:1 doesn't need a shift, only a mask, i.e. the compiler should do:
> if (structure->somebyte & 0x04)
> for testing a bool on the 3rd bit.
> 
> The OR operator has the same complexity:
> if (structure->bool_val & 0x24)
> for testing 2 bools on the 3rd and 6th bit.
> 
> And the AND operator:
> if ((structure->bool_val & 0x24) == 0x24)
 >
> Setting a bool:1 to 1 is done by OR'ing the bit. Setting a bool to 0
> is done by AND'ing the bit. No shifts anywhere.

In fairness, most of these do end up involving more instructions because 
x86 ops don't have memory/immediate forms. So for a simple if 
(structure->bool_bit) you end up with a MOV+TEST instead of a CMP.

I still think bitfields are worth it in many cases, though, and 
gl_texture_image may be a good candidate.

Cheers,
Nicolai
-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.


More information about the mesa-dev mailing list