[Mesa-dev] [PATCH] mesa: Remove the ralloc canary on release builds.
Brian Paul
brianp at vmware.com
Sat Nov 23 15:40:37 PST 2013
On 11/22/2013 10:30 AM, Eric Anholt wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
>
>> On 11/22/2013 12:21 AM, Eric Anholt wrote:
>>> The canary is basically just to give a better debugging message when you
>>> ralloc_free() something that wasn't rallocated. Reduces maximum memory
>>> usage of apitrace replay of the dota2 demo by 60MB on my 64-bit system (so
>>> half that on a real 32-bit dota2 environment).
>>
>> Really, half? It's an unsigned...that's 4 bytes regardless of 64-bit
>> vs. 32-bit. I think this should be 60MB of savings, end of story.
>
> Scalar types get aligned to their size, so since it's followed by a
> pointer, there's 4 bytes of pad in between.
>
> For anyone that hasn't seen this tool before, check out pahole from the
> dwarves package. Run it on a .o file you think might be sucking up a
> bunch of memory, and see your structs like:
>
> class fs_inst : public backend_instruction {
> public:
>
> /* class backend_instruction <ancestor>; */ /* 0 32 */
>
> /* XXX last struct has 7 bytes of padding */
>
> class fs_reg dst; /* 32 48 */
> /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
> class fs_reg src[3]; /* 80 144 */
> /* --- cacheline 3 boundary (192 bytes) was 32 bytes ago --- */
> bool saturate; /* 224 1 */
>
> /* XXX 3 bytes hole, try to pack */
>
> int conditional_mod; /* 228 4 */
> uint8_t flag_subreg; /* 232 1 */
>
> /* XXX 3 bytes hole, try to pack */
>
> int mlen; /* 236 4 */
> int regs_written; /* 240 4 */
> int base_mrf; /* 244 4 */
> uint32_t texture_offset; /* 248 4 */
> int sampler; /* 252 4 */
> /* --- cacheline 4 boundary (256 bytes) --- */
> int target; /* 256 4 */
> bool eot; /* 260 1 */
> bool header_present; /* 261 1 */
> bool shadow_compare; /* 262 1 */
> bool force_uncompressed; /* 263 1 */
> bool force_sechalf; /* 264 1 */
> bool force_writemask_all; /* 265 1 */
>
> ...
>
> /* size: 288, cachelines: 5, members: 21 */
> /* sum members: 280, holes: 3, sum holes: 8 */
> /* paddings: 1, sum paddings: 7 */
> /* last cacheline: 32 bytes */
> };
Getting a bit OT, but I'm sure some mesa structs could be compacted
quite a bit. In gl_texture_image, for example, a number of the fields
could be reduced to GLubyte (like Face, Level, Border, NumSamples, etc)
and rearranged to reduce the memory used for such objects.
We could potentially reduce gl_texture_image from 80 bytes to 44 bytes
which would save 324 bytes for a 256x256 mipmapped texture. It would
start to add up with a thousand textures or so.
There might be some debate about how worthwhile that is. I'm not too
concerned right now.
However, pahole says gl_debug_state is fairly huge: 292712 bytes!
sizeof(gl_context) = 384208 so that's a big piece. At the very least,
maybe gl_debug_state could be pulled out and allocated on first use...
-Brian
More information about the mesa-dev
mailing list