[Mesa-dev] [PATCH 2/2] i965: Optimize intel_batchbuffer_emit_dword().

Chris Wilson chris at chris-wilson.co.uk
Wed Jul 8 17:01:12 PDT 2015


On Thu, Jul 09, 2015 at 12:53:23AM +0100, Chris Wilson wrote:
> This is what I expected to see
>    0x000000000000025e <+62>:	movl   $0x780d1c02,(%rcx)
>    0x0000000000000264 <+68>:	mov    0x22f08(%rdi),%rax
>    0x000000000000026b <+75>:	mov    0x24320(%rdi),%edx
>    0x0000000000000271 <+81>:	mov    %edx,0x4(%rax)
>    0x0000000000000274 <+84>:	mov    0x22f08(%rdi),%rax
>    0x000000000000027b <+91>:	mov    0x24338(%rdi),%edx
>    0x0000000000000281 <+97>:	mov    %edx,0x8(%rax)
>    0x0000000000000284 <+100>:	mov    0x22f08(%rdi),%rax
>    0x000000000000028b <+107>:	mov    0x24814(%rdi),%edx
>    0x0000000000000291 <+113>:	mov    %edx,0xc(%rax)
>    0x0000000000000294 <+116>:	addq   $0x10,0x22f08(%rdi)
>    0x000000000000029c <+124>:	pop    %rbp
>    0x000000000000029d <+125>:	retq   
> with the pointer increments coalesced to the end. Generated by
> opencoding emit_dwords as
> 
> static void upload_viewport_state_pointers(struct brw_context *brw)
> {
>    BEGIN_BATCH(4);
>    brw->batch.map[0] = (_3DSTATE_VIEWPORT_STATE_POINTERS << 16 | (4 - 2) |
>                         GEN6_CC_VIEWPORT_MODIFY |
>                         GEN6_SF_VIEWPORT_MODIFY |
>                         GEN6_CLIP_VIEWPORT_MODIFY);
>    brw->batch.map[1] = (brw->clip.vp_offset);
>    brw->batch.map[2] = (brw->sf.vp_offset);
>    brw->batch.map[3] = (brw->cc.vp_offset);
>    brw->batch.map += 4;
>    ADVANCE_BATCH();
> }

But it is still reloading brw->batch.map everytime!
One manual local variable later,
   0x000000000000025e <+62>:	movl   $0x780d1c02,(%rdx)
   0x0000000000000264 <+68>:	mov    0x24320(%rdi),%eax
   0x000000000000026a <+74>:	mov    %eax,0x4(%rdx)
   0x000000000000026d <+77>:	mov    0x24338(%rdi),%eax
   0x0000000000000273 <+83>:	mov    %eax,0x8(%rdx)
   0x0000000000000276 <+86>:	mov    0x24814(%rdi),%eax
   0x000000000000027c <+92>:	mov    %eax,0xc(%rdx)
   0x000000000000027f <+95>:	addq   $0x10,0x22f08(%rdi)
   0x0000000000000287 <+103>:	pop    %rbp
   0x0000000000000288 <+104>:	retq   
(I hope telling gcc to tune for atom does better.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre


More information about the mesa-dev mailing list