[Nouveau] [RFC] drm/nouveau: optimize code emission of inline functions
Younes Manton
younes.m at gmail.com
Mon Aug 10 17:14:16 PDT 2009
On Mon, Aug 10, 2009 at 1:40 PM, Pekka Paalanen<pq at iki.fi> wrote:
> Before this patch:
>
> $ objdump -t nouveau.ko --section=.text | cut -f2 | sort -k2 | uniq -d -c
> 4
> 9 0000000000000010 BEGIN_RING
> 5 0000000000000051 FIRE_RING
> 2 00000000000000b3 NVLockVgaCrtcs
> 4 000000000000008b NVReadVgaCrtc
> 2 000000000000008c NVReadVgaCrtc
> 2 0000000000000011 NVVgaSeqReset
> 2 000000000000006b NVWriteCRTC
> 2 0000000000000066 NVWriteRAMDAC
> 4 0000000000000081 NVWriteVgaCrtc
> 3 0000000000000082 NVWriteVgaCrtc
> 11 000000000000001a OUT_RING
> 9 0000000000000028 RING_SPACE
> 2 0000000000000019 crtc_wr_cio_state
> 3 0000000000000012 drm_gem_object_unreference
> 2 0000000000000005 kmalloc
> 3 000000000000000b kzalloc
> 4 0000000000000051 nouveau_bo_ref
> 2 0000000000000050 nvReadMC
> 2 0000000000000052 nvWriteMC
> 3 0000000000000029 nv_gf4_disp_arch
> 4 000000000000001b nv_rd08
> 3 000000000000001c nv_rd08
> 29 0000000000000012 nv_rd32
> 2 0000000000000012 nv_ri32
> 5 000000000000001c nv_ro32
> 4 000000000000008b nv_two_heads
> 11 0000000000000022 nv_wo32
> 8 0000000000000015 nv_wr08
> 29 0000000000000014 nv_wr32
> 2 0000000000000013 pci_read_config_dword
>
> After this patch:
>
> $ objdump -t nouveau.ko --section=.text | cut -f2 | sort -k2 | uniq -d -c
> 4
> 9 0000000000000010 BEGIN_RING
> 5 0000000000000051 FIRE_RING
> 2 00000000000000b3 NVLockVgaCrtcs
> 5 00000000000000a7 NVReadVgaCrtc
> 2 0000000000000011 NVVgaSeqReset
> 2 0000000000000073 NVWriteCRTC
> 3 0000000000000072 NVWriteRAMDAC
> 4 0000000000000091 NVWriteVgaCrtc
> 3 0000000000000092 NVWriteVgaCrtc
> 11 000000000000001a OUT_RING
> 9 0000000000000028 RING_SPACE
> 2 0000000000000019 crtc_wr_cio_state
> 3 0000000000000012 drm_gem_object_unreference
> 2 0000000000000005 kmalloc
> 3 000000000000000b kzalloc
> 3 0000000000000051 nouveau_bo_ref
> 2 0000000000000052 nouveau_bo_ref
> 3 000000000000005d nvReadMC
> 2 000000000000005c nvWriteMC
> 3 0000000000000029 nv_gf4_disp_arch
> 4 000000000000008b nv_two_heads
> 2 0000000000000013 pci_read_config_dword
>
> As you can see, the static inline functions changed to extern
> inline functions no longer appear many times in the final kernel
> module. But, at the same time nouveau.ko file size
> before: 583683 B (.text size 0x000312c8)
> after: 681075 B (.text size 0x00039474)
> That's .text size increase by 32 kB.
>
> So something is definitely inlined a lot more. This was tested on
> x86_64, gcc 4.1.2, CONFIG_OPTIMIZE_INLINING=y,
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y.
>
> Now, I'm not sure if this patch would be a good thing or not.
> Comments?
Well if the goal is a small module then I guess it's not a good idea,
but then we should be disabling some other optimizations that
excessively bloat the module. I don't think it's a bad idea, but I'd
be curious where all the extra text comes from. I'm guessing more
inlining and/or loop unrolling.
More information about the Nouveau
mailing list