[Intel-gfx] [PATCH 4/4] [v4] drm/i915: Convert execbuf code to use vmas

Ben Widawsky ben at bwidawsk.net
Thu Aug 15 01:22:42 CEST 2013


On Wed, Aug 14, 2013 at 11:43:58PM +0100, Chris Wilson wrote:
> These are my numbers for a beefy haswell box (note the really
> interesting numbers will be on Baytrail):
> 
> unpatched:
> 
> relocation: buffers=   1: old=  21945 + 34.4*reloc, lut=  21814 + 34.0*reloc (ns)
> relocation: buffers=   2: old=  15947 + 36.4*reloc, lut=  16169 + 35.4*reloc (ns)
> relocation: buffers=   4: old=  12711 + 37.6*reloc, lut=  13039 + 36.7*reloc (ns)
> relocation: buffers=   8: old=   6154 + 40.9*reloc, lut=   7201 + 38.9*reloc (ns)
> relocation: buffers=  16: old=   4846 + 41.6*reloc, lut=   5337 + 40.6*reloc (ns)
> relocation: buffers=  32: old=   7097 + 41.9*reloc, lut=   6943 + 41.0*reloc (ns)
> relocation: buffers=  64: old=  13318 + 41.9*reloc, lut=  12748 + 41.2*reloc (ns)
> relocation: buffers= 128: old=  27282 + 43.0*reloc, lut=  25778 + 41.7*reloc (ns)
> relocation: buffers= 256: old=  54535 + 45.2*reloc, lut=  51912 + 43.7*reloc (ns)
> relocation: buffers= 512: old= 137447 + 53.2*reloc, lut= 129333 + 45.5*reloc (ns)
> relocation: buffers=1024: old= 307347 + 66.5*reloc, lut= 291487 + 48.1*reloc (ns)
> relocation: buffers=2048: old= 606300 + 92.1*reloc, lut= 574774 + 51.6*reloc (ns)
> skip-relocs: buffers=   1: old=   1583 + 15.6*reloc, lut=   1516 + 14.5*reloc (ns)
> skip-relocs: buffers=   2: old=   1621 + 15.6*reloc, lut=   1603 + 14.5*reloc (ns)
> skip-relocs: buffers=   4: old=   1791 + 15.6*reloc, lut=   1777 + 14.5*reloc (ns)
> skip-relocs: buffers=   8: old=   2009 + 15.6*reloc, lut=   2024 + 14.6*reloc (ns)
> skip-relocs: buffers=  16: old=   2637 + 15.7*reloc, lut=   2564 + 14.6*reloc (ns)
> skip-relocs: buffers=  32: old=   3835 + 15.8*reloc, lut=   3785 + 14.7*reloc (ns)
> skip-relocs: buffers=  64: old=   6996 + 15.8*reloc, lut=   6681 + 14.7*reloc (ns)
> skip-relocs: buffers= 128: old=  14333 + 16.4*reloc, lut=  13560 + 15.2*reloc (ns)
> skip-relocs: buffers= 256: old=  28092 + 17.7*reloc, lut=  26759 + 16.2*reloc (ns)
> skip-relocs: buffers= 512: old=  70885 + 25.2*reloc, lut=  66713 + 17.9*reloc (ns)
> skip-relocs: buffers=1024: old= 158520 + 35.2*reloc, lut= 150828 + 20.1*reloc (ns)
> skip-relocs: buffers=2048: old= 314208 + 54.3*reloc, lut= 298343 + 22.1*reloc (ns)
> no-relocs: buffers=   1: old=   1533 + 5.2*reloc, lut=   1498 + 4.9*reloc (ns)
> no-relocs: buffers=   2: old=   1518 + 5.2*reloc, lut=   1505 + 4.9*reloc (ns)
> no-relocs: buffers=   4: old=   1647 + 5.2*reloc, lut=   1593 + 4.9*reloc (ns)
> no-relocs: buffers=   8: old=   1882 + 5.3*reloc, lut=   1874 + 5.0*reloc (ns)
> no-relocs: buffers=  16: old=   2399 + 5.3*reloc, lut=   2341 + 5.0*reloc (ns)
> no-relocs: buffers=  32: old=   3638 + 5.3*reloc, lut=   3554 + 5.0*reloc (ns)
> no-relocs: buffers=  64: old=   6622 + 5.3*reloc, lut=   6308 + 5.1*reloc (ns)
> no-relocs: buffers= 128: old=  13584 + 5.3*reloc, lut=  12872 + 5.1*reloc (ns)
> no-relocs: buffers= 256: old=  26519 + 5.8*reloc, lut=  25234 + 5.5*reloc (ns)
> no-relocs: buffers= 512: old=  67128 + 5.4*reloc, lut=  63054 + 5.2*reloc (ns)
> no-relocs: buffers=1024: old= 146705 + 5.2*reloc, lut= 139020 + 5.1*reloc (ns)
> no-relocs: buffers=2048: old= 290319 + 5.4*reloc, lut= 274705 + 5.4*reloc (ns)
> 
> vma(execbuffer):
> 
> relocation: buffers=   1: old=  21922 + 34.6*reloc, lut=  21510 + 34.0*reloc (ns)
> relocation: buffers=   2: old=  16851 + 37.4*reloc, lut=  17123 + 35.4*reloc (ns)
> relocation: buffers=   4: old=  13234 + 37.8*reloc, lut=  13436 + 36.9*reloc (ns)
> relocation: buffers=   8: old=   6549 + 40.8*reloc, lut=   6512 + 39.8*reloc (ns)
> relocation: buffers=  16: old=   5012 + 41.8*reloc, lut=   4883 + 41.0*reloc (ns)
> relocation: buffers=  32: old=   8591 + 42.2*reloc, lut=   8377 + 41.1*reloc (ns)
> relocation: buffers=  64: old=  16051 + 42.8*reloc, lut=  15658 + 41.7*reloc (ns)
> relocation: buffers= 128: old=  33397 + 44.5*reloc, lut=  32705 + 43.3*reloc (ns)
> relocation: buffers= 256: old=  68012 + 46.8*reloc, lut=  66904 + 45.5*reloc (ns)
> relocation: buffers= 512: old= 160162 + 56.4*reloc, lut= 155586 + 49.1*reloc (ns)
> relocation: buffers=1024: old= 348728 + 71.8*reloc, lut= 338113 + 55.1*reloc (ns)
> relocation: buffers=2048: old= 699331 + 98.7*reloc, lut= 675969 + 62.2*reloc (ns)
> skip-relocs: buffers=   1: old=   1642 + 16.5*reloc, lut=   1588 + 15.6*reloc (ns)
> skip-relocs: buffers=   2: old=   1676 + 16.4*reloc, lut=   1663 + 15.6*reloc (ns)
> skip-relocs: buffers=   4: old=   1926 + 16.4*reloc, lut=   1891 + 15.6*reloc (ns)
> skip-relocs: buffers=   8: old=   2218 + 16.6*reloc, lut=   2212 + 15.7*reloc (ns)
> skip-relocs: buffers=  16: old=   2933 + 16.6*reloc, lut=   2880 + 15.7*reloc (ns)
> skip-relocs: buffers=  32: old=   4594 + 16.6*reloc, lut=   4523 + 15.8*reloc (ns)
> skip-relocs: buffers=  64: old=   8414 + 16.8*reloc, lut=   8210 + 15.9*reloc (ns)
> skip-relocs: buffers= 128: old=  17429 + 17.9*reloc, lut=  17062 + 16.8*reloc (ns)
> skip-relocs: buffers= 256: old=  34794 + 19.8*reloc, lut=  34144 + 18.4*reloc (ns)
> skip-relocs: buffers= 512: old=  82287 + 27.6*reloc, lut=  80002 + 20.8*reloc (ns)
> skip-relocs: buffers=1024: old= 179851 + 38.0*reloc, lut= 174574 + 23.9*reloc (ns)
> skip-relocs: buffers=2048: old= 361511 + 57.2*reloc, lut= 350132 + 26.8*reloc (ns)
> no-relocs: buffers=   1: old=   1581 + 5.2*reloc, lut=   1579 + 4.9*reloc (ns)
> no-relocs: buffers=   2: old=   1609 + 5.2*reloc, lut=   1572 + 4.9*reloc (ns)
> no-relocs: buffers=   4: old=   1701 + 5.3*reloc, lut=   1685 + 4.9*reloc (ns)
> no-relocs: buffers=   8: old=   2084 + 5.3*reloc, lut=   2033 + 5.0*reloc (ns)
> no-relocs: buffers=  16: old=   2747 + 5.3*reloc, lut=   2686 + 5.0*reloc (ns)
> no-relocs: buffers=  32: old=   4379 + 5.3*reloc, lut=   4285 + 5.0*reloc (ns)
> no-relocs: buffers=  64: old=   8049 + 5.3*reloc, lut=   7850 + 5.1*reloc (ns)
> no-relocs: buffers= 128: old=  16641 + 5.4*reloc, lut=  16301 + 5.2*reloc (ns)
> no-relocs: buffers= 256: old=  33111 + 5.7*reloc, lut=  32539 + 5.5*reloc (ns)
> no-relocs: buffers= 512: old=  79898 + 5.4*reloc, lut=  77517 + 5.2*reloc (ns)
> no-relocs: buffers=1024: old= 172199 + 5.2*reloc, lut= 166907 + 5.1*reloc (ns)
> no-relocs: buffers=2048: old= 345542 + 5.2*reloc, lut= 334300 + 5.3*reloc (ns)
> 
> So there is measurable degradation for the extra indirections, both for
> looking up the execbuffers and for performing the relocations. Though it
> doesn't merit anything more than a footnote in the changelog.
> -Chris
> 

I'm sad I can't reproduce it. I think I amended the commit message
already, I can do more if you want.

-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list