[Intel-gfx] [PATCH 4/4] [v4] drm/i915: Convert execbuf code to use vmas

Ben Widawsky ben at bwidawsk.net
Wed Aug 14 03:11:59 CEST 2013


On Tue, Aug 13, 2013 at 06:09:09PM -0700, Ben Widawsky wrote:
> From: Ben Widawsky <ben at bwidawsk.net>
> 
> In order to transition more of our code over to using a VMA instead of
> an <OBJ, VM> pair - we must have the vma accessible at execbuf time. Up
> until now, we've only had a VMA when actually binding an object.
> 
> The previous patch helped handle the distinction on bound vs. unbound.
> This patch will help us catch leaks, and other issues before we actually
> shuffle a bunch of stuff around.
> 
> This attempts to convert all the execbuf code to speak in vmas. Since
> the execbuf code is very self contained it was a nice isolated
> conversion.
> 
> The meat of the code is about turning eb_objects into eb_vma, and then
> wiring up the rest of the code to use vmas instead of obj, vm pairs.
> 
> Unfortunately, to do this, we must move the exec_list link from the obj
> structure. This list is reused in the eviction code, so we must also
> modify the eviction code to make this work.
> 
> WARNING: This patch makes an already hotly profiled path slower. The cost is
> unavoidable. In reply to this mail, I will attach the extra data.
> 

[snip]

Here is the output from gem_exec_lut_handle both before and after this
patch. The results honestly don't make sense to me, but I'll set Chris
parse it before scratching my head harder.

Before patch
============
relocation: buffers=   1: old=   8060 + 165.3*reloc, lut=   7816 + 164.8*reloc (ns)
relocation: buffers=   2: old=   6748 + 166.6*reloc, lut=   6952 + 165.4*reloc (ns)
relocation: buffers=   4: old=   8140 + 165.9*reloc, lut=   8216 + 165.4*reloc (ns)
relocation: buffers=   8: old=  10732 + 166.0*reloc, lut=  10615 + 165.2*reloc (ns)
relocation: buffers=  16: old=  15099 + 167.8*reloc, lut=  15337 + 165.3*reloc (ns)
relocation: buffers=  32: old=  26140 + 166.0*reloc, lut=  25488 + 165.5*reloc (ns)
relocation: buffers=  64: old=  46300 + 170.5*reloc, lut=  44279 + 166.7*reloc (ns)
relocation: buffers= 128: old=  84056 + 176.9*reloc, lut=  85379 + 166.3*reloc (ns)
relocation: buffers= 256: old= 174398 + 167.9*reloc, lut= 167744 + 167.0*reloc (ns)
relocation: buffers= 512: old= 349688 + 175.7*reloc, lut= 348590 + 170.8*reloc (ns)
relocation: buffers=1024: old= 726265 + 191.2*reloc, lut= 719774 + 180.2*reloc (ns)
relocation: buffers=2048: old=1456866 + 224.3*reloc, lut=1442087 + 173.0*reloc (ns)
skip-relocs: buffers=   1: old=   4445 + 16.0*reloc, lut=   4433 + 15.6*reloc (ns)
skip-relocs: buffers=   2: old=   4585 + 16.0*reloc, lut=   4571 + 15.6*reloc (ns)
skip-relocs: buffers=   4: old=   5667 + 16.0*reloc, lut=   5340 + 15.6*reloc (ns)
skip-relocs: buffers=   8: old=   6051 + 16.1*reloc, lut=   6026 + 15.6*reloc (ns)
skip-relocs: buffers=  16: old=   7953 + 16.1*reloc, lut=   7914 + 15.6*reloc (ns)
skip-relocs: buffers=  32: old=  11972 + 16.2*reloc, lut=  11875 + 15.7*reloc (ns)
skip-relocs: buffers=  64: old=  19999 + 16.5*reloc, lut=  19832 + 15.7*reloc (ns)
skip-relocs: buffers= 128: old=  37796 + 16.9*reloc, lut=  36539 + 15.9*reloc (ns)
skip-relocs: buffers= 256: old=  71604 + 18.1*reloc, lut=  71313 + 16.5*reloc (ns)
skip-relocs: buffers= 512: old= 152682 + 24.3*reloc, lut= 141379 + 27.9*reloc (ns)
skip-relocs: buffers=1024: old= 314116 + 41.7*reloc, lut= 303019 + 20.1*reloc (ns)
skip-relocs: buffers=2048: old= 619784 + 54.1*reloc, lut= 603931 + 20.0*reloc (ns)
no-relocs: buffers=   1: old=   4194 + 5.1*reloc, lut=   4206 + 4.8*reloc (ns)
no-relocs: buffers=   2: old=   4404 + 5.1*reloc, lut=   4381 + 4.8*reloc (ns)
no-relocs: buffers=   4: old=   4926 + 5.1*reloc, lut=   4921 + 4.8*reloc (ns)
no-relocs: buffers=   8: old=   5901 + 5.1*reloc, lut=   5822 + 4.9*reloc (ns)
no-relocs: buffers=  16: old=   7840 + 5.1*reloc, lut=   7737 + 4.9*reloc (ns)
no-relocs: buffers=  32: old=  11842 + 5.1*reloc, lut=  11681 + 4.9*reloc (ns)
no-relocs: buffers=  64: old=  19741 + 5.1*reloc, lut=  19542 + 4.8*reloc (ns)
no-relocs: buffers= 128: old=  36479 + 5.2*reloc, lut=  35958 + 4.9*reloc (ns)
no-relocs: buffers= 256: old=  70171 + 5.4*reloc, lut=  69390 + 5.2*reloc (ns)
no-relocs: buffers= 512: old= 147213 + 3.5*reloc, lut= 137953 + 13.0*reloc (ns)
no-relocs: buffers=1024: old= 300165 + 4.8*reloc, lut= 293852 + 4.9*reloc (ns)
no-relocs: buffers=2048: old= 597992 + 8.3*reloc, lut= 590185 + 2.1*reloc (ns)


After patch
===========
relocation: buffers=   1: old=   8075 + 81.4*reloc, lut=   7592 + 80.6*reloc (ns)
relocation: buffers=   2: old=   5744 + 82.3*reloc, lut=   5837 + 81.1*reloc (ns)
relocation: buffers=   4: old=   4875 + 82.7*reloc, lut=   4871 + 81.6*reloc (ns)
relocation: buffers=   8: old=   5729 + 82.7*reloc, lut=   5698 + 81.5*reloc (ns)
relocation: buffers=  16: old=   7952 + 83.0*reloc, lut=   7809 + 81.9*reloc (ns)
relocation: buffers=  32: old=  11884 + 82.9*reloc, lut=  11702 + 81.6*reloc (ns)
relocation: buffers=  64: old=  20388 + 83.4*reloc, lut=  19995 + 82.2*reloc (ns)
relocation: buffers= 128: old=  38057 + 85.0*reloc, lut=  37675 + 83.4*reloc (ns)
relocation: buffers= 256: old=  74912 + 87.0*reloc, lut=  74064 + 85.4*reloc (ns)
relocation: buffers= 512: old= 161136 + 94.8*reloc, lut= 157046 + 87.5*reloc (ns)
relocation: buffers=1024: old= 349443 + 107.0*reloc, lut= 342081 + 91.2*reloc (ns)
relocation: buffers=2048: old= 707951 + 131.8*reloc, lut= 690754 + 96.9*reloc (ns)
skip-relocs: buffers=   1: old=   2966 + 16.6*reloc, lut=   2963 + 15.6*reloc (ns)
skip-relocs: buffers=   2: old=   3083 + 16.5*reloc, lut=   3056 + 15.5*reloc (ns)
skip-relocs: buffers=   4: old=   3279 + 16.6*reloc, lut=   3242 + 15.6*reloc (ns)
skip-relocs: buffers=   8: old=   3692 + 16.7*reloc, lut=   3654 + 15.6*reloc (ns)
skip-relocs: buffers=  16: old=   4522 + 16.7*reloc, lut=   4461 + 15.5*reloc (ns)
skip-relocs: buffers=  32: old=   6254 + 16.7*reloc, lut=   6138 + 15.7*reloc (ns)
skip-relocs: buffers=  64: old=  10098 + 16.8*reloc, lut=   9939 + 15.7*reloc (ns)
skip-relocs: buffers= 128: old=  17983 + 17.6*reloc, lut=  17729 + 16.3*reloc (ns)
skip-relocs: buffers= 256: old=  34388 + 18.8*reloc, lut=  33981 + 17.6*reloc (ns)
skip-relocs: buffers= 512: old=  74211 + 25.2*reloc, lut=  72185 + 18.6*reloc (ns)
skip-relocs: buffers=1024: old= 160514 + 34.1*reloc, lut= 157086 + 20.3*reloc (ns)
skip-relocs: buffers=2048: old= 323954 + 51.5*reloc, lut= 315928 + 22.5*reloc (ns)
no-relocs: buffers=   1: old=   2840 + 5.1*reloc, lut=   2834 + 4.8*reloc (ns)
no-relocs: buffers=   2: old=   2938 + 5.1*reloc, lut=   2917 + 4.8*reloc (ns)
no-relocs: buffers=   4: old=   3220 + 5.1*reloc, lut=   3201 + 4.8*reloc (ns)
no-relocs: buffers=   8: old=   3614 + 5.1*reloc, lut=   3545 + 4.8*reloc (ns)
no-relocs: buffers=  16: old=   4437 + 5.1*reloc, lut=   4368 + 4.8*reloc (ns)
no-relocs: buffers=  32: old=   6105 + 5.1*reloc, lut=   6024 + 4.9*reloc (ns)
no-relocs: buffers=  64: old=   9864 + 5.1*reloc, lut=   9652 + 4.9*reloc (ns)
no-relocs: buffers= 128: old=  17388 + 5.1*reloc, lut=  17126 + 4.9*reloc (ns)
no-relocs: buffers= 256: old=  33087 + 5.4*reloc, lut=  32668 + 5.3*reloc (ns)
no-relocs: buffers= 512: old=  71476 + 5.0*reloc, lut=  69464 + 4.9*reloc (ns)
no-relocs: buffers=1024: old= 154379 + 4.9*reloc, lut= 152796 + 4.3*reloc (ns)
no-relocs: buffers=2048: old= 309435 + 5.0*reloc, lut= 301095 + 4.9*reloc (ns)


-- 
Ben Widawsky, Intel Open Source Technology Center



More information about the Intel-gfx mailing list