[PATCH 0/4] [RFC] Sometimes opt for wbinvd over clflush

Ben Widawsky benjamin.widawsky at intel.com
Sat Dec 13 19:08:20 PST 2014


While looking at a recent benchmark on a non-LLC platform, Kristian noticed that
the amount of time simply spent cflushing buffers was not only measurable, but
dominating the profile. It's possible I'm oversimplifying the problem, but it
seems like for cases where we have a slow CPU, and when you know the set of BOs
is using all or most of the cache, wbinvd is the optimal solution. The gains are
about 3.5x FPS on the micro-benchmark with these patches.

These patches attempt to make a generic solution which could potentially be used
by other drivers. It can just as easily be implemented solely in i915, and if
that's what people find more desirable and safe, I am happy to do that as well.

I wouldn't say these patches are ready for inclusion as I haven't spent much
time testing, or polishing them. I would like feedback on what people think of the
general idea. Thoughts on figuring out when to switch over to wbinvd, and in
particular [as mentioned in patch 3] if I even need to do the synchronized
wbinvd. (For the time being, I have convinced myself we can avoid it on i915,
but I am quite often wrong about such things; more details in the relevant
patch.)

PPC specific code is only compile tested.

Thanks.

Ben Widawsky (4):
  drm/cache: Use wbinvd helpers
  drm/cache: Try to be smarter about clflushing on x86
  drm/cache: Return what type of cache flush occurred
  drm/i915: Opportunistically reduce flushing at execbuf

 drivers/gpu/drm/drm_cache.c                | 54 +++++++++++++++++++++---------
 drivers/gpu/drm/i915/i915_drv.h            |  3 +-
 drivers/gpu/drm/i915/i915_gem.c            | 12 +++----
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  8 +++--
 drivers/gpu/drm/i915/intel_lrc.c           |  8 +++--
 include/drm/drmP.h                         | 13 +++++--
 6 files changed, 66 insertions(+), 32 deletions(-)

-- 
2.1.3



More information about the dri-devel mailing list