[Intel-gfx] [PATCH] drm/i915: Update MOCS settings for gen 9

Arkadiusz Hiler arkadiusz.hiler at intel.com
Thu May 4 08:35:33 UTC 2017


On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote:
> On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote:
> > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote:
> > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote:
> > > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs.
> > > > Some of these are used by media-sdk; if these entries are missing
> > > > the default will instead be to do everything uncached.
> > > > 
> > > > This patch improves media-sdk performance with up to 60%
> > > > with the (admittedly synthetic) benchmarks we use in our nightly
> > > > testing, without regressing any other benchmarks.
> > > 
> > > Hey David,
> > > 
> > > I am testing some of the extended MOCS with Mesa and the differences I
> > > see fit in the margins of statistical error.
> > > 
> > > Odd, I thought, so to make sure I haven't messed up anything in the
> > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned
> > > everything to UNCACHED - and I saw severe performance drop.
> > > 
> > > So here is the question it induced:
> > > 
> > > Have you used the "closest neighbour" from entries available or did you
> > > defaulted to the UNCACHED ones? That could be the culprit.
> > > 
> > > Note: I have tested MOCS for VB and Render Target only, and only in a
> > > few synthetic cases - it will require much more fine-tuning and
> > > benchmarking before any final conclusions.
> > 
> > As I mentioned in the commit message, the improvements only manifest
> > themselves for media-sdk workloads (and presumably other workloads
> > that uses the same hardware); if you see any performance regressions
> > with these additional entries I'd be interested to know.
> 
> But what is being counter suggested is that their is no reason for these
> mocs entries. If the sdk is just using mocs registers without first
> programming them outside of the kernel abi, then it will be hitting
> uncached memory - and then the only benefit is from simply enabling
> cached access. The kernel ABI is minimalist for a reason, and we want to
> know why we should be adding tables that we need to maintain forever
> (bonus points for making that a consistent interface for hardware for
> years to come).
> -Chris

Thanks for rephrasing - that's exactly what I am concerned with.

Did you just use the MediaSDK as it is - meaning that MOCS entries
beyond the set of the 3 we have defined had been naively utilized?

If that's the case it is probably the cause of the performance
difference - everything beyond "the 3" means UNCACHED.

Can you try changing MediaSDK to only use entries that are already in?
How the performance differs in that case?

-- 
Cheers,
Arek




More information about the Intel-gfx mailing list