[Mesa-dev] [PATCH] i965: Don't check for draw-time errors that cannot occur in core profile

Ilia Mirkin imirkin at alum.mit.edu
Mon Aug 31 16:21:43 PDT 2015


On Mon, Aug 31, 2015 at 7:06 PM, Ian Romanick <idr at freedesktop.org> wrote:
> ping. :)
>
> On 08/10/2015 11:48 AM, Matt Turner wrote:
>> On Mon, Aug 10, 2015 at 10:12 AM, Ian Romanick <idr at freedesktop.org> wrote:
>>> From: Ian Romanick <ian.d.romanick at intel.com>
>>>
>>> On many CPU-limited applications, this is *the* hot path.  The idea is
>>> to generate per-API versions of brw_draw_prims that elide some checks.
>>> This patch removes render-mode and "is everything in VBOs" checks from
>>> core-profile contexts.
>>>
>>> On my IVB laptop (which may have experienced thermal throttling):
>>>
>>> Gl32Batch7:     3.70955% +/- 1.11344%
>>
>> I'm getting 3.18414% +/- 0.587956% (n=113) on my IVB, , which probably
>> matches your numbers depending on your value of n.
>>
>>> OglBatch7:      1.04398% +/- 0.772788%
>>
>> I'm getting 1.15377% +/- 1.05898% (n=34) on my IVB, which probably
>> matches your numbers depending on your value of n.
>
> This is another thing that make me feel a little uncomfortable with the
> way we've done performance measurements in the past.  If I run my test
> before and after this patch for 121 iterations, which I have done, I can
> cut the data at any point and oscillate between "no difference" or X%
> +/- some-large-fraction-of-X%.  Since the before and after code for the
> compatibility profile path should be identical, "no difference" is the
> only believable result.
>
> Using a higher confidence threshold (e.g., -c 98) results in "no
> difference" throughout, as expected.  I feel like 90% isn't a tight
> enough confidence interval for a lot of what we do, but I'm unsure how
> to determine what confidence level we should use.  We could
> experimentally determine it by running a test some number of times and
> finding the interval that detects no change in some random partitioning
> of the test results.  Ugh.

(sorry, statistics rant below, can't help myself)

AFAIK the standard in statistics is to use a 95% confidence interval.
Unless you have 'statistician' in your job title [or someone with that
job title has indicated otherwise], that's what you should probably
use. Using anything lower than that is a way of saying "This
scientific study isn't turning out the way I wanted, I'm going to have
to take matters into my own hands".

Of course note that if you do run the same experiment 20 times, you
should expect one of those 20 times to yield a confidence interval
that does not include the true mean. And in general I'd be very
suspicious of results where the change is near the confidence interval
boundary.

And lastly, all this statistics stuff assumes that you're evaluating
the same normal distribution repeatedly. This isn't exactly true, but
it's true enough. However you can try to get more accurate by
experimentally determining a fudge factor on the CI width. You could
run (literal) no-op experiments lots of times and fudge the output of
the CI width calculation until it matches up with empirical results,
e.g. if you run the same experiment 100x and use a 95% CI, fudge the
outcome until you end up with 5 significant results and 95
insignificant ones. Ideally such a fudge factor should not be too
different from 1, or else you have a very non-normal distribution, and
fudge factors ain't gonna help you. Note that any computed fudge
factors could not be shared among machine groups that weren't used to
determine them, so I'm not seriously recommending this approach, but
thought I'd mention it.

  -ilia


More information about the mesa-dev mailing list