[Mesa-dev] [PATCH] i965: Enable disk shader cache by default

Thu Dec 7 23:50:03 UTC 2017

On 2017-12-07 09:57:48, Matt Turner wrote:
> On Tue, Dec 5, 2017 at 3:40 PM, Jordan Justen <jordan.l.justen at intel.com> wrote:
> > On 2017-12-05 14:49:28, Mark Janes wrote:
> >> Jordan Justen <jordan.l.justen at intel.com> writes:
> >> > On 2017-12-05 09:13:11, Mark Janes wrote:
> >> >> Adding a dimension to the test matrix has high cost, especially when
> >> >> combined with other dimensions of the test matrix (does shader cache
> >> >> need to be tested for 32/64 bit builds? For hswgt1/hswgt2/hswgt3e?).
> >> >
> >> > Are you saying this is too high cost to run per check-in? Do you need
> >> > to disable it for the health of CI? I think I proposed that daily, or
> >> > perhaps even weekly would be adequate.
> >>
> >> Certainly, the test time per line of shader cache code is massively
> >> higher than any other feature, even if you run it just once a month.
> >> Other features have tests that run in milliseconds, not 30min * 20
> >> machines.
> >
> > The scope of this feature is nearly the entire API. It is justified to
> > throw the entire GL suite of tests at it on a regular basis. The cost
> > of running this once per week ought to be reasonable.
> 
> But the entire API boils down to a comparatively small set of
> non-orthogonal state. The configuration of those nobs seems to me like
> the place things are most likely to break.
> 
> I think there's value in testing that we're hitting the cache, but if
> we're not it's not a functional regression. I'm more concerned about
> ensuring we don't have bugs that affect functionality and cause things
> to break.
> 
> The program key for fragment shaders looks like:
> 
> /** The program key for Fragment/Pixel Shaders. */
> struct brw_wm_prog_key {
>    /* Some collection of BRW_WM_IZ_* */
>    uint8_t iz_lookup;
>    bool stats_wm:1;
>    bool flat_shade:1;
>    unsigned nr_color_regions:5;
>    bool replicate_alpha:1;
>    bool clamp_fragment_color:1;
>    bool persample_interp:1;
>    bool multisample_fbo:1;
>    bool frag_coord_adds_sample_pos:1;
>    enum brw_wm_aa_enable line_aa:2;
>    bool high_quality_derivatives:1;
>    bool force_dual_color_blend:1;
>    bool coherent_fb_fetch:1;
> 
>    uint64_t input_slots_valid;
>    unsigned program_string_id;
>    GLenum alpha_test_func;          /* < For Gen4/5 MRT alpha test */
>    float alpha_test_ref;
> 
>    struct brw_sampler_prog_key_data tex;
> };
> 
> and it's the most complex of the shader stages. Wouldn't you feel a
> lot safer if we just had a piglit test for each of those nobs that
> compiled a shader, then changed some non-orthogonal state, and then
> rendered with it, thus confirming that it didn't get the wrong shader
> program out of the cache?

What would that be testing? That the disk_cache returns the same
program if we give it the same hash? Shouldn't disk_cache unit tests
cover this?

The scope is also more than just covering the various GL states that
might change the i965 program keys. We also need one or more programs
that are actually affected by that key change. We also need to test
that glsl program serialization works. We also need to test that nir
serialization works.

How about maintainability? Once we develop these 50~100 tests, how do
we make sure we update them if we change the i965 program keys?

> I know I've run into cases numerous times where piglit doesn't
> actually test something, or only one seemingly unrelated test in all
> 50 thousand tickles a real bug in my code. I feel uncomfortable
> assuming that piglit's existing coverage is good enough.

This is a reasonable point. Piglit has coverage holes. The CTS has
coverage holes. dEQP has coverage holes. I think the union of all 3
probably still has coverage holes. (But hopefully a fairly small set.)

> In general, I'm still bothered by the idea of just running piglit
> twice. That's a very untargeted approach that doesn't fit well with
> our existing testing model, as Mark has described.

Are you saying that running all of our CI tests twice is a waste of
time? I think that the 'dumb run everything twice' plan actually gets
us much more coverage than we'll likely get if we try to write 50~100
tests focusing on the i965 program keys. So, I don't agree that it is
a waste of time.

I think that the 'dumb run everything twice' plan gets us almost
everything we need, and is more maintainable. If it misses something,
then I think means we don't have good test coverage of the GL feature,
and we should add that test to one of our 3 possible GL test suites.

If you are arguing that the 'dumb run everything twice' plan is
valuable, but inadequate. Fine. I concede that maybe we could try to
add more specific focused tests. I'm not sure that these tests will
actually be added anytime soon, which means that we don't have a plan
for getting the i965 shader cache enabled.

-Jordan