<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Nov 21, 2017 at 9:37 PM, Timothy Arceri <<a href="mailto:tarceri@itsqueeze.com" target="_blank">tarceri@itsqueeze.com</a>> wrote: <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 21/11/17 03:18, Jason Ekstrand wrote:<div><div class="h5"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> On Mon, Nov 20, 2017 at 3:03 AM, Timothy Arceri <<a href="mailto:tarceri@itsqueeze.com" target="_blank">tarceri@itsqueeze.com</a> <mailto:<a href="mailto:tarceri@itsqueeze.com" target="_blank">tarceri@itsqueeze.com</a>>> wrote: On 29/10/17 05:36, Jason Ekstrand wrote: Instead of hashing each stage separately (and TES and TCS together), we hash the entire pipeline. This means we'll get fewer cache hits if they, for instance, re-use the same VS over and over again but it also means we can now safely do cross-stage optimizations. --- src/intel/vulkan/anv_pipeline.c | 151 +++++++++++++++++++++++++--------------- 1 file changed, 94 insertions(+), 57 deletions(-) diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 0ebc90b..e6c4955 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -367,35 +367,70 @@ struct anv_pipeline_stage { const VkSpecializationInfo *spec_info; union brw_any_prog_key key; + + struct { + gl_shader_stage stage; + unsigned char sha1[20]; + } cache_key; }; static void -anv_pipeline_hash_shader(struct anv_pipeline *pipeline, - struct anv_pipeline_stage *stage, - unsigned char *sha1_out) +anv_pipeline_hash_shader(struct mesa_sha1 *ctx, + struct anv_pipeline_stage *stage) { - struct mesa_sha1 ctx; + _mesa_sha1_update(ctx, &stage->stage, sizeof(stage->stage)); + _mesa_sha1_update(ctx, stage->module->sha1, sizeof(stage->module->sha1)); + _mesa_sha1_update(ctx, stage->entrypoint, strlen(stage->entrypoint)); + _mesa_sha1_update(ctx, &stage->stage, sizeof(stage->stage)); + if (stage->spec_info) { + _mesa_sha1_update(ctx, stage->spec_info->pMapEntries, + stage->spec_info->mapEntryCount * + sizeof(*stage->spec_info->pMapEntries)); + _mesa_sha1_update(ctx, stage->spec_info->pData, + stage->spec_info->dataSize); + } + _mesa_sha1_update(ctx, &stage->key, brw_prog_key_size(stage->stage)); +} +static void +anv_pipeline_hash_graphics(struct anv_pipeline *pipeline, + struct anv_pipeline_stage *stages, + unsigned char *sha1_out) +{ + struct mesa_sha1 ctx; _mesa_sha1_init(&ctx); - if (stage->stage != MESA_SHADER_COMPUTE) { - _mesa_sha1_update(&ctx, &pipeline->subpass->view_mask, - sizeof(pipeline->subpass->view_mask)); - } + + _mesa_sha1_update(&ctx, &pipeline->subpass->view_mask, + sizeof(pipeline->subpass->view_mask)); + if (pipeline->layout) { _mesa_sha1_update(&ctx, pipeline->layout->sha1, sizeof(pipeline->layout->sha1)); } - _mesa_sha1_update(&ctx, stage->module->sha1, sizeof(stage->module->sha1)); - _mesa_sha1_update(&ctx, stage->entrypoint, strlen(stage->entrypoint)); - _mesa_sha1_update(&ctx, &stage->stage, sizeof(stage->stage)); - if (stage->spec_info) { - _mesa_sha1_update(&ctx, stage->spec_info->pMapEntries, - stage->spec_info->mapEntryCount * - sizeof(*stage->spec_info->pMapEntries)); - _mesa_sha1_update(&ctx, stage->spec_info->pData, - stage->spec_info->dataSize); + + for (unsigned s = 0; s < MESA_SHADER_STAGES; s++) { + if (stages[s].entrypoint) + anv_pipeline_hash_shader(&ctx, &stages[s]); + } + + _mesa_sha1_final(&ctx, sha1_out); +} + +static void +anv_pipeline_hash_compute(struct anv_pipeline *pipeline, + struct anv_pipeline_stage *stage, + unsigned char *sha1_out) +{ + struct mesa_sha1 ctx; + _mesa_sha1_init(&ctx); + + if (pipeline->layout) { + _mesa_sha1_update(&ctx, pipeline->layout->sha1, + sizeof(pipeline->layout->sha1)); } - _mesa_sha1_update(&ctx, &stage->key, brw_prog_key_size(stage->stage)); + + anv_pipeline_hash_shader(&ctx, stage); + _mesa_sha1_final(&ctx, sha1_out); } @@ -515,12 +550,6 @@ anv_pipeline_compile_vs(struct anv_pipeline *pipeline, const struct brw_compiler *compiler = pipeline->device->instance->physicalDevice.compiler; struct anv_shader_bin *bin = NULL; - unsigned char sha1[20]; - - if (cache) { - anv_pipeline_hash_shader(pipeline, stage, sha1); - bin = anv_pipeline_cache_search(cache, sha1, 20); - } if (bin == NULL) { This is always true now so the if can be removed. Smae goes for the other stages. With that 5-12 are: True, and it does get removed by the end of the series. I just didn't remove it here because I didn't want to permute these functions any more than needed and blow up indentation. I was trying to be nice to the reviewers. :-) </blockquote> </div></div> Ok, no problem :) Once you push this can I temp you into reviewing this series [1]? It should be straight forward adding those passes to ANV also. [1] <a href="https://patchwork.freedesktop.org/series/34131/" rel="noreferrer" target="_blank">https://patchwork.freedesktop.org/series/34131/</a> </blockquote><div> </div><div>I'd like to but I can't guarantee when. Cross-stage optimization isn't exactly on the top of my priority list right now. I'm trying to rework a bunch of stuff so we can finally get this modifiers stuff landed. I do plan to take a look at it though.</div><div> </div><div>--Jason </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Reviewed-by: Timothy Arceri <<a href="mailto:tarceri@itsqueeze.com" target="_blank">tarceri@itsqueeze.com</a> <mailto:<a href="mailto:tarceri@itsqueeze.com" target="_blank">tarceri@itsqueeze.com</a>>><div><div class="h5"> struct brw_vs_prog_data prog_data = {}; @@ -557,7 +586,9 @@ anv_pipeline_compile_vs(struct anv_pipeline *pipeline, } unsigned code_size = prog_data.base.base.program_size; - bin = anv_pipeline_upload_kernel(pipeline, cache, sha1, 20, + bin = anv_pipeline_upload_kernel(pipeline, cache, + &stage->cache_key, + sizeof(stage->cache_key), shader_code, code_size, &prog_data.base.base, sizeof(prog_data), &map); @@ -624,17 +655,6 @@ anv_pipeline_compile_tcs_tes(struct anv_pipeline *pipeline, pipeline->device->instance->physicalDevice.compiler; struct anv_shader_bin *tcs_bin = NULL; struct anv_shader_bin *tes_bin = NULL; - unsigned char tcs_sha1[40]; - unsigned char tes_sha1[40]; - - if (cache) { - anv_pipeline_hash_shader(pipeline, tcs_stage, tcs_sha1); - anv_pipeline_hash_shader(pipeline, tes_stage, tes_sha1); - memcpy(&tcs_sha1[20], tes_sha1, 20); - memcpy(&tes_sha1[20], tcs_sha1, 20); - tcs_bin = anv_pipeline_cache_search(cache, tcs_sha1, sizeof(tcs_sha1)); - tes_bin = anv_pipeline_cache_search(cache, tes_sha1, sizeof(tes_sha1)); - } if (tcs_bin == NULL || tes_bin == NULL) { struct brw_tcs_prog_data tcs_prog_data = {}; @@ -705,7 +725,8 @@ anv_pipeline_compile_tcs_tes(struct anv_pipeline *pipeline, unsigned code_size = tcs_prog_data.base.base.program_size; tcs_bin = anv_pipeline_upload_kernel(pipeline, cache, - tcs_sha1, sizeof(tcs_sha1), + &tcs_stage->cache_key, + sizeof(tcs_stage->cache_key), shader_code, code_size, &tcs_prog_data.base.base, sizeof(tcs_prog_data), @@ -726,7 +747,8 @@ anv_pipeline_compile_tcs_tes(struct anv_pipeline *pipeline, code_size = tes_prog_data.base.base.program_size; tes_bin = anv_pipeline_upload_kernel(pipeline, cache, - tes_sha1, sizeof(tes_sha1), + &tes_stage->cache_key, + sizeof(tes_stage->cache_key), shader_code, code_size, &tes_prog_data.base.base, sizeof(tes_prog_data), @@ -753,12 +775,6 @@ anv_pipeline_compile_gs(struct anv_pipeline *pipeline, const struct brw_compiler *compiler = pipeline->device->instance->physicalDevice.compiler; struct anv_shader_bin *bin = NULL; - unsigned char sha1[20]; - - if (cache) { - anv_pipeline_hash_shader(pipeline, stage, sha1); - bin = anv_pipeline_cache_search(cache, sha1, 20); - } if (bin == NULL) { struct brw_gs_prog_data prog_data = {}; @@ -796,7 +812,9 @@ anv_pipeline_compile_gs(struct anv_pipeline *pipeline, /* TODO: SIMD8 GS */ const unsigned code_size = prog_data.base.base.program_size; - bin = anv_pipeline_upload_kernel(pipeline, cache, sha1, 20, + bin = anv_pipeline_upload_kernel(pipeline, cache, + &stage->cache_key, + sizeof(stage->cache_key), shader_code, code_size, &prog_data.base.base, sizeof(prog_data), &map); @@ -821,7 +839,6 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, const struct brw_compiler *compiler = pipeline->device->instance->physicalDevice.compiler; struct anv_shader_bin *bin = NULL; - unsigned char sha1[20]; /* TODO: we could set this to 0 based on the information in nir_shader, but * we need this before we call spirv_to_nir. @@ -830,11 +847,6 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, &anv_pipeline_get_last_vue_prog_data(pipeline)->vue_map; stage->key.wm.input_slots_valid = vue_map->slots_valid; - if (cache) { - anv_pipeline_hash_shader(pipeline, stage, sha1); - bin = anv_pipeline_cache_search(cache, sha1, 20); - } - if (bin == NULL) { struct brw_wm_prog_data prog_data = {}; struct anv_pipeline_binding surface_to_descriptor[256]; @@ -916,7 +928,9 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, } unsigned code_size = prog_data.base.program_size; - bin = anv_pipeline_upload_kernel(pipeline, cache, sha1, 20, + bin = anv_pipeline_upload_kernel(pipeline, cache, + &stage->cache_key, + sizeof(stage->cache_key), shader_code, code_size, &prog_data.base, sizeof(prog_data), &map); @@ -957,7 +971,7 @@ anv_pipeline_compile_cs(struct anv_pipeline *pipeline, unsigned char sha1[20]; if (cache) { - anv_pipeline_hash_shader(pipeline, &stage, sha1); + anv_pipeline_hash_compute(pipeline, &stage, sha1); bin = anv_pipeline_cache_search(cache, sha1, 20); } @@ -1299,27 +1313,50 @@ anv_pipeline_init(struct anv_pipeline *pipeline, } } - if (stages[MESA_SHADER_VERTEX].entrypoint) { + if (cache) { + unsigned char sha1[20]; + anv_pipeline_hash_graphics(pipeline, stages, sha1); + + for (unsigned s = 0; s < MESA_SHADER_STAGES; s++) { + if (!stages[s].entrypoint) + continue; + + stages[s].cache_key.stage = s; + memcpy(stages[s].cache_key.sha1, sha1, sizeof(sha1)); + + struct anv_shader_bin *bin = + anv_pipeline_cache_search(cache, &stages[s].cache_key, + sizeof(stages[s].cache_key)); + if (bin) + anv_pipeline_add_compiled_stage(pipeline, s, bin); + } + } + + if (stages[MESA_SHADER_VERTEX].entrypoint && + !pipeline->shaders[MESA_SHADER_VERTEX]) { result = anv_pipeline_compile_vs(pipeline, cache, &stages[MESA_SHADER_VERTEX]); if (result != VK_SUCCESS) goto compile_fail; } - if (stages[MESA_SHADER_TESS_EVAL].entrypoint) { + if (stages[MESA_SHADER_TESS_EVAL].entrypoint && + !pipeline->shaders[MESA_SHADER_TESS_EVAL]) { anv_pipeline_compile_tcs_tes(pipeline, cache, &stages[MESA_SHADER_TESS_CTRL], &stages[MESA_SHADER_TESS_EVAL]); } - if (stages[MESA_SHADER_GEOMETRY].entrypoint) { + if (stages[MESA_SHADER_GEOMETRY].entrypoint && + !pipeline->shaders[MESA_SHADER_GEOMETRY]) { result = anv_pipeline_compile_gs(pipeline, cache, &stages[MESA_SHADER_GEOMETRY]); if (result != VK_SUCCESS) goto compile_fail; } - if (stages[MESA_SHADER_FRAGMENT].entrypoint) { + if (stages[MESA_SHADER_FRAGMENT].entrypoint && + !pipeline->shaders[MESA_SHADER_FRAGMENT]) { result = anv_pipeline_compile_fs(pipeline, cache, &stages[MESA_SHADER_FRAGMENT]); if (result != VK_SUCCESS) </div></div></blockquote> </blockquote></div> </div></div>