[Mesa-dev] [RFC] i965: Resolve color for all active shader images in intel_update_state().
Francisco Jerez
currojerez at riseup.net
Thu Oct 29 05:16:53 PDT 2015
Francisco Jerez <currojerez at riseup.net> writes:
> Chris Wilson <chris at chris-wilson.co.uk> writes:
>
>> On Sat, Oct 03, 2015 at 05:57:05PM +0300, Francisco Jerez wrote:
>>> Jordan Justen <jordan.l.justen at intel.com> writes:
>>>
>>> > From: Francisco Jerez <currojerez at riseup.net>
>>> >
>>> > Fixes arb_shader_image_load_store/execution/load-from-cleared-image.shader_test
>>> >
>>> > Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>> > Cc: Jason Ekstrand <jason.ekstrand at intel.com>
>>> > Tested-by: Jordan Justen <jordan.l.justen at intel.com>
>>> > ---
>>> > RE: i965: Perform an explicit flush after doing _mesa_meta_pbo_TexSubImage
>>> >
>>> > curro has some concerns about potential perf impact by this and
>>> > wanted it to be checked on small-core w/CPU bound apps.
>>> > Unfortunately, he is on vacation now.
>>>
>>> I've benchmarked this on VLV and none of the CPU-bound tests in the
>>> Finnish benchmarking system regress significantly, with n=6 and 95%
>>> confidence level, so s/RFC/PATCH/. I'll CC mesa-stable so it probably
>>> makes sense to keep this independent from Chris' VBO resolve series.
>>
>> I ran patch this on bsw (and repeated it afresh just to be sure) using synmark:Ogl*:
>>
>> 6994ca2 glsl: fix whitespace
>> synmark:OglBatch3: 277.02 (+0.00%): min/p50/90/95/99/max/std = 271.902 / 277.024 / 277.952 / 278.117 / 278.233 / 278.409 / 1.27475 n=30
>> synmark:OglBatch3:cpu: 434.92 (+0.00%): min/p50/90/95/99/max/std = 429.869 / 434.755 / 437.818 / 438.236 / 439.022 / 439.251 / 2.19871 n=30
>> synmark:OglBatch4: 154.76 (+0.00%): min/p50/90/95/99/max/std = 153.394 / 154.75 / 155.636 / 155.643 / 156.089 / 156.172 / 0.721397 n=30
>> synmark:OglBatch4:cpu: 176.84 (+0.00%): min/p50/90/95/99/max/std = 176.239 / 176.838 / 177.068 / 177.394 / 177.451 / 177.46 / 0.273547 n=30
>> synmark:OglBatch5: 46.59 (+0.00%): min/p50/90/95/99/max/std = 45.9918 / 46.6053 / 46.819 / 46.842 / 46.8459 / 46.8538 / 0.26363 n=30
>> synmark:OglBatch5:cpu: 52.79 (+0.00%): min/p50/90/95/99/max/std = 52.1812 / 52.6714 / 53.3544 / 53.3605 / 53.3726 / 53.4059 / 0.402148 n=30
>> synmark:OglBatch6: 11.95 (+0.00%): min/p50/90/95/99/max/std = 11.7449 / 11.9523 / 12.0025 / 12.0026 / 12.0097 / 12.0304 / 0.0771611 n=30
>> synmark:OglBatch6:cpu: 14.17 (+0.00%): min/p50/90/95/99/max/std = 14.0292 / 14.169 / 14.1863 / 14.1889 / 14.1963 / 14.1999 / 0.0387371 n=30
>> synmark:OglBatch7: 3.04 (+0.00%): min/p50/90/95/99/max/std = 3.00939 / 3.03578 / 3.05513 / 3.05555 / 3.05582 / 3.05847 / 0.0148493 n=30
>> synmark:OglBatch7:cpu: 3.66 (+0.00%): min/p50/90/95/99/max/std = 3.63355 / 3.66219 / 3.66685 / 3.66706 / 3.66789 / 3.66943 / 0.00683119 n=30
>> Patched
>> synmark:OglBatch3: 276.06 (-0.35%): min/p50/90/95/99/max/std = 269.608 / 276.098 / 277.25 / 277.354 / 277.967 / 278.013 / 2.0933 n=30
>> synmark:OglBatch3:cpu: 415.49 (-4.47%): min/p50/90/95/99/max/std = 412.316 / 415.554 / 417.828 / 417.9 / 418.471 / 419.22 / 1.84629 n=30
>> synmark:OglBatch4: 144.26 (-6.78%): min/p50/90/95/99/max/std = 143.126 / 144.188 / 145.026 / 145.114 / 145.126 / 145.356 / 0.527859 n=30
>> synmark:OglBatch4:cpu: 161.82 (-8.49%): min/p50/90/95/99/max/std = 161.247 / 161.82 / 162.12 / 162.169 / 162.172 / 162.222 / 0.254633 n=30
>> synmark:OglBatch5: 42.44 (-8.91%): min/p50/90/95/99/max/std = 41.856 / 42.4856 / 42.7209 / 42.8424 / 42.8436 / 42.8441 / 0.287101 n=30
>> synmark:OglBatch5:cpu: 47.90 (-9.27%): min/p50/90/95/99/max/std = 47.4268 / 47.7758 / 48.4164 / 48.4775 / 48.5086 / 48.5284 / 0.341355 n=30
>> synmark:OglBatch6: 10.86 (-9.09%): min/p50/90/95/99/max/std = 10.7564 / 10.8818 / 10.9238 / 10.926 / 10.9279 / 10.9535 / 0.0619808 n=30
>> synmark:OglBatch6:cpu: 12.80 (-9.62%): min/p50/90/95/99/max/std = 12.7149 / 12.8064 / 12.8179 / 12.8228 / 12.8249 / 12.8255 / 0.0235037 n=30
>> synmark:OglBatch7: 2.76 (-9.02%): min/p50/90/95/99/max/std = 2.74078 / 2.7634 / 2.77936 / 2.78025 / 2.78254 / 2.7827 / 0.0126239 n=30
>> synmark:OglBatch7:cpu: 3.29 (-10.12%): min/p50/90/95/99/max/std = 3.26676 / 3.29127 / 3.29445 / 3.29464 / 3.29472 / 3.29616 / 0.0072781 n=30
>>
>>
>> 6994ca2 glsl: fix whitespace
>> synmark:OglBatch3: 276.90 (+0.00%): min/p50/90/95/99/max/std = 274.104 / 276.81 / 277.697 / 278.063 / 278.067 / 278.505 / 0.914328 n=30
>> synmark:OglBatch3:cpu: 434.96 (+0.00%): min/p50/90/95/99/max/std = 429.492 / 434.784 / 437.174 / 437.482 / 437.548 / 439.812 / 2.09205 n=30
>> synmark:OglBatch4: 154.06 (+0.00%): min/p50/90/95/99/max/std = 152.336 / 153.995 / 155.37 / 155.446 / 155.544 / 155.636 / 0.919322 n=30
>> synmark:OglBatch4:cpu: 176.45 (+0.00%): min/p50/90/95/99/max/std = 175.959 / 176.435 / 176.686 / 176.718 / 176.892 / 176.9 / 0.247188 n=30
>> synmark:OglBatch5: 45.88 (+0.00%): min/p50/90/95/99/max/std = 45.2706 / 45.7631 / 46.6576 / 46.6662 / 46.6929 / 46.7339 / 0.485474 n=30
>> synmark:OglBatch5:cpu: 52.65 (+0.00%): min/p50/90/95/99/max/std = 52.025 / 52.5863 / 53.0849 / 53.0974 / 53.1337 / 53.1717 / 0.306497 n=30
>> synmark:OglBatch6: 11.88 (+0.00%): min/p50/90/95/99/max/std = 11.7448 / 11.8725 / 11.9216 / 11.9337 / 11.935 / 11.9444 / 0.0534648 n=30
>> synmark:OglBatch6:cpu: 14.13 (+0.00%): min/p50/90/95/99/max/std = 14.0528 / 14.1313 / 14.1533 / 14.1584 / 14.1585 / 14.1721 / 0.0227182 n=30
>> synmark:OglBatch7: 3.02 (+0.00%): min/p50/90/95/99/max/std = 2.99852 / 3.02145 / 3.04142 / 3.04273 / 3.04422 / 3.04584 / 0.0141605 n=30
>> synmark:OglBatch7:cpu: 3.66 (+0.00%): min/p50/90/95/99/max/std = 3.65238 / 3.66158 / 3.66565 / 3.66662 / 3.66722 / 3.66726 / 0.00423393 n=30
>> Patched
>> synmark:OglBatch3: 275.73 (-0.42%): min/p50/90/95/99/max/std = 269.989 / 275.696 / 277.249 / 277.262 / 277.306 / 277.577 / 1.59707 n=30
>> synmark:OglBatch3:cpu: 412.83 (-5.09%): min/p50/90/95/99/max/std = 410.283 / 412.828 / 415.52 / 415.523 / 416.087 / 416.527 / 1.66123 n=30
>> synmark:OglBatch4: 144.29 (-6.34%): min/p50/90/95/99/max/std = 142.661 / 144.363 / 145.052 / 145.105 / 145.147 / 145.276 / 0.763492 n=30
>> synmark:OglBatch4:cpu: 161.53 (-8.45%): min/p50/90/95/99/max/std = 160.928 / 161.522 / 161.847 / 162.003 / 162.048 / 162.116 / 0.263058 n=30
>> synmark:OglBatch5: 41.75 (-9.01%): min/p50/90/95/99/max/std = 41.3497 / 41.7404 / 41.9044 / 41.9791 / 42.0902 / 42.1262 / 0.166136 n=30
>> synmark:OglBatch5:cpu: 47.91 (-9.02%): min/p50/90/95/99/max/std = 47.53 / 47.8143 / 48.3434 / 48.3721 / 48.4509 / 48.4631 / 0.292788 n=30
>> synmark:OglBatch6: 10.89 (-8.27%): min/p50/90/95/99/max/std = 10.7444 / 10.8922 / 10.9323 / 10.9384 / 10.9477 / 10.9482 / 0.0535149 n=30
>> synmark:OglBatch6:cpu: 12.78 (-9.56%): min/p50/90/95/99/max/std = 12.6762 / 12.7798 / 12.7961 / 12.7967 / 12.7981 / 12.8059 / 0.0355064 n=30
>> synmark:OglBatch7: 2.76 (-8.65%): min/p50/90/95/99/max/std = 2.74121 / 2.76015 / 2.77461 / 2.77479 / 2.77892 / 2.77992 / 0.0111074 n=30
>> synmark:OglBatch7:cpu: 3.29 (-10.07%): min/p50/90/95/99/max/std = 3.26976 / 3.29293 / 3.29758 / 3.29878 / 3.29917 / 3.29936 / 0.00586701 n=30
>>
>> nothing else stood from the noise. (The cpu variants are with INTEL_NO_HW=1.)
>> -Chris
>
> I don't see anything like that on VLV even after going up to n=30. What
> compiler options did you use to build mesa? Does the attached change
> have any effect on the results? Do you see a comparable regression
> after moving the image resolves to the VBO hook you added in your other
> series?
>
> BTW please send any absolute BSW FPS results to me in private rather
> than to the public mailing list...
>
I've had the chance to test this on BSW and couldn't reproduce any
regression. Could you send me a branch and SHA hashes of the exact
revisions you tested, and the compiler options you used to build them?
>>
>> --
>> Chris Wilson, Intel Open Source Technology Centre
>
> diff --git a/src/mesa/drivers/dri/i965/brw_context.c b/src/mesa/drivers/dri/i965/brw_context.c
> index b6b8262..50788fd 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.c
> +++ b/src/mesa/drivers/dri/i965/brw_context.c
> @@ -191,17 +191,20 @@ intel_update_state(struct gl_context * ctx, GLuint new_state)
>
> /* Resolve color for each active shader image. */
> for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
> - const struct gl_shader *shader = ctx->_Shader->CurrentProgram[i] ?
> - ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL;
> + const struct gl_shader_program *prog = ctx->_Shader->CurrentProgram[i];
>
> - if (unlikely(shader && shader->NumImages)) {
> - for (unsigned j = 0; j < shader->NumImages; j++) {
> - struct gl_image_unit *u = &ctx->ImageUnits[shader->ImageUnits[j]];
> - tex_obj = intel_texture_object(u->TexObj);
> + if (unlikely(prog)) {
> + const struct gl_shader *shader = prog->_LinkedShaders[i];
>
> - if (tex_obj && tex_obj->mt) {
> - intel_miptree_resolve_color(brw, tex_obj->mt);
> - brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
> + if (unlikely(shader && shader->NumImages)) {
> + for (unsigned j = 0; j < shader->NumImages; j++) {
> + struct gl_image_unit *u = &ctx->ImageUnits[shader->ImageUnits[j]];
> + tex_obj = intel_texture_object(u->TexObj);
> +
> + if (tex_obj && tex_obj->mt) {
> + intel_miptree_resolve_color(brw, tex_obj->mt);
> + brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
> + }
> }
> }
> }
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 212 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20151029/5779343b/attachment-0001.sig>
More information about the mesa-dev
mailing list