[PATCH i-g-t 05/23] lib/rendercopy: Always setup clear color for TGL

Ville Syrjälä ville.syrjala at linux.intel.com
Thu Sep 12 11:34:21 UTC 2024


On Sat, Sep 07, 2024 at 11:26:52AM +0300, Juha-Pekka Heikkila wrote:
> On 6.9.2024 16.45, Ville Syrjälä wrote:
> > On Fri, Sep 06, 2024 at 04:09:19PM +0300, Juha-Pekka Heikkila wrote:
> >> On 2.9.2024 17.37, Ville Syrjala wrote:
> >>> From: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >>>
> >>> TGL apparently automagically converts regular output to fast
> >>> clears when the output matches the configured clear color.
> >>> And if we don't enable the clear color packet at all then we
> >>> just get some rainbow gibberish on all black parts of the
> >>> output.
> >>>
> >>> To avoid always set up the clear color packet when using
> >>> using a non-clear color modifier. We'll just stick a bunch
> >>> of NaNs into the clear value so it'll never match any
> >>> legitimate output, and thus automagic fast clear should not
> >>> happen.
> >>>
> >>> TODO: Hide this better inside rendercopy_gen9.c without
> >>>         requring extra allocation in the FB BO
> >>> TODO: Figure out if other platforms need this sort stuff
> >>>
> >>> Signed-off-by: Ville Syrjälä <ville.syrjala at linux.intel.com>
> >>> ---
> >>>    lib/igt_fb.c          | 24 ++++++++++++++++++++++++
> >>>    lib/intel_bufops.h    |  1 +
> >>>    lib/rendercopy_gen9.c | 14 +++++++++++---
> >>>    3 files changed, 36 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/lib/igt_fb.c b/lib/igt_fb.c
> >>> index 21c56a454c5a..0eb3897d0f8b 100644
> >>> --- a/lib/igt_fb.c
> >>> +++ b/lib/igt_fb.c
> >>> @@ -965,6 +965,16 @@ void igt_calc_fb_size(struct igt_fb *fb)
> >>>    		size += calc_plane_size(fb, plane);
> >>>    	}
> >>>    
> >>> +	/*
> >>> +	 * We always need a clear color on TGL, make some extra
> >>> +	 * room for one it if it's not explicit in the modifier.
> >>> +	 *
> >>> +	 * TODO: probably better to allocate this as part of the
> >>> +	 * batch instead so the fb size doesn't need to change...
> >>> +	 */
> >>> +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS)
> >>> +		size = ALIGN(size + 64, 64);
> >>> +
> >>>    	if (is_xe_device(fb->fd)) {
> >>>    		size = ALIGN(size, xe_get_default_alignment(fb->fd));
> >>>    		if (fb->modifier == I915_FORMAT_MOD_4_TILED_BMG_CCS)
> >>> @@ -2670,6 +2680,20 @@ igt_fb_create_intel_buf(int fd, struct buf_ops *bops,
> >>>    	if (fb->modifier == I915_FORMAT_MOD_4_TILED_DG2_RC_CCS_CC)
> >>>    		buf->cc.offset = fb->offsets[1];
> >>>    
> >>> +	/*
> >>> +	 * TGL appears to do automagic fast clear when rendering
> >>> +	 * black and the clear color isn't specified, or when the
> >>> +	 * output matches the specified clear color. Force a
> >>> +	 * non-sensical clear color to prevent it from doing this
> >>> +	 * when using a non-clear color modifier.
> >>> +	 *
> >>> +	 * TODO: figure out if other platforms are affected...
> >>> +	 */
> >>> +	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS) {
> >>
> >> What about adlp on this? I believe it will now get same treatment as tgl
> >> and ci doesn't seem to have adlp on shards so there's no good results(?)
> > 
> > I *think* adl wasn't affected, but I'll need to double check that...
> > 
> > Although apart from allocating a bit of extra memory this should
> > be completely harmeless on unaffected platforms.
> 
> ah yes, true. At the worst this will reveal something that was not 
> handled for adlp.

Mesa has a wa 1607794140 for this stuff which they limit to
TGL:A0-C0 and DG1.

Digging a bit more based on that there is apparently a new
bit in 3DSTATE_3DMODE ("Fast clear optimization (FCV)" (what the
"V" stands for I have no idea). However that new bit only seems to
exist on DG2+ only, but not for TGL/DG1/ADL.

I re-ran the tests:
- My TGL is definitely affected (can't remember what stepping it is).
  This fake clear color thing seems to cure it 100%.
- Didn't test DG1 since I don't have a suitable machine to plug it into
- ADL never seems to do this optimization, so the fake clear color
  would be redundant.
- for DG2 I can enable this via 3DSTATE_3DMODE, after which it somewhat
  matches the TGL behaviour. Though using the fake clear color trick
  doesn't 100% cure it on DG2 for some unknown reason. Looks mostly
  good but there are a few blocks of incorrect pixels sprinkled here
  and there.

I'll update the comment a bit when pushing this, but based on that
I think we should be OK to do this unconditonally for
I915_FORMAT_MOD_Y_TILED_GEN12_RC_CCS.

> 
> Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila at gmail.com>
> 
> > 
> >>
> >> /Juha-Pekka
> >>
> >>> +		buf->cc.disable = true;
> >>> +		buf->cc.offset = fb->size - 64;
> >>> +	}
> >>> +
> >>>    	return buf;
> >>>    }
> >>>    
> >>> diff --git a/lib/intel_bufops.h b/lib/intel_bufops.h
> >>> index 84e71d41a2c2..06e72ba4ba93 100644
> >>> --- a/lib/intel_bufops.h
> >>> +++ b/lib/intel_bufops.h
> >>> @@ -38,6 +38,7 @@ struct intel_buf {
> >>>    	} ccs[2];
> >>>    	struct {
> >>>    		uint32_t offset;
> >>> +		bool disable;
> >>>    	} cc;
> >>>    	struct {
> >>>    		uint64_t offset;
> >>> diff --git a/lib/rendercopy_gen9.c b/lib/rendercopy_gen9.c
> >>> index 5ee4c89f5cdb..f0757a8e6b78 100644
> >>> --- a/lib/rendercopy_gen9.c
> >>> +++ b/lib/rendercopy_gen9.c
> >>> @@ -1133,15 +1133,23 @@ void _gen9_render_op(struct intel_bb *ibb,
> >>>    
> >>>    	gen12_emit_aux_pgtable_state(ibb, aux_pgtable_state, true);
> >>>    
> >>> -	if (fast_clear) {
> >>> +	if (fast_clear || dst->cc.disable) {
> >>>    		for (int i = 0; i < 4; i++) {
> >>>    			intel_bb_out(ibb, MI_STORE_DWORD_IMM_GEN4);
> >>>    			intel_bb_emit_reloc(ibb, dst->handle,
> >>>    					    I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
> >>>                                                dst->cc.offset + i*sizeof(float),
> >>>    					    dst->addr.offset);
> >>> -			intel_bb_out(ibb, *(uint32_t*)&clear_color[i]);
> >>> -               }
> >>> +			if (fast_clear) {
> >>> +				intel_bb_out(ibb, *(uint32_t*)&clear_color[i]);
> >>> +			} else {
> >>> +				/*
> >>> +				 * Emit NaN so it'll match nothing and thus prevent
> >>> +				 * TGL from doing its automagic fast clear tricks.
> >>> +				 */
> >>> +				intel_bb_out(ibb, 0xffffffff);
> >>> +			}
> >>> +		}
> >>>           }
> >>>    
> >>>    
> > 

-- 
Ville Syrjälä
Intel


More information about the igt-dev mailing list