On 26 April 2012 16:40, Chad Versace <span dir="ltr"><<a href="mailto:chad.versace@linux.intel.com" target="_blank">chad.versace@linux.intel.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
A few comments below.<br>
<div><div class="h5"><br>
On 04/24/2012 12:23 PM, Paul Berry wrote:<br>
> When rendering to a miplevel other than 0 within a color, depth,<br>
> stencil, or HiZ buffer, we need to tell the GPU to render to an offset<br>
> within the buffer, so that the data is written into the correct<br>
> miplevel. We do this using a coarse offset (in pages), and a fine<br>
> adjustment (the so-called "tile_x" and "tile_y" values, which are<br>
> measured in pixels).<br>
><br>
> We have always computed the coarse offset and fine adjustment using<br>
> intel_renderbuffer_tile_offsets() function. This worked fine for<br>
> color and combined depth/stencil buffers, but failed to work properly<br>
> when HiZ and separate stencil were in use. It failed to work because<br>
> there is only one set of fine adjustment controls shared by the HiZ,<br>
> depth, and stencil buffers, so we need to choose tile_x and tile_y<br>
> values that are compatible with the tiling of all three buffers, and<br>
> then compute separate coarse offsets for each buffer.<br>
><br>
> This patch fixes the HiZ and separate stencil case by replacing the<br>
> call to intel_renderbuffer_tile_offsets() with calls to two functions:<br>
> intel_region_get_tile_masks(), which determines how much of the<br>
> adjustment can be performed using offsets and how much can be<br>
> performed using tile_x and tile_y, and<br>
> intel_region_get_aligned_offset(), which computes the coarse offset.<br>
><br>
> intel_region_get_tile_offsets() is still used for color renderbuffers,<br>
> so to avoid code duplication, I've re-worked it to use<br>
> intel_region_get_tile_masks() and intel_region_get_aligned_offset().<br>
><br>
> On i965 Gen6, fixes piglit tests<br>
> "texturing/depthstencil-render-miplevels 1024 X" where X is one of<br>
> (depth, depth_and_stencil, depth_stencil_single_binding, depth_x,<br>
> depth_x_and_stencil, stencil, stencil_and_depth, stencil_and_depth_x).<br>
><br>
> On i965 Gen7, the variants of<br>
> "texturing/depthstencil-render-miplevels" that contain a stencil<br>
> buffer still fail, due to another problem: Gen7 seems to ignore the 3<br>
> LSB's of the tile_y adjustment (and possibly also tile_x).<br>
> ---<br>
> src/mesa/drivers/dri/i965/brw_misc_state.c | 96 ++++++++++++++++++++++++--<br>
> src/mesa/drivers/dri/i965/gen6_hiz.c | 54 +++++++++++----<br>
> src/mesa/drivers/dri/i965/gen7_hiz.c | 54 +++++++++++----<br>
> src/mesa/drivers/dri/i965/gen7_misc_state.c | 79 ++++++++++++++++++++--<br>
> src/mesa/drivers/dri/intel/intel_fbo.c | 27 ++-----<br>
> src/mesa/drivers/dri/intel/intel_fbo.h | 28 ++++++++<br>
> src/mesa/drivers/dri/intel/intel_regions.c | 52 ++++++++++++++<br>
> src/mesa/drivers/dri/intel/intel_regions.h | 8 ++<br>
> 8 files changed, 338 insertions(+), 60 deletions(-)<br>
<br>
</div></div>[snip]<br>
<div><div class="h5"><br>
> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.h b/src/mesa/drivers/dri/intel/intel_fbo.h<br>
> index 724f141..503b006 100644<br>
> --- a/src/mesa/drivers/dri/intel/intel_fbo.h<br>
> +++ b/src/mesa/drivers/dri/intel/intel_fbo.h<br>
> @@ -153,6 +153,34 @@ intel_flip_renderbuffers(struct gl_framebuffer *fb);<br>
> void<br>
> intel_renderbuffer_set_draw_offset(struct intel_renderbuffer *irb);<br>
><br>
> +void<br>
> +intel_renderbuffer_fine_offset_masks(struct intel_renderbuffer *irb,<br>
> + uint32_t *fine_offset_mask_x,<br>
> + uint32_t *fine_offset_mask_y);<br>
> +<br>
> +/**<br>
> + * When rendering to a texture with multiple miplevels, depth planes, or cube<br>
> + * faces, we need to instruct the GPU to render to an offset within the<br>
> + * texture image corresponding to the appropriate miplevel/plane/cubeface. We<br>
> + * do this by a combination of two techniques: by offsetting the base address<br>
> + * of the texture image, and by supplying additional X and Y coordinate<br>
> + * offsets to the GPU in the SURFACE_STATE structure.<br>
> + *<br>
> + * This function computes the additional Y coordinate offset.<br>
> + */<br>
> +<br>
> +/**<br>
> + * When rendering to a texture with multiple miplevels, depth planes, or cube<br>
> + * faces, we need to instruct the GPU to render to an offset within the<br>
> + * texture image corresponding to the appropriate miplevel/plane/cubeface. We<br>
> + * do this by a combination of two techniques: by offsetting the base address<br>
> + * of the texture image, and by supplying additional X and Y coordinate<br>
> + * offsets to the GPU in the SURFACE_STATE structure.<br>
> + *<br>
> + * This function computes the additional X coordinate offset.<br>
> + */<br>
> +<br>
> +<br>
> uint32_t<br>
> intel_renderbuffer_tile_offsets(struct intel_renderbuffer *irb,<br>
> uint32_t *tile_x,<br>
<br>
</div></div>Dangling comments above.<br></blockquote><div><br>Oh, yeah. Those were left over from an old draft of the code. Removed.<br> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
[snip]<br>
<div class="im"><br>
> +/**<br>
> + * Compute the offset (in bytes) from the start of the region to the given x<br>
> + * and y coordinate. For tiled regions, caller must ensure that x and y are<br>
> + * multiples of the tile size.<br>
> + */<br>
<br>
</div>I think this function needs to assert the preconditions for x and y. It's<br>
just too easy for a caller to forget the precondition and hence introduce<br>
an easily avoided bug.<br></blockquote><div><br>Good point.<br> </div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div class="im"><br>
> +uint32_t<br>
> +intel_region_get_aligned_offset(struct intel_region *region, uint32_t x,<br>
> + uint32_t y)<br>
> +{<br>
> + int cpp = region->cpp;<br>
> + uint32_t pitch = region->pitch * cpp;<br>
> +<br>
> + switch (region->tiling) {<br>
> + default:<br>
> + assert(false);<br>
> + case I915_TILING_NONE:<br>
> + return y * pitch + x * cpp;<br>
> + case I915_TILING_X:<br>
> + return y * pitch + x / (512 / cpp) * 4096;<br>
> + case I915_TILING_Y:<br>
> + return y * pitch + x / (128 / cpp) * 4096;<br>
> + }<br>
> +}<br>
<br>
</div>One more thing. It would be nice if the duplicate offset code in<br>
gen6_hiz.c and gen7_hiz.c were consolidated into a helper function...<br>
but, meh. I don't care that much.<br>
</blockquote></div><br>Yeah, good point. This is somewhat improved after all of the refactoring I did for MSAA. Tell you what, once the dust settles from that, let's have another look and see if it's still worth consolidating more of the code.<br>
<br>Updated patch to follow.<br>