On 11 July 2012 09:50, Chad Versace <span dir="ltr"><<a href="mailto:chad.versace@linux.intel.com" target="_blank">chad.versace@linux.intel.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On 07/06/2012 03:29 PM, Paul Berry wrote:<br>
> When a buffer using Gen7's CMS MSAA layout is bound to a texture or a<br>
> render target, the SURFACE_STATE structure needs to point to the MCS<br>
> buffer and to indicate its pitch. This patch updates the functions<br>
> that emit SURFACE_STATE to handle CMS layout properly.<br>
> ---<br>
> src/mesa/drivers/dri/i965/brw_state.h | 5 ++<br>
> src/mesa/drivers/dri/i965/gen7_blorp.cpp | 4 ++<br>
> src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 45 +++++++++++++++++++++<br>
> 3 files changed, 54 insertions(+), 0 deletions(-)<br>
<br>
<br>
<br>
</div><div class="im">> void<br>
> +gen7_set_surface_mcs_info(struct brw_context *brw,<br>
> + struct gen7_surface_state *surf,<br>
> + uint32_t surf_offset,<br>
> + const struct intel_mipmap_tree *mcs_mt,<br>
> + bool is_render_target)<br>
> +{<br>
> + /* From the Ivy Bridge PRM, Vol4 Part1 p76, "MCS Base Address":<br>
> + *<br>
> + * "The MCS surface must be stored as Tile Y."<br>
> + */<br>
> + assert(mcs_mt->region->tiling == I915_TILING_Y);<br>
> +<br>
> + /* Compute the pitch in units of tiles. To do this we need to divide the<br>
> + * pitch in bytes by 128, since a single Y-tile is 128 bytes wide.<br>
> + */<br>
> + unsigned pitch_bytes = mcs_mt->region->pitch * mcs_mt->cpp;<br>
> + unsigned pitch_tiles = pitch_bytes / 128;<br>
<br>
</div>Here, I first thought "A rounddown error will occur if pitch_bytes is not<br>
128-aligned!" But I convinced myself that can't happen.<br>
<div class="im"><br>
<br>
<br>
> +<br>
> + /* The upper 20 bits of surface state DWORD 6 are the upper 20 bits of the<br>
> + * GPU address of the MCS buffer; the lower 12 bits contain other control<br>
> + * information. Since buffer addresses are always on 4k boundaries (and<br>
> + * thus have their lower 12 bits zero), we can use an ordinary reloc to do<br>
> + * the necessary address translation.<br>
> + */<br>
> + assert ((mcs_mt->region->bo->offset & 0xfff) == 0);<br>
> + surf->ss6.mcs_enabled.mcs_enable = 1;<br>
> + surf->ss6.mcs_enabled.mcs_surface_pitch = pitch_tiles - 1;<br>
> + surf->ss6.mcs_enabled.mcs_base_address = mcs_mt->region->bo->offset >> 12;<br>
> + drm_intel_bo_emit_reloc(brw-><a href="http://intel.batch.bo" target="_blank">intel.batch.bo</a>,<br>
> + surf_offset +<br>
> + offsetof(struct gen7_surface_state, ss6),<br>
> + mcs_mt->region->bo,<br>
> + surf->ss6.raw_data - mcs_mt->region->bo->offset,<br>
> + is_render_target ? I915_GEM_DOMAIN_RENDER<br>
> + : I915_GEM_DOMAIN_SAMPLER,<br>
> + is_render_target ? I915_GEM_DOMAIN_RENDER : 0);<br>
> +}<br>
<br>
</div>I'm confused. ss6.raw_data is not an address nor an offset; it's lower bits are<br>
non-addressy things. So I see a unit conflict in the target_offset arg,<br>
`surf->ss6.raw_data - mcs_mt->region->bo->offset`.<br></blockquote><div><br>Yeah, the hardware folks have made things difficult for us by squeezing both an address and control information into a single dword. They way they got away with it is that the address is known by both Mesa and the GPU to always lie on a page boundary; hence its lower 12 bits are always zero. Therefore, we replace the lower 12 bits with control information. So it looks as though our address is some sort of crazy offset within the MCS buffer, even though that "offset" is actually not referring to a memory location.<br>
<br>If the MCS buffer doesn't get relocated, then everything works fine. If the MCS buffer *does* get relocated, then we need to tell the kernel relocator to apply the same crazy offset to the new location of the MCS buffer. Fortunately the kernel always relocates things to page boundaries, so the relocated address of the MCS buffer will still have its lower 12 bits zero. That means that when the relocator applies the offset, the lower 12 bits will still contain the same control information they had before the relocation.<br>
<br>Would it be helpful to walk through an example?<br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
I tried to reconstruct what you may have intended, and I arrived at<br>
`(surf->ss6.mcs_enabled.mcs_base_address << 12) - mcs_mt->region->bo->offset`.<br>
But that just evaluates to a 0 offset, and I became confused again as to why you<br>
would want to calculate 0 in such roundabout way.<br></blockquote><div><br>This value is the "crazy offset" I was referring to earlier; the one that secretly contains the control information. Would it help if I used the alternative formulation "surf->ss6.raw_data & 0xfff"? That might clarify what I'm trying to achieve, and why the value is not expected to be zero.<br>
</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
What's happening there?<br></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
----<br>
Chad Versace<br>
<a href="mailto:chad.versace@linux.intel.com">chad.versace@linux.intel.com</a><br>
</blockquote></div><br>