[Mesa-dev] [Mesa-stable] [PATCH] i965: Fix buffer overruns in MSAA MCS buffer clearing.

Tue Apr 15 13:16:43 PDT 2014

Courtney Goeltzenleuchter <courtney at lunarg.com> writes:

> On Tue, Apr 15, 2014 at 1:18 PM, Eric Anholt <eric at anholt.net> wrote:
>
>> Kenneth Graunke <kenneth at whitecape.org> writes:
>>
>> > On 04/14/2014 05:33 PM, Eric Anholt wrote:
>> >> This manifested as rendering failures or sometimes GPU hangs in
>> >> compositors when they accidentally got MSAA visuals due to a bug in the
>> X
>> >> Server.  Today we decided that the problem in compositors was equivalent
>> >> to a corruption bug we'd noticed recently in resizing MSAA-visual
>> >> glxgears, and debugging got a lot easier.
>> >>
>> >> When we allocate our MCS MT, libdrm takes the size we request, aligns it
>> >> to Y tile size (blowing it up from 300x300=900000 bytes to
>> 384*320=122880
>> >> bytes, 30 pages), then puts it into a power-of-two-sized BO (131072
>> bytes,
>> >> 32 pages).  Because it's Y tiled, we attach a 384-byte-stride fence to
>> it.
>> >> When we memset by the BO size in Mesa, between bytes 122880 and 131072
>> the
>> >> data gets stored to the first 20 or so scanlines of each of the 3 tiled
>> >> pages in that row, even though only 2 of those pages were allocated by
>> >> libdrm.
>> >
>> > What?
>> >
>> > I get that drm_intel_bo_alloc/drm_intel_bo_alloc_tiled might return a
>> > drm_intel_bo where bo->size is larger than what you asked for, due to
>> > the BO cache.  But...what you're saying is, it doesn't actually allocate
>> > enough pages to back the whole bo->size it gives you?  So, if you write
>> > bytes 0..(bo->size - 1), you'll randomly clobber memory in a way that's
>> > really difficult to detect?
>>
>> You have that many pages, really.  But you've attached a fence to it, so
>> your allocated pages are structured as:
>>
>> +---+---+---+
>> |   |   |   |
>> +---+---+---+
>> |   |   |   |
>> +---+---+---+
>> |   |   |   |
>> +---+---+---+
>> |   |   |
>> +---+---+
>>
>> (except taller in this specific example).  If you hit the pixels in
>> those quads, you'll be fine.
>>
>> >
>> > There are other places where we memset an entire BO using bo->size.  For
>> > example, your INTEL_DEBUG=shader_time code does exactly that (though it
>> > isn't tiled).
>> >
>> > Could we change libdrm to set bo->size to the actual usable size of the
>> > buffer, rather than the bucket size?
>>
>> The pages containing pixels you asked for go to 122880, and the BO is
>> 131072, but the pixels you asked for have a maximum linear address of
>> 384*320=115200.  Which value are you thinking is the "actual usable
>> size"?  We certainly shouldn't have been memsetting more pixels than
>> 115200.
>>
>
> Why not? I understand that it's not useful to touch pixels beyond 115200,
> but from the data structure, we were given 131072 bytes to work with. Why
> shouldn't memsetting the entire allocated space be a safe operation?

If you drm_intel_bo_map() and write 131072, you'd be fine.  If you
drm_intel_bo_map_gtt() on the tiled buffer and your fence readdresses
your writes beyond 131072, you're not fine.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/mesa-dev/attachments/20140415/d5c36708/attachment.sig>