[Mesa-dev] [PATCH] r600: fix FMASK allocation on r600/700

Marek Olšák maraeo at gmail.com
Wed Dec 10 12:45:38 PST 2014


It uses the libdrm surface allocator and FMASK is 2D tiled. Maybe the
rounding of bpp affects the pitch in a bad way.

Marek

On Wed, Dec 10, 2014 at 4:06 PM, Alex Deucher <alexdeucher at gmail.com> wrote:
> On Wed, Dec 10, 2014 at 5:40 AM, Marek Olšák <maraeo at gmail.com> wrote:
>> bpe is in bytes, not bits, so you're overallocating the size 8 times.
>> No wonder it works.
>>
>> As far as I can see, FMASK allocation is the same on R600 and
>> Evergreen. Since the allocator accepts bpe in bytes, I multiplied
>> nr_samples with bpp and divided by 8.
>>
>> 2xMSAA:
>> bpp = 2 bits
>> nr_samples = log2(bpp) = 1
>> final bpe = (2*1)/8 rounded up = 1
>>
>> 4xMSAA:
>> bpp = 4 bits
>> nr_samples = log2(bpp) = 2
>> final bpe = (4*2)/8 = 1
>>
>> 8xMSAA:
>> bpp = 8 bits
>> nr_samples = log2(bpp) = 3
>> final bpe = (8*3)/8 rounded up to next pow2 = 4
>>
>> So you can see the code was correct. It's the tiling or alignment that
>> must have changed.
>
> I haven't looked at the code recently, but are we aligning pitch to
> groupsize (256 bytes)?
>
> The pitch padding alignment must be multiplied times the number of 8x8
> tiles required to exactly fill the GroupSize.  The pitch alignment is
> increased by a factor of 16 for 2-sample and 4 for 4-sample or
> 8-sample.
>
> Alex
>
>>
>> Marek
>>
>>
>> On Wed, Dec 10, 2014 at 3:11 AM, Dave Airlie <airlied at gmail.com> wrote:
>>> From: Dave Airlie <airlied at redhat.com>
>>>
>>> According to NDA docs:
>>> FMASK surfaces are addressed identically a surface with num_samples bits per element, and
>>> log2(num_samples) samples. For example, an FMASK for an 8-sample surface would be addressed
>>> identically to a color surface with 8 bits per element and 3 samples.
>>>
>>> Separate the r600 fmask allocation out, and workaround
>>> a bug in the libdrm surface allocation which blocks a 3
>>> sample surface, just round it up to 4.
>>>
>>> This fixes hangs with ext_framebuffer_multisample-clip-and-scissor-blit 8 msaa
>>> and destination clipping on my rv635.
>>>
>>> Signed-off-by: Dave Airlie <airlied at redhat.com>
>>> ---
>>>  src/gallium/drivers/radeon/r600_texture.c | 49 +++++++++++++++++++------------
>>>  1 file changed, 30 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c
>>> index fdf4d76..3ca460a 100644
>>> --- a/src/gallium/drivers/radeon/r600_texture.c
>>> +++ b/src/gallium/drivers/radeon/r600_texture.c
>>> @@ -299,27 +299,38 @@ void r600_texture_get_fmask_info(struct r600_common_screen *rscreen,
>>>                 fmask.flags |= RADEON_SURF_HAS_TILE_MODE_INDEX;
>>>         }
>>>
>>> -       switch (nr_samples) {
>>> -       case 2:
>>> -       case 4:
>>> -               fmask.bpe = 1;
>>> -               if (rscreen->chip_class <= CAYMAN) {
>>> -                       fmask.bankh = 4;
>>> +       if (rscreen->chip_class <= R700) {
>>> +               /*
>>> +                * R600/700 -
>>> +                * FMASK surfaces are addressed identically a surface with num_samples
>>> +                * bits per element, and log2(num_samples) samples.
>>> +                */
>>> +               if (nr_samples != 2 && nr_samples != 4 && nr_samples != 8) {
>>> +                       R600_ERR("Invalid sample count for FMASK allocation.\n");
>>> +                       return;
>>>                 }
>>> -               break;
>>> -       case 8:
>>> -               fmask.bpe = 4;
>>> -               break;
>>> -       default:
>>> -               R600_ERR("Invalid sample count for FMASK allocation.\n");
>>> -               return;
>>> -       }
>>> +               fmask.bpe = nr_samples;
>>> +               fmask.nsamples = log2(nr_samples);
>>> +               /* surface allocator won't do 3 samples */
>>> +               if (fmask.nsamples == 3)
>>> +                       fmask.nsamples = 4;
>>>
>>> -       /* Overallocate FMASK on R600-R700 to fix colorbuffer corruption.
>>> -        * This can be fixed by writing a separate FMASK allocator specifically
>>> -        * for R600-R700 asics. */
>>> -       if (rscreen->chip_class <= R700) {
>>> -               fmask.bpe *= 2;
>>> +       } else {
>>> +               switch (nr_samples) {
>>> +               case 2:
>>> +               case 4:
>>> +                       fmask.bpe = 1;
>>> +                       if (rscreen->chip_class <= CAYMAN) {
>>> +                               fmask.bankh = 4;
>>> +                       }
>>> +                       break;
>>> +               case 8:
>>> +                       fmask.bpe = 4;
>>> +                       break;
>>> +               default:
>>> +                       R600_ERR("Invalid sample count for FMASK allocation.\n");
>>> +                       return;
>>> +               }
>>>         }
>>>
>>>         if (rscreen->ws->surface_init(rscreen->ws, &fmask)) {
>>> --
>>> 2.1.0
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev at lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


More information about the mesa-dev mailing list