SDMA out-of-bounds write access of tiled surface (was: Re: [amd-gfx] AMD Carrizo - GPU fault detected: 146 0x0842b714)

Alex Deucher alexdeucher at gmail.com
Wed Jun 22 14:46:04 UTC 2016


On Wed, Jun 22, 2016 at 10:23 AM, Marek Olšák <maraeo at gmail.com> wrote:
> On Wed, Jun 22, 2016 at 3:33 PM, Alex Deucher <alexdeucher at gmail.com> wrote:
>> On Wed, Jun 22, 2016 at 8:21 AM, Marek Olšák <maraeo at gmail.com> wrote:
>>> I don't think so.
>>>
>>> The VM faults can only occur when accessing the linear texture, and
>>> the Mesa code should use the correct workarounds already.
>>>
>>> The tiled texture is just a collection of 1D tiles (8x8 pixels) and
>>> SDMA operates on those 1D tiles. It doesn't access memory outside of
>>> 1D tile boundaries it's supposed to access. 2D tiling is just a
>>> different ordering of 1D tiles with greater alignment requirements.
>>> The 2D tile parameters such as bank_height and macro_tile_aspect only
>>> affect that ordering. 1D tiles are always the same regardless of the
>>> higher tile mode. Given that, I don't see how SDMA can behave
>>> differently here.
>>>
>>> There are 2 possible explanations for VM faults from tiled access:
>>> - The tile parameters passed to SDMA don't agree with the parameters
>>> determined by addrlib. (or there can be a bug in passing those between
>>> processes)
>>> - Unknown or undiscovered SDMA bug.
>>>
>>> Note that no docs describe the VM fault bug from linear access.
>>>
>>> If you both have Carrizo, you should get the same 2D tile parameters.
>>> If you don't, it's weird.
>>
>> The row size varies based on the memory configuration and the number
>> of banks populated.  It might be worth adjusting the row size in
>> gfx_v8_0_gpu_early_init() to see if that helps reproduce the issue.
>
> That's interesting. Note that all micro (1D) and macro tile parameters
> are the same on all Carrizos regardless of the memory configuration.
> That's determined by the tile mode arrays. Internal docs don't list
> any other tile mode configurations for Carrizo.

The row size affects the value programmed in the GB_ADDR_CONFIG and
related registers.

Alex


More information about the amd-gfx mailing list