[PATCH] drm/radeon: Fix screen corruption (v2)

Thu Dec 15 08:07:19 UTC 2022

Am 15.12.22 um 00:08 schrieb Robin Murphy:
> On 2022-12-14 22:02, Alex Deucher wrote:
>> On Wed, Dec 14, 2022 at 4:54 PM Robin Murphy <robin.murphy at arm.com> 
>> wrote:
>>>
>>> On 2022-12-12 02:08, Luben Tuikov wrote:
>>>> Fix screen corruption on older 32-bit systems using AGP chips.
>>>>
>>>> On older systems with little memory, for instance 1.5 GiB, using an 
>>>> AGP chip,
>>>> the device's DMA mask is 0xFFFFFFFF, but the memory mask is 
>>>> 0x7FFFFFF, and
>>>> subsequently dma_addressing_limited() returns 0xFFFFFFFF < 0x7FFFFFFF,
>>>> false. As such the result of this static inline isn't suitable for 
>>>> the last
>>>> argument to ttm_device_init()--it simply needs to now whether to 
>>>> use GFP_DMA32
>>>> when allocating DMA buffers.
>>>
>>> This sounds wrong to me. If the issues happen on systems without PAE it
>>> clearly can't have anything to with the actual DMA address size. Not to
>>> mention that AFAICS 32-bit x86 doesn't even have ZONE_DMA32, so
>>> GFP_DMA32 would be functionally meaningless anyway. Although the
>>> reported symptoms initially sounded like they could be caused by DMA
>>> going to the wrong place, that is also equally consistent with a 
>>> loss of
>>> cache coherency.
>>>
>>> My (limited) understanding of AGP is that the GART can effectively 
>>> alias
>>> memory to a second physical address, so I could well believe that
>>> something somewhere in the driver stack needs to perform some cache
>>> maintenance to avoid coherency issues, and that in these particular
>>> setups whatever that is might be assuming the memory is direct-mapped
>>> and thus going wrong for highmem pages.
>>>
>>> So as I said before, I really think this is not about using 
>>> GFP_DMA32 at
>>> all, but about *not* using GFP_HIGHUSER.
>>
>> One of the wonderful features of AGP is that it has to be used with
>> uncached memory.  The aperture basically just provides a remapping of
>> physical pages into a linear aperture that you point the GPU at.  TTM
>> has to jump through quite a few hoops to get uncached memory in the
>> first place, so it's likely that that somehow isn't compatible with
>> HIGHMEM.  Can you get uncached HIGHMEM?
>
> I guess in principle yes, if you're careful not to use regular 
> kmap()/kmap_atomic(), and always use pgprot_noncached() for 
> userspace/vmalloc mappings, but clearly that leaves lots of scope for 
> slipping up.

I theory we should do exactly that in TTM, but we have very few users 
who actually still exercise that functionality.

>
> Working backwards from primitives like set_memory_uc(), I see various 
> paths in TTM where manipulating the caching state is skipped for 
> highmem pages, but I wouldn't even know where to start looking for 
> whether the right state is propagated to all the places where they 
> might eventually be mapped somewhere.

The tt object has the caching state for the pages and 
ttm_prot_from_caching() then uses pgprot_noncached() and co for the 
userspace/vmalloc mappings.

Regards,
Christian.

>
> Cheers,
> Robin.