Helping Wine use 64 bit Mesa OGL drivers for 32-bit Windows applications

Thu Oct 24 12:56:02 UTC 2024

I haven't tested it but as far as I know that isn't correct.

As far as I know you can map the same VMA at a different location even 
without MREMAP_DONTUNMAP. And yes MREMAP_DONTUNMAP only work with 
private mappings, but that isn't needed here.

Give me a moment to test this.

Regards,
Christian.

Am 24.10.24 um 10:03 schrieb Derek Lesho:
> In my last mail I responded to this approach all the way at the 
> bottom, so it probably got lost: mremap on Linux as it exists now 
> won't work as it only supports private anonymous mappings (in 
> conjunction with MREMAP_DONTUNMAP), which GPU mappings are not.
>
> Am 10/24/24 um 01:06 schrieb James Jones:
>> That makes sense. Reading the man page myself, it does seem like:
>>
>> -If the drivers can guarantee they set MAP_SHARED when creating their 
>> initial mapping.
>>
>> -If WINE is fine rounding down to page boundaries to deal with 
>> mappings of suballocations and either using some lookup structure to 
>> avoid duplicate remappings (probably needed to handle unmap anyway 
>> per below) or just living with the perf cost and address space 
>> overconsumption for duplicate remappings.
>>
>> -If mremap() preserves the cache attributes of the original mapping.
>>
>> Then no GL API change would be needed. WINE would just have to do an 
>> if (addrAbove4G) { mremapStuff() } on map and presumably add some 
>> tracking to perform an equivalent munmap() when unmapping. I assume 
>> WINE already has a bunch of vaddr tracking logic in use to manage the 
>> <4G address space as described elsewhere in the thread. That would be 
>> pretty ideal from a driver vendor perspective.
>>
>> Does that work?
>>
>> Thanks,
>> -James
>>
>> On 10/23/24 06:12, Christian König wrote:
>>> I haven't read through the whole mail thread, but if you manage the 
>>> address space using mmap() then you always run into this issue.
>>>
>>> If you manage the whole 4GiB address space by Wine then you never 
>>> run into this issue. You would just allocate some address range 
>>> internally and mremap() into that.
>>>
>>> Regards,
>>> Christian.
>>>
>>> Am 22.10.24 um 19:32 schrieb James Jones:
>>>> This sounds interesting, but does it come with the same "Only gets 
>>>> 2GB VA" downside Derek pointed out in the thread fork where he was 
>>>> responding to Michel?
>>>>
>>>> Thanks,
>>>> -James
>>>>
>>>> On 10/22/24 07:14, Christian König wrote:
>>>>> Hi guys,
>>>>>
>>>>> one theoretical alternative not mentioned in this thread is the 
>>>>> use of mremap().
>>>>>
>>>>> In other words you reserve some address space below 2G by using 
>>>>> mmap(NULL, length, PROT_NONE, MAP_32BIT | MAP_ANONYMOUS, 0, 0) and 
>>>>> then use mremap(addr64bit, 0, length, MREMAP_FIXED, reserved_addr).
>>>>>
>>>>> I haven't tested this but at least in theory it should give you a 
>>>>> duplicate of the 64bit mapping in the lower 2G of the address space.
>>>>>
>>>>> Important is that you give 0 as oldsize to mremap() so that the 
>>>>> old mapping isn't unmapped but rather just a new mapping of the 
>>>>> existing VMA created.
>>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>
>>>>> Am 18.10.24 um 23:55 schrieb Derek Lesho:
>>>>>> Hey everyone 👋,
>>>>>>
>>>>>> I'm Derek from the Wine project, and wanted to start a discussion 
>>>>>> with y'all about potentially extending the Mesa OGL drivers to 
>>>>>> help us with a functionality gap we're facing.
>>>>>>
>>>>>> Problem Space:
>>>>>>
>>>>>> In the last few years Wine's support for running 32-bit windows 
>>>>>> apps in a 64-bit host environment (wow64) has almost reached 
>>>>>> feature completion, but there remains a pain point with OpenGL 
>>>>>> applications: Namely that Wine can't return a 64-bit GL 
>>>>>> implementation's buffer mappings to a 32 bit application when the 
>>>>>> address is outside of the 32-bit range.
>>>>>>
>>>>>> Currently, we have a workaround that will copy any changes to the 
>>>>>> mapping back to the host upon glBufferUnmap, but this of course 
>>>>>> is slow when the implementation directly returns mapped memory, 
>>>>>> and doesn't work for GL_PERSISTENT_BIT, where directly mapped 
>>>>>> memory is required.
>>>>>>
>>>>>> A few years ago we also faced this problem with Vulkan's, which 
>>>>>> was solved through the VK_EXT_map_memory_placed extension Faith 
>>>>>> drafted, allowing us to use our Wine-internal allocator to 
>>>>>> provide the pages the driver maps to. I'm now wondering if an GL 
>>>>>> equivalent would also be seen as feasible amongst the devs here.
>>>>>>
>>>>>> Proposed solution:
>>>>>>
>>>>>> As the GL backend handles host mapping in its own code, only 
>>>>>> giving suballocations from its mappings back to the App, the 
>>>>>> problem is a little bit less straight forward in comparison to 
>>>>>> our Vulkan solution: If we just allowed the application to set 
>>>>>> its own placed mapping when calling glMapBuffer, the driver might 
>>>>>> then have to handle moving buffers out of already mapped ranges, 
>>>>>> and would lose control over its own memory management schemes.
>>>>>>
>>>>>> Therefore, I propose a GL extension that allows the GL client to 
>>>>>> provide a mapping and unmapping callback to the implementation, 
>>>>>> to be used whenever the driver needs to perform such operations. 
>>>>>> This way the driver remains in full control of its memory 
>>>>>> management affairs, and the amount of work for an implementation 
>>>>>> as well as potential for bugs is kept minimal. I've written a 
>>>>>> draft implementation in Zink using map_memory_placed [1] and a 
>>>>>> corresponding Wine MR utilizing it [2], and would be curious to 
>>>>>> hear your thoughts. I don't have experience in the Mesa codebase, 
>>>>>> so I apologize if the branch is a tad messy.
>>>>>>
>>>>>> In theory, the only requirement from drivers from the extension 
>>>>>> would be that glMapBuffer always return a pointer from within a 
>>>>>> page allocated through the provided callbacks, so that it can be 
>>>>>> guaranteed to be positioned within the required address space. 
>>>>>> Wine would then use it's existing workaround for other types of 
>>>>>> buffers, but as Mesa seems to often return directly mapped 
>>>>>> buffers in other cases as well, Wine could also avoid the 
>>>>>> slowdown that comes with copying in these cases as well.
>>>>>>
>>>>>> Why not use Zink?:
>>>>>>
>>>>>> There's also a proposal to use a 32-bit PE build of Zink in Wine 
>>>>>> bypassing the need for an extension; I brought this to discussion 
>>>>>> in this Wine-Devel thread last week [3], which has some arguments 
>>>>>> against this approach.
>>>>>>
>>>>>>
>>>>>> If any of you have thoughts, concerns, or questions about this 
>>>>>> potential approach, please let me know, thanks!
>>>>>>
>>>>>> 1: 
>>>>>> https://gitlab.freedesktop.org/Guy1524/mesa/-/commits/placed_allocation 
>>>>>>
>>>>>>
>>>>>> 2: https://gitlab.winehq.org/wine/wine/-/merge_requests/6663
>>>>>>
>>>>>> 3: https://marc.info/?t=172883260300002&r=1&w=2
>>>>>>
>>>>>
>>>