[Mesa-dev] [PATCH] mesa/marshal: add custom BufferData/BufferSubData marshalling

Thu Mar 23 13:07:08 UTC 2017

On 23/03/17 23:51, Nicolai Hähnle wrote:
> On 23.03.2017 13:41, Timothy Arceri wrote:
>>
>>
>> On 23/03/17 23:19, Nicolai Hähnle wrote:
>>> On 23.03.2017 13:06, Timothy Arceri wrote:
>>>>
>>>>
>>>> On 23/03/17 22:52, Nicolai Hähnle wrote:
>>>>> On 23.03.2017 07:32, Timothy Arceri wrote:
>>>>>> +void GLAPIENTRY
>>>>>> +_mesa_marshal_BufferData(GLenum target, GLsizeiptr size, const
>>>>>> GLvoid
>>>>>> * data,
>>>>>> +                         GLenum usage)
>>>>>> +{
>>>>>> +   GET_CURRENT_CONTEXT(ctx);
>>>>>> +   size_t cmd_size =
>>>>>> +      sizeof(struct marshal_cmd_BufferData) + (data ? size *2 : 0);
>>>>>
>>>>> Why the *2?
>>>>
>>>> That's a a typo I thought, I fixed it before sending.
>>>>
>>>>>
>>>>> Also, I didn't notice that before, but I'm concerned that this can
>>>>> break
>>>>> the alignment of subsequent marshal_cmd structs. It's probably best to
>>>>> force the alignment in _mesa_glthread_allocate_command, so that this
>>>>> problem cannot be missed later on in similar situations.
>>>>
>>>> You mean as a follow up patch right? We don't want to make a copy here.
>>>
>>> Well, better as a preparatory patch, so that the alignment problem is
>>> never introduced in the first place, but yes.
>>
>> This code (and I assume other similar code) is already generated
>> automatically by gl_marshal.py so there is no regression.
>
> I have a feeling we're talking past each other. At least I don't see how
> what you write relates to what I wrote :/
>
> The issue that concerns me is when size is not a multiple of 8 bytes (or
> perhaps sizeof(void *)? -- but there are commands with 8 byte-arguments
> even on 32-bit platforms).
>
> In that case, cmd_size % 8 != 0, and _mesa_glthread_allocate_command
> will happily advance glthread->batch->used to an amount that is not a
> multiple of 8. As a result, subsequent command structs will no longer be
> aligned.
>
> I suppose this isn't a noticeable issue on x86 (perhaps some memory
> accesses will be slower when crossing a cache line), but (1) sanitizers
> _will_ complain about this, and (2) it's likely to break other
> architectures.
>
> My suggested solution would be to round size up in
> _mesa_glthread_allocate_command, to catch this problem in a single
> place. Alternatively (e.g. if that degrades the assembly generated by
> the compiler), we could simply add an assert in
> _mesa_glthread_allocate_command and do the alignment in the custom-coded
> mesa_marshal_* functions that need it.
>

Right, I understand the problem, and your suggestion. :)

I'm just saying that the marshal generation code in master generates 
code with this problem already, so there is no way to create "a 
preparatory patch, so that the alignment problem is never introduced in 
the first place".

This patch simply skips that code path for AMD_pinned_memory. It's 
possible we could avoid this manual code and even keep this path 
threaded by tracking the use of EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD but 
for now this stops the failures.