[Mesa-dev] [PATCH] mesa/marshal: add custom BufferData/BufferSubData marshalling

Thu Mar 23 13:18:17 UTC 2017

On 23.03.2017 14:07, Timothy Arceri wrote:
>
>
> On 23/03/17 23:51, Nicolai Hähnle wrote:
>> On 23.03.2017 13:41, Timothy Arceri wrote:
>>>
>>>
>>> On 23/03/17 23:19, Nicolai Hähnle wrote:
>>>> On 23.03.2017 13:06, Timothy Arceri wrote:
>>>>>
>>>>>
>>>>> On 23/03/17 22:52, Nicolai Hähnle wrote:
>>>>>> On 23.03.2017 07:32, Timothy Arceri wrote:
>>>>>>> +void GLAPIENTRY
>>>>>>> +_mesa_marshal_BufferData(GLenum target, GLsizeiptr size, const
>>>>>>> GLvoid
>>>>>>> * data,
>>>>>>> +                         GLenum usage)
>>>>>>> +{
>>>>>>> +   GET_CURRENT_CONTEXT(ctx);
>>>>>>> +   size_t cmd_size =
>>>>>>> +      sizeof(struct marshal_cmd_BufferData) + (data ? size *2 : 0);
>>>>>>
>>>>>> Why the *2?
>>>>>
>>>>> That's a a typo I thought, I fixed it before sending.
>>>>>
>>>>>>
>>>>>> Also, I didn't notice that before, but I'm concerned that this can
>>>>>> break
>>>>>> the alignment of subsequent marshal_cmd structs. It's probably
>>>>>> best to
>>>>>> force the alignment in _mesa_glthread_allocate_command, so that this
>>>>>> problem cannot be missed later on in similar situations.
>>>>>
>>>>> You mean as a follow up patch right? We don't want to make a copy
>>>>> here.
>>>>
>>>> Well, better as a preparatory patch, so that the alignment problem is
>>>> never introduced in the first place, but yes.
>>>
>>> This code (and I assume other similar code) is already generated
>>> automatically by gl_marshal.py so there is no regression.
>>
>> I have a feeling we're talking past each other. At least I don't see how
>> what you write relates to what I wrote :/
>>
>> The issue that concerns me is when size is not a multiple of 8 bytes (or
>> perhaps sizeof(void *)? -- but there are commands with 8 byte-arguments
>> even on 32-bit platforms).
>>
>> In that case, cmd_size % 8 != 0, and _mesa_glthread_allocate_command
>> will happily advance glthread->batch->used to an amount that is not a
>> multiple of 8. As a result, subsequent command structs will no longer be
>> aligned.
>>
>> I suppose this isn't a noticeable issue on x86 (perhaps some memory
>> accesses will be slower when crossing a cache line), but (1) sanitizers
>> _will_ complain about this, and (2) it's likely to break other
>> architectures.
>>
>> My suggested solution would be to round size up in
>> _mesa_glthread_allocate_command, to catch this problem in a single
>> place. Alternatively (e.g. if that degrades the assembly generated by
>> the compiler), we could simply add an assert in
>> _mesa_glthread_allocate_command and do the alignment in the custom-coded
>> mesa_marshal_* functions that need it.
>>
>
> Right, I understand the problem, and your suggestion. :)
>
> I'm just saying that the marshal generation code in master generates
> code with this problem already, so there is no way to create "a
> preparatory patch, so that the alignment problem is never introduced in
> the first place".

Oh, it does? Hmm, because some of the structs only have a 4-byte 
alignment? I guess you're right then, the order of patches doesn't matter.

Nicolai

>
> This patch simply skips that code path for AMD_pinned_memory. It's
> possible we could avoid this manual code and even keep this path
> threaded by tracking the use of EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD but
> for now this stops the failures.

-- 
Lerne, wie die Welt wirklich ist,
Aber vergiss niemals, wie sie sein sollte.