[Mesa-dev] build failure: _mesa_BindAttribLocation vs _mesa_lookup_shader_program_err , GLuint vs GLhandleARB

Jeremy Huddleston Sequoia jeremyhu at apple.com
Sat May 24 21:22:45 PDT 2014


On May 24, 2014, at 19:55, Emil Velikov <emil.l.velikov at gmail.com> wrote:

> On 25/05/14 02:12, Jeremy Huddleston Sequoia wrote:
>> I'm getting this build failure:
>> 
>> main/shader_query.cpp:49:7: error: no matching function for call to '_mesa_lookup_shader_program_err'
>>      _mesa_lookup_shader_program_err(ctx, program, "glBindAttribLocation");
>>      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> ../../src/mesa/main/shaderobj.h:81:1: note: candidate function not viable: cannot convert argument of incomplete type 'GLhandleARB'
>>      (aka 'void *') to 'GLuint' (aka 'unsigned int')
>> _mesa_lookup_shader_program_err(struct gl_context *ctx, GLuint name,
>> ^
>> 
>> _mesa_lookup_shader_program_err is defined as:
>> 
>> extern struct gl_shader_program *
>> _mesa_lookup_shader_program_err(struct gl_context *ctx, GLuint name,
>>                                const char *caller);
>> 
>> 
>> but it is being passed a GLhandleARB rather than an GLuint by _mesa_BindAttribLocation:
>> 
>> void GLAPIENTRY
>> _mesa_BindAttribLocation(GLhandleARB program, GLhandleARB index,
>>                            const GLcharARB *name)
>> {
>>   GET_CURRENT_CONTEXT(ctx);
>> 
>>   struct gl_shader_program *const shProg =
>>      _mesa_lookup_shader_program_err(ctx, program, "glBindAttribLocation");
>> ...
>> 
>> This seems like an old bug that is just now coming to light.  The main reason nobody else seems to have hit this in a while is that on most platforms, GLhandleARB and GLuint are both 'unsigned integer', so the compiler happily goes back and forth between the two, but on darwin, GLhandleARB is a void * and thus cannot be implicitly cast to an unsigned int:
>> 
>> #ifdef __APPLE__
>> typedef void *GLhandleARB;
>> #else
>> typedef unsigned int GLhandleARB;
>> #endif
>> 
>> How should we address this?  I'd really really really prefer to not have to put a bunch of casts everywhere, and I'm ok breaking binary compatibility on darwin in fixing this.
>> 
> Hi Jeremy,
> 
> IIRC there was another location where the above typedef gave us the finger.
> Not entirety sure what the conclusion on the topic was and I believe that some
> of the patches did not get accepted as they would break our current libGL <>
> DRI ABI. The discussion (starting with a few patches) is available in the ML
> archives [1].
> 
> -Emil
> 
> [1] http://lists.freedesktop.org/archives/mesa-dev/2014-March/055617.html

Thanks for the pointer.  +Brian and Ian from the March thread.

As I understand it, the only platforms where fixing this could break the DRI ABI are the ones where GLhandleARB and GLuint do not have the same underlying type.  The only platform where that is the case is darwin, which doesn't use that code (hence why I mentioned above that I wasn't concerned about fixing this breaking binary compatibility on darwin).  Can someone explain how chaning some GLuint types to GLhandleARB (or visa versa) could break ABI on other systems?  I just don't see why that would be the case.

Ian said:
> The problem is that drivers are built expecting that glCompileShader and
> glCompileShaerARB are the same function.  As a result, the driver only
> asks libGL the offset of one of those functions in the dispatch table,
> and it only sets one pointer in the dispatch table.  Then an application
> tries to call the "other" function, gets a NULL dispatch pointer, and
> explodes.

That doesn't seem right to me.  Why would the driver only set one entry?  As it knows (or at least assumes) that both are the same, it seems understandable that it would just ask libGL for one of the functions, but it should set both entries in its dispatch table to that value.  Having a NULL entry for one of those functions seems like an obvious bug at the driver level.  Is the application layer really responsible for knowing about what aliasing is being done at the driver level?  That's a rather big violation of the abstraction that I'd expect to be present.

Also, in the earlier thread, Ian said, "I can't understand why we'd break our own ABI because of something silly that Apple did.  This feels like madness." ... if I recall, the issue wasn't that Apple did "something silly," the issue was that GLhandleARB was underspecified and different vendors implemented it differently.  Apple is no more "at fault" for making it sized to a pointer (which is actually much more "safe" given ambiguity) than Mesa is "at fault" for fixing it at a 32bit unsigned integer.  The real issue here is that mesa is mixing GLhandleARB and GLuint when it shouldn't be and has made other design decisions which make fixing bugs like this difficult.

--Jeremy



More information about the mesa-dev mailing list