<div dir="ltr"><div>Before we start, I have a clarifying question: Is the binary format intended to be the same for 32- and 64-bit builds, or different? At least one of my comments below will depend on the answer to this question.<br> </div><div><div><br>On 24 October 2013 01:28, Tapani Pälli <span dir="ltr"><<a href="mailto:tapani.palli@intel.com" target="_blank">tapani.palli@intel.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">+/* cache specific debug output */<br> +#ifdef SHADER_CACHE_DEBUG<br> +#define CACHE_DEBUG(fmt, args...) printf(fmt, ## args)<br> +#else<br> +#define CACHE_DEBUG(fmt, args...) do {} while (0)<br> +#endif<br></blockquote><div><br></div><div>This is the gcc-specific variadic argument syntax. To be compatible with MSVC, we have to use the C99 syntax, which is:<br><br>#ifdef SHADER_CACHE_DEBUG<br>#define CACHE_DEBUG(fmt, ...) printf(fmt, ## __VA_ARGS__)<br> #else<br>#define CACHE_DEBUG(fmt, ...) do {} while (0)<br>#endif<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> +/* C API for the cache */<br> +#ifdef __cplusplus<br> +#define _EXTC extern "C"<br> +#else<br> +#define _EXTC<br> +#endif<br> +<br> +_EXTC char *_mesa_shader_serialize(struct gl_shader *shader,<br> + struct _mesa_glsl_parse_state *state,<br> + const char *mesa_sha, size_t *size);<br> +<br> +_EXTC struct gl_shader *_mesa_shader_unserialize(void *mem_ctx,<br> + void *blob, const char *mesa_sha, size_t size);<br> +<br> +_EXTC char *_mesa_program_serialize(struct gl_shader_program *prog,<br> + size_t *size, const char *mesa_sha);<br> +<br> +_EXTC int _mesa_program_unserialize(struct gl_shader_program *prog,<br> + const GLvoid *blob, size_t size, const char *mesa_sha);<br></blockquote><div><br></div><div>This _EXTC stuff is nonstandard. What we typically do is to surround the block with<br><br>#ifdef __cplusplus<br>extern "C" {<br> #endif<br><br></div><div>and<br><br>#ifdef __cplusplus<br></div><div>} /* extern "C" */<br></div><div>#endif<br><br></div><div>See, for example, src/glsl/program.h.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> +#ifdef __cplusplus<br> +<br> +<br> +/* helper class for writing data to memory */<br></blockquote><div><br></div><div>Please use doxygen-formatted comments for documenting structs, classes, functions, and struct/class members, i.e.:<br><br>/**<br> * blah blah blah<br> */<br><br></div><div>Also, comments of the form "helper class for XXX" are generally hard to follow. We really need a comment explaining what the class does. For example, in this case, maybe it should be something like:<br> <br>/**<br> * This class maintains a dynamically-sized memory buffer and allows for data<br> * to be efficiently appended to it with automatic resizing.<br> */<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +struct memory_writer<br></blockquote><div><br></div><div>Let's make this a class rather than a struct to clarify that it's C++ only.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +{<br> +public:<br> + memory_writer() :<br> + memory(NULL),<br> + memory_p(NULL),<br> + curr_size(0),<br> + pos(0)<br> + { }<br> +<br> + /* NOTE - there is no dtor to free memory, user is responsible */<br> + void free_memory()<br> + {<br> + if (memory)<br> + free(memory);<br> + }<br></blockquote><div><br></div><div>I'm assuming the reason you did this rather than write a destructor was that you wanted the option of having the user of the class claim ownership of the memory by calling mem()?<br> <br>The safer way to do this is to go ahead and write the destructor, and change mem() to something like this:<br><br></div><div>inline char *release_memory(size_t *size)<br>{<br></div><div> char *result = memory;<br></div> <div> *size = pos;<br></div><div> memory = NULL;<br></div><div> memory_p = NULL;<br></div><div> curr_size = 0;<br></div><div> pos = 0;<br></div><div> return result;<br>}<br><br></div><div>That way there is no danger of someone forgetting to call free_memory() (or calling it when they shouldn't).<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + /* realloc more memory */<br> + int grow(int32_t size)<br></blockquote><div><br></div><div>This function should be private.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + {<br> + char *more_mem = (char *) realloc(memory, curr_size + size);<br></blockquote><div><br></div><div>Growing the allocation to just curr_size + size will give this class O(n^2) performance (since in general, every call to write() will require a call to grow(), which may need to memcpy all of the previously-written data). What you need to do is something like this (ignoring error handling):<br> <br></div><div>size_t new_size = 2 * (curr_size + size);<br></div><div>char *more_mem = (char *) realloc(memory, new_size);<br></div><div>memory = more_mem;<br></div><div>memory_p = memory + pos;<br></div><div>curr_size = new_size;<br> <br></div><div>By doubling the size on each call to grow() we guarantee that we won't have to call realloc() more than log(n) times, so the total runtime of all the realloc's will be bounded by O(n).<br><br></div> <div>Note: if we do this, then sometimes when we're done serializing we'll have allocated more memory than we need. To avoid wasting memory over the long term we should probably do a final realloc() down to the exact size needed. Assuming you take my advice about release_memory() above, an easy way to do that would be to do a final realloc() in release_memory(). Note that this final realloc() will be efficient because it's guaranteed not to need to memcpy the data.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + if (more_mem == NULL) {<br> + free(memory);<br> + memory = NULL;<br> + return -1;<br> + } else {<br> + memory = more_mem;<br> + memory_p = memory + pos;<br> + curr_size += size;<br> + return 0;<br> + }<br> + }<br></blockquote><div><br></div><div>Generally in Mesa we don't worry too much about memory allocation failures, since in an out of memory condition it's not really feasible to recover. I'd recommend eliminating the out-of-memory-handling code and changing the return type of grow() (and the write() functions) to void.<br> <br>If we do decide to keep the out-of-memory handling, then:<br>- return type of grow() and write() should be bool, and they should return true on success.<br></div><div>- we need to change all the code that calls write() to make sure that it properly checks the return value.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + /* write functions for different types */<br> +#define _WRITE_TYPE(type) inline int write(type *val) {\<br> + return write(val, sizeof(*val), 1, -1);\<br> +}<br> +<br> + _WRITE_TYPE(uint8_t)<br> + _WRITE_TYPE(int32_t)<br> + _WRITE_TYPE(uint32_t)<br> + _WRITE_TYPE(bool)<br> + _WRITE_TYPE(gl_texture_index)<br> + _WRITE_TYPE(ir_node_type)<br> + _WRITE_TYPE(ir_loop_jump::jump_mode)<br> + _WRITE_TYPE(ir_expression_operation)<br> + _WRITE_TYPE(GLbitfield64)<br></blockquote><div><br></div><div>Because of implicit type conversions, these definitions essentially allow any integer type to be passed to write(), and that increases the risk of mistakes. For example, I believe this would be allowed, in both 32 and 64 bit builds:<br> <br></div><div>size_t s;<br></div><div>write(s);<br><br></div><div>But in 32 bit builds it'll resolve to _WRITE_TYPE(uint32_t) and in 64 bit builds it'll resolve to _WRITE_TYPE(GLbitfield64).<br><br>I'd prefer if we just made the type explicit in each write function, i.e. write_uint8(), write_int32(), write_bool(), and so on. It's a little more typing at the call site, but it carries the advantage that we always know exactly what we're writing.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + int write(const void *data, int32_t size, int32_t nmemb,<br> + int32_t offset = -1)<br></blockquote><div><br></div><div>It looks like nearly all callers set either size or nmemb to 1. I think the code would be easier to follow if we just had a single size argument, and in the rare cases where we are serializing multiple objects whose size != 1, just do the multiply in the caller.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + {<br> + int32_t amount = size * nmemb;<br> + int32_t extra = 8192;<br> +<br> + /* reallocate more if does not fix to current memory */<br> + if (!memory || pos > (int32_t)(curr_size - amount))<br> + if (grow(amount + extra))<br> + return -1;<br></blockquote><div><br></div><div>For the case where offset != -1, this will do the wrong thing (it will try to reserve memory when it's not necessary to do so). It looks like in all cases where offset != -1, the necessary memory is guaranteed to be already reserved (because we are overwriting a value we wrote previously) so I'd recommend that in this case we simply do:<br> <br>assert(amount + extra <= pos);<br><br></div><div>Also, given how differently this function needs to behave when offset != -1, and given how infrequently it is called with an offset other than -1, I'd recommend breaking the offset != -1 case out to its own function (maybe call it overwrite()).<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + /**<br> + * by default data is written to pos memory_p, however<br> + * if an offset value >= 0 is passed then that value is<br> + * used as offset from start of the memory blob<br> + */<br> + char *dst = memory_p;<br> +<br> + if (offset > -1)<br> + dst = memory + offset;<br> +<br> + memcpy(dst, data, amount);<br> +<br> + /* if no offset given, forward the pointer */<br> + if (offset == -1) {<br> + memory_p += amount;<br> + pos += amount;<br> + }<br> + return 0;<br> + }<br> +<br> + int write_string(const char *str)<br> + {<br> + if (!str)<br> + return -1;<br> + uint32_t len = strlen(str);<br> + write(&len);<br> + write(str, 1, len);<br> + return 0;<br> + }<br> +<br> + inline int32_t position() { return pos; }<br> + inline char *mem() { return memory; }<br> +<br> +private:<br> + char *memory;<br> + char *memory_p;<br> + int32_t curr_size;<br> + int32_t pos;<br></blockquote><div><br></div><div>Please add a short comment above each of these private data members explaining what it stores.<br><br></div><div>I'm not comfortable with storing both memory_p and pos in this class--they are redundant, so it means we have to always double check that we keep the two of them in sync. I'd suggest getting rid of memory_p, since that looks like it would require the smallest number of changes to the rest of the class.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +};<br> +<br> +<br> +/* helper for string serialization */<br></blockquote><div><br></div><div>Maybe this comment could be something like:<br><br>/**<br> * Wrapper for a C string that manages allocation and deallocation of the<br> * underlying memory.<br> */<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +struct string_data<br></blockquote><div><br></div><div>This should also be a class.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +{<br> +public:<br> + string_data() : len(0), data(NULL) {}<br> + string_data(const char *cstr) :<br> + data(NULL)<br> + {<br> + if (cstr) {<br></blockquote><div><br></div><div>It looks like we almost never expect to pass NULL to the string_data constructor (other than by mistake). I'd prefer to change things so that we never pass NULL to the string_data constructor, and get rid of this test, so that if we ever do pass NULL to string_data by mistake, we'll notice the bug as quickly as possible.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + len = strlen(cstr);<br> + data = _mesa_strdup(cstr);<br> + }<br> + }<br> +<br> + ~string_data() {<br> + if (data) {<br> + free(data);<br> + data = NULL;<br></blockquote><div><br></div><div>It's not necessary to set data to NULL here, since after the destructor finishes the class won't be accessible anymore.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + }<br> + }<br> +<br> + int serialize(memory_writer &blob)<br></blockquote><div><br></div><div>As with the grow() and write() functions above, I think this should return void. However, if we do decide to give it a return value, it should be a bool, with true indicating success.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + {<br> + if (!data)<br> + return -1;<br></blockquote><div><br></div><div>We never expect to call serialize() on string_data whose data is NULL, right? If so, this should be an assertion:<br><br>assert(data != NULL);<br></div><div> </div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + blob.write(&len);<br> + blob.write(data, 1, len);<br> + return 0;<br> + }<br> +<br> + void set(const char *cstr)<br> + {<br> + if (data)<br> + free(data);<br></blockquote><div><br></div><div>With one exception (which I have a suggestion about changing, see ir_cache_variable_data below), it looks like we only ever call set() in circumstances where data is NULL. I'd suggest changing this to:<br> <br></div><div>assert(data == NULL);<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + data = _mesa_strdup(cstr);<br> + len = strlen(cstr);<br> + }<br> +<br> + uint32_t len;<br> + char *data;<br></blockquote><div><br></div><div>These two data members should be private. Currently their only users outside of the class are glsl_type_data::serialize() and the ir_cache_variable_data constructor, both of which can be easily changed not to access them. <br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +};<br> +<br> +<br> +/* data required to serialize glsl_type */<br> +struct glsl_type_data<br></blockquote><div><br></div><div>Can you help me understand why it's necessary to have a separate class for this, rather than simply give glsl_type a serialize() method? At the moment this seems like needless copying.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +{<br> +public:<br> + glsl_type_data() :<br> + name(NULL),<br> + element_type(NULL),<br> + field_names(NULL),<br> + field_types(NULL),<br> + field_major(NULL) {}<br> +<br> + glsl_type_data(const glsl_type *t) :<br> + base_type(t->base_type),<br> + length(t->length),<br> + vector_elms(t->vector_elements),<br> + matrix_cols(t->matrix_columns),<br> + sampler_dimensionality(t->sampler_dimensionality),<br> + sampler_shadow(t->sampler_shadow),<br> + sampler_array(t->sampler_array),<br> + sampler_type(t->sampler_type),<br> + interface_packing(t->interface_packing),<br> + element_type(NULL),<br> + field_names(NULL),<br> + field_types(NULL),<br> + field_major(NULL)<br></blockquote><div><br></div><div>These inline functions are starting to get large for my taste. Starting with this class, let's move them into the .cpp file. If we later find that it's necessary to inline some of them for performance reasons, we can always change them back to inline. But I doubt this will ever be a hot enough code path for that to be necessary.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + {<br> + name = new string_data(t->name);<br> +<br> + /* for array, save element type information */<br> + if (t->base_type == GLSL_TYPE_ARRAY)<br> + element_type =<br> + new glsl_type_data(t->element_type());<br> +<br> + /* with structs, copy each struct field name + type */<br> + else if (t->base_type == GLSL_TYPE_STRUCT) {<br></blockquote><div><br></div><div>Don't we need to handle interface block types as well? They have additional information in glsl_struct_field that's important (location, interpolation, and centroid).<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + field_names = new string_data[t->length];<br> + field_types = new glsl_type_data*[t->length];<br> + field_major = new uint32_t[t->length];<br> +<br> + glsl_struct_field *field = t->fields.structure;<br> + glsl_type_data **field_t = field_types;<br> + for (unsigned k = 0; k < t->length; k++, field++, field_t++) {<br> + field_names[k].set(field->name);<br> + *field_t = new glsl_type_data(field->type);<br> + field_major[k] = field->row_major;<br> + }<br> + }<br> + }<br> +<br> + ~glsl_type_data() {<br> + delete name;<br> + delete element_type;<br> + delete [] field_names;<br> + delete [] field_major;<br> + if (field_types) {<br> + struct glsl_type_data **data = field_types;<br> + for (int k = 0; k < length; k++, data++)<br> + delete *data;<br> + delete [] field_types;<br> + }<br> + }<br> +<br> + int serialize(memory_writer &blob)<br> + {<br> + uint32_t ir_len = 666;<br> + blob.write(&name->len);<br> + blob.write(name->data, 1, name->len);<br></blockquote><div><br></div><div>These two lines can be replaced with:<br><br>name->serialize(blob);<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + int32_t start_pos = blob.position();<br> + blob.write(&ir_len);<br></blockquote><div><br></div><div>It seeems like this pattern crops up a lot: write out a bogus length value, keep track of its position, and then later overwrite that with the true length once the rest of the data has been written. How about if we make a class that does all this work? I'm thinking something like:<br> <br></div><div>struct memory_writer<br>{<br> ...<br> class scoped_length<br> {<br> public:<br> explicit scoped_length(memory_writer *writer)<br> : writer(writer)<br> {<br> uint32_t dummy;<br> writer->write(&dummy);<br> start_pos = writer->position();<br> }<br><br> ~scoped_length()<br> {<br> uint32_t len = writer->position() - start_pos;<br> writer->write(&len, sizeof(len), 1, start_pos);<br> }<br><br> private:<br> memory_writer *writer;<br> uint32_t start_pos;<br> };<br> ...<br>};<br><br></div><div>And then any time you need to serialize the length of some data that follows you can just do:<br> <br>{<br></div><div> memory_writer::scoped_length len(&blob); /* reserves space for the length */<br></div><div> ... /* write out variable length data */<br></div><div>} /* when len goes out of scope, the proper length is automatically written */<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + blob.write(this, sizeof(*this), 1);<br> +<br> + if (base_type == GLSL_TYPE_ARRAY)<br> + element_type->serialize(blob);<br> + else if (base_type == GLSL_TYPE_STRUCT) {<br></blockquote><div><br></div><div>As above, I'm worried that we may have to handle interface block types as well.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + struct string_data *data = field_names;<br> + glsl_type_data **field_t = field_types;<br> + for (int k = 0; k < length; k++, data++, field_t++) {<br> + data->serialize(blob);<br> + (*field_t)->serialize(blob);<br> + blob.write(&field_major[k]);<br> + }<br> + }<br> +<br> + ir_len = blob.position() - start_pos - sizeof(ir_len);<br> + blob.write(&ir_len, sizeof(ir_len), 1, start_pos);<br> + return 0;<br> + }<br> +<br> + int32_t base_type;<br> + int32_t length;<br> + int32_t vector_elms;<br> + int32_t matrix_cols;<br> +<br> + uint32_t sampler_dimensionality;<br> + uint32_t sampler_shadow;<br> + uint32_t sampler_array;<br> + uint32_t sampler_type;<br> +<br> + uint32_t interface_packing;<br> +<br> + struct string_data *name;<br> +<br> + /* array element type */<br> + struct glsl_type_data *element_type;<br> +<br> + /* structure fields */<br> + struct string_data *field_names;<br> + struct glsl_type_data **field_types;<br> + uint32_t *field_major;<br> +};<br> +<br> +<br> +/* helper to create a unique id from a ir_variable address */<br> +static uint32_t _unique_id(ir_variable *var)<br> +{<br> + char buffer[256];<br> + _mesa_snprintf(buffer, 256, "%s_%p", var->name, var);<br> + return _mesa_str_checksum(buffer);<br></blockquote><div><br></div><div>Two problems:<br><br></div><div>1. This is a lot of work to obtain a unique identifier. The pointer is already unique; we should just use that.<br> </div><div><br>2. There's no guarantee that the checksum of the string will be unique. Because of the birthday paradox, checksum collisions are probably going to happen all the time.<br><br></div><div>Instead of going to a lot of work to generate a unique id for each ir_variable, I think we should just serialize the ir_variable pointer address. If we are trying to use the same binary format for 32-bit and 64-bit builds (see my clarifying question at the top of this email), we'll need to extend the pointer to 64 bits before serializing it.<br> <br></div><div>Alternatively, if you want to keep the unique ID's small, we'll have to use some hashtable mechanism to map each variable to its own ID.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +}<br> +<br> +<br> +/* data required to serialize ir_variable */<br> +struct ir_cache_variable_data<br></blockquote><div><br></div><div>As with glsl_type_data, I don't understand why we need to have this class rather than just adding a serialize() method to the ir_variable class.<br></div> <div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +{<br> +public:<br> + ir_cache_variable_data() :<br> + type(NULL),<br> + name(NULL),<br> + unique_name(NULL),<br> + state_slots(NULL) {}<br> +<br> + ir_cache_variable_data(ir_variable *ir) :<br> + unique_id(_unique_id(ir)),<br> + max_array_access(ir->max_array_access),<br> + ir_type(ir->ir_type),<br> + mode(ir->mode),<br> + location(ir->location),<br> + read_only(ir->read_only),<br> + centroid(ir->centroid),<br> + invariant(ir->invariant),<br> + interpolation(ir->interpolation),<br> + origin_upper_left(ir->origin_upper_left),<br> + pixel_center_integer(ir->pixel_center_integer),<br> + explicit_location(ir->explicit_location),<br> + explicit_index(ir->explicit_index),<br> + explicit_binding(ir->explicit_binding),<br> + has_initializer(ir->has_initializer),<br> + depth_layout(ir->depth_layout),<br> + location_frac(ir->location_frac),<br> + num_state_slots(ir->num_state_slots),<br> + has_constant_value(ir->constant_value ? 1 : 0),<br> + has_constant_initializer(ir->constant_initializer ? 1 : 0)<br> + {<br> + name = new string_data(ir->name);<br> +<br> + /* name can be NULL, see ir_print_visitor for explanation */<br> + if (!ir->name)<br> + name->set("parameter");<br></blockquote><div><br></div><div>To avoid having to call set() on a non-NULL string_data, I think we should change this to:<br><br></div><div>const char *non_null_name = ir->name ? ir->name : "parameter";<br> </div><div>name = new string_data(non_null_name);<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + char uniq[256];<br> + _mesa_snprintf(uniq, 256, "%s_%d", name->data, unique_id);<br></blockquote><div><br></div>To avoid having to access an element of a string_data that ought to be private, I think we should change this to:<br> <div><br></div><div>_mesa_snprintf(uniq, 256, "%s_%d", non_null_name, unique_id);<br><br></div><div>(Note, however, that if we decide to use a pointer as the unique ID, this will have to change to ensure that on 64-bit builds, the entire pointer is included in the printf)<br> <br></div><div>Two additional problems:<br></div><div><br>1. If the name is too long, uniq won't get null terminated, and the call to string_data(uniq) below will go outside its bounds. Since the name comes from the client-supplied shader, this could conceivably lead to a security exploit.<br> <br></div><div>2. If the name is too long, unique_name won't be guaranteed to be unique.<br><br></div><div>I'd recommend just using ralloc_asprintf(), which allocates a string of the appropriate length.<br></div> <div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + unique_name = new string_data(uniq);<br> + type = new glsl_type_data(ir->type);<br> +<br> + state_slots = (ir_state_slot *) malloc<br> + (ir->num_state_slots * sizeof(ir->state_slots[0]));<br> + memcpy(state_slots, ir->state_slots,<br> + ir->num_state_slots * sizeof(ir->state_slots[0]));<br> + }<br> +<br> + ~ir_cache_variable_data()<br> + {<br> + delete name;<br> + delete unique_name;<br> + delete type;<br> +<br> + if (state_slots)<br> + free(state_slots);<br> + }<br> +<br> + int serialize(memory_writer &blob)<br> + {<br> + type->serialize(blob);<br> + name->serialize(blob);<br> + unique_name->serialize(blob);<br> + blob.write(this, sizeof(*this), 1);<br> +<br> + for (unsigned i = 0; i < num_state_slots; i++) {<br> + blob.write(&state_slots[i].swizzle);<br> + for (unsigned j = 0; j < 5; j++) {<br> + blob.write(&state_slots[i].tokens[j]);<br> + }<br> + }<br> + return 0;<br> + }<br> +<br> + struct glsl_type_data *type;<br> + struct string_data *name;<br> + struct string_data *unique_name;<br> +<br> + uint32_t unique_id;<br> +<br> + uint32_t max_array_access;<br> + int32_t ir_type;<br> + int32_t mode;<br> + int32_t location;<br> + uint32_t read_only;<br> + uint32_t centroid;<br> + uint32_t invariant;<br> + uint32_t interpolation;<br> + uint32_t origin_upper_left;<br> + uint32_t pixel_center_integer;<br> + uint32_t explicit_location;<br> + uint32_t explicit_index;<br> + uint32_t explicit_binding;<br> + uint8_t has_initializer;<br> + int32_t depth_layout;<br> + uint32_t location_frac;<br> +<br> + uint32_t num_state_slots;<br> + struct ir_state_slot *state_slots;<br> +<br> + uint8_t has_constant_value;<br> + uint8_t has_constant_initializer;<br> +};<br> +<br> +<br> +/* helper class to read instructions */<br></blockquote><div><br></div><div>This comment should be expanded to explain that this class wraps around the posix functions for mapping a file into the process's address space.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +struct memory_map<br> +{<br> +public:<br> + memory_map() :<br> + fd(0),<br> + cache_size(0),<br> + cache_mmap(NULL),<br> + cache_mmap_p(NULL) { }<br> +<br> + /* read from disk */<br> + int map(const char *path)<br></blockquote><div><br></div><div>Return value should be a boolean, with true indicating success.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + {<br> + struct stat stat_info;<br> + if (stat(path, &stat_info) != 0)<br> + return -1;<br> +<br> + cache_size = stat_info.st_size;<br> +<br> + fd = open(path, O_RDONLY);<br> + if (fd) {<br> + cache_mmap_p = cache_mmap = (char *)<br> + mmap(NULL, cache_size, PROT_READ, MAP_PRIVATE, fd, 0);<br></blockquote><div><br></div><div>Will this work on Windows?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + return (cache_mmap == MAP_FAILED) ? -1 : 0;<br> + }<br> + return -1;<br> + }<br> +<br> + /* read from memory */<br> + int map(const void *memory, size_t size)<br> + {<br> + cache_mmap_p = cache_mmap = (char *) memory;<br> + cache_size = size;<br> + return 0;<br> + }<br> +<br> + ~memory_map() {<br> + if (cache_mmap) {<br> + munmap(cache_mmap, cache_size);<br> + close(fd);<br> + }<br> + }<br> +<br> + /* move read pointer forward */<br> + inline void ffwd(int len)<br> + {<br> + cache_mmap_p += len;<br> + }<br> +<br> + /* position of read pointer */<br> + inline uint32_t position()<br> + {<br> + return cache_mmap_p - cache_mmap;<br> + }<br> +<br> + inline void read_string(char *str)<br> + {<br> + uint32_t len;<br> + read(&len);<br> + memcpy(str, cache_mmap_p, len);<br> + str[len] = '\0';<br> + ffwd(len);<br></blockquote><div><br></div><div>If the string is larger than the caller expects (or the contents of the shader binary are corrupted), this function may overwrite arbitrarily large areas of memory. Since the shader binary may be supplied by client code (in the case of OES_get_program_binary), this is a security risk.<br> <br></div><div>I can think of two possible ways of fixing this:<br><br></div><div>1. Change this function so that it copies the string to a newly allocated chunk of memory.<br><br></div><div>2. Change the format of the shader binary so that strings are stored as null-terminated (rather than length followed by string data); that way this function can simply return cache_mmap_p and fast forward to the next byte after the end of the string.<br> </div><div><br>I'd lean in favor of #2. <br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + }<br> +<br> + /* read functions for different types */<br> +#define _READ_TYPE(type) inline void read(type *val) {\<br> + *val = *(type *) cache_mmap_p;\<br> + ffwd(sizeof(type));\<br> +}<br> + _READ_TYPE(int32_t)<br> + _READ_TYPE(uint32_t)<br> + _READ_TYPE(bool)<br> + _READ_TYPE(GLboolean)<br> + _READ_TYPE(gl_texture_index)<br> + _READ_TYPE(ir_expression_operation)<br> +<br> + inline void read(void *dst, size_t size)<br> + {<br> + memcpy(dst, cache_mmap_p, size);<br> + ffwd(size);<br> + }<br> +<br> + /* read and serialize a gl_shader */<br> + inline struct gl_shader *read_shader(void *mem_ctx,<br> + const char *mesa_sha, size_t size)<br> + {<br> + struct gl_shader *sha = _mesa_shader_unserialize(mem_ctx,<br> + cache_mmap_p, mesa_sha, size);<br> + ffwd(size);<br> + return sha;<br> + }<br></blockquote><div><br></div><div>It surprises me to see us fast-forward by a caller-supplied size here. Shouldn't we encode the size in the first 4 bytes of the shader binary?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> +private:<br> +<br> + int32_t fd;<br> + int32_t cache_size;<br> + char *cache_mmap;<br> + char *cache_mmap_p;<br> +};<br> +<br> +<br> +/* class to read and write gl_shader */<br></blockquote><div><br></div><div>This is one of the cases where I need more information in the comment above the class definition. Is this a class that should be created once per context, and represents a proxy to an on-disk cache of shaders? Or is it a class that should be created once per load/save of a single shader, to aid in saving/loading the shader? I'm guessing it's the latter, because prototypes_only seems like the sort of thing that will change from one serialization/deserialization to the next.<br> <br>Perhaps a clearer name for this class would be something like "ir_serializer"?<br><br></div><div>Also, it looks like there's very little crossover between the code needed to serialize a shader and the code needed to deserialize a shader. Perhaps this should be split into two classes, ir_serializer and ir_deserializer.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +struct ir_cache<br> +{<br> +public:<br> + ir_cache(bool prototypes = false) :<br> + prototypes_only(prototypes)<br> + {<br> + var_ht = hash_table_ctor(0, hash_table_string_hash,<br> + hash_table_string_compare);<br> + }<br> +<br> + ~ir_cache()<br> + {<br> + hash_table_call_foreach(this->var_ht, delete_key, NULL);<br> + hash_table_dtor(this->var_ht);<br> + }<br> +<br> + /* serialize gl_shader to memory */<br> + char *serialize(struct gl_shader *shader,<br> + struct _mesa_glsl_parse_state *state,<br> + const char *mesa_sha, size_t *size);<br></blockquote><div><br></div><div>Style nit-pick: the arguments should be lined up in the column after the opening paren, like this:<br><br> char *serialize(struct gl_shader *shader,<br> struct _mesa_glsl_parse_state *state,<br> const char *mesa_sha, size_t *size);<br><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + /* unserialize gl_shader from mapped memory */<br> + struct gl_shader *unserialize(void *mem_ctx, memory_map &map,<br> + uint32_t shader_size,<br> + struct _mesa_glsl_parse_state *state,<br> + const char *mesa_sha,<br> + int *error_code);<br> +<br> + enum cache_error {<br> + GENERAL_READ_ERROR = -1,<br> + DIFFERENT_MESA_SHA = -2,<br> + DIFFERENT_LANG_VER = -3,<br> + };<br> +<br> + /**<br> + * this method is public so that gl_shader_program<br> + * unserialization can use it when reading the<br> + * uniform storage<br> + */<br> + const glsl_type *read_glsl_type(memory_map &map,<br> + struct _mesa_glsl_parse_state *state);<br> +<br> +private:<br> +<br> + /* variables and methods required for serialization */<br> +<br> + memory_writer blob;<br> +<br> + bool prototypes_only;<br> +<br> + /**<br> + * writes ir_type and instruction dump size as a 'header'<br> + * for each instruction before calling save_ir<br> + */<br> + int save(ir_instruction *ir);<br> +<br> + int save_ir(ir_variable *ir);<br> + int save_ir(ir_assignment *ir);<br> + int save_ir(ir_call *ir);<br> + int save_ir(ir_constant *ir);<br> + int save_ir(ir_dereference_array *ir);<br> + int save_ir(ir_dereference_record *ir);<br> + int save_ir(ir_dereference_variable *ir);<br> + int save_ir(ir_discard *ir);<br> + int save_ir(ir_expression *ir);<br> + int save_ir(ir_function *ir);<br> + int save_ir(ir_function_signature *ir);<br> + int save_ir(ir_if *ir);<br> + int save_ir(ir_loop *ir);<br> + int save_ir(ir_loop_jump *ir);<br> + int save_ir(ir_return *ir);<br> + int save_ir(ir_swizzle *ir);<br> + int save_ir(ir_texture *ir);<br> + int save_ir(ir_emit_vertex *ir);<br> + int save_ir(ir_end_primitive *ir);<br> +<br> +<br> + /* variables and methods required for unserialization */<br> +<br> + struct _mesa_glsl_parse_state *state;<br> + void *mem_ctx;<br> +<br> + struct exec_list *top_level;<br> + struct exec_list *prototypes;<br> + struct exec_list *current_function;<br></blockquote><div><br></div><div>These private data members need to have documentation comments. In particular, since an exec_list can in principle store anything that derives from exec_node, the comments should explain what kind of data is stored in each list.<br> </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> +<br> + int read_header(struct gl_shader *shader, memory_map &map,<br> + const char *mesa_sha);<br> + int read_prototypes(memory_map &map);<br> +<br> + int read_instruction(struct exec_list *list, memory_map &map,<br> + bool ignore = false);<br> +<br> + int read_ir_variable(struct exec_list *list, memory_map &map);<br> + int read_ir_assignment(struct exec_list *list, memory_map &map);<br> + int read_ir_function(struct exec_list *list, memory_map &map);<br> + int read_ir_if(struct exec_list *list, memory_map &map);<br> + int read_ir_return(struct exec_list *list, memory_map &map);<br> + int read_ir_call(struct exec_list *list, memory_map &map);<br> + int read_ir_discard(struct exec_list *list, memory_map &map);<br> + int read_ir_loop(struct exec_list *list, memory_map &map);<br> + int read_ir_loop_jump(struct exec_list *list, memory_map &map);<br> + int read_emit_vertex(struct exec_list *list, memory_map &map);<br> + int read_end_primitive(struct exec_list *list, memory_map &map);<br> +<br> + /* rvalue readers */<br> + ir_rvalue *read_ir_rvalue(memory_map &map);<br> + ir_constant *read_ir_constant(memory_map &map,<br> + struct exec_list *list = NULL);<br> + ir_swizzle *read_ir_swizzle(memory_map &map);<br> + ir_texture *read_ir_texture(memory_map &map);<br> + ir_expression *read_ir_expression(memory_map &map);<br> + ir_dereference_array *read_ir_dereference_array(memory_map &map);<br> + ir_dereference_record *read_ir_dereference_record(memory_map &map);<br> + ir_dereference_variable *read_ir_dereference_variable(memory_map &map);<br> +<br> + /**<br> + * var_ht is used to store created ir_variables with a unique_key for<br> + * each so that ir_dereference_variable creation can find the variable<br> + */<br> + struct hash_table *var_ht;<br> +<br> + /**<br> + * these 2 functions are ~copy-pasta from string_to_uint_map,<br></blockquote><div><br></div><div>I think you mean "copy-pasted" :)<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> + * we need a hash here with a string key and pointer value<br> + */<br> + void hash_store(void * value, const char *key)<br> + {<br> + char *dup_key = _mesa_strdup(key);<br> + bool result = hash_table_replace(this->var_ht, value, dup_key);<br> + if (result)<br> + free(dup_key);<br> + }<br> +<br> + static void delete_key(const void *key, void *data, void *closure)<br> + {<br> + (void) data;<br> + (void) closure;<br> + free((char *)key);<br> + }<br> +<br> +};<br> +#endif /* ifdef __cplusplus */<br> +<br> +#endif /* IR_CACHE_H */<br></blockquote><div><br></div><div>Unfortunately, I'm out of time to review this patch series today. I'm not sure when I'll be able to get back to it.<br></div></div></div></div></div> </div>