<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 11/05/2013 10:16 PM, Paul Berry
wrote:<br>
</div>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">On 1 November 2013 02:16, Tapani Pälli <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:tapani.palli@intel.com" target="_blank">tapani.palli@intel.com</a>></span>
wrote:<br>
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">ir_deserializer can
create a gl_shader structure out of binary<br>
data from ir_serializer, will be used by
OES_get_program_binary<br>
implementation.<br>
<br>
Signed-off-by: Tapani Pälli <<a moz-do-not-send="true"
href="mailto:tapani.palli@intel.com" target="_blank">tapani.palli@intel.com</a>><br>
---<br>
src/glsl/Makefile.sources | 1 +<br>
src/glsl/ir_cache_deserializer.cpp | 1341
++++++++++++++++++++++++++++++++++++<br>
src/glsl/ir_cache_deserializer.h | 301 ++++++++<br>
3 files changed, 1643 insertions(+)<br>
create mode 100644 src/glsl/ir_cache_deserializer.cpp<br>
create mode 100644 src/glsl/ir_cache_deserializer.h<br>
</blockquote>
<div><br>
</div>
<div>General comment: my attitude about error handling is
different when reviewing this patch than it was when
reviewing the last patch, because in the case of
deserialization, we're dealing with data passed in by the
client, so I think it's far more important to be robust in
the case of malformed input data. We probably don't need
to worry about malicious clients, since the client is in
process so if they wanted to be malicious they could just
walk all over memory. But it will make life a lot easier
for library clients (and for us when we're debugging) if
we can avoid having Mesa die in a fire when it receives a
corrupted shader binary.<br>
<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yes, I agree, it makes sense to have some more error checking in
deserialization, it has to be basic and quick though.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>Having said that, I still don't think integer return
values are the way to go, for the following reasons:<br>
</div>
<div>- it's confusing for 0 to mean success.<br>
</div>
<div>- we're going to forget to check return values.<br>
</div>
<div>- it complicates the signatures of a lot of functions
that should be able to just return the thing that they
read.<br>
<br>
</div>
<div>What I'd recommend doing instead is adding a "failed"
boolean to memory_map. That way rather than having to
check every single return value for failure, we just have
to check the "failed" boolean at appropriate critical
times (such as inside loops and at the end of functions
like ir_read_variable()), and return early in the event of
failure. An additional advantage of this is that it makes
it easy to add bounds checking to memory_map--the read()
and ffwd() functions can simply check if the position goes
out of bounds, and if so, set "failed" flag to 0 and read
zeros.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
ok, I'll take a look how this would work out. The current checks
have actually helped as they catch not just invalid/corrupt data
from client but possible errors in serialization too, even though
they are just simple value checks but there's quite a lot of them
and they help in stopping the parsing immediately. If allowed to go
further after error, it'll be harder to track when did it start to
fail so I think there will be still a lot of checks.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
+ir_deserializer::read_header(struct gl_shader *shader,
memory_map &map,<br>
+ const char *mesa_sha)<br>
</blockquote>
<div><br>
</div>
<div>Rather than pass the memory_map to all of the
ir_deserializer functions, I'd prefer to see it just be a
member of the ir_deserializer class. It's a closer
parallel to what you do with memory_writer in
ir_serializer, and it makes the code easier to read by
making all of the function signatures and invocations
smaller.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yep, I think it can be changed, the actual memory_map will come from
caller of this class. I think read_glsl_type() is the only method
which will require support to pass a memory_map, this is used when
reading uniform_storage when deserializing a whole pgoram. <br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>As with the memory_writer::write*() functions, I think
the memory_map::read() functions should include the type
in their name, and should use return by value rather than
a pointer argument. E.g.:<br>
<br>
</div>
<div>int32_t read_int32();<br>
</div>
<div>bool read_bool();<br>
<br>
</div>
<div>and so on.<br>
<br>
</div>
<div>(Note: I think it's still ok to have a read() function
that operates on a pointer for reading larger structs).<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yes, will change these.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>Although technically the C standard allows locals whose
name begin with an underscore (provided that what follows
is a lower case letter), it seems strange--names beginning
with underscores typically refer to globals that belong to
libraries.<br>
<br>
Also, it looks like all the callers of read_glsl_type pass
this->state for this parameter, so I don't understand
why the parameter is needed at all.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yes, it is possible that in some of calls it might not be explicitly
needed parameter, read_glsl_type must support also situation where
we don't have state, this is used when reading uniform_storage.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +{<br>
+ uint32_t type_size;<br>
+<br>
+ char *name = map.read_string();<br>
+ map.read(&type_size);<br>
+<br>
+ const glsl_type *type_exists = NULL;<br>
</blockquote>
<div><br>
</div>
<div>"type_exists" sounds like the name of a boolean. Can
we rename this to "existing_type"?<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
ok<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +<br>
+ if (_state && _state->symbols)<br>
+ type_exists =
_state->symbols->get_type(name);<br>
</blockquote>
<div><br>
</div>
<div>I don't see any circumstances where _state or
_state->symbols would be NULL. Am I missing something?<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
state can be null, state->symbols is a paranoid check and can be
removed<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +<br>
+ /* if type exists, move read pointer forward and
return type */<br>
+ if (type_exists) {<br>
+ map.ffwd(type_size);<br>
+ return type_exists;<br>
+ }<br>
</blockquote>
<div><br>
</div>
<div>A single shader is allowed to have multiple distinct
interface types with the same name (e.g. an input and an
output with the block name, but different interface
members), so this won't work. It will cause the distinct
interface types to alias to each other.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
This will need some more thought then, I'll check if there's some
tests that would utilize this.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>I would have thought that the logic above for looking
up existing types would take care of samplers, since they
are built-in types (and hence populated in the symbol
table before deserialization). Am I missing something
here?<br>
<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yes, samplers could be removed. I wanted this function to be able to
construct whatever type but you are right that samplers are all
built-ins.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>If we do need to keep this switch statement, I
recommend adding some error handling in the default case
to make sure the "failed" flag is set and then return
glsl_type::error_type. That will reduce the risk of Mesa
crashing in the event of a corrupted shader binary.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I will remove samplers from here<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +
glsl_struct_field *fields = ralloc_array(mem_ctx,<br>
+ glsl_struct_field, length);<br>
</blockquote>
<div><br>
</div>
<div>We should check for a memory allocation failure here
and do appropriate error handling; that way a corrupted
binary that has an outrageously large length stored in it
won't lead to a crash.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
ok, will fix<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
+ir_deserializer::read_ir_variable(struct exec_list *list,
memory_map &map)<br>
</blockquote>
<div><br>
</div>
<div>General comment on code organization: I'd prefer to
see deserialization functions like these written as
constructors (i.e. this function would be an ir_variable
constructor, which takes an ir_deserializer as a single
parameter). The advantages of this approach are (a) the
deserialization code is closer to the other
ir_variable-related code, so there's less danger of us
forgetting to update it when we change the structure of
ir_variable, and (b) if we change ir_variable and forget
to update the deserialization code, static analysis tools
like coverity will alert us to the bug. <br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yep, I'll refactor and go this way now.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>uint8_t read_only;<br>
</div>
<div>map.read(&read_only);<br>
</div>
<div>var->read_only = read_only;<br>
<br>
</div>
<div>we simply would do:<br>
<br>
</div>
<div>var->read_only = map.read_uint8();<br>
<br>
</div>
<div>Of course, my idea from the last patch of serializing a
"bits" data structure would make this example moot.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I will do the separate 'bits' structure, it'll make code much more
easier to maintain.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">+<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +
list->push_tail(var);<br>
+<br>
+ return 0;<br>
+}<br>
</blockquote>
<div><br>
</div>
<div>Rather than having this function read the variable and
add it to a list, I'd prefer for it to read the variable
and return a pointer to it. Then the caller can decide
whether to add it to a list or do something else to it.
The key advantage IMHO is more flexibility for the
future. For example, we have talked about the idea of
changing the storage of function parameter ir_variables so
that they use an array of ir_variable pointers rather than
an exec_list. As the deserialization code is written
right now we have extra obstacles to making that future
change.<br>
</div>
<br>
</div>
<div class="gmail_quote">A similar comment applies to most of
the other read...() functions below.<br>
</div>
<div class="gmail_quote"> </div>
</div>
</div>
</blockquote>
<br>
Sure, can be done. Currently it's been all the time about building
the lists so I wanted to have it in functions instead of caller.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> + uint32_t
sig_amount;<br>
</blockquote>
<div><br>
</div>
<div>As in the previous patch, I'd recommend renaming
"sig_amount" to "num_signatures" and "par_count" to
"param_count".<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
ok<br>
<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> + if (ir_type !=
ir_type_function_signature) {<br>
+ CACHE_DEBUG("cache format error with function
%s\n", name);<br>
+ return -1;<br>
+ }<br>
</blockquote>
<div><br>
</div>
<div>It seems silly to store ir_type in the instruction
stream and then generate an error if it doesn't match. If
we want to be robust in the face of a corrupted shader
binary, we should be checking for things that don't make
sense (like is_builtin being a number other than 0 or 1)
rather than storing redundant information like ir_type in
the instruction stream and then checking that it equals
the only value that it could possibly be.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
agreed, this check can be removed<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">+<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> + /* insert
function parameter variables to prototypes list ... */<br>
+ foreach_list_const(node, &sig->parameters) {<br>
+ ir_variable *var = ((ir_instruction *)
node)->as_variable();<br>
+ if (var)<br>
+
prototypes->push_tail(var->clone(mem_ctx, NULL));<br>
+ }<br>
</blockquote>
<div><br>
</div>
<div>Can you explain what's going on with the variables in
the prototypes list? I don't understand what purpose this
serves.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
There can be further references to these variables so this list will
be then used to search for them.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +read_errors:<br>
+ CACHE_DEBUG("%s: read errors with [%s]\n", __func__,
name);<br>
+ if (sig)<br>
+ ralloc_free(sig);<br>
</blockquote>
<div><br>
</div>
<div>Part of the point of ralloc is that we shouldn't have
to do this kind of clean up. Instead, in the event of
failure, we should just free the toplevel memory context,
and all the memory we allocated during the failed
deserialization attempt will be automatically reclaimed.<br>
<br>
</div>
<div>If we let ralloc take care of reclaiming the memory at
the end of deserialization, then we can replace a lot of
these goto labels for error conditions with early returns,
which I think will make the code easier to read.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
true, I will go through these cases<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> + return NULL;<br>
</blockquote>
<div><br>
</div>
<div>If we adopt the idea of the "failed" boolean that I
talked about above, then that gives us the opportunity to
return a dummy ir_dereference_array rather than NULL. The
advantage of doing that is that there's less risk of the
caller segfaulting before it figures out that we are in an
error condition.<br>
<br>
</div>
<div>Of course, if we adopt my suggestion of turning these
reader functions into IR constructors, that will happen
automatically.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Yes, I'm going for the constructor approach.<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +<br>
+read_return:<br>
+ if (list)<br>
+ list->push_tail(con);<br>
</blockquote>
<div><br>
</div>
<div>This is a great example of why it should be the
caller's responsibility to add the deserialized IR to the
appropriate list. That way callers that need to put the
IR in a list can do so, and callers that don't need to put
it in a list won't.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
will be changed<br>
<br>
<br>
<blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex"> +<br>
+static uint8_t<br>
+read_bool(memory_map &map)<br>
+{<br>
+ uint8_t value = 0;<br>
+ map.read(&value);<br>
+ return value;<br>
+}<br>
</blockquote>
<div><br>
</div>
<div>As in the previous patch, this should just be a
function in memory_map.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
will fix<br>
<br>
<br>
Thanks for the review;<br>
<br>
// Tapani<br>
<br>
</body>
</html>