<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html;
      charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">On 11/05/2013 10:16 PM, Paul Berry
      wrote:<br>
    </div>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">On 1 November 2013 02:16, Tapani Pälli <span
          dir="ltr"><<a moz-do-not-send="true"
            href="mailto:tapani.palli@intel.com" target="_blank">tapani.palli@intel.com</a>></span>
        wrote:<br>
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">ir_deserializer can
              create a gl_shader structure out of binary<br>
              data from ir_serializer, will be used by
              OES_get_program_binary<br>
              implementation.<br>
              <br>
              Signed-off-by: Tapani Pälli <<a moz-do-not-send="true"
                href="mailto:tapani.palli@intel.com" target="_blank">tapani.palli@intel.com</a>><br>
              ---<br>
               src/glsl/Makefile.sources          |    1 +<br>
               src/glsl/ir_cache_deserializer.cpp | 1341
              ++++++++++++++++++++++++++++++++++++<br>
               src/glsl/ir_cache_deserializer.h   |  301 ++++++++<br>
               3 files changed, 1643 insertions(+)<br>
               create mode 100644 src/glsl/ir_cache_deserializer.cpp<br>
               create mode 100644 src/glsl/ir_cache_deserializer.h<br>
            </blockquote>
            <div><br>
            </div>
            <div>General comment: my attitude about error handling is
              different when reviewing this patch than it was when
              reviewing the last patch, because in the case of
              deserialization, we're dealing with data passed in by the
              client, so I think it's far more important to be robust in
              the case of malformed input data.  We probably don't need
              to worry about malicious clients, since the client is in
              process so if they wanted to be malicious they could just
              walk all over memory.  But it will make life a lot easier
              for library clients (and for us when we're debugging) if
              we can avoid having Mesa die in a fire when it receives a
              corrupted shader binary.<br>
              <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes, I agree, it makes sense to have some more error checking in
    deserialization, it has to be basic and quick though.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>Having said that, I still don't think integer return
              values are the way to go, for the following reasons:<br>
            </div>
            <div>- it's confusing for 0 to mean success.<br>
            </div>
            <div>- we're going to forget to check return values.<br>
            </div>
            <div>- it complicates the signatures of a lot of functions
              that should be able to just return the thing that they
              read.<br>
              <br>
            </div>
            <div>What I'd recommend doing instead is adding a "failed"
              boolean to memory_map.  That way rather than having to
              check every single return value for failure, we just have
              to check the "failed" boolean at appropriate critical
              times (such as inside loops and at the end of functions
              like ir_read_variable()), and return early in the event of
              failure.  An additional advantage of this is that it makes
              it easy to add bounds checking to memory_map--the read()
              and ffwd() functions can simply check if the position goes
              out of bounds, and if so, set "failed" flag to 0 and read
              zeros.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    ok, I'll take a look how this would work out. The current checks
    have actually helped as they catch not just invalid/corrupt data
    from client but possible errors in serialization too, even though
    they are just simple value checks but there's quite a lot of them
    and they help in stopping the parsing immediately. If allowed to go
    further after error, it'll be harder to track when did it start to
    fail so I think there will be still a lot of checks.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              +ir_deserializer::read_header(struct gl_shader *shader,
              memory_map &map,<br>
              +   const char *mesa_sha)<br>
            </blockquote>
            <div><br>
            </div>
            <div>Rather than pass the memory_map to all of the
              ir_deserializer functions, I'd prefer to see it just be a
              member of the ir_deserializer class.  It's a closer
              parallel to what you do with memory_writer in
              ir_serializer, and it makes the code easier to read by
              making all of the function signatures and invocations
              smaller.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yep, I think it can be changed, the actual memory_map will come from
    caller of this class. I think read_glsl_type() is the only method
    which will require support to pass a memory_map, this is used when
    reading uniform_storage when deserializing a whole pgoram. <br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>As with the memory_writer::write*() functions, I think
              the memory_map::read() functions should include the type
              in their name, and should use return by value rather than
              a pointer argument.  E.g.:<br>
              <br>
            </div>
            <div>int32_t read_int32();<br>
            </div>
            <div>bool read_bool();<br>
              <br>
            </div>
            <div>and so on.<br>
              <br>
            </div>
            <div>(Note: I think it's still ok to have a read() function
              that operates on a pointer for reading larger structs).<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes, will change these.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>Although technically the C standard allows locals whose
              name begin with an underscore (provided that what follows
              is a lower case letter), it seems strange--names beginning
              with underscores typically refer to globals that belong to
              libraries.<br>
              <br>
              Also, it looks like all the callers of read_glsl_type pass
              this->state for this parameter, so I don't understand
              why the parameter is needed at all.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes, it is possible that in some of calls it might not be explicitly
    needed parameter, read_glsl_type must support also situation where
    we don't have state, this is used when reading uniform_storage.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +{<br>
              +   uint32_t type_size;<br>
              +<br>
              +   char *name = map.read_string();<br>
              +   map.read(&type_size);<br>
              +<br>
              +   const glsl_type *type_exists = NULL;<br>
            </blockquote>
            <div><br>
            </div>
            <div>"type_exists" sounds like the name of a boolean.  Can
              we rename this to "existing_type"?<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    ok<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +<br>
              +   if (_state && _state->symbols)<br>
              +      type_exists =
              _state->symbols->get_type(name);<br>
            </blockquote>
            <div><br>
            </div>
            <div>I don't see any circumstances where _state or
              _state->symbols would be NULL.  Am I missing something?<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    state can be null, state->symbols is a paranoid check and can be
    removed<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +<br>
              +   /* if type exists, move read pointer forward and
              return type */<br>
              +   if (type_exists) {<br>
              +      map.ffwd(type_size);<br>
              +      return type_exists;<br>
              +   }<br>
            </blockquote>
            <div><br>
            </div>
            <div>A single shader is allowed to have multiple distinct
              interface types with the same name (e.g. an input and an
              output with the block name, but different interface
              members), so this won't work.  It will cause the distinct
              interface types to alias to each other.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    This will need some more thought then, I'll check if there's some
    tests that would utilize this.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <div>I would have thought that the logic above for looking
              up existing types would take care of samplers, since they
              are built-in types (and hence populated in the symbol
              table before deserialization).  Am I missing something
              here?<br>
              <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes, samplers could be removed. I wanted this function to be able to
    construct whatever type but you are right that samplers are all
    built-ins.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>If we do need to keep this switch statement, I
              recommend adding some error handling in the default case
              to make sure the "failed" flag is set and then return
              glsl_type::error_type.  That will reduce the risk of Mesa
              crashing in the event of a corrupted shader binary.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    I will remove samplers from here<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +    
               glsl_struct_field *fields = ralloc_array(mem_ctx,<br>
              +         glsl_struct_field, length);<br>
            </blockquote>
            <div><br>
            </div>
            <div>We should check for a memory allocation failure here
              and do appropriate error handling; that way a corrupted
              binary that has an outrageously large length stored in it
              won't lead to a crash.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    ok, will fix<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex">
              +ir_deserializer::read_ir_variable(struct exec_list *list,
              memory_map &map)<br>
            </blockquote>
            <div><br>
            </div>
            <div>General comment on code organization:  I'd prefer to
              see deserialization functions like these written as
              constructors (i.e. this function would be an ir_variable
              constructor, which takes an ir_deserializer as a single
              parameter).  The advantages of this approach are (a) the
              deserialization code is closer to the other
              ir_variable-related code, so there's less danger of us
              forgetting to update it when we change the structure of
              ir_variable, and (b) if we change ir_variable and forget
              to update the deserialization code, static analysis tools
              like coverity will alert us to the bug. <br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yep, I'll refactor and go this way now.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>uint8_t read_only;<br>
            </div>
            <div>map.read(&read_only);<br>
            </div>
            <div>var->read_only = read_only;<br>
              <br>
            </div>
            <div>we simply would do:<br>
              <br>
            </div>
            <div>var->read_only = map.read_uint8();<br>
              <br>
            </div>
            <div>Of course, my idea from the last patch of serializing a
              "bits" data structure would make this example moot.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    I will do the separate 'bits' structure, it'll make code much more
    easier to maintain.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">+<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +  
              list->push_tail(var);<br>
              +<br>
              +   return 0;<br>
              +}<br>
            </blockquote>
            <div><br>
            </div>
            <div>Rather than having this function read the variable and
              add it to a list, I'd prefer for it to read the variable
              and return a pointer to it.  Then the caller can decide
              whether to add it to a list or do something else to it. 
              The key advantage IMHO is more flexibility for the
              future.  For example, we have talked about the idea of
              changing the storage of function parameter ir_variables so
              that they use an array of ir_variable pointers rather than
              an exec_list.  As the deserialization code is written
              right now we have extra obstacles to making that future
              change.<br>
            </div>
            <br>
          </div>
          <div class="gmail_quote">A similar comment applies to most of
            the other read...() functions below.<br>
          </div>
          <div class="gmail_quote"> </div>
        </div>
      </div>
    </blockquote>
    <br>
    Sure, can be done. Currently it's been all the time about building
    the lists so I wanted to have it in functions instead of caller.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +   uint32_t
              sig_amount;<br>
            </blockquote>
            <div><br>
            </div>
            <div>As in the previous patch, I'd recommend renaming
              "sig_amount" to "num_signatures" and "par_count" to
              "param_count".<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    ok<br>
    <br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +      if (ir_type !=
              ir_type_function_signature) {<br>
              +         CACHE_DEBUG("cache format error with function
              %s\n", name);<br>
              +         return -1;<br>
              +      }<br>
            </blockquote>
            <div><br>
            </div>
            <div>It seems silly to store ir_type in the instruction
              stream and then generate an error if it doesn't match.  If
              we want to be robust in the face of a corrupted shader
              binary, we should be checking for things that don't make
              sense (like is_builtin being a number other than 0 or 1)
              rather than storing redundant information like ir_type in
              the instruction stream and then checking that it equals
              the only value that it could possibly be.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    agreed, this check can be removed<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">+<br>
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +      /* insert
              function parameter variables to prototypes list ... */<br>
              +      foreach_list_const(node, &sig->parameters) {<br>
              +         ir_variable *var = ((ir_instruction *)
              node)->as_variable();<br>
              +         if (var)<br>
              +          
               prototypes->push_tail(var->clone(mem_ctx, NULL));<br>
              +      }<br>
            </blockquote>
            <div><br>
            </div>
            <div>Can you explain what's going on with the variables in
              the prototypes list?  I don't understand what purpose this
              serves.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    There can be further references to these variables so this list will
    be then used to search for them.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +read_errors:<br>
              +   CACHE_DEBUG("%s: read errors with [%s]\n", __func__,
              name);<br>
              +   if (sig)<br>
              +      ralloc_free(sig);<br>
            </blockquote>
            <div><br>
            </div>
            <div>Part of the point of ralloc is that we shouldn't have
              to do this kind of clean up.  Instead, in the event of
              failure, we should just free the toplevel memory context,
              and all the memory we allocated during the failed
              deserialization attempt will be automatically reclaimed.<br>
              <br>
            </div>
            <div>If we let ralloc take care of reclaiming the memory at
              the end of deserialization, then we can replace a lot of
              these goto labels for error conditions with early returns,
              which I think will make the code easier to read.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    true, I will go through these cases<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +   return NULL;<br>
            </blockquote>
            <div><br>
            </div>
            <div>If we adopt the idea of the "failed" boolean that I
              talked about above, then that gives us the opportunity to
              return a dummy ir_dereference_array rather than NULL.  The
              advantage of doing that is that there's less risk of the
              caller segfaulting before it figures out that we are in an
              error condition.<br>
              <br>
            </div>
            <div>Of course, if we adopt my suggestion of turning these
              reader functions into IR constructors, that will happen
              automatically.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    Yes, I'm going for the constructor approach.<br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +<br>
              +read_return:<br>
              +   if (list)<br>
              +      list->push_tail(con);<br>
            </blockquote>
            <div><br>
            </div>
            <div>This is a great example of why it should be the
              caller's responsibility to add the deserialized IR to the
              appropriate list.  That way callers that need to put the
              IR in a list can do so, and callers that don't need to put
              it in a list won't.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    will be changed<br>
    <br>
    <br>
    <blockquote
cite="mid:CA+yLL66U6FmPti3PTrXvHK3Zyx-J=S8DATEfYyRx3m-ANzQ-9g@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <blockquote class="gmail_quote" style="margin:0px 0px 0px
              0.8ex;border-left:1px solid
              rgb(204,204,204);padding-left:1ex"> +<br>
              +static uint8_t<br>
              +read_bool(memory_map &map)<br>
              +{<br>
              +   uint8_t value = 0;<br>
              +   map.read(&value);<br>
              +   return value;<br>
              +}<br>
            </blockquote>
            <div><br>
            </div>
            <div>As in the previous patch, this should just be a
              function in memory_map.<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
    will fix<br>
    <br>
    <br>
    Thanks for the review;<br>
    <br>
    // Tapani<br>
    <br>
  </body>
</html>