[Spice-devel] [PATCH 07/17] Get rid of C-style memset initializations, use C++ style aggregates

Thu Feb 22 18:03:18 UTC 2018

> 
> [Indeed, I had not sent this reply, took time to check standard and
> compilers]
> 
> > On 20 Feb 2018, at 12:31, Frediano Ziglio <fziglio at redhat.com> wrote:
> > 
> >>> On 20 Feb 2018, at 10:39, Lukáš Hrázký <lhrazky at redhat.com> wrote:
> >>> 
> >>> On Tue, 2018-02-20 at 10:18 +0100, Christophe de Dinechin wrote:
> >>>>> On Feb 19, 2018, at 7:19 PM, Lukáš Hrázký <lhrazky at redhat.com> wrote:
> >>>>> 
> >>>>> On Fri, 2018-02-16 at 17:15 +0100, Christophe de Dinechin wrote:
> >>>>>> From: Christophe de Dinechin <dinechin at redhat.com>
> >>>>>> 
> >>>>>> Signed-off-by: Christophe de Dinechin <dinechin at redhat.com>
> >>>>>> ---
> >>>>>> src/spice-streaming-agent.cpp | 47
> >>>>>> ++++++++++++++++++++++++++-----------------
> >>>>>> 1 file changed, 28 insertions(+), 19 deletions(-)
> >>>>>> 
> >>>>>> diff --git a/src/spice-streaming-agent.cpp
> >>>>>> b/src/spice-streaming-agent.cpp
> >>>>>> index 69c27a3..1e19e43 100644
> >>>>>> --- a/src/spice-streaming-agent.cpp
> >>>>>> +++ b/src/spice-streaming-agent.cpp
> >>>>>> @@ -181,19 +181,24 @@ write_all(int fd, const void *buf, const size_t
> >>>>>> len)
> >>>>>>   return written;
> >>>>>> }
> >>>>>> 
> >>>>>> -static int spice_stream_send_format(int streamfd, unsigned w,
> >>>>>> unsigned
> >>>>>> h, unsigned c)
> >>>>>> +static int spice_stream_send_format(int streamfd, unsigned w,
> >>>>>> unsigned
> >>>>>> h, uint8_t c)
> >>>>>> {
> >>>>>> -
> >>>>>> -    SpiceStreamFormatMessage msg;
> >>>>>> -    const size_t msgsize = sizeof(msg);
> >>>>>> -    const size_t hdrsize  = sizeof(msg.hdr);
> >>>>>> -    memset(&msg, 0, msgsize);
> >>>>>> -    msg.hdr.protocol_version = STREAM_DEVICE_PROTOCOL;
> >>>>>> -    msg.hdr.type = STREAM_TYPE_FORMAT;
> >>>>>> -    msg.hdr.size = msgsize - hdrsize; /* includes only the body? */
> >>>>>> -    msg.msg.width = w;
> >>>>>> -    msg.msg.height = h;
> >>>>>> -    msg.msg.codec = c;
> >>>>>> +    const size_t msgsize = sizeof(SpiceStreamFormatMessage);
> >>>>>> +    const size_t hdrsize  = sizeof(StreamDevHeader);
> >>>>>> +    SpiceStreamFormatMessage msg = {
> >>>>>> +        .hdr = {
> >>>>>> +            .protocol_version = STREAM_DEVICE_PROTOCOL,
> >>>>>> +            .padding = 0,       // Workaround GCC "not implemented"
> >>>>>> bug
> >>>>>> +            .type = STREAM_TYPE_FORMAT,
> >>>>>> +            .size = msgsize - hdrsize
> >>>>>> +        },
> >>>>>> +        .msg = {
> >>>>>> +            .width = w,
> >>>>>> +            .height = h,
> >>>>>> +            .codec = c,
> >>>>>> +            .padding1 = { }
> >>>>>> +        }
> >>>>>> +    };
> >>>>>>   syslog(LOG_DEBUG, "writing format\n");
> >>>>>>   std::lock_guard<std::mutex> stream_guard(stream_mtx);
> >>>>>>   if (write_all(streamfd, &msg, msgsize) != msgsize) {
> >>>>>> @@ -204,14 +209,18 @@ static int spice_stream_send_format(int
> >>>>>> streamfd,
> >>>>>> unsigned w, unsigned h, unsign
> >>>>>> 
> >>>>>> static int spice_stream_send_frame(int streamfd, const void *buf,
> >>>>>> const
> >>>>>> unsigned size)
> >>>>>> {
> >>>>>> -    SpiceStreamDataMessage msg;
> >>>>>> -    const size_t msgsize = sizeof(msg);
> >>>>>>   ssize_t n;
> >>>>>> +    const size_t msgsize = sizeof(SpiceStreamFormatMessage);
> >>>>>> +    SpiceStreamDataMessage msg = {
> >>>>>> +        .hdr = {
> >>>>>> +            .protocol_version = STREAM_DEVICE_PROTOCOL,
> >>>>>> +            .padding = 0,       // Workaround GCC "not implemented"
> >>>>>> bug
> >>>>>> +            .type = STREAM_TYPE_DATA,
> >>>>>> +            .size = size  /* includes only the body? */
> >>>>>> +        },
> >>>>>> +        .msg = {}
> >>>>>> +    };
> >>>>> 
> >>>>> So, someone should find out if we can use the designated initializers,
> >>>>> I suppose it depends on the compilers on all platforms we care about
> >>>>> supporting them?
> >>>>> 
> >>>>> I wasn't able to find much useful information so far. Anyone knows in
> >>>>> which version of gcc it was introduced?
> >>>> 
> >>>> My experience is that clang has no issue with it in any version.
> >>>> 
> >>>> GCC supports it in C++ since at least 4.8.5, as tested on RHEL 7.4 with
> >>>> the following short example:
> >>>> 
> >>>> struct Thing { int x; float y; };
> >>>> 
> >>>> int foo()
> >>>> {
> >>>> Thing x = { x: 10, y:20 };
> >>>> Thing y = { .x = 10, .y = 20 };
> >>>> Thing z { 10, 20 };
> >>>> Thing t { .x = 10, .y = 20 };
> >>>> }
> >>>> 
> >>>> It has, however, trouble with out-of-order initializations, with the
> >>>> same
> >>>> message in 4.8.5 as on Fedora 25 (6.4.1):
> >>>> 
> >>>> glop.cpp:9:30: sorry, unimplemented: non-trivial designated initializers
> >>>> not supported
> >>>>  Thing a { .y = 10, .x = 20 };
> >>>> 
> >>>> The “not implemented” message is a bit scary, but the fact that the code
> >>>> as written has been supported as far back as 4.8.5 makes me confident
> >>>> that we are not in some experimental section of the compiler.
> >>> 
> >>> I've found this on the matter:
> >>> 
> >>> https://gcc.gnu.org/ml/gcc/2014-09/msg00207.html
> >>> 
> >>> That doesn't give much confidence... Is this documented anywhere? I
> >>> mean I've only used Google and haven't found anything…
> >> 
> >> I think it’s a low priority thing because the “fix” in source code is so
> >> easy
> >> (reorder the fields), and the fix complicated. According to the message
> >> you
> >> referenced, presently, GCC just checks that the annotations match the
> >> names,
> >> but otherwise ignores them and proceeds like for a POD init. If it were to
> >> be implemented, it would have to support a lot of corner cases mentioned
> >> in
> >> the message, so this makes the fix non-trivial.
> >> 
> > 
> > This is not a bug but a feature. In C++20 fields have to be in the right
> > order so when clang will start implementing correctly the standard will
> > have to fix it. In C++ fields are initialized in order.
> > About the title aggregates are also a C style actually as we are supposed
> > to be C++11 is more a C style than a C++ style.
> > But is not this that worry me. memset(0) and aggregate initializers are
> > quite different, is not only a question of style. Consider this:
> > 
> >   struct xxx {
> >      float f;
> >      int i;
> >   };
> >   xxx v { };
> > 
> > this initialize f and i to 0.0 and 0 respectively. On some machines
> > the memset does not set f to 0.0 (depends on floating point
> > representation).
> 
> > But not a big deal, we don't use floating point.
> > But padding worry me a bit more, consider this:
> > 
> >  struct xxx {
> >     uint8_t c;
> >     uint16_t n;
> >  };
> >  xxx v { 1, 2 }; // or v { .c = 1, .n = 2 };
> 
> All padding is initialized to zero in the case of zero-initialization, see
> http://en.cppreference.com/w/cpp/language/zero_initialization. So if you are
> really concerned, you can write:
>
> xxx v = { };
> x.c = 1; x.n = 2;
>

Source:

void aaa()
{
    struct {
        char c;
        int i;
    } x = { 1, 2 };
    write(0, &x, sizeof(x));
}

Code:

0000000000000610 <_Z3aaav>:
     610:       48 83 ec 18             sub    $0x18,%rsp
     614:       31 ff                   xor    %edi,%edi
     616:       ba 08 00 00 00          mov    $0x8,%edx
     61b:       48 89 e6                mov    %rsp,%rsi
     61e:       c6 04 24 01             movb   $0x1,(%rsp)
     622:       c7 44 24 04 02 00 00    movl   $0x2,0x4(%rsp)
     629:       00
     62a:       64 48 8b 04 25 28 00    mov    %fs:0x28,%rax
     631:       00 00
     633:       48 89 44 24 08          mov    %rax,0x8(%rsp)
     638:       31 c0                   xor    %eax,%eax
     63a:       e8 00 00 00 00          callq  63f <_Z3aaav+0x2f>
     63f:       48 8b 44 24 08          mov    0x8(%rsp),%rax
     644:       64 48 33 04 25 28 00    xor    %fs:0x28,%rax
     64b:       00 00
     64d:       75 05                   jne    654 <_Z3aaav+0x44>
     64f:       48 83 c4 18             add    $0x18,%rsp
     653:       c3                      retq
     654:       e8 00 00 00 00          callq  659 <_Z3aaav+0x49>
     659:       0f 1f 80 00 00 00 00    nopl   0x0(%rax)

so I assume I have a compiler bug here or that the code you wrote
is potentially not zero filling the paddings.

> Now, while your concerns are correct by the word of the standard, they apply
> equally well to default copy and assignment. So unless we disable default
> copies on the C structs, we have the same risk there, including when we
> return values…
> 
> It’s not a real risk, though, for multiple different reasons, the primary one
> being that we should *never ever* depend on the value padding!!! If we do,
> we have a more serious issue elsewhere (see below), and whoever did that
> deserves to spend time debugging his own mess ;-)
> > 
> > now, we know there's a byte between c and n. What's the value of this byte?
> > In the case of memset is 0, in the case of aggregate initializers... we
> > don't
> > know! That is only the bytes occupied by the fields are initialized, not
> > the
> > padding. Currently is not a big issue, there are no implicit holes and all
> > bytes are initialized but is this became false we would potentially have a
> > security issue as we are sending the full raw structure to an external
> > entity.
> 
> While I understand what you are saying, the problem would only arise if the
> following chain of events happened:
> 
> 1) We insert a new field in an (presently nonexistent) implicit padding
> 2) We rely on that insertion not relocating the fields behind, i.e. we do it
> “on purpose” because we are so “smart” (asking for trouble)
> 3) We do so without also adding capabilities or version check and assume the
> field exists on the other end (really asking for trouble)
> 4) We access the field in new software that talks to old software which does
> not have the field (REALLY asking for trouble)
> 5) We rely on that new field being initialized at 0 by the “legacy” other
> end, when we know it’s not true (REALLY REALLY asking for trouble)
> 6) We depend on the struct never being copied on either side, which would
> erase that zero guarantee (Uh oh!)
> 7) We never thought of using the “packed” attribute for that struct… (at that
> point, why bother mentioning this?)
> 8) We compiled with gcc at -O0 to ensure it would not zero-init the padding
> 9) There is a security issue due to the 8 steps above, but we claim the
> problem is to use C++-style init
> 

Only 7 apply here for good reasons. Try to see why ip protocol take into
account alignment.

> That seems far-fetched enough that I’d argue we have more pressing concerns.
> 
> > At the moment the entity should be considered more secure but for
> > instance moving the same code on the server size would cause a security
> > issue. To avoid that any future structure should have no implicit padding
> > but this imposes an implementation detail of the protocol definition in a
> > project written in C (spice-protocol) just to satisfy some much higher
> > level style detail. It does seem to me quite easy to break this is the
> > future. In this respect also there are some condition which are even harder
> > to avoid. Consider something like
> > 
> >  struct xxx {
> >     uint8_t type_of_union;
> >     uint8_t padding;
> >     union {
> >        uint8_t type1;
> >        uint16_t type2;
> >     }
> >  };
> >  xxx v { };
> > 
> > now all the bytes are covered by fields however the last byte is not
> > potentially initialized as C++ mandate that only first no static field
> > of an union is initialized.
> > 
> > Yes, aggregate initialization is a bit easier to read but seems that
> > the cost is potentially high.
> 
> None of the cases you evoke happens in our current codebase. We have explicit
> padding, no unions. So not a problem today and in foreseeable future.
> 

I said that. Current code is not affected but what I'm complaining is
that is not robust and requires attentions.

> > 
> >>> 
> >>> OTOH, if the extension is robust against random mistakes and typos, and
> >>> GCC and clang are the only compilers we care about, I think it should
> >>> be fine?
> >> 
> >> If we agree that a message that contains “not supported” if we mess up is
> >> OK,
> >> I think that’s fine.
> >> 
> >> However, I believe I need to beef up the comment and explain what the
> >> message
> >> is and what the fix is.
> >> 
> >>> 
> >>> Lukas
> >>> 
> >>>> The alternatives are:
> >>>> - Not tagging fields at all
> >>>> - Tagging them “the old way”, i.e. field:value, but that has been
> >>>> deprecated since 2.5 and actually has the same bug as .field = value, so
> >>>> no benefit.
> >>>> 
> >>>> 
> >>>>> 
> >>>>> Lukas
> >>>>> 
> >>>>>> -    memset(&msg, 0, msgsize);
> >>>>>> -    msg.hdr.protocol_version = STREAM_DEVICE_PROTOCOL;
> >>>>>> -    msg.hdr.type = STREAM_TYPE_DATA;
> >>>>>> -    msg.hdr.size = size; /* includes only the body? */
> >>>>>>   std::lock_guard<std::mutex> stream_guard(stream_mtx);
> >>>>>>   n = write_all(streamfd, &msg, msgsize);
> >>>>>>   syslog(LOG_DEBUG,
> >>>>>> @@ -379,7 +388,7 @@ do_capture(int streamfd, const char *streamport,
> >>>>>> FILE *f_log)
> >>>>>> 
> >>>>>>           if (frame.stream_start) {
> >>>>>>               unsigned width, height;
> >>>>>> -                unsigned char codec;
> >>>>>> +                uint8_t codec;
> >>>>>> 
> >>>>>>               width = frame.size.width;
> >>>>>>               height = frame.size.height;
> > 
> > Maybe even better if the type is SpiceVideoCodecType here or use auto
> > to avoid any possible truncation.
> 
> Makes sense to change for clarity, but StreamMsgFormat codec is uint8_t.
> 

I know, C has no specification for enumeration size so we can't use enums
in a binary structure.

Frediano