[Spice-devel] [PATCH spice-gtk v3 1/5] Avoid clang warnings on casts with stricter alignment requirements
Christophe de Dinechin
christophe at dinechin.org
Wed May 17 14:17:29 UTC 2017
> On 17 May 2017, at 12:08, Frediano Ziglio <fziglio at redhat.com> wrote:
>
>>
>>
>>> On 17 May 2017, at 09:44, Frediano Ziglio <fziglio at redhat.com> wrote:
>>>
>>>>
>>>> From: Christophe de Dinechin <dinechin at redhat.com>
>>>>
>>>> For example, something like this:
>>>> uint8_t *p8;
>>>> uint32_t *p32 = (uint32_t *) p8;
>>>>
>>>> generates a warning like this:
>>>> spice-channel.c:1350:10: error: cast from 'uint8_t *' (aka 'unsigned char
>>>> *') to
>>>> 'uint32_t *' (aka 'unsigned int *') increases required alignment from
>>>> 1
>>>> to
>>>> 4 [-Werror,-Wcast-align]
>>>>
>>>> The warning indicates that we end up with a pointer to data that
>>>> should be 4-byte aligned, but its value may be misaligned. On x86,
>>>> this does not make much of a difference, except a relatively minor
>>>> performance penalty. However, on platforms such as ARM, misaligned
>>>> accesses are emulated by the kernel, and support for them is optional.
>>>> So we may end up with a fault.
>>>>
>>>> The intent of the fix here is to make it easy to identify and rework
>>>> places where actual mis-alignment occurs. Wherever casts raise the
>>>> warning,
>>>> they are replaced with a macro:
>>>>
>>>> - SPICE_ALIGNED_CAST(type, value) casts value to type, and indicates that
>>>> we believe the resulting pointer is aligned. If it is not, a warning will
>>>> be issued.
>>>>
>>>> - SPICE_UNALIGNED_CAST(type, value) casts value to type, and indicates
>>>> that
>>>> we believe the resulting pointer is not always aligned
>>>>
>>>> Any code using SPICE_UNALIGNED_CAST may need to be revisited in order
>>>> to improve performance, e.g. by using memcpy.
>>>>
>>>> There are normally no warnings for SPICE_UNALIGNED_CAST, but it is
>>>> possible
>>>> to emit debug messages for mis-alignment in SPICE_UNALIGNED_CAST.
>>>> Set environment variable SPICE_DEBUG_ALIGNMENT to activate this check.
>>>>
>>>> Signed-off-by: Christophe de Dinechin <dinechin at redhat.com>
>>>> ---
>>>> spice-common | 2 +-
>>>> src/channel-cursor.c | 6 +++---
>>>> src/channel-display-mjpeg.c | 2 +-
>>>> src/decode-glz-tmpl.c | 2 +-
>>>> src/spice-channel.c | 14 +++++++++-----
>>>> 5 files changed, 15 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/spice-common b/spice-common
>>>> index af682b1..1239c82 160000
>>>> --- a/spice-common
>>>> +++ b/spice-common
>>>> @@ -1 +1 @@
>>>> -Subproject commit af682b1b06dea55007d9aa7c37cd443e4349e43f
>>>> +Subproject commit 1239c82c54dee0244cca262cca0aa21071b23e24
>>>> diff --git a/src/channel-cursor.c b/src/channel-cursor.c
>>>> index 609243b..50de5ce 100644
>>>> --- a/src/channel-cursor.c
>>>> +++ b/src/channel-cursor.c
>>>> @@ -340,7 +340,7 @@ static display_cursor *set_cursor(SpiceChannel
>>>> *channel,
>>>> SpiceCursor *scursor)
>>>> memcpy(cursor->data, data, size);
>>>> for (i = 0; i < hdr->width * hdr->height; i++) {
>>>> pix_mask = get_pix_mask(data, size, i);
>>>> - if (pix_mask && *((guint32*)data + i) == 0xffffff) {
>>>> + if (pix_mask && *(SPICE_ALIGNED_CAST(guint32*, data) + i) ==
>>>> 0xffffff) {
>>>> cursor->data[i] = get_pix_hack(i, hdr->width);
>>>> } else {
>>>> cursor->data[i] |= (pix_mask ? 0 : 0xff000000);
>>>> @@ -350,7 +350,7 @@ static display_cursor *set_cursor(SpiceChannel
>>>> *channel,
>>>> SpiceCursor *scursor)
>>>> case SPICE_CURSOR_TYPE_COLOR16:
>>>> for (i = 0; i < hdr->width * hdr->height; i++) {
>>>> pix_mask = get_pix_mask(data, size, i);
>>>> - pix = *((guint16*)data + i);
>>>> + pix = *(SPICE_ALIGNED_CAST(guint16*,data) + i);
>>>> if (pix_mask && pix == 0x7fff) {
>>>> cursor->data[i] = get_pix_hack(i, hdr->width);
>>>> } else {
>>>
>>> If these 2 hits you are going to have a message for every single pixel
>>> in the image, potentially millions.
>>
>> No, you will get one message only when you enter the image.
>>
>> static const char *last_loc = NULL;
>> if (loc != last_loc) {
>> last_loc = loc;
>> spice_log(SPICE_LOG_DOMAIN, SPICE_LOG_LEVEL_WARNING, loc,
>> __FUNCTION__,
>> "Misaligned access at %p, alignment %u", p, sz);
>> }
>>
>> If it’s the same source code location, you get it once. But if you have two
>> such warnings, then yes, it may become verbose.
>>
>> But if you get millions of calls in case of mis-alignment, it means that on
>> ARM, you get millions of faults into the kernel to correct the alignment,
>> which is surely a lot worse performance-wise than a call.
>>
>> I remember someone (Pavel?) recently saying that Spice performance on ARM was
>> terrible. Something like this could easily explain it.
>>
>>
>>> I'm still convinced, like in this case, that if we are sure that data
>>> should be aligned no check and warning should be done.
>>
>> Would you be OK with a check and warning under #ifndef NDEBUG (like an
>> assert).
>>
>>> I'm really sure no architecture on hearth can do check and print in
>>> a single machine instruction.
>>
>> This is not what I said. I said that the cost in the likely case was one
>> extra instruction.
>>
>>> Actually I don't know any architecture
>>> where a single check like this can be done in a single instruction.
>>
>> Reduced source code: https://pastebin.com/ibZEVThA.
>>
>> Machine code for x86 with clang: https://pastebin.com/Vmtwpkhf
>>
>> The test follows the call to malloc. It’s a single “testb” instruction
>> followed by a never-taken conditional jump to cold-code. I am not positive,
>> but I’m pretty sure all recent x86 architectures execute this combination in
>> a single cycle under nominal conditions. So two additional instructions in
>> the hot code path, one of them (the jump) normally never executed.
>>
>
> Yes, 1+1 == 2 so that's 2 instructions not counting the print.
> The jump is always executed, just not taken.
> Not many processors implements speculative execution so there's
> a penalty on the jump preventing following instructions to be
> executed.
There, I disagree. All recent CPUs have branch prediction, and in that case, we have a “statically not taken” branch if there is no history, and a strongly not taken if there is history.
> Beside these quite paranoid consideration my point is simple
> that in different cases we know this is SURELY to never happen
> and I don't see the point of doing the check at all.
> Our current platforms don't have much issues with unaligned
> memory
I actually checked that. It’s true. x86 seems to have zero penalty. It looks like recent ARM chips also support misaligned. I have old chips at home ;-)
> but the compiler is currently helping porting spice-gtk
> to platforms with more strongly alignment issues so I don't
> think is wasted time to fix properly these problems to avoid
> alignment issues in the possible future.
>
>> movl $27, %edi
>> callq _malloc
>>
>> ## A single instruction to test
>> testb $7, %al
>>
>
> did you notice the bug in the macro?
> You are aligning on the pointer, not to the int type.
No, I had not noticed. Very good catch. So no-ack ;-)
Christophe
>
>> ## Fall through jump not taken, cost presumably 0 cycles
>> jne LBB2_1
>> LBB2_2:
>>
>>
>> Christophe
>>
>>>
>>>> @@ -364,7 +364,7 @@ static display_cursor *set_cursor(SpiceChannel
>>>> *channel,
>>>> SpiceCursor *scursor)
>>>> for (i = 0; i < hdr->width * hdr->height; i++) {
>>>> pix_mask = get_pix_mask(data, size + (sizeof(uint32_t) << 4),
>>>> i);
>>>> int idx = (i & 1) ? (data[i >> 1] & 0x0f) : ((data[i >> 1] &
>>>> 0xf0) >> 4);
>>>> - pix = *((uint32_t*)(data + size) + idx);
>>>> + pix = *(SPICE_UNALIGNED_CAST(uint32_t*,(data + size)) + idx);
>>>> if (pix_mask && pix == 0xffffff) {
>>>> cursor->data[i] = get_pix_hack(i, hdr->width);
>>>> } else {
>>>> diff --git a/src/channel-display-mjpeg.c b/src/channel-display-mjpeg.c
>>>> index 3ae9d21..17c0f4f 100644
>>>> --- a/src/channel-display-mjpeg.c
>>>> +++ b/src/channel-display-mjpeg.c
>>>> @@ -151,7 +151,7 @@ static gboolean mjpeg_decoder_decode_frame(gpointer
>>>> video_decoder)
>>>> #ifndef JCS_EXTENSIONS
>>>> {
>>>> uint8_t *s = lines[0];
>>>> - uint32_t *d = (uint32_t *)s;
>>>> + uint32_t *d = SPICE_ALIGNED_CAST(uint32_t *,s);
>>>>
>>>> if (back_compat) {
>>>> for (unsigned int j = lines_read * width; j > 0; ) {
>>>> diff --git a/src/decode-glz-tmpl.c b/src/decode-glz-tmpl.c
>>>> index b337a8b..7695a28 100644
>>>> --- a/src/decode-glz-tmpl.c
>>>> +++ b/src/decode-glz-tmpl.c
>>>> @@ -178,7 +178,7 @@ static size_t FNAME(decode)(SpiceGlzDecoderWindow
>>>> *window,
>>>> uint64_t image_id, SpicePalette *plt)
>>>> {
>>>> uint8_t *ip = in_buf;
>>>> - OUT_PIXEL *out_pix_buf = (OUT_PIXEL *)out_buf;
>>>> + OUT_PIXEL *out_pix_buf = SPICE_ALIGNED_CAST(OUT_PIXEL *,out_buf);
>>>> OUT_PIXEL *op = out_pix_buf;
>>>> OUT_PIXEL *op_limit = out_pix_buf + size;
>>>>
>>>
>>> This is safe too. 16 byte and 32 byte images must be aligned and 24 and 8
>>> bit
>>> are surely aligned (to byte).
>>>
>>>> diff --git a/src/spice-channel.c b/src/spice-channel.c
>>>> index 3b6231e..88d0567 100644
>>>> --- a/src/spice-channel.c
>>>> +++ b/src/spice-channel.c
>>>> @@ -1312,6 +1312,7 @@ static void spice_channel_send_link(SpiceChannel
>>>> *channel)
>>>> {
>>>> SpiceChannelPrivate *c = channel->priv;
>>>> uint8_t *buffer, *p;
>>>> + uint32_t *p32;
>>>> int protocol, i;
>>>>
>>>> c->link_hdr.magic = SPICE_MAGIC;
>>>> @@ -1356,14 +1357,15 @@ static void spice_channel_send_link(SpiceChannel
>>>> *channel)
>>>> memcpy(p, &c->link_hdr, sizeof(c->link_hdr)); p +=
>>>> sizeof(c->link_hdr);
>>>> memcpy(p, &c->link_msg, sizeof(c->link_msg)); p +=
>>>> sizeof(c->link_msg);
>>>>
>>>> + // Need this to avoid error with -Wcast-align
>>>> + p32 = SPICE_UNALIGNED_CAST(uint32_t *,p);
>>>> for (i = 0; i < c->common_caps->len; i++) {
>>>> - *(uint32_t *)p = GUINT32_TO_LE(g_array_index(c->common_caps,
>>>> uint32_t, i));
>>>> - p += sizeof(uint32_t);
>>>> + *p32++ = GUINT32_TO_LE(g_array_index(c->common_caps, uint32_t,
>>>> i));
>>>> }
>>>> for (i = 0; i < c->caps->len; i++) {
>>>> - *(uint32_t *)p = GUINT32_TO_LE(g_array_index(c->caps, uint32_t,
>>>> i));
>>>> - p += sizeof(uint32_t);
>>>> + *p32++ = GUINT32_TO_LE(g_array_index(c->caps, uint32_t, i));
>>>> }
>>>> + p = (uint8_t *) p32;
>>>> CHANNEL_DEBUG(channel, "channel type %d id %d num common caps %u num
>>>> caps %u",
>>>> c->channel_type,
>>>> c->channel_id,
>>>> @@ -1887,6 +1889,7 @@ static gboolean
>>>> spice_channel_recv_link_msg(SpiceChannel *channel)
>>>> SpiceChannelPrivate *c;
>>>> int rc, num_caps, i;
>>>> uint32_t *caps, num_channel_caps, num_common_caps;
>>>> + uint8_t *caps_src;
>>>> SpiceChannelEvent event = SPICE_CHANNEL_ERROR_LINK;
>>>>
>>>> g_return_val_if_fail(channel != NULL, FALSE);
>>>> @@ -1926,7 +1929,8 @@ static gboolean
>>>> spice_channel_recv_link_msg(SpiceChannel *channel)
>>>> /* see original spice/client code: */
>>>> /* g_return_if_fail(c->peer_msg + c->peer_msg->caps_offset *
>>>> sizeof(uint32_t) > c->peer_msg + c->peer_hdr.size); */
>>>>
>>>> - caps = (uint32_t *)((uint8_t *)c->peer_msg +
>>>> GUINT32_FROM_LE(c->peer_msg->caps_offset));
>>>> + caps_src = (uint8_t *)c->peer_msg +
>>>> GUINT32_FROM_LE(c->peer_msg->caps_offset);
>>>> + caps = SPICE_UNALIGNED_CAST(uint32_t *, caps_src);
>>>>
>>>> g_array_set_size(c->remote_common_caps, num_common_caps);
>>>> for (i = 0; i < num_common_caps; i++, caps++) {
>>>
>>> Also you should merge style changes from 5/5.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/spice-devel/attachments/20170517/0080b8d0/attachment-0001.html>
More information about the Spice-devel
mailing list