[Intel-gfx] [PATCH] lib: Always NUL terminate ucs2_as_utf8
Dave Gordon
david.s.gordon at intel.com
Fri Apr 22 11:27:29 UTC 2016
On 21/04/16 16:13, Peter Jones wrote:
> On Thu, Apr 21, 2016 at 01:18:27PM +0100, Matt Fleming wrote:
>> ( Good Lord, I hate doing string manipulation in C )
>
> (yep)
>
>>
>> On Wed, 20 Apr, at 03:25:32PM, Laszlo Ersek wrote:
>>>
>>> So, "len" does not include the room for the terminating NUL-byte here.
>>> When "len" is passed to ucs2_as_utf8(), with the proposed patch applied,
>>> a NUL byte will be produced in "name", but it will be at the price of a
>>> genuine character from the input variable name.
>>
>> Right, and this is a problem because we're trying to keep the names
>> consistent between efivarfs and the EFI variable data. Force
>> NUL-terminating the string is wrong, because if you have no room for
>> the NUL the caller should check for that. Sadly none do.
>>
>> On the flip-side, passing around non-NUL terminated strings is just
>> begging for these kinds of issues to come up.
>>
>> The fact is that the callers of ucs2_as_utf8() are passing it the
>> wrong 'len' argument. We want a NUL-terminated utf8 string and we're
>> passing a NUL-terminated ucs2 string. We should tell ucs2_as_utf8() it
>> has enough room to copy the NUL.
>>
>> Wouldn't this work (minus the return value checking)?
>
> I agree with your analysis, and your patch looks plausible.
This isn't really any different from the plain old ASCII case, where a
function might call strlen(), allocate a buffer, and strncpy() the
non-NUL bytes into it.
It would of course then be wrong to pass that to any of the standard
string functions that operate on NUL-terminated strings.
OTOH if the code knows that the next thing will be an append @buf[len],
then the temporary lack of a NUL isn't an issue.
Personally I'd be inclined to keep my strings NUL-terminated at all
times, just as a guard against forgetting that I should only use
functions that take an explicit length.
The pattern:
l1 = strlen(s1);
l2 = strlen(s2);
buf = malloc(l1+l2+1);
strcpy(buf, s1);
strcpy(buf+l1, s2);
is quite common, quite safe, and not likely to confuse anyone. The extra
overwrites of the temporary NULs are a small price for those benefits.
.Dave.
More information about the Intel-gfx
mailing list