<html>
<head>
<base href="https://bugs.documentfoundation.org/">
</head>
<body>
<p>
<div>
<b><a class="bz_bug_link
bz_status_NEW "
title="NEW - FIREBIRD: Couldn't create a varchar-field with less than 20 characters in GUI"
href="https://bugs.documentfoundation.org/show_bug.cgi?id=105711#c10">Comment # 10</a>
on <a class="bz_bug_link
bz_status_NEW "
title="NEW - FIREBIRD: Couldn't create a varchar-field with less than 20 characters in GUI"
href="https://bugs.documentfoundation.org/show_bug.cgi?id=105711">bug 105711</a>
from <span class="vcard"><a class="email" href="mailto:btomi96@gmail.com" title="Tamas Bunth <btomi96@gmail.com>"> <span class="fn">Tamas Bunth</span></a>
</span></b>
<pre>(In reply to Lionel Elie Mamane from <a href="show_bug.cgi?id=105711#c9">comment #9</a>)
<span class="quote">> (In reply to Tamas Bunth from <a href="show_bug.cgi?id=105711#c7">comment #7</a>)
> > The driver uses UTF-8 encoding for text-based columns (char, varchar, clob).
>
> > A character now occupies up to 4 bytes, so when the 'length' option in the
> > GUI is set to 5 it will allocate 20 bytes.
>
> > You can put more than 5 characters, because most of the characters don't
> > need all the 4 bytes.
>
> > SO what is the correct behavior of the 'length' GUI option set to 5?
>
> The expected behaviour is 5 characters ("codepoint" in Unicode terminology),
> independently of how many bytes that is.
>
> I understand Firebird deviates from the SQL standard in that respect, and
> takes a length in bytes. We could try to implement the "characters" thing
> within the Firebird SDBC driver (as your patch on gerrit tries to do, but I
> wouldn't bother. If we try to do that, there will always be cases that we
> won't catch. We will correctly implement that when the field is updated
> through the SDBC API, but then the user will do:
>
> SELECT LENGTH(field) as len FROM table;
> or
> UPDATE TABLE SET field='+' || field || '+';
>
> and get unexpected results (results inconsistent with the GUI).</span >
Hmm.. Right.
<span class="quote">> I'd just let it be, and let the Firebird "misfeature" bubble up. A length of
> 5 in the GUI should be a VARCHAR(5) in Firebird. If firebird understands
> that as 5 bytes, then so be it.</span >
A length of 5 in the GUI is a VARCHAR(5) in Firebird, which allocates 20 bytes
(as long as the character set is UTF-8).
The problem is rather the following:
Firebird truncates the input string at 20 bytes no matter how many characters
it contains.
I think it's a bug in Firebird (but I couldn't find it in the bug tracker).
<span class="quote">> Maybe a middle ground would be to use UTF16 instead of UTF8 and let the
> length be UTF16 code units (that is length 5 is 10 bytes), if that's
> possible with Firebird. This is still a bug, but confusing character and
> UTF16 code units is so common that it usually not perceived as a bug. In my
> opinion, if this gives as a result that LENGTH(field) is in bytes and not in
> UTF16 code units, it is inconsistent and not desirable.</span >
There are two kind of length functions (actually there are three of them):
char_length and octet_length[1]
and they are working as expected.
[1] <a href="https://firebirdsql.org/refdocs/langrefupd20-char-length.html">https://firebirdsql.org/refdocs/langrefupd20-char-length.html</a></pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>