[Fribidi-discuss] (Justified?) compilation warning (fwd)

Wed Oct 23 01:31:02 EST 2002

This email is a bit long, sorry.

Behdad Esfahbod wrote:

>I don't like the idea too much.  IMHO the FriBidiChar type should 
>remain 4 bytes.  We should have different encoding types.
>  
>
I don't understand the "Different encoding types" part.

As for FriBidiChar - as far as the application is concerned, it is 4 
bytes, unless the application explicitly asks for a different size (by 
setting a preprocessor macro). This is the way Windows implement the 
different encoding functions. I will explain the windows way because I 
have some experience with it from both ends. Both as a user using this 
API to create programs, and as a developer (Wine) that has to develop 
using this infrastructure. I truely believe this is the way to go.

In windows there are three character types and one meta-type:
char - 1 byte
WCHAR - 2 bytes (utf-16)
mbcs - 1 or 2 bytes (unsigned char - meant to be used for CJK Multi Byte 
Character Strings).
TCHAR - the current character base type.

If a macro called "UNICODE" is defined - this is WCHAR. If "MBCS" - 
mbcs, otherwise - char.

When calling a function that depends on the character width (almost all 
of them), two variants are defined in the DLLs. One with the A extension 
(for ANSI) and another with a W extension (for Wide). A set of #ifdefs 
and #defines in the windows headers map the basic function name (for 
example - CreateWindow) to either CreateWindowA or CreateWindowW.

At some point I wanted to write a program that will be compiled for both 
an Ansi and a Unicode variant. This was simply done by setting the 
equivalent of "CFLAGS" for "-DUNICODE" or not in two flavours. The 
program itself was written almost exclusively with function names of 
"CreateWindow". The only exception was a certain parameter passing 
filename in char * syntax. Instead of converting it, I just called 
"CreateFileA" instead of "CreateFile", thus calling the Ansi variation 
even if the program is a "unicode" variation.

Then again, at some point I asked for a feature (Font dialog - the 
encoding selection) that made no sense in the Unicode variation. At that 
point I had to place an #ifdef UNICODE, as the ALGORITHM was was 
different among the flavours.

When working on the other side of the API, as a Wine developer, both A 
and W variants need to be implemented. In Windows as in Wine, the way to 
do that is usually by converting the Ansi strings to Unicode and calling 
the W variant. This, I believe, would be the wrong way to go for 
Fribidi, hense my, admitably a bit unusual, suggestion.

>>Makefile - don't you know what that is?
>>There are two special tricks performed in this makefile.
>>The first is for compiling fribidi_utf.c three times, selecting a 
>>different default each time. The three .o files are then merged into a 
>>single library.
>>    
>>
>Nice trick, it's a bit unreadible at first, but I like it.  But I 
>really don't know how to merge it with fribidi's current 
>codebase.
>  
>
The only problem I see is with the various auto* tools. I have to admit 
this is the first time I see anything beyond autoconf used, even for 
projects far larger than fribidi, and I'm begining to understand why. 
The other tools seem to impose limitations on directory structure at the 
very least. I have not given up on implementing this strange trick yet, 
but if that fails, how crucial is it to use them all?

Mean while, in a different email, Behdad also wrote:

>In the patch you sent, the answers were (as far as my understanding goes):
>> A. Automatically, according to the first argument's type (assuming it's 
>> the right type)
>  
>
>
>I like the idea of selection according to the first argument's 
>type, but seems that I should forget it.
>  
>
I don't. This spells out compilation problems left and right. If this 
was a C++ interface, and we could do that without losing static type 
checking, I'd perhaps reconsider.

>>> B. Either no reference was made, or suggested reimplementation for each 
>>> size (different sources).
>>    
>>
>
>Same source, In what I'm thinking of, the main functions are the 
>same, and the wrappers do the casts.  Some small functions may 
>need three implementations though.
>  
>
And you are suffering performance penalties, because you are moving four 
bytes units using one byte functions. This means that you are apriori 
limited by the design to inefficient engine. The alternative is to 
reimplement some functions. This will suffer from duplicate code.

Had we been writing in C++, the implementation I would suggest would 
involve templating the entire engine. This achieve's your goal of 
autoselecting based on variable type, and solves the duplicate code 
problem. While this would have been a reasonable solution HAD C++ been 
used, for C I think I suggest a reasonable alternative.

No duplicate code - instead of duplicating the code, we are simply 
recompiling the same code as many times as necessary. We actually have 
better control over variations in algorithm as you cannot place #ifdefs 
inside templates (you could, theoretically, use "if", but that's a 
different discussion).

No auto selecting the function based on type, true, but that's actually 
a C limitation rather than a design limitation. Using C++ you could just 
as easily define three overloaded functions and call log2vis8, log2vis16 
and log2vis32 from each function. Get this, however. NO CAST!!! You 
don't have to cast anything to anything, and you get full static typing.

If you want, you can give me CVS access, and I'll create a branch and 
try to implement what I'm talking about there. I'm a bit afraid of doing 
too long a development on my personal machine's only, as I just suffered 
a HD crash that wiped my entire installation, and the only reason I have 
a backup of the demo I showed on the list is because I have the list 
archives.

As usual, opinions etc. are welcome.

                Shachar

P.S.
How DO you pronounce "Behdad"?