[HarfBuzz] Don't render control characters?

Behdad Esfahbod behdad at behdad.org
Thu Sep 25 09:30:43 PDT 2014


Thanks James and Jonathan for taking care of this on the CSS side.
Working-group resolved to change this to display Cc characters
(other than HT, LF, CR):

  http://log.csswg.org/irc.w3.org/css/2014-09-08/#e469835

On 14-03-20 03:19 AM, James Clark wrote:
> On Thu, Mar 20, 2014 at 6:04 AM, Behdad Esfahbod <behdad at behdad.org
> <mailto:behdad at behdad.org>> wrote:
> 
> 
>     Also, Unicode says GC=Cc should just render as boxed if not supported.
> 
> 
> However, it also says that  characters with the White_Space property true it
> should be rendered as space.  In addition to 0x9, 0xA and 0xD (which both CSS
> and HTML treat as white space), these are 0xB (VT), 0xC (FF), and 0x85 (NEL).
> 
>     The
>     reason we want them removed here is really an artifact of the HTML spec.
> 
> 
> The requirement of ignoring all GC=Cc characters seems to be an artifact of
> the CSS3 Text WD (http://www.w3.org/TR/css-text-3/#white-space-processing),
> which is not yet set in stone.  Note that it's different from CSS2.1
> (http://www.w3.org/TR/CSS2/text.html#ctrlchars) which says that they render as
> usual.
> 
> The CSS3 text behaviour seems like a bad idea to me, because
> 
> a) it conflicts with Unicode, and
> b) legacy Windows encodings use C1 code points (in the range 0x80 - 0x9F) for
> real characters; if a page using eg Windows-1252 encoding is mislabelled as
> ISO-8859-1 (which can definitely happen) then all the code points in this
> range would be silently be ignored rather than showing up as boxes.
> 
>     WDYT?
> 
> 
> I think the default should be to do what Unicode says.  Also ask the CSS3 text
> folks why they are proposing this handling of Cc.
> 
> James

-- 
behdad
http://behdad.org/


More information about the HarfBuzz mailing list