[LibreOffice][svg import filter] handling text
Marco Cecchetti
mrcekets at gmail.com
Thu May 3 13:17:44 PDT 2012
Hi everyone.
Some thoughts on the current svg text import implementation.
(1)
The ShapeWritingVisitor does not handle <tspan> element.
In fact only <text> element are mapped to XML_TEXT id,
the id for <tspan> element is XML_TSPAN and no such case
is present in the ShapeWritingVisitor::operator().
(2)
Only 'x' and 'y' attributes whose value is a single coordinate
are handled, whilst the value of such attributes can be a list
of coordinates where the n-th coordinate pair represents the
position at which the n-th character included in the given <text>
or <tspan> element has to be placed.
Reference: http://www.w3.org/TR/SVG11/text.html#TextElementXAttribute
A draft of a *possible* solution:
(1)
Implement an ad-hoc visitor to be applied to the svg DOM tree
before any other visitor in order to "normalize" text elements.
After normalization a <text> or <tspan> element that owns
a TEXT_NODE (that is an inter-tag character sequence) does not
own any ELEMENT_NODE.
So for example:
<text>svg<tspan>import</tspan>filter</text>
should be transformed in:
<text>
<tspan>svg</tspan>
<tspan>import</tspan>
<tspan>filter</tspan>
</text>
and:
<text x="10, 20, 30" y="5, 15">HELLO<\text>
should be transformed in:
<text>
<tspan x="10" y="5">H</tspan>
<tspan x="20" y="15">E</tspan>
<tspan x="30">LLO</tspan>
<\text>
(2)
Add to the AnnotateVisitor two new properties:
mnTextCurrentXPos, mnTextCurrentYPos.
After setting up all style properties the AnnotateVisitor
should perform something like the following pseudo-code.
if( Element is <text> or <tspan> )
{
if( Element has 'x' attribute )
mnTextCurrentXPos = value of 'x';
if( Element has TEXT_NODE )
{
// text elements that does not have a TEXT_NODE are just
// container providing style we do not need to handle them
// further
// the 'x' attribute will be added if not present
set the value of the 'x' attribute to mnTextCurrentXPos;
aText = extract text from Element;
// compute the text width using
// the current text style
width = computeTextWidth( aText, aCurrentState )
mnTextCurrentXPos += width;
}
// do the same for the y attribute
}
Moreover each time a root text element starts
the value of mnTextCurrentXPos and mnTextCurrentYPos
should be reset to zero.
The above implementation follows what all browsers at present
do for rendering svg text: that is if a <tspan> element does
not specify an 'x' attribute the current text position is used,
that is the last seen 'x' attribute not the parent one.
Indeed the standard says something different:
<< If the attribute is not specified: (a) if an ancestor ‘text’
or ‘tspan’ element specifies an absolute X coordinate for
a given character via an ‘x’ attribute, then that absolute X
coordinate is used (nearest ancestor has precedence),
else (b) the starting X coordinate for rendering the glyphs
corresponding to a given character is the X coordinate of
the resulting current text position from the most recently
rendered glyph for the current ‘text’ element.>>
Computation of text width and height should take into account
the value of text style attributes.
The real problem is how to perform such computations ?
Note that in order to not make things even more complex I have
ignored dx, dy, and rotate attributes and transformations too.
(3)
XML_TEXT and XML_TSPAN should be handled by the same case:
if the element owns a TEXT_NODE (and so no ELEMENT_NODE after
normalization) the text is extracted and a odf text element
is created;
in case the element has only ELEMENT_NODEs (that is one or more
<tspan>) it should be handled as a <g> element.
The visitElements routine will be responsible for iterating
on children (<tspan> elements).
Well for sure it lacks a lot of details and I have not taken
into account several issues, anyway I think it can be regarded
as a start point.
Cheers,
-- Marco
More information about the LibreOffice
mailing list