Using a real parser generator to parse numbers (and dates)

Lionel Elie Mamane lionel at mamane.lu
Thu Mar 1 09:25:48 PST 2012


On Thu, Mar 01, 2012 at 12:14:28PM +0100, Stephan Bergmann wrote:
> On 02/29/2012 06:41 PM, Lionel Elie Mamane wrote:
>> On Wed, Feb 29, 2012 at 02:57:09PM +0100, Stephan Bergmann wrote:

>>> Note that the stable sal interface historically stays clear of
>>> boost, because of differences in the various boost versions
>>> available in the various environments.


>> 1) Use of boost in rtl/ustring.hxx

>>    So here's the patch for a boost-free OUString, with manually
>>    implemented const_iterator.

> Which is a bit of a pain, for sure.  Would it be an option to
> extract const_iterator and begin/end from OUString to outside the
> URE interface (see below) and use some sort of adapter around
> OUString in those places that expect it to support begin/end?

On Thu, Mar 01, 2012 at 12:40:44PM +0100, Michael Stahl wrote:

> see also stlunosequence.hxx:

> http://opengrok.libreoffice.org/xref/core/comphelper/inc/comphelper/stlunosequence.hxx

We can't follow exactly the same technique than in
comphelper/inc/comphelper/stlunosequence.hxx, because spirit calls
s.begin() and s.end() when s is a string, and uses the type
T::const_iterator where T is the type of s.

The file lo_traits.hxx in my patch teaches spirit that rtl::OUString
is a string type:

    template <>
    struct is_string<rtl::OUString> : mpl::true_ {};

And that the corresponding character type is sal_uInt32 (so that
spirit sees Unicode code points, not UTF16 code units):

    template <>
    struct char_type_of<rtl::OUString> : mpl::identity<sal_uInt32> {};


Maybe something like:

namespace comphelper
{
    class OUString : rtl::OUString
    {
        typedef boost::u16_to_u32_iterator<const sal_Unicode*> const_iterator;
        const_iterator begin() const { return boost::u16_to_u32_iterator<const sal_Unicode*>(getStr()); };
        const_iterator end()   const { return boost::u16_to_u32_iterator<const sal_Unicode*>(getStr() + getLength()); };
	// Add automatic conversion from rtl::OUString to
	// comphelper::OUString for ease of coding
    }
}

And then use comphelper::OUString to pass strings to spirit. Sounds
like a reasonable idea, and maybe the best so far. (Maybe also rely
less on namespaces and call it OUStringSTL or some such.)

Theoretically, another possibility is to change spirit so that it does
not hardcode s.begin(), but uses
 boost::spirit::traits::get_begin<char_type>(s)
everywhere. This is a template which we can then specialise for
OUString on lo_traits.hxx:

    template <typename T>
    inline const boost::u16_to_u32_iterator<const sal_Unicode*> get_begin(rtl::OUString const& str)
    { return boost::u16_to_u32_iterator<const sal_Unicode*>(str.getStr()); }

boost::spirit::traits::get_begin is already used in some places in
spirit, but not everywhere; I'm not sure why that is. At first
contact, the spirit developers seem quite open to the idea of doing
that, but it means that we would need a patched boost until boost 1.50
comes out, and then we would require the bleeding-edge boost
1.50. Which would also be "a bit of a pain", albeit a temporary pain
(until boost 1.50 is "old").

-- 
Lionel


More information about the LibreOffice mailing list