[PUSHED] Add Switches to find-german-comments to aid in weeding false positives
vmiklos at suse.cz
Wed Mar 14 03:06:45 PDT 2012
On Wed, Mar 14, 2012 at 09:30:21AM +0000, Michael Meeks <michael.meeks at suse.com> wrote:
> It's a tad unclear to me why hexadecimal is flagged as German :-) I'd
> prefer to grok & fix that really:
> $ bin/find-german-comments cppuhelper
> cppuhelper/source/unourl.cxx:264: c != 0x2F && c != 0x3A && c != 0x3D
> Still - perhaps it's some nightclub or something
> "Können Sie mir bitte sagen wo der 0x2F ist ?"
> /me 's German is of course fuzzed by a 20 year break but ...
> etc. ;-)
The language guesser is simply based on statistics, so it has no idea
what a hex number is. Of course we could simply filter out hex numbers,
but as the above example says, that would hide real German comments as
well. I like Tom's approach - this way one can simple take files with
many German comments, then later review the rest.
Once we have a module that is surely German-free, I think we can
blacklist its name in the script.
More information about the LibreOffice