thesaurus.dic Workday 1, background

Martin Srebotnjak miles at filmsi.net
Thu Jun 29 07:18:33 UTC 2023


Hi,

please do not remove the Slovenian thesaurus just for the purposes of
cleaning the folders/files, please act according to Laszslo's instructions.

Lp, m.


V V čet., 29. jun. 2023 ob 07:28 je oseba Alex <taosubmarines at mail.ru>
napisala:

> Hi
>
> I have no affiliation with GSoc or any other code program
>
> I am looking at the thesaurus files (as «I am not a coder»), with a view
> to providing an updated technical.dic thesaurus with many new terms and «a
> clear upgrade/merge/integrate path from an external data source»
> (wikipedia).
> 603 lines of references as opposed to the current 378 lines, each term
> (possibly) searchable directly on wikipedia .Then maybe I will see about
> concept/design to integrate (hypothetical) web search in xml  (help,
> thesaurus) interface. Right click on an item in help or Bayram Cicek’s
> search interface and you get an option to search the internet, or wikipedia
> article… (as concept).
>
> Anyway, I deleted the two foreign language thesaurus files hu_AkH11.dic
> and sl.dic, looking at the bug reports mentioned below, but I quickly
> realized the dic files were needed when compiling. But blank placeholders
> (empty text files) are good too.
>
> I have edited code references to remove hu_AkH11.dic, and it compiles OK
> without even a placeholder (empty) file.
> Aside from Linguistic.backup..xcs, line 217, where else is hu_AkH11.dic
> referenced. I looked in the references below and I believe that it is
> bloat. I will try compiling HU language support if people think it is
> useful.
> I would expect then that even with HU language and dictionary support
> installed, the (original) hu_AkH11.dic thesaurus will not exist or be
> called for. I don’t speak Hungarian though.
>
> sl.dic is integrated to the unit test, I can see. Not touching it today. I
> will put the text back into the placeholder file for my next build.
>
> «en_US or other language builds get these files unnecessarily, the only
> task is fixing our packaging.»   OK, how can I help with packaging?
>
> Laszlo do you have a local repo for your lo code, the en_US spelling
> dictionary? Your language code is different to this specific technical.dic
> thesaurus, yes?
>
> Thanks
>
> Alex Tao
> Tao Submarines and Systems
> Chios, Aegean Sea
>
>
> Thursday, June 29, 2023 1:44 AM +03:00 from Németh László <
> nemeth at numbertext.org <http:///compose?To=nemeth@numbertext.org>>:
>
> Hi,
>
> Andras Timar <timar74 at gmail.com
> <http://e.mail.ru/compose/?mailto=mailto%3atimar74@gmail.com>> ezt írta
> (időpont: 2023. jún. 28., Sze, 17:55):
>
> Hi Alex,
>
> On Wed, Jun 28, 2023 at 5:15PM Alex <taosubmarines at mail.ru
> <http://e.mail.ru/compose/?mailto=mailto%3ataosubmarines@mail.ru>> wrote:
>
> Hi everyone
>
> Today I try to determine how to remove two unwanted wordbook files from
> libreoffice/extras/source/wordbook:
> hu_AkH11.dic and sl.dic.
> These foreign language (incomplete) dics should be removed, unless they
> are used in some unit test.
> Bug 139961, 68576 etc
>
> Can be removed? OK?
>
>
>
> I'm not sure, if it's OK. We added these dictionaries for a reason. It's
> better to ask the maintainers first (I CC-ed them).
> From the technical point of view, if you remove the files from source, and
> all references to them, the build should pass. Maybe you need a clean build
> from scratch. Use "git grep sl.dic" and "git grep hu_AkH11.dic" commands,
> they are more reliable than opengrok.
>
>
> You can remove hu_AkH11.dic with the following git command:
>
> $ git revert 6247c966942a0e43320a234302a67c1f92c2eea7
>
>  Because this was added with that commit:
>
> $ git log libreoffice/extras/source/wordbook/hu_AkH11.dic
> commit 6247c966942a0e43320a234302a67c1f92c2eea7
>
> But these are not unwanted dictionaries, as András wrote.
>
> In theory, they are packaged only with their language builds, sl-SI and
> hu-HU. If not, i.e. en_US or other language builds get these files
> unnecessarily, the only task is fixing our packaging. If the packaging
> problem is related to some Linux distributions, I believe, our task is only
> to report that in their bug trackers.
>
> Is this a GSoC project? I haven't found information about the planned
> improvement of the (en_US?) thesaurus or the thesaurus code base.
> (By the way, I had an interesting improvement here: English stemming and
> affixation during thesaurus usage by adding extra language data to the
> en_US spelling dictionary. Unfortunately, by accident this was removed by
> the recent maintainer.)
>
> Best regards,
> László
>
>
>
>
> Best regards,
> Andras
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20230629/926fa699/attachment.htm>


More information about the LibreOffice mailing list