Support for the Apache Parquet file format
Kohei Yoshida
kohei at libreoffice.org
Fri Nov 22 12:42:35 UTC 2019
On 22.11.2019 02:37, Shadi Akiki wrote:
> I'm wondering why Parquet is not yet a supported format in LibreOffice
> Calc (and most desktop worksheet processing tools for that matter).
Well, one reason may be that nobody had asked for it yet! On that note,
asking about it and raising awareness (which you did) is a necessary
first step.
Also, it would be nice to know the benefits that this format brings that
any other existing formats currently do not. I use pandas occasionally
and I do work with people who use it on a regular basis, but I had not
heard this file format mentioned in our conversations to this day.
Is this page
https://github.com/apache/parquet-format
The best place to learn about the specifics of this file format, or is
there any other page that provides more details?
One way we can add support for a new file format such as this one to
Calc is to add it to the orcus library [1], which Calc uses internally
to handle a subset of file formats. That may potentially be a much
easier route than adding it to the LibreOffice code base directly...
Full disclosure: I do maintain this library.
Kohei
[1] https://gitlab.com/orcus/orcus
--
Kohei Yoshida, LibreOffice Calc volunteer hacker
More information about the LibreOffice
mailing list