Support for the Apache Parquet file format

Kohei Yoshida kohei at libreoffice.org
Fri Nov 22 12:42:35 UTC 2019


On 22.11.2019 02:37, Shadi Akiki wrote:

> I'm wondering why Parquet is not yet a supported format in LibreOffice
> Calc (and most desktop worksheet processing tools for that matter).

Well, one reason may be that nobody had asked for it yet!  On that note, 
asking about it and raising awareness (which you did) is a necessary 
first step.

Also, it would be nice to know the benefits that this format brings that 
any other existing formats currently do not.  I use pandas occasionally 
and I do work with people who use it on a regular basis, but I had not 
heard this file format mentioned in our conversations to this day.

Is this page

https://github.com/apache/parquet-format

The best place to learn about the specifics of this file format, or is 
there any other page that provides more details?

One way we can add support for a new file format such as this one to 
Calc is to add it to the orcus library [1], which Calc uses internally 
to handle a subset of file formats.  That may potentially be a much 
easier route than adding it to the LibreOffice code base directly... 
Full disclosure: I do maintain this library.

Kohei

[1] https://gitlab.com/orcus/orcus

-- 
Kohei Yoshida, LibreOffice Calc volunteer hacker


More information about the LibreOffice mailing list