[Libreoffice-ux-advise] Pivot Table data provider extension framework (removal possibility)

Thu Mar 14 06:26:55 PDT 2013

Hi Eike,

Thanks for your reply.

On Wed, Mar 13, 2013 at 4:48 PM, Eike Rathke <erack at redhat.com> wrote:
> Hi Kohei,
>
> On Tuesday, 2013-03-12 11:41:32 -0400, Kohei Yoshida wrote:
>
>> I'd like to ask whether someone actually uses this Pivot Table data
>> provider extension framework, because I'd like to remove this if
>> nobody is using it, or only few people are using it.
>
> From what I remember that can be used to populate pivot tables with data
> obtained from external resources like databases. Unfortunately you'll
> hardly find such extensions in the wild but more within enterprises and
> corporate users, so determining whether it's actually used or not is
> nearly impossible unless someone knows who those customers are.

Understood. I imagined it would be used only in such enterprise
setting, by someone with enough resources to develop the major part of
the pivot engine as an extension.

>
>> I believe the same functionality can be achieve via database
>> connectivity, by having such external data provider register as a
>> database, and use it to act as a data provider for pivot tables.
>> So, I don't see a reason why we need to keep this as a separate data
>> source category.
>
> IMHO the advantage of the data provider is that the actual data does not
> have to reside in the spreadsheet, allowing for massive amounts of data
> records but providing only the information necessary for the pivot
> table. This maybe could be accomplished as well using a registered data
> source, but currently we have no means to pull the data without actually
> storing it in the spreadsheet for further processing. Or isn't that the
> case?

Well, that would depend on what you actually mean by "storing (the
data) in the spreadsheet". When pulling data via database
connectivity, we don't actually copy the data in the spreadsheet
document, but generate the pivot table output directly from it. But we
*do* first populate the pivot cache from the database internally, so a
copy of the data will sit in memory while the document is open.  I
consider the pivot cache part an implementation detail, so I'm not
sure if that's what you meant by "storing it in the spreadsheet"...

>> The way it is currently implemented also makes it *extremely*
>> difficult for us to optimize the pivot table engine, because all its
>> functionality has to go through the UNO API which forces us to do
>> data conversion *twice* for every single transaction.  That's very
>> very expensive especially as the data size grows (and it always
>> does).
>
> Seconded.

And this to me is a considerable disadvantage on further speeding up
the engine and reducing its memory usage.

>> So, I'd *love* to get rid of this sooner rather than later, and I'd
>> like to know whether there are people who would absolutely need this
>> functionality, and if so why.  As I said above, I believe the same
>> functionality could be achieved via the database connectivity
>> backend even if we remove the extension backend.
>
> I think there work needs to be done to pull the data and provide it in
> a form that pivot tables can actually process. It may be viable, but I'm
> really not familiar with pivot table topics.

Yes. So, anyone who currently use this data provider extension backend
would change the way the data is connected to Calc's pivot table;
which will require *some* work. But, considering that developing such
an extension requires a non-trivial resource (it's almost half o the
whole pivot table engine), I would imagine they could spare a bit of
their resource to set up database connectivity to achieve what they
need...  At least that's what I'm hoping.

Kohei