<div dir="ltr">Hello Manuj,<br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 7, 2018 at 6:48 PM, Manuj Vashist <span dir="ltr"><<a href="mailto:manujvashist@gmail.com" target="_blank">manujvashist@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div><div><div>Hello everyone,<br><br></div>I am a sophomore student persuing B.E. In Birla Institute of Technology & Science Pilani.<br><br></div>I am exploring the LO code base since January and have merged a couple of easyHacks too :)<br><br></div>I would like to work on the project idea " <a href="https://wiki.documentfoundation.org/Development/GSoC/Ideas#Implement_interface_for_external_data_source_import_into_Calc" target="_blank">Implement interface for external data source import into Calc</a> " in this summers as GSoC student.<br></div><div><div><br></div><div>As the currently available dialog imports data from other csv files and html web <a href="http://pages.as" target="_blank">pages.as</a> the project is about extending the existing data providers and data transformation.<br></div><div>I can think of data providers like a sql table that can be included to it,please give some more information on what kind of data transformation is referred here.<br><br></div><div>Also there are two dialogs doing the same thing here link to external data dialog and data provider dialog, what are the use case of having two diff dialogs? can't both be merged together?<br></div><div>A bit more info on project will be helpful.<br></div><div><br></div></div></div><br></blockquote></div></div><div class="gmail_extra"><br></div><div class="gmail_extra"><div class="gmail_extra">The idea is similar to PowerQuery for Excel but
with a more limited focus. As a simple example take the stock data of
the last week that is published on some website and that we would like
to integrate into our spreadsheet. Currently this happens by downloading
the csv file (hopefully the data is in csv format or another
spreadsheet format) and either copying the data or using the link to
external data. Both features don't handle updating the data very well or
transforming the data.</div><div class="gmail_extra"><br></div><div class="gmail_extra">The
idea now is to take all the different ways that we have to import
external data (link to external data, xml source, data streaming) and
combine them in one common feature. To make working with the external
data easier we also want to be able to apply simple transformations to
the data before importing them (like deleting a column, applying a
filter, sanitizing data, ...). The concept that I already started is to
have a second hidden document with a sheet that we use to import the
data and then apply the transformations before finally copying the data
from the hidden document to the final document. Currently the data is
always imported into a database range (Data->Define) that stores the
range of the imported data.<br></div><div class="gmail_extra"><br></div><div class="gmail_extra">You
can already test some parts of the feature in current master by
enabling experimental features and then going to Data->Data Provider.
The implementation right now does most of the work in an own thread to
allow slow data fetching and transformations while keeping the UI
responsive. Most of the current code can be found at
sc/source/ui/dataprovider/* except for the ugly UI that is currently
available (and which needs to be made more user friendly) at
sc/source/ui/miscdlgs/<wbr>dataproviderdlg.cxx</div><div class="gmail_extra">Additionally there are some initial tests at sc/qa/unit/dataproviders_test.<wbr>cxx and sc/qa/unit/datatransformation_<wbr>test.cxx that I used to prototype some of the code.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Work
during GSoC may include work on an improved UI for the feature, new
data providers and data transformations, storing the information about
data providers and transformations in files (ods and possibly xlsx),
adding a UNO interface to allow extension authors to add their own data
providers and data transformations and many more features that may be
interesting.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Maybe
start by having a look at the already implemented feature and then at
the code. I hope that at least most of the data providers and data
transformations are actually quite simple. If you are through that and
have more questions about some of the other ideas that I mentioned above
feel free to request more information.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Regards,</div><div class="gmail_extra">Markus</div></div></div>