[core] xmloff: Histogram Chart ODF import and export support
Devansh Varshney
varshney.devansh614 at gmail.com
Fri Jan 31 19:47:47 UTC 2025
Hi community,
To follow up on the previous mail, I'd like to give you a more detailed
update on how we're addressing the histogram data handling issue in the
*chart2* module, which Tomaž and Regina pointed out.
- The core problem lies in how *histograms* handle *X-axis* values. The
existing *chart2* architecture *assumes a direct mapping between source
data and chart X-axis values*.
- However, histograms calculate X-axis values (bin-width) based on the
input data, leading to a situation where these calculated values overwrite
the original X-values and create issues in many places in the code.
- We can get the render of the chart while creating it but when we tried
to save(for example) the original values were gone by this time hence the
problem.
Tomaž also said that he would like to introduce a variable
*XValuesCalculated* which will resolve most of the problem.
Further Analysis -
- *Original X-Values:* The original X-values are correctly stored in
XLabeledDataSequence.
- *Histogram Calculations:* The histogram calculations occur within
the *HistogramDataInterpreter* and then the calculated values overwrite
the XLabeledDataSequence x values.
- *Need for Separation:* We need to store and display the
histogram-calculated X-values while preserving the original data as many
parts of the code depend on original values.
Analysis of *HistogramChartType.cxx*
*Purpose:* This file defines the HistogramChartType class, that is
responsible for setting up a histogram chart type. It manages properties
specific to histograms,
like bin width and range. It also implements the logic to create the
data-series and do the calculations.
*createCalculatedDataSeries()* Method:
- This method retrieves the raw data through the *XLabeledDataSequence*
from *m_aDataSeries*. This is where the logic modifies the source data
(the sequence returned from getValues()).
- It then uses *HistogramCalculator* to perform calculations and
generate the data.
- Finally, it creates a new *HistogramDataSequence* to store calculated
values and then adds that to m_aDataSeries.
*Interaction with **XLabeledDataSequence*
*:*
- It gets the original data using the *getDataSequences2()*
method of *DataSeries*.
- It gets the original x values by using *getValues*() method of
the *XLabeledDataSequence* which returns an XDataSequence and then uses
the *getData() *method of the XDataSequence to retrieve the x values in
a sequence of uno::Any.
Analysis of *HistogramCalculator.cxx*
*Purpose*: This class performs the core histogram calculations using the
raw data that is provided to *computeBinFrequencyHistogram()* method.
*Bin Calculation Logic:*
- The class calculates the number of bins, bin width, and bin
ranges based on the input data.
- It computes bin ranges and their frequencies.
- The calculated values are stored internally in *maBinRanges*
and *maBinFrequencies*.
Understanding* HistogramDataSequence.cxx*
*Data Storage:*
- The primary data storage is the *mxValues* member, which is a
*uno::Sequence<double>*. This sequence stores the calculated bin values
for the histogram.
- There is also *mxLabels* which stores labels but we are not
concerned with it.
*Interface Implementations:*
- *HistogramDataSequence* implements several UNO interfaces
including *XNumericalDataSequence, XTextualDataSequence, XDataSequence*.
- *getNumericalData()*: Returns the *mxValues* as a
*uno::Sequence<double>*.
- *getTextualData()*: Returns an empty *uno::Sequence<OUString>*.
- * getData()*: Returns the *mxValues* as a
*uno::Sequence<uno::Any>*.
Proposed Solution:
To address these issues, this is how I am thinking of making the changes:
* *Add **XValuesCalculated Property(As said by Tomaž)* :
- Add an *[optional, attribute] sequence<double> *
*XValuesCalculated**;* to the *DataSeries* service definition (
*offapi/com/sun/star/chart2/DataSeries.idl*).
- Implement the corresponding storage, getter, and setter
methods in *DataSeries.hxx* and *DataSeries.cxx*. (Add storage for
*m_aXValuesCalculated*)
- Add the corresponding property support in *OPropertySet.hxx*
and *OPropertySet.cxx*?
*Why add it to DataSeries? *-> If my understanding is correct,
*XValuesCalculated* is a property associated with a DataSeries and is not a
new data structure by itself, so we do not need to create a new IDL file
for it.
* No Boolean Flag Needed:* -> we do not need an explicit boolean flag.
I am using *.hasElements() *and *w**hy Length Check is Sufficient*:
- *Presence = Calculated Values*: The presence of elements within
the *XValuesCalculated* sequence is sufficient to determine if
calculated X-values exist.
- *Empty Sequence = No Calculated Values*: An empty sequence, which
tells us that no values have been calculated. This is very convenient
because UNO Sequences can be empty.
- *No Redundancy:* Using a separate boolean flag would add
redundancy.
*
* Modify Histogram Logic:*
- Update HistogramDataInterpreter.cxx to perform calculations
and store calculated bin values in the *XValuesCalculated* property of
the corresponding DataSeries object. This will not touch the original
values in the *XLabeledDataSequence*.
- And the Histogram View utilises the BarChart view for the
final rendering.
*
* Modify Rendering Logic:*
- Modify chart renderers (such as VSeriesPlotter.cxx ?) to
prioritize the *XValuesCalculated* property when available for plotting.
When the *XValuesCalculated* is not available then it will use the
original values of *XLabeledDataSequence*.
*
*Adjust Axis Scaling:*
- Modify the code to use the *XValuesCalculated* property, when
available, for axis scaling range calculation.
* *Preserve Original Data in UI:*
- Ensure that UI components always display original data fetched
from *XLabeledDataSequence* with the role of *"values-x".*
- Since users expect the UI tables to reflect the underlying
data they have provided. If we show calculated bin values in tables/cells,
it will be confusing and wrong.
*
*Comprehensive Testing:*
- Create tests to ensure that the changes work and do not cause
a regression to ensure proper functionality and maintain backward
compatibility.
- We will also make sure the undo and redo are working properly.
*
* Documentation:*
- Document all the changes to reflect the new property and its
usage.
Key Benefits:
1. * Correct Histogram Display:* Histograms will correctly use
calculated bin values for the X-axis.
2. *Preserved Original Data:* Original source data will be
maintained for other charts and user interactions.
3. *Minimal Impact:* This change does not affect other chart types,
it ensures backward compatibility, and it only impacts histogram and
renderer-specific logic.
4. *Clear Separation:* Separates the concern of raw data and
calculated data.
5. * Improved Architecture: *Addresses the underlying issue of
assumptions embedded in the chart(*chart2*) module.
*Is the below file related?*
chart2/source/controller/dialogs/DataBrowser.cxx
<https://github.com/LibreOffice/core/blob/b0a4afc58e1c9434e56ddb96c41f4ebe5985ed0a/chart2/source/controller/dialogs/DataBrowser.cxx#L4>
Do we also have to modify *getValues()* in *XLabeledDataSequence* to return
calculated values or original values based on the presence of
*XValuesCalculated* property and chart type?
If other places need changes or I if have still missed files (please let me
know)
and the changes in the *offapi/com/sun/star/chart2/DataSeries.idl* are
correct?
On Thu, 23 Jan 2025 at 18:59, Regina Henschel (via Code Review) <
gerrit at gerrit.libreoffice.org> wrote:
> Attention is currently required from: Devansh Varshney, Tomaž Vajngerl.
>
> Regina Henschel has posted comments on this change by Devansh Varshney. (
> https://gerrit.libreoffice.org/c/core/+/177364?usp=email )
>
> Change subject: xmloff: Histogram Chart ODF import and export support
> ......................................................................
>
>
> Patch Set 26:
>
> (2 comments)
>
> Patchset:
>
> PS26:
> There is a deeper problem. You have set the calculated binFrequencies and
> binRanges in place of the original data. That has give you a convenient way
> for getting a rendering as bar chart. Not only does this cause bugs
> tdf#164109 and tdf#162240, but now you can't export the <chart:series>
> element the same way as it is done for other chart types. I'm referring to
> the way that can be seen after comment '// export dataseries for current
> chart-type' in SchXMLExport.cxx.
>
> Currently in your patch, the <chart:series> element is not exported at
> all. When it will be exported, the new <loext:histogram-configuration>
> element has to become a child element after all the existing child
> elements. If the export looks similar to the existing chart types, it would
> be just before the comment '// close series'.
>
> I suggest to discuss the underlying problem with Tomaž Vajngerl. I don't
> see an easy solution and would have to try a lot. Until this is resolved,
> it doesn't make sense to make progress on the export.
>
>
> File schema/libreoffice/OpenDocument-v1.4+libreoffice-schema.rng:
>
>
> https://gerrit.libreoffice.org/c/core/+/177364/comment/2e8b89d8_b939b3f4?usp=email
> :
> PS26, Line 2149: <rng:optional>
> : <rng:ref name="loext-histogram-configuration"/>
> : </rng:optional>
> > I mean the `loext:histogram-configuration` is placed under
> `chart:series` is this the correct way? […]
> The position is correct.
>
>
>
> --
> To view, visit https://gerrit.libreoffice.org/c/core/+/177364?usp=email
> To unsubscribe, or for help writing mail filters, visit
> https://gerrit.libreoffice.org/settings?usp=email
>
> Gerrit-MessageType: comment
> Gerrit-Project: core
> Gerrit-Branch: master
> Gerrit-Change-Id: Ibda198c1a3e4a7b53e3cf590c33cc9adb558be74
> Gerrit-Change-Number: 177364
> Gerrit-PatchSet: 26
> Gerrit-Owner: Devansh Varshney <varshney.devansh614 at gmail.com>
> Gerrit-Reviewer: Jenkins
> Gerrit-Reviewer: Tomaž Vajngerl <quikee at gmail.com>
> Gerrit-CC: Regina Henschel <rb.henschel at t-online.de>
> Gerrit-Attention: Devansh Varshney <varshney.devansh614 at gmail.com>
> Gerrit-Attention: Tomaž Vajngerl <quikee at gmail.com>
> Gerrit-Comment-Date: Thu, 23 Jan 2025 13:29:36 +0000
> Gerrit-HasComments: Yes
> Gerrit-Has-Labels: No
> Comment-In-Reply-To: Devansh Varshney <varshney.devansh614 at gmail.com>
>
--
*Regards,*
*Devansh*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20250201/82ba7f4d/attachment.htm>
More information about the LibreOffice
mailing list