real-time financial data (howto)

Tue Nov 26 13:54:27 PST 2013

On 23/11/13 20:54, Michael Meeks wrote:
> 
> On Sat, 2013-11-23 at 15:15 +0100, Daniel Pocock wrote:
>>> Would you be interested in working on integrating such a thing?
>>
>> Yes, that is why I'm looking at it
> 
> 	Cool - so there is a lot we want to do here in calc. The initial work
> on that is in master:
> 
> 	Data->Streams - I attach a screenshot.

Great - that looks like it may be the right place for this work

I made up a diagram and published it on my blog, I made sure LibreOffice
is mentioned too:

http://danielpocock.com/real-time-streaming-market-data-with-free-open-source-software

> 
> 	Matus has a chunk of work to better manage data streams, and how
> they're processed / visualised. I'd personally like to see a clear
> separation between data source:
> 
> src:	http[s] | shell-script | spreadsheet-cells
> format: csv, A1<value>

Here we would need to add an extra choice, to source the data from OpenMAMA

Sourcing data from RRDtool may be cool as well and if the HTTP method
could fetch Ganglia XML feeds from gmetad (port 8652) that could also be
quite useful

> 	And finally how those cells move over time as new data comes to get the
> time series in.

This would probably need to work in a few different ways

Some sources (e.g. OpenMAMA or RRD files) are constantly updating.
OpenMAMA ticks arrive at a non-deterministic interval, while RRD updates
are at a constant interval.

Sometimes the user may just want to poll manually (on spreadsheet
refresh) or use some defined interval (e.g. every 5 minutes) to avoid
excessive computation effort.

>> Would it be problematic for core to have an OpenMAMA dependency
>> though?  OpenMAMA is not available on all platforms.
> 
> 	In general with these dependencies we try to de-couple stuff.
> 
>> It would be possible in Java - after all, OpenMAMA provides a Java API
>> and it is packaged too:
> 
> 	And we try to avoid new Java dependencies in the core - as an extension
> (fine) - but there are plenty of limits to extensions - and particularly
> around performance you want to be C++ for bulk data transfers (cf. the
> issues in base).
> 
> 	In general we try to look for and dlopen system libraries that may not
> be there. As an alternative we provide a clean internal abstraction and
> distribute a dll that hard links to this system OpenMAMA thing - but (of
> course) if it is not there then dlopening our own library will fail.
> 
> 	Then again - how complex is this API going to be ? :-)
> 

See the bottom of my blog (link above) for sample code in both C++ and Java

> 	Either way - you'd want to hack on the data streams code; Matus is
> about to merge some biggish improvements there next week.
> 
> 	But - from my perspective - I'd -love- to have OpenMAMA support - it's
> great to hear about you guys, and I'd love to bundle it in the default
> install if possible: big data is something Calc is about to get rather
> good at in 4.2  =)

I'm just on the fringes of the OpenMAMA effort, packaging it for Debian
- big thanks go to the developers at Wombat and NYSE who obviously put
many years work into that code and decided to release it as an open
source project through the Linux Foundation.