calc parallel review call minutes

Michael Meeks michael.meeks at collabora.com
Wed Nov 15 16:10:17 UTC 2017


ODT with pictures, and plain text with cryptic notes as normal =)

Calc Parallel Review call minutes 2017-11-15

Present
	Dennis, Eike, Michael, Tor, Mike Kaganski
Commits
	NumberFormatter (Eike)
		expensive to create one of these per thread
		constructing a default formatter doesn’t copy
			the used-formats from the documents formatter
		better to use the flag that we’re doing threaded calc
			use a mutex in the GetFormatTable
			otherwise don’t use a mutex.
			Return the normal document formatter number pointer.
				Should be faster.
		Interpreter – rarely used …
			final result is obtained – set number format if it was general.
		Also not working – can’t get the cell formats through the newly created formatter.
			Would fail for ‘IsNumber’ etc.
	Concerned wrt. Ove-ruse of cell formats (Michael)
		nervous about its use.
		if change when calculating it is pushed through
		Excel doesn’t do this.
		This is now expected by users (Eike)
			yes formula construction.
	sc/source/core/data/formulacell.cxx
		SvNumberFormatter – is the class that needs locking.
			=> better to make this actually thread-safe …
		started to look at the places where it is used (Dennis)
			not sure why the crash happens when it is shared.
AI:		add mutex’s to SvNumberFormatter (Dennis)

	Avoid SvTokenArray thrash
		re-using some token pointers – for string/double.
			Does not work; can’t assign content to a token of a different type.
			Should put asserts in the virtual base methods.
		Huge cost of old mhu allocator, and the token re-use work.
AI:			check whether we free MemoryPools en-masse (Michael)
		in concept ok – but need to only re-using the same type (Eike)

		Problematic pieces here – where type is different.			
AI:		if ( pTargetTok && pTargetTok→type != string)  (Dennis)
			then replace 
		else
			new token time …

	GetFormatTable assert …
		hit under some circumstances ? … stopped in debugger & continued.
		Concerning → … chase that.

	Propose – on by default through betas  (Michael)
		and switch to experimental for RC’s (Eike)

	InterpretTail – returning from deep recursion (Eike)
		happens after a few hundred cells.
		Recursion is stack bound; at some point we return & iterate over the stacked cells
		and try the next bunch.
		Does it still work with threaded calculation ?
			Do we blow the stack here ?
		Uses the RecursionHelper to see if we should do this again.
AI:		need to propagate the recursion ‘ERSTART_SYS’ type thing back (Dennis)

	ScInterpreterContext (Michael)
		plan is to move stuff out of thread variables eg. vlookup cache and tie from here.

	Patch for avoiding excessive allocations
		another patch to come doing this on the ScInterpreterContext
		how many ? (Eike)
			around eight.
			Might work.
			Should be on the ScInterpreterContext indeed.
	Slightly sad S/W group interpreter still wins (Michael)
		“Thread the S/W interpreter” patch.

	SUMPRODUCT – building Matrix’s (Michael)
		Allocates a new ScMatrix each calculation
			keeps track of dimensions internally (Eike)
				different memory chunks if use only the upper-left.
			Different semantics here; forces array-mode to all its params and sub-params
				by definition.
				Other functions that do this too.
				sc/…/tools/ function classification – some force matricees
		Could re-use S/W interpreter bits perhaps ?

TODO
	Vlookup cache needs reconciling with the main version ideally
		whenever a change in the data – all dependent vlookups must be re-calculated
		do we do smarts with dependencies ? (Michael)
			listen to the range (Eike)
			when re-calculating, is the initial value we looked up the same ?
				In this case return the cached value.
		This is why we  have ‘MergeBackIntoNonThreadedData()’ fn (Tor)
	Error state in TokenArray
		Should only be used for formula parsing & compiler state…
		It the compiler flagged the error – don’t intpret it
			currently abused for other cases.
		Fix as we find them.
Next:
	First fix the number formatter issues
	Then start to merge these; generally happy
	Saw all the cores busy (Eike)

Noel’s column limit patch
	Could have a look (Eike)
		didn’t think it was finished – deps on MAXCOL / MAXROW
		actual access depends on size of these
		other places – compare reference values vs. MAXCOL etc.
		would be good to get some more details.
		SlotMachine needs looking at – wrt. Distributing.

Appendix 1 – some numbers

Benchmark Compute Sheets used
Repo : git://gerrit.libreoffice.org/benchmark
1. BuildingDesign.xls
2. GrossProfit-Supermarkets.xls
3. Stock_history.xls

-- 
michael.meeks at collabora.com <><, Pseudo Engineer, itinerant idiot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: calc-parallel.odt
Type: application/vnd.oasis.opendocument.text
Size: 24442 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/libreoffice/attachments/20171115/ad019d50/attachment.odt>


More information about the LibreOffice mailing list