[a11y] LibreOffice Calc exposes 2^31 children, freezes on `GetChildren`

Thu Jun 13 12:49:37 UTC 2024

Hi Michael,

On 12/06/2024 15:55, Michael Weghorn wrote:
> No need to apologize - thanks a lot for your valuable input! :-)

	Ah - if you encourage me you get more ;-)

> There, the Table interface also only exposes the same amount of cells as 
> are exposed via the a11y tree.

	Fair enough;

>>      Right; so - I mentioned "near to the screen" - by near; I mean we 
>> will probably want a number of things that are navigationally close: 
>> eg. "next heading" or somesuch - to lurk around as real & tracked 
>> peers. The content of the Navigator headings should prolly always be 
>> present in a writer document's object hierarcy IMHO. That should let 
>> ATs very quickly enumerate headings, jump focus to them with a simple 
>> API etc.
> 
> That sounds interesting, but in a way also like a rather strange tree to 
> me if it contains elements of some type for the whole doc, but other 
> parts of the document in between are missing.

	Indeed; and yet from a caching and performance perspective - its gold 
to give ATs exactly what they want pre-fetched and cached in-process, 
and nothing more I guess; but of course fetching headings via a 
different mechanism is probably sensible.

> AT-SPI's flows-from and flows-to relations (and ARIA's aria-flowto) seem 
> somewhat similar to the UIA Navigation API you mention.

	=) Ultimately we dynamically create peers as these methods are called 
currently I imagine.

> If they allow consistent access to off-screen content (related: 
> tdf#96492), they could potentially be used to retrieve the previous/next 
> heading,...

	Sure; I guess the MS APIs have the problem that the interface 
implemented tends also to be the protocol for remote COM querying of 
peers whereas in Linux we can cut/de-couple that and can do better at 
least in theory.

>>      Although - I'd really suggest that a11y doesn't work against the 
>> application, and if navigating - it should allow the AT to scroll the 
>> actual visible/view-port to match what is being interrogated.
> 
> Interesting thought, and maybe that could be part of the solution, if it 
> becomes clearer what that can look like in practice.

	Sure; so all/most applications have in large scrolled panes a mess of 
logic to try to detect when a change moves focus, and when it moves the 
scroll-area. How you manage both of those is fraught with fun and 
unfortunate 'view jumping' ;-) in the collaborative case - consider your 
cursor moves - so you want to move the view-port to show the cursor, but 
in fact it moved because someone re-sized a spreadsheet row above and 
... ;-) anyhow; deep joy.

> E.g. it would seem odd to me if an AT starts scrolling through the 
> document if a "go to next heading/list item" navigation command is 
> triggered, and then e.g. goes back if it doesn't find anything, because 
> it can't otherwise access the previously off-screen content to search 
> for the item.

	I guess; but I really expect that there are keybindings and/or well 
known actions for expert users that are used left and right, and that in 
reality tracking the focused peer and interrogating it is some 
overwhelming majority of the use cases to the point that first making 
that piece really, really good, fast & complete is far more important 
than anything else; but perhaps I'm mistaken.

>>      I really think that's a mistake that will ultimately hurt ATs 
>> performance and that we should focus on the end-user use-cases we want 
>> to succeed with - rather than having an abstract absolutist 
>> pre-conception that we can expose everything in an efficient way =)
> 
> Sure - if there's a better way to properly make the AT use cases a 
> reality, then let's go that route instead. :-)

	From a prioritization perspective; I'd really suggest working on the 
majority platforms for the impaired: Windows/NVDA, and vast-majority 
use-cases: of getting really good & complete API and feature coverage on 
the focused widget, before moving off into the more tricky stuff :-)

>>      But now I shut up ;-) we're working on the web side of this; 
>> caching bits in the browser and adding another protocol latency there 
>> - and I'm sure we want to be handling a reasonably bounded set of data 
>> there =)
> 
> Is there an easy way to test COOL a11y web and impacts of potential 
> changes?

	Ah - so; we tend to focus on the focused widget and things 'near' it - 
adjacent table cells etc. when populating our shadow DOM. But at some 
level the use-case we have for the a11y APIs is not really different 
than an AT would use I think.

> (I just opened a sample Writer doc on nextcloud.documentfoundation.org 
> and couldn't find the doc content via Accerciser in a quick test, but am 
> also not very familiar with web content/browser a11y.)

	You will want:

         <enable type="bool" desc="Controls whether accessibility 
support should be enabled or not." default="false">false</enable>

	Enabled in coolwsd.xml - and then to turn on screen-reading support.

	=)

> As an additional note, one more potential source to get some interesting 
> insights could be to check how NVDA's browse mode is currently 
> implemented for MS Word, for example.

	Indeed.

On 13/06/2024 13:27, Michael Weghorn wrote:
 > I'm wondering whether one potential approach could e.g. be to provide
 > different "modes" on how much Writer exposes in the a11y tree, and
 > a way to switch between those....

	Lots of things are possible of course.

 >  From looking a bit further into NVDA and Orca doc and some
 > experimenting. It seems to me that access to the whole document
 > is needed in particular in (1) structural navigation/browse mode...

	Again; I'd respectfully suggest that creating APIs that make it 
possible to easily do things that then scale badly ultimately does a 
dis-service to the impaired; people quickly use them and write poorly 
performing ATs.

	A nice API for navigation and/or pre-fetching to enable linear reading 
through documents, and/or reading of headings etc. seems to me far more 
useful (and likely to perform well) - than an API that allows pre-fetch 
of potentially hundreds of thousands of peers - even if we don't think 
they will change in readonly mode =)

	I think AT authors will always want all of the state of the app 
exposed, and indeed they will want it all cached locally if they can get 
it: but ... probably what is really useful is providing a way to write 
simple, reliable, performant, context-aware, maintainable ATs easily - 
and IMHO the "suck all state down as a first step" thing is not that :-)

	Anyhow - glad you're wrestling it not me!

	Regards,

		Michael.

-- 
michael.meeks at collabora.com <><, CEO Collabora Productivity
(M) +44 7795 666 147 - timezone usually UK / Europe