3.6.0 regression: non-deterministic filter selection ...

Kohei Yoshida kohei.yoshida at gmail.com
Wed Aug 15 08:05:55 PDT 2012

On Wed, Aug 15, 2012 at 12:40 AM, Kohei Yoshida <kyoshida at novell.com> wrote:
> On Tue, Aug 14, 2012 at 3:15 PM, Michael Meeks <michael.meeks at suse.com> wrote:
>>         Thoughts ?
> It appears that the type detection asks SwFilterDetect::detect() to
> detect its type, and it returns empty handed, which eventually leads
> to it being "detected" as ascii text.
> I've tried a simple Word document I created from Word XP, and the same
> code detects the file type just fine.  I'm right now looking in to see
> why SwFilterDetect::detect() fails to detect your particular document.

I'm starting to suspect that maybe it's the filter itself failing to
parse this document, rather than the type detection system failing to
detect it.

Here is my reasoning:

1) Type detection correctly prioritize the writer_MS_Word_97 file type
at the top of the list, which gets tested first.  With normal .doc
file, this test succeeds, the type detection ends with the correct
filter type detected 'MS Word 97'.

2) But with Large-Word.doc, this test fails, and the type detection
continues down the list of all types to test.  They all fail, and the
last one on the list is the ascii text, which always succeeds.

3) Now, if you launch the file open dialog and select this file,
manually set the file type to 'Microsoft Word 97/2000/XP/2003', select
Large-Word.doc and click Open, internally it bypasses the type
detection and proceeds to open the file using the pre-selected filter
'MS Word 97'.  Even in this scenario, you'll get 'Read-Error. Error
reading file'.

The last point indicates that the correct filter for this file type
fails to load this file for whatever reason.

Does this make sense?  At this point, I'm inclined to ask the Writer
folks to see if any changes in writer's word import filter has ended
up not loading this file correctly...  I'm CC'ing Cedric who is our
writer guru, and Caolan who is generally very knowledgeable about this
sort of thing. :-)

I'll go a little deeper in the type detection chain (I'm at
SwIoSystem::IsFileFilter at the moment) to confirm this theory.


More information about the LibreOffice mailing list