Python extension issue with LO 4 on Windows

Stephan Bergmann sbergman at redhat.com
Tue Jan 22 06:22:01 PST 2013


On 01/22/2013 02:05 PM, Olivier R. wrote:
> On Windows, LO 4 is unable to install this extension (written in Python):
> http://extensions.libreoffice.org/extension-center/dictionnaires-francais/releases/4.9/lo-oo-ressources-linguistiques-fr-v4.9.oxt
>
> Here is the error I get:
>
> (com.sun.star.uno.RuntimeException) { { Message = "<class
> 'UnicodeDecodeError'>: 'charmap' codec can't decode byte 0x9d in position
> 3782: character maps to <undefined>, traceback follows\X000a  C:\\Program
> Files (x86)\\LOdev
> 4.0\\program\\python-core-3.3.0\\lib\\encodings\\cp1252.py:23 in function
> decode() [return
> codecs.charmap_decode(input,self.errors,decoding_table)[0]]\X000a
> C:\\Program Files (x86)\\LOdev 4.0\\program\\pythonloader.py:94 in function
> getModuleFromUrl() [src = fileHandle.read().replace(\"\\r\",\"\")]\X000a
> C:\\Program Files (x86)\\LOdev 4.0\\program\\pythonloader.py:146 in function
> writeRegistryInfo() [mod = self.getModuleFromUrl( locationUrl
> )]\X000a\X000a", Context = (com.sun.star.uno.XInterface) @0 } }
>
> But it works properly on Linux.
>
> The code is useful to switch between the 4 different French dictionaries.
>
> Any idea?
> Should I create a bug report?

The file DictionarySwitcher.py included in 
<http://extensions.libreoffice.org/extension-center/dictionnaires-francais/releases/4.9/lo-oo-ressources-linguistiques-fr-v4.9.oxt> 
is apparently UTF-8 encoded and includes code unit sequences like E2 80 
9D (representing U+201D RIGHT DOUBLE QUOTATION MARK) at offset 3780.

The code at LO's pythonloader.py:94

   src = fileHandle.read().replace("\r","")

is obviously environment sensitive, in that it tries to treat the file 
as CP-1252 encoded (which fails: E2 and 80 happen to represent valid 
CP-1252 characters, but 9D does not) in your Windows environment, but 
likely treats it as UTF-8 (which succeeds) in your Linux environment. 
This environment-sensitive behavior of Python presumably changed for 
Python 3, so that this issue only starts to show with LO 4.

So, for one, someone better versed in Python than me could fix that code 
in pythonloader.py to not be environment sensitive.  (And you can file a 
LO bug about that.)

And, for another, for maximum portability, it might be possible to 
change the content of that DictionarySwitcher.py to stick to plain ASCII 
source.  (And you can contact the authors of that extension about that.)

Stephan


More information about the LibreOffice mailing list