[Libreoffice-bugs] [Bug 109241] New: Problem with urllib on https URLs

bugzilla-daemon at bugs.documentfoundation.org bugzilla-daemon at bugs.documentfoundation.org
Thu Jul 20 20:12:17 UTC 2017


https://bugs.documentfoundation.org/show_bug.cgi?id=109241

            Bug ID: 109241
           Summary: Problem with urllib on https URLs
           Product: LibreOffice
           Version: 5.3.4.2 release
          Hardware: x86 (IA32)
                OS: Windows (All)
            Status: UNCONFIRMED
          Severity: normal
          Priority: medium
         Component: LibreOffice
          Assignee: libreoffice-bugs at lists.freedesktop.org
          Reporter: kiloran.public+bugzilla at gmail.com

I'm having problems web scraping https sites using LibreOffice python. I have
Libreoffice 5.3.4.2 (x86) on Windows 7, and can demonstrate the problem with
this simple script:

import urllib.request
myUrl = 'https://ask.libreoffice.org/en/questions/'
hdr = {'User-Agent': 'Mozilla/5.0'}
req = urllib.request.Request(url=myUrl, headers=hdr)
response = urllib.request.urlopen(req)

This fails immediately with "urlopen error unknown url type: https". It works
fine with an http url, but fails with any https url.

I tried the above in a LibreOffice Calc document with this embedded script and
it failed. It also failed when I tried running it in a terminal window from
C:\Program Files (x86)\LibreOffice 5\program\python-core-3.3.0\bin\python.exe

The script works fine with my standalone Python 3.3.2 running from a terminal
window.

I've also tried various LibreOffice Portable installations I have:

4.0.2.2: Works OK
5.3.1.2: Fails
5.3.2.2: Fails
I've tried uninstalling and reinstalling 5.3.4.2 more times than I can count
and cannot get it to work. Yet installing it on Windows 10 on the same PC using
a VM machine, it works fine.

I tried the Safe Mode in LibreOffice 5 and the script works fine. Went back to
normal mode and it failed again. Uninstalled LibreOffice 5.4.3.2 and then
deleted everything I could find relating to LibreOffice. Reinstalled 5.4.3.2
x86 and the behaviour is unchanged... works OK in Safe Mode and fails in normal
mode.

I did find a fix/workaround:

I renamed _ssl.pyd in C:\Program Files (x86)\LibreOffice
5\program\python-core-3.3.0\lib\ to _ssl.pyd(old).

I then copied _ssl.pyd from my standalone Python installation at C:\Program
Files (x86)\Python\DLLs\ and pasted it into the above folder.

LibreOffice now works OK, even though the original _ssl.pyd was just 48kB and
the replacement is 1162kB so they are very different.

Any idea why I am getting this problem on Windows 7?

-- 
You are receiving this mail because:
You are the assignee for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/libreoffice-bugs/attachments/20170720/4f73e7e2/attachment-0001.html>


More information about the Libreoffice-bugs mailing list