[Mesa-dev] [PATCH 02/16] docs: Add python script that converts html to rst.

Eric Engestrom eric.engestrom at intel.com
Fri May 25 11:26:05 UTC 2018


On Thursday, 2018-05-24 17:27:05 -0700, Laura Ekstrand wrote:
> Use Beautiful Soup to fix bad html, then use pandoc for converting to
> rst.
> ---
>  docs/rstConverter.py | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
>  create mode 100755 docs/rstConverter.py
> 
> diff --git a/docs/rstConverter.py b/docs/rstConverter.py
> new file mode 100755
> index 0000000000..5321fdde8b
> --- /dev/null
> +++ b/docs/rstConverter.py
> @@ -0,0 +1,23 @@
> +#!/usr/bin/python3
> +import glob
> +import subprocess
> +from bs4 import BeautifulSoup
> +
> +pages = glob.glob("*.html")
> +pages += glob.glob("relnotes/*.html")
> +for filename in pages:
> +    # Fix some annoyingly bad html.
> +    with open(filename) as f:
> +        soup = BeautifulSoup(f, 'html5lib')
> +    soup.find("div", "header").extract() # Get rid of old header
> +    soup.iframe.extract() # Get rid of old contents bar.
> +    soup.find("div", "content").unwrap() # Strip the content div.

Good call on using beautifulsoup to clean the html before converting it!

> +
> +    # Write out the better html.
> +    with open(filename, 'wt') as f:
> +        f.write(str(soup))
> +
> +    # Convert to rst with pandoc.
> +    name = filename.split(".html")[0]
> +    bashCmd = "pandoc " + filename + " -o " + name + ".rst"
> +    subprocess.run(bashCmd.split())

Idea: remove the old html at the same time as we introduce the rst
(commit-wise), so that git picks it up as a rename with changes, which
hopefully would be easier to check as a 1:1 of any given conversion?

(In case this is as unclear as I think it is, I'm thinking about how we
can review individual pages conversions; say index.html -> index.rst, to
see that no release has been dropped in the process. If git shows this
as a rename with changes, I expect it will be easier to check than if
one commit creates all the rst files and another deletes all the html)


More information about the mesa-dev mailing list