[Mesa-dev] [PATCH v3] python: Rework bytes/unicode string handling

Mathieu Bridon bochecha at daitauha.fr
Fri Aug 17 13:03:09 UTC 2018


On Fri, 2018-08-17 at 13:29 +0100, Jose Fonseca wrote:
> This change caused one of our MSVC build machines to fail with
> 
> scons: Building targets ...
>    Generating build\windows-x86-debug\util\xmlpool\options.h ...
> Traceback (most recent call last):
>    File "src\util\xmlpool\gen_xmlpool.py", line 221, in <module>
>      print(line, end='')
> UnicodeEncodeError: 'ascii' codec can't encode character u'\u201c'
> in 
> position 68: ordinal not in range(128)
> scons: *** [build\windows-x86-debug\util\xmlpool\options.h] Error 1

Argh!

I think that's because we're now printing a unicode string. On Python 3
that's the right thing to do, if we printed the encoded byte string
we'd get the "b'…'" representation, which we certainly don't want:

>>> print(u'août')
août
>>> print(u'août'.encode('utf-8'))
b'ao\xc3\xbbt'


But on Python 2, we should really print the byte string though. Python
2 helpfully tries to encode the unicode string automatically, using its
default encoding: ASCII.

That obviously fails when the string contains non-ascii character.

I'll send a patch ASAP.

> Setting PYTHONIOENCODING=utf-8 helps, but then bad things still
> happen when the output is loaded src/gallium/auxiliary/pipe-loader/
> 
> 
> But the fact is that everything was working before.
> 
> 
> Perhaps a solution is to just start using Python 3 for the
> generation scripts, as it might yield more consistent results.

That's a possibility, but then it means you need both Python 2 (for
Scons) and Python 3 (for the scripts). Requiring 2 Python stacks to
build a C codebase is pretty terrible. :-/


-- 
Mathieu



More information about the mesa-dev mailing list