<html>
<head>
<base href="https://bugs.freedesktop.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Support unicode args and console output on windows"
href="https://bugs.freedesktop.org/show_bug.cgi?id=103693">103693</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Support unicode args and console output on windows
</td>
</tr>
<tr>
<th>Product</th>
<td>poppler
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>Other
</td>
</tr>
<tr>
<th>OS</th>
<td>Windows (All)
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>medium
</td>
</tr>
<tr>
<th>Component</th>
<td>utils
</td>
</tr>
<tr>
<th>Assignee</th>
<td>poppler-bugs@lists.freedesktop.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>ajohnson@redneon.com
</td>
</tr></table>
<p>
<div>
<pre>Currently, using the utils in a windows console does not work with unicode,
either for the command line arguments, or for the output.
I looked into the possible solutions for this. The command line arguments are
easy to fix. There is GetCommandLineW() to get the unicode command line.
Getting unicode output on the console is a bit more work. There are three ways:
- Set the code page to UTF-8 then printf with UTF-8 should just work - except
for when it doesn't. The UTF-8 support is a second class citizen in windows and
very buggy. Just redirecting the output to "more" crashed it.
- Use the wide IO functions eg wprintf(). This only works if the IO mode is set
to 16-bit. ie _setmode(_fileno(stdout), _O_U16TEXT). The only problems is once
the mode is changed to 16-bit, the 8-bit IO functions no longer work. printf()
crashes.
- Use the WriteConsoleW() function. This seems to be the best solution as it
can be mixed with printf. I did some testing and windows flushes stdio after
each end of line so printf and WriteConsoleW() can be interleaved providing
they are not used for the same line.
The next problem is finding the easiest way to support this in poppler. For the
command line argument I create a Win32Console class that converts the args to
UTF-8. eg
int main(int argc, char **argv)
{
Win32Console win32Console(&argc, &argv);
...
}
On windows it will get the unicode args, convert to UTF-8 and store in
argc/argv. On other platforms it is a no-op.
For the output there a 3 ways it could be done:
- Create a DLL that overrides the printf and related functions. I couldn't get
this to work. I seems to depend on the order of linking and MSVC links the
system libraries first.
- Write our own utf8_printf that calls WriteConsoleW on windows or printf on
other platforms. I would prefer not to use non standard functions that need to
be used everywhere in the code.
- #define printf to a replacement function on windows.
I ended up going the #define solution. The Win32Console header redefines
printf/fprintf/fputc on windows to use WriteConsoleW if outputting to the
console.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are the assignee for the bug.</li>
</ul>
</body>
</html>