wprintf() is not portable.

The wprintf() function seems like a very useful function for modern applications softwares. It speaks wide characters allowing one to potentially (assuming that one's implementation means something like UTF-16 or UCS-32 by "wide character") Unicodify one's application yet further, in a way that is portable from C/C++ compiler to C/C++ compiler with ease; and it is standardized. (See the page for fwprintf() in the Single Unix Specification version 6 for one of the two standards that define it.)

That's the theory from the standardization perspective, at least. Unfortunately, the function suffers from some disastrous non-standardization in real implementations of the C and C++ languages, that make it non-portable across implementations. These incompatibilities, moreover, exist in some fundamental and often-used parts of the function: the output of characters and strings.

Unfortunately, it turns out that it is impossible to be both standards-conformant and portable when calling wprintf().

Implementation behaviours

These are the main variant behaviours across implementations of the C and C++ languages:

C/C++ implementation(s) These format specifiers imply these arguments
%hs1 %s %ls %hS2 %S %lS2 %hc1 %c %lc %hC2 %C %lC2
Microsoft Visual C/C++ 7.1 (see the type and the size/distance documentation) const char * const wchar_t * const wchar_t * const char * const char * const wchar_t * int4 wint_t2 wint_t3 int4 int4 wint_t3
OpenWatcom C/C++ const char * const wchar_t * const wchar_t * const wchar_t *5 const wchar_t * const wchar_t *5 int4 wint_t3 wint_t3 wint_t3 5 wint_t3 wint_t3 5
GNU libc, and thus C/C++ compilers that use it (see the type and the conversion documentation) undocumented const char * const wchar_t * undocumented const wchar_t * undocumented undocumented int4 wint_t3 undocumented wint_t3 undocumented
OpenVMS C library, and thus C/C++ compilers that use it (see the output conversion documentation) undocumented const char * const wchar_t * undocumented const wchar_t * undocumented undocumented int4 wint_t3 undocumented wint_t3 undocumented
  1. See footnote #1 to the next table. These, where they are documented, are all implementation extensions.

  2. See footnote #2 to the next table.

  3. Formally, because it is a variable arguments function, any wchar_t argument is promoted to wint_t to be passed to the function.

  4. Formally, because it is a variable arguments function, any char argument is promoted to int to be passed to the function.

  5. These format specifiers are not documented in the OpenWatcom C library reference (in contrast to Microsoft's documentation which does document them, notice), however the OpenWatcom C library supports them in practice. The behaviour of %hS in OpenWatcom C/C++ is quirky, and in practice varies according to what actual string data are passed to it to be printed. OpenWatcom is clearly aping the non-standards-conformant behaviour of the Microsoft compiler. But it does so quite badly.

Or, put another way around:

To print the standards say to use but in these C/C++ implementations you actually have to use these format specifiers so code that is both standards-conformant and portable
an SBCS/MBCS character %c1 Microsoft Visual C/C++ 7.1 %C, %hc, or %hC2 cannot exist
OpenWatcom C/C++ %hc
GNU libc, and thus C/C++ compilers that use it %c
OpenVMS C library, and thus C/C++ compilers that use it %c
an SBCS/MBCS character string %s1 Microsoft Visual C/C++ 7.1 %S, %hs, or %hS2 cannot exist
OpenWatcom C/C++ %hs
GNU libc, and thus C/C++ compilers that use it %s
OpenVMS C library, and thus C/C++ compilers that use it %s
a wide character %C3 or %lc Microsoft Visual C/C++ 7.1 %c, %lc, or %lC2 must use %lc
OpenWatcom C/C++ %c, %lc, %C, %hC2, or %lC2
GNU libc, and thus C/C++ compilers that use it %C or %lc
OpenVMS C library, and thus C/C++ compilers that use it %C or %lc
a wide character string %S3 or %ls Microsoft Visual C/C++ 7.1 %s, %ls, or %lS2 must use %ls
OpenWatcom C/C++ %s, %ls, %S, %hS2, or %lS2
GNU libc, and thus C/C++ compilers that use it %S or %ls
OpenVMS C library, and thus C/C++ compilers that use it %S or %ls
  1. Strictly, and certainly if portability is one's aim, one cannot apply the h size modifier to the c or s format specifiers, as the results of doing so are not defined by either standard.

  2. Strictly, and certainly if portability is one's aim, one cannot apply the h and l size modifiers to the C or S format specifiers, as the results of doing so are not defined by either standard. These are all implementation extensions.

  3. The C language standard, ISO/IEC 9899:1999, does not define either %C or %S. The Linux C library documentation rather overcautiously says "Do not use." for these specifiers. In fact you can use them quite happily if portability to POSIX systems is as portable as you wish to be, because they are defined by the Single Unix Specification, and so as the SUS itself notes will be supported by any POSIX conformant system.


© Copyright 2011 Jonathan de Boyne Pollard. "Moral" rights asserted.
Permission is hereby granted to copy and to distribute this web page in its original, unmodified form as long as its last modification datestamp is preserved.