Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/locale/codecvt"

From cppreference.com
< cpp‎ | locale
(fmt)
(fix link)
Line 44: Line 44:
 
{{dcl list mem ctor | cpp/locale/codecvt/codecvt | constructs a new codecvt facet }}
 
{{dcl list mem ctor | cpp/locale/codecvt/codecvt | constructs a new codecvt facet }}
 
{{dcl list prot mem dtor | cpp/locale/codecvt/~codecvt | destructs a codecvt facet }}
 
{{dcl list prot mem dtor | cpp/locale/codecvt/~codecvt | destructs a codecvt facet }}
{{dcl list mem fun | cpp/locale/codecvt/do_out | title=out | invokes {{tt|do_out}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/out | invokes {{tt|do_out}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_in | title=in | invokes {{tt|do_in}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/in | invokes {{tt|do_in}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_unshift | title=unshift | invokes {{tt|do_unshift}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/unshift | invokes {{tt|do_unshift}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_encoding | title=encoding | invokes {{tt|do_encoding}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/encoding | invokes {{tt|do_encoding}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_always_noconv | title=always_noconv | invokes {{tt|do_always_noconv}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/always_noconv | invokes {{tt|do_always_noconv}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_length | title=length | invokes {{tt|do_length}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/length | invokes {{tt|do_length}} }}
{{dcl list mem fun | cpp/locale/codecvt/do_max_length | title=max_length | invokes {{tt|do_max_length}} }}
+
{{dcl list mem fun | cpp/locale/codecvt/max_length | invokes {{tt|do_max_length}} }}
 
{{dcl list end}}
 
{{dcl list end}}
  

Revision as of 10:43, 12 October 2012

 
 
 
 

Template:ddcl list begin <tr class="t-dsc-header">

<td>
Defined in header <locale>
</td>

<td></td> <td></td> </tr> <tr class="t-dcl ">

<td class="t-dcl-nopad">
template<

    class InternT,
    class ExternT,
    class State

> class codecvt;
</td>

<td class="t-dcl-nopad"> </td> <td class="t-dcl-nopad"> </td> </tr> Template:ddcl list end

Class std::codecvt encapsulates conversion of character strings, including wide and multibyte, from one encoding to another. All file I/O operations performed through std::basic_fstream<CharT> use the std::codecvt<CharT, char, std::mbstate_t> facet of the locale imbued in the stream.

cpp/locale/codecvt basecpp/locale/locale/facetstd-codecvt-inheritance.svg

Inheritance diagram

Four specializations are provided by the standard library and are implemented by all locale objects created in a C++ program:

Defined in header <locale>
std::codecvt<char, char, std::mbstate_t> identity conversion
std::codecvt<char16_t, char, std::mbstate_t> conversion between UTF-16 and UTF-8 (since C++11)
std::codecvt<char32_t, char, std::mbstate_t> conversion between UTF-32 and UTF-8 (since C++11)
std::codecvt<wchar_t, char, std::mbstate_t> locale-specific conversion between wide string and narrow, possibly multibyte, string

Contents

Member types

Member type Definition
intern_type internT
extern_type externT
state_type stateT

Member objects

Member name Type
id (static) std::locale::id

Member functions

constructs a new codecvt facet
(public member function)
destructs a codecvt facet
(protected member function)
invokes do_out
(public member function)
invokes do_in
(public member function)
invokes do_unshift
(public member function)
invokes do_encoding
(public member function)
invokes do_always_noconv
(public member function)
invokes do_length
(public member function)
invokes do_max_length
(public member function)

Protected member functions

Template:cpp/locale/codecvt/dcl list do outTemplate:cpp/locale/codecvt/dcl list do inTemplate:cpp/locale/codecvt/dcl list do unshiftTemplate:cpp/locale/codecvt/dcl list do encodingTemplate:cpp/locale/codecvt/dcl list do always noconvTemplate:cpp/locale/codecvt/dcl list do lengthTemplate:cpp/locale/codecvt/dcl list do max length

Inherited from std::codecvt_base

Member type Definition
enum result { ok, partial, error, noconv }; Unscoped enumeration type
Enumeration constant Definition
ok conversion was completed with no error
partial not all source characters were converted
error encountered an invalid character
noconv no conversion required, input and output types are the same

Example

The following examples reads a UTF-8 file using a locale which implements UTF-8 conversion in codecvt<wchar_t, char, mbstate_t>

#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <iomanip>
int main()
{
    // UTF-8 narrow multibyte encoding
    std::ofstream("text.txt") << u8"z\u00df\u6c34\U0001d10b"; // or u8"zß水𝄋"
                                           // or "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9d\x84\x8b";
    std::wifstream fin("text.txt");
    fin.imbue(std::locale("en_US.UTF-8")); // this locale's codecvt<wchar_t, char, mbstate_t>
                                           // converts UTF-8 to UCS4
    std::cout << "The UTF-8 file contains the following wide characters: \n";
    for(wchar_t c; fin >> c; )
        std::cout << "U+" << std::hex << std::setw(4) << std::setfill('0') << c << '\n';
}

Output:

The UTF-8 file contains the following wide characters:
U+007a
U+00df
U+6c34
U+1d10b

See also

Character
conversions
locale-defined multibyte
(UTF-8, GB18030)
UTF-8
UTF-16
UTF-16 mbrtoc16 / c16rtomb (with C11's DR488)

codecvt<char16_t,char,mbstate_t>
codecvt_utf8_utf16<char16_t>
codecvt_utf8_utf16<char32_t>
codecvt_utf8_utf16<wchar_t>

N/A
UCS-2 c16rtomb (without C11's DR488) codecvt_utf8<char16_t> codecvt_utf16<char16_t>
UTF-32

mbrtoc32 / c32rtomb

codecvt<char32_t,char,mbstate_t>
codecvt_utf8<char32_t>

codecvt_utf16<char32_t>

system wchar_t:

UTF-32 (non-Windows)
UCS-2 (Windows)

mbsrtowcs / wcsrtombs
use_facet<codecvt
<wchar_t,char,mbstate_t>>(locale)

codecvt_utf8<wchar_t> codecvt_utf16<wchar_t>
Template:cpp/locale/dcl list codecvt utf8Template:cpp/locale/dcl list codecvt utf16Template:cpp/locale/dcl list codecvt utf8 utf16
defines character conversion errors
(class template)
creates a codecvt facet for the named locale
(class template)