Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/string/char traits"

From cppreference.com
< cpp‎ | string
(Rearranged the whole page, removed inconsistencies and clarified that the member listed here are only guaranteed for standard specializations.)
(Undo revision 158851 by Xmcgcg (talk))
Line 1: Line 1:
 
{{cpp/title|char_traits}}
 
{{cpp/title|char_traits}}
 
{{cpp/string/char_traits/navbar}}
 
{{cpp/string/char_traits/navbar}}
{{ddcl|header=string|1=
+
{{ddcl | header=string | 1=
 
template<
 
template<
 
     class CharT  
 
     class CharT  
Line 9: Line 9:
 
The {{tt|char_traits}} class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized {{tt|char_traits}} class.
 
The {{tt|char_traits}} class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized {{tt|char_traits}} class.
  
The {{tt|char_traits}} class template serves as a basis for explicit instantiations. The user can [[cpp/language/extending std|provide a specialization]] for any custom character types. Several explicit specializations are provided for the standard character types (see below), other specializations are not required to satisfy the requirements of {{named req|CharTraits}}.
+
The {{tt|char_traits}} class template serves as a basis for explicit instantiations. The user can [[cpp/language/extending std|provide a specialization]] for any custom character types. Several specializations are defined for the standard character types.
  
===Specializations===
+
If an operation on traits emits an exception, the behavior is undefined.
The standard library provides the following standard specializations:
+
{{dsc begin}}
+
{{dsc header|string}}
+
{{dsc|{{c/core|std::char_traits<char>}}|the standard character traits of {{c/core|char}}}}
+
{{dsc|{{c/core|std::char_traits<wchar_t>}}|the standard character traits of {{c/core|wchar_t}}}}
+
{{dsc|{{c/core|std::char_traits<char8_t>}} {{mark c++20}}|the standard character traits of {{c/core|char8_t}}}}
+
{{dsc|{{c/core|std::char_traits<char16_t>}} {{mark c++11}}|the standard character traits of {{c/core|char16_t}}}}
+
{{dsc|{{c/core|std::char_traits<char32_t>}} {{mark c++11}}|the standard character traits of {{c/core|char32_t}}}}
+
{{dsc end}}
+
  
All these specializations satisfy the requirements of {{named req|CharTraits}}.
+
===Standard specializations===
 +
Member typedefs of standard specializations are as follows:
 +
{| class="t-dsc-begin"
 +
|- class="t-dsc-hitem" style="text-align:center"
 +
! Specialization
 +
! {{tt|char_type}}
 +
! {{tt|int_type}}
 +
! {{tt|pos_type}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char>}}
 +
| {{c/core|char}}
 +
| {{c/core|int}}
 +
| {{lc|std::streampos}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<wchar_t>}}
 +
| {{c/core|wchar_t}}
 +
| {{rlpt|wide#Types|std::wint_t}}
 +
| {{lc|std::wstreampos}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char16_t>}} {{mark c++11}}{{nbsp|4}}
 +
| {{c/core|char16_t}}{{nbsp|4}}
 +
| {{lc|std::uint_least16_t}}{{nbsp|4}}
 +
| {{lc|std::u16streampos}}{{nbsp|4}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char32_t>}} {{mark c++11}}
 +
| {{c/core|char32_t}}
 +
| {{lc|std::uint_least32_t}}
 +
| {{lc|std::u32streampos}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char8_t>}} {{mark c++20}}
 +
| {{c/core|char8_t}}
 +
| {{c/core|unsigned int}}
 +
| {{lc|std::u8streampos}}
 +
|}
  
====Member types====
+
{| class="t-dsc-begin"
The standard specializations define the following member types required by {{named req|CharTraits}}:
+
|- class="t-dsc-hitem" style="text-align:center"
{|class=wikitable style="text-align: center;"
+
! Member type
!rowspan=2|{{tt|CharT}}
+
! Definition (same among all standard specializations)
!colspan=5|Member type
+
|- class="t-dsc"
|-
+
| {{tt|off_type}}
!{{nbsp}}{{tt|char_type}}{{nbsp}}
+
| {{lc|std::streamoff}}
!{{tt|int_type}}
+
|- class="t-dsc"
!{{tt|off_type}}
+
| {{tt|state_type}}
!{{tt|pos_type}}
+
| {{lc|std::mbstate_t}}
!{{tt|state_type}}
+
|- class="t-dsc"
|-
+
| {{tt|comparison_category}} {{mark c++20}}
|{{c/core|char}}
+
| {{lc|std::strong_ordering}}
|{{c/core|char}}
+
|{{c/core|int}}
+
|rowspan=5|{{nbsp}}{{lc|std::streamoff}}{{nbsp}}
+
|{{lc|std::streampos}}
+
|rowspan=5|{{nbsp}}{{lc|std::mbstate_t}}{{nbsp}}
+
|-
+
|{{c/core|wchar_t}}
+
|{{c/core|wchar_t}}
+
|{{ltt|cpp/string/wide#Types|std::wint_t}}
+
|{{lc|std::wstreampos}}
+
|-
+
|{{c/core|char8_t}}
+
|{{c/core|char8_t}}
+
|{{c/core|unsigned int}}
+
|{{lc|std::u8streampos}}
+
|-
+
|{{nbsp}}{{c/core|char16_t}}{{nbsp}}
+
|{{c/core|char16_t}}
+
|{{nbsp}}{{lc|std::uint_least16_t}}{{nbsp}}
+
|{{nbsp}}{{lc|std::u16streampos}}{{nbsp}}
+
|-
+
|{{c/core|char32_t}}
+
|{{c/core|char32_t}}
+
|{{lc|std::uint_least32_t}}
+
|{{lc|std::u32streampos}}
+
 
|}
 
|}
  
{{rrev|since=c++20|
+
The semantics of the member functions of standard specializations are defined are as follows:
On top of that, the standard specializations also define the member type {{tt|comparison_category}} as {{ltt std|cpp/utility/compare/strong_ordering}}.
+
{| class="t-dsc-begin"
}}
+
|- class="t-dsc-hitem" style="text-align:center"
 +
! Specialization
 +
! {{tt|assign}}
 +
! {{tt|eq}}
 +
! {{tt|lt}}
 +
! {{tt|eof}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char>}}
 +
| {{c|1==}}
 +
| {{c|1===}} for<br>{{c/core|unsigned char}}{{nbsp|4}}
 +
| {{c|<}} for<br>{{c/core|unsigned char}}{{nbsp|4}}
 +
| {{lc|EOF}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<wchar_t>}}
 +
| {{c|1==}}
 +
| {{c|1===}}
 +
| {{c|<}}
 +
| {{rlpt|wide#Macros|WEOF}}
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char16_t>}} {{mark c++11}}{{nbsp|4}}
 +
| {{c|1==}}
 +
| {{c|1===}}
 +
| {{c|<}}
 +
| invalid UTF-16 code unit
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char32_t>}} {{mark c++11}}
 +
| {{c|1==}}
 +
| {{c|1===}}
 +
| {{c|<}}
 +
| invalid UTF-32 code unit
 +
|- class="t-dsc"
 +
| {{c/core|std::char_traits<char8_t>}} {{mark c++20}}
 +
| {{c|1==}}
 +
| {{c|1===}}
 +
| {{c|<}}
 +
| invalid UTF-8 code unit
 +
|}
 +
 
 +
Standard specializations of {{tt|char_traits}} class template satisfy the requirements of {{named req|CharTraits}}.
  
====Member functions====
+
===Member types===
The standard specializations define the following static member functions required by {{named req|CharTraits}}:
+
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc|cpp/string/char_traits/dsc assign}}
+
{{dsc hitem | Type | Definition}}
{{dsc inc|cpp/string/char_traits/dsc cmp}}
+
{{dsc | {{tt|char_type}} | {{tt|CharT}}}}
{{dsc inc|cpp/string/char_traits/dsc move}}
+
{{dsc | {{tt|int_type}} | an integer type that can hold all values of {{tt|char_type}} plus {{lc|EOF}}}}
{{dsc inc|cpp/string/char_traits/dsc copy}}
+
{{dsc | {{tt|off_type}} | ''implementation-defined''}}
{{dsc inc|cpp/string/char_traits/dsc compare}}
+
{{dsc | {{tt|pos_type}} | ''implementation-defined''}}
{{dsc inc|cpp/string/char_traits/dsc length}}
+
{{dsc | {{tt|state_type}} | ''implementation-defined''}}
{{dsc inc|cpp/string/char_traits/dsc find}}
+
{{dsc inc|cpp/string/char_traits/dsc to_char_type}}
+
{{dsc inc|cpp/string/char_traits/dsc to_int_type}}
+
{{dsc inc|cpp/string/char_traits/dsc eq_int_type}}
+
{{dsc inc|cpp/string/char_traits/dsc eof}}
+
{{dsc inc|cpp/string/char_traits/dsc not_eof}}
+
 
{{dsc end}}
 
{{dsc end}}
  
===Notes===
+
===Member functions===
{{named req|CharTraits}} does not require defining the types and functions listed above as direct members, it only requires types like {{tt|X::type}} and expressions like {{c|X::func(args)}} are valid and have the required semantics. Users-defined character traits can be derived from other character traits classes and only override some of their members, see the example below.
+
{{dsc begin}}
 +
{{dsc inc | cpp/string/char_traits/dsc assign}}
 +
{{dsc inc | cpp/string/char_traits/dsc cmp}}
 +
{{dsc inc | cpp/string/char_traits/dsc move}}
 +
{{dsc inc | cpp/string/char_traits/dsc copy}}
 +
{{dsc inc | cpp/string/char_traits/dsc compare}}
 +
{{dsc inc | cpp/string/char_traits/dsc length}}
 +
{{dsc inc | cpp/string/char_traits/dsc find}}
 +
{{dsc inc | cpp/string/char_traits/dsc to_char_type}}
 +
{{dsc inc | cpp/string/char_traits/dsc to_int_type}}
 +
{{dsc inc | cpp/string/char_traits/dsc eq_int_type}}
 +
{{dsc inc | cpp/string/char_traits/dsc eof}}
 +
{{dsc inc | cpp/string/char_traits/dsc not_eof}}
 +
{{dsc end}}
  
 
===Example===
 
===Example===
Line 92: Line 135:
 
|User-defined character traits may be used to provide [http://www.gotw.ca/gotw/029.htm case-insensitive comparison]:
 
|User-defined character traits may be used to provide [http://www.gotw.ca/gotw/029.htm case-insensitive comparison]:
 
|code=
 
|code=
#include <cctype>
 
#include <iostream>
 
 
#include <string>
 
#include <string>
 
#include <string_view>
 
#include <string_view>
 +
#include <iostream>
 +
#include <cctype>
  
 
struct ci_char_traits : public std::char_traits<char>
 
struct ci_char_traits : public std::char_traits<char>
Line 130: Line 173:
 
     static const char* find(const char* s, std::size_t n, char a)
 
     static const char* find(const char* s, std::size_t n, char a)
 
     {
 
     {
         const auto ua{to_upper(a)};
+
         auto const ua (to_upper(a));
 
         while (n-- != 0)  
 
         while (n-- != 0)  
 
         {
 
         {
Line 161: Line 204:
 
Hello and heLLo are equal
 
Hello and heLLo are equal
 
}}
 
}}
 +
 +
===Defect reports===
 +
{{dr list begin}}
 +
{{dr list item|wg=lwg|dr=467|std=C++98|before=for {{c/core|std::char_traits<char>}}, the semantics of {{tt|eq()}} and {{tt|lt()}}<br>are the same as the built-in {{c|1===}} and {{c|<}} on {{c/core|char}} respectively<ref>Most implementations call {{lc|std::memcmp()}} for efficiency, which interprets the data as arrays of {{c/core|unsigned char}}. If {{c/core|char}} [[cpp/language/types#Character types|is signed]] on such implementations, {{c/core|std::char_traits<char>}} fails to satisfy the requirements of {{named req|CharTraits}}.</ref>|after=changed to built-in {{c|1===}} and<br>{{c|<}} on {{c/core|unsigned char}}}}
 +
{{dr list end}}
 +
<references/>
  
 
===See also===
 
===See also===
 
{{dsc begin}}
 
{{dsc begin}}
 
{{dsc inc|cpp/string/dsc basic_string}}
 
{{dsc inc|cpp/string/dsc basic_string}}
{{dsc inc|cpp/string/dsc basic_string_view}}
 
{{dsc inc|cpp/io/dsc basic_istream}}
 
{{dsc inc|cpp/io/dsc basic_ostream}}
 
{{dsc inc|cpp/io/dsc basic_streambuf}}
 
 
{{dsc end}}
 
{{dsc end}}
  
 
{{langlinks|de|es|fr|it|ja|pt|ru|zh}}
 
{{langlinks|de|es|fr|it|ja|pt|ru|zh}}

Revision as of 14:46, 12 September 2023

Defined in header <string>
template<

    class CharT

> class char_traits;

The char_traits class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized char_traits class.

The char_traits class template serves as a basis for explicit instantiations. The user can provide a specialization for any custom character types. Several specializations are defined for the standard character types.

If an operation on traits emits an exception, the behavior is undefined.

Contents

Standard specializations

Member typedefs of standard specializations are as follows:

Specialization char_type int_type pos_type
std::char_traits<char> char int std::streampos
std::char_traits<wchar_t> wchar_t std::wint_t std::wstreampos
std::char_traits<char16_t> (C++11)     char16_t     std::uint_least16_t     std::u16streampos    
std::char_traits<char32_t> (C++11) char32_t std::uint_least32_t std::u32streampos
std::char_traits<char8_t> (C++20) char8_t unsigned int std::u8streampos
Member type Definition (same among all standard specializations)
off_type std::streamoff
state_type std::mbstate_t
comparison_category (C++20) std::strong_ordering

The semantics of the member functions of standard specializations are defined are as follows:

Specialization assign eq lt eof
std::char_traits<char> = == for
unsigned char    
< for
unsigned char    
EOF
std::char_traits<wchar_t> = == < WEOF
std::char_traits<char16_t> (C++11)     = == < invalid UTF-16 code unit
std::char_traits<char32_t> (C++11) = == < invalid UTF-32 code unit
std::char_traits<char8_t> (C++20) = == < invalid UTF-8 code unit

Standard specializations of char_traits class template satisfy the requirements of CharTraits.

Member types

Type Definition
char_type CharT
int_type an integer type that can hold all values of char_type plus EOF
off_type implementation-defined
pos_type implementation-defined
state_type implementation-defined

Member functions

[static]
assigns a character
(public static member function) [edit]
[static]
compares two characters
(public static member function) [edit]
[static]
moves one character sequence onto another
(public static member function) [edit]
[static]
copies a character sequence
(public static member function) [edit]
[static]
lexicographically compares two character sequences
(public static member function) [edit]
[static]
returns the length of a character sequence
(public static member function) [edit]
[static]
finds a character in a character sequence
(public static member function) [edit]
converts int_type to equivalent char_type
(public static member function) [edit]
[static]
converts char_type to equivalent int_type
(public static member function) [edit]
[static]
compares two int_type values
(public static member function) [edit]
[static]
returns an eof value
(public static member function) [edit]
[static]
checks whether a character is eof value
(public static member function) [edit]

Example

User-defined character traits may be used to provide case-insensitive comparison:

#include <string>
#include <string_view>
#include <iostream>
#include <cctype>
 
struct ci_char_traits : public std::char_traits<char>
{
    static char to_upper(char ch)
    {
        return std::toupper((unsigned char) ch);
    }
 
    static bool eq(char c1, char c2)
    {
        return to_upper(c1) == to_upper(c2);
    }
 
    static bool lt(char c1, char c2)
    {
         return to_upper(c1) < to_upper(c2);
    }
 
    static int compare(const char* s1, const char* s2, std::size_t n)
    {
        while (n-- != 0)
        {
            if (to_upper(*s1) < to_upper(*s2))
                return -1;
            if (to_upper(*s1) > to_upper(*s2))
                return 1;
            ++s1;
            ++s2;
        }
        return 0;
    }
 
    static const char* find(const char* s, std::size_t n, char a)
    {
        auto const ua (to_upper(a));
        while (n-- != 0) 
        {
            if (to_upper(*s) == ua)
                return s;
            s++;
        }
        return nullptr;
    }
};
 
template<class DstTraits, class CharT, class SrcTraits>
constexpr std::basic_string_view<CharT, DstTraits>
    traits_cast(const std::basic_string_view<CharT, SrcTraits> src) noexcept
{
    return {src.data(), src.size()};
}
 
int main()
{
    using namespace std::literals;
 
    constexpr auto s1 = "Hello"sv;
    constexpr auto s2 = "heLLo"sv;
 
    if (traits_cast<ci_char_traits>(s1) == traits_cast<ci_char_traits>(s2))
        std::cout << s1 << " and " << s2 << " are equal\n";
}

Output:

Hello and heLLo are equal

Defect reports

The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

DR Applied to Behavior as published Correct behavior
LWG 467 C++98 for std::char_traits<char>, the semantics of eq() and lt()
are the same as the built-in == and < on char respectively[1]
changed to built-in == and
< on unsigned char
  1. Most implementations call std::memcmp() for efficiency, which interprets the data as arrays of unsigned char. If char is signed on such implementations, std::char_traits<char> fails to satisfy the requirements of CharTraits.

See also

stores and manipulates sequences of characters
(class template) [edit]