Difference between revisions of "cpp/string/char traits"
(Rearranged the whole page, removed inconsistencies and clarified that the member listed here are only guaranteed for standard specializations.) |
|||
Line 1: | Line 1: | ||
{{cpp/title|char_traits}} | {{cpp/title|char_traits}} | ||
{{cpp/string/char_traits/navbar}} | {{cpp/string/char_traits/navbar}} | ||
− | {{ddcl|header=string|1= | + | {{ddcl | header=string | 1= |
template< | template< | ||
class CharT | class CharT | ||
Line 9: | Line 9: | ||
The {{tt|char_traits}} class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized {{tt|char_traits}} class. | The {{tt|char_traits}} class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized {{tt|char_traits}} class. | ||
− | The {{tt|char_traits}} class template serves as a basis for explicit instantiations. The user can [[cpp/language/extending std|provide a specialization]] for any custom character types. Several | + | The {{tt|char_traits}} class template serves as a basis for explicit instantiations. The user can [[cpp/language/extending std|provide a specialization]] for any custom character types. Several specializations are defined for the standard character types. |
− | + | If an operation on traits emits an exception, the behavior is undefined. | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ===Standard specializations=== | |
+ | Member typedefs of standard specializations are as follows: | ||
+ | {| class="t-dsc-begin" | ||
+ | |- class="t-dsc-hitem" style="text-align:center" | ||
+ | ! Specialization | ||
+ | ! {{tt|char_type}} | ||
+ | ! {{tt|int_type}} | ||
+ | ! {{tt|pos_type}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char>}} | ||
+ | | {{c/core|char}} | ||
+ | | {{c/core|int}} | ||
+ | | {{lc|std::streampos}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<wchar_t>}} | ||
+ | | {{c/core|wchar_t}} | ||
+ | | {{rlpt|wide#Types|std::wint_t}} | ||
+ | | {{lc|std::wstreampos}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char16_t>}} {{mark c++11}}{{nbsp|4}} | ||
+ | | {{c/core|char16_t}}{{nbsp|4}} | ||
+ | | {{lc|std::uint_least16_t}}{{nbsp|4}} | ||
+ | | {{lc|std::u16streampos}}{{nbsp|4}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char32_t>}} {{mark c++11}} | ||
+ | | {{c/core|char32_t}} | ||
+ | | {{lc|std::uint_least32_t}} | ||
+ | | {{lc|std::u32streampos}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char8_t>}} {{mark c++20}} | ||
+ | | {{c/core|char8_t}} | ||
+ | | {{c/core|unsigned int}} | ||
+ | | {{lc|std::u8streampos}} | ||
+ | |} | ||
− | + | {| class="t-dsc-begin" | |
− | + | |- class="t-dsc-hitem" style="text-align:center" | |
− | + | ! Member type | |
− | ! | + | ! Definition (same among all standard specializations) |
− | + | |- class="t-dsc" | |
− | |- | + | | {{tt|off_type}} |
− | + | | {{lc|std::streamoff}} | |
− | + | |- class="t-dsc" | |
− | + | | {{tt|state_type}} | |
− | + | | {{lc|std::mbstate_t}} | |
− | + | |- class="t-dsc" | |
− | + | | {{tt|comparison_category}} {{mark c++20}} | |
− | + | | {{lc|std::strong_ordering}} | |
− | + | ||
− | + | ||
− | + | ||
− | | | + | |
− | + | ||
− | + | ||
− | |{{ | + | |
− | + | ||
− | + | ||
− | |{{lc|std:: | + | |
− | |- | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | |{{ | + | |
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | + | ||
− | |{{lc|std:: | + | |
|} | |} | ||
− | {{ | + | The semantics of the member functions of standard specializations are defined are as follows: |
− | + | {| class="t-dsc-begin" | |
− | }} | + | |- class="t-dsc-hitem" style="text-align:center" |
+ | ! Specialization | ||
+ | ! {{tt|assign}} | ||
+ | ! {{tt|eq}} | ||
+ | ! {{tt|lt}} | ||
+ | ! {{tt|eof}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char>}} | ||
+ | | {{c|1==}} | ||
+ | | {{c|1===}} for<br>{{c/core|unsigned char}}{{nbsp|4}} | ||
+ | | {{c|<}} for<br>{{c/core|unsigned char}}{{nbsp|4}} | ||
+ | | {{lc|EOF}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<wchar_t>}} | ||
+ | | {{c|1==}} | ||
+ | | {{c|1===}} | ||
+ | | {{c|<}} | ||
+ | | {{rlpt|wide#Macros|WEOF}} | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char16_t>}} {{mark c++11}}{{nbsp|4}} | ||
+ | | {{c|1==}} | ||
+ | | {{c|1===}} | ||
+ | | {{c|<}} | ||
+ | | invalid UTF-16 code unit | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char32_t>}} {{mark c++11}} | ||
+ | | {{c|1==}} | ||
+ | | {{c|1===}} | ||
+ | | {{c|<}} | ||
+ | | invalid UTF-32 code unit | ||
+ | |- class="t-dsc" | ||
+ | | {{c/core|std::char_traits<char8_t>}} {{mark c++20}} | ||
+ | | {{c|1==}} | ||
+ | | {{c|1===}} | ||
+ | | {{c|<}} | ||
+ | | invalid UTF-8 code unit | ||
+ | |} | ||
+ | |||
+ | Standard specializations of {{tt|char_traits}} class template satisfy the requirements of {{named req|CharTraits}}. | ||
− | + | ===Member types=== | |
− | + | ||
{{dsc begin}} | {{dsc begin}} | ||
− | {{dsc | + | {{dsc hitem | Type | Definition}} |
− | {{dsc | + | {{dsc | {{tt|char_type}} | {{tt|CharT}}}} |
− | {{ | + | {{dsc | {{tt|int_type}} | an integer type that can hold all values of {{tt|char_type}} plus {{lc|EOF}}}} |
− | {{ | + | {{dsc | {{tt|off_type}} | ''implementation-defined''}} |
− | {{dsc | + | {{dsc | {{tt|pos_type}} | ''implementation-defined''}} |
− | {{ | + | {{dsc | {{tt|state_type}} | ''implementation-defined''}} |
− | {{ | + | |
− | {{ | + | |
− | {{dsc | + | |
− | {{ | + | |
− | {{dsc | + | |
− | {{dsc | + | |
{{dsc end}} | {{dsc end}} | ||
− | === | + | ===Member functions=== |
− | {{ | + | {{dsc begin}} |
+ | {{dsc inc | cpp/string/char_traits/dsc assign}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc cmp}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc move}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc copy}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc compare}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc length}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc find}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc to_char_type}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc to_int_type}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc eq_int_type}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc eof}} | ||
+ | {{dsc inc | cpp/string/char_traits/dsc not_eof}} | ||
+ | {{dsc end}} | ||
===Example=== | ===Example=== | ||
Line 92: | Line 135: | ||
|User-defined character traits may be used to provide [http://www.gotw.ca/gotw/029.htm case-insensitive comparison]: | |User-defined character traits may be used to provide [http://www.gotw.ca/gotw/029.htm case-insensitive comparison]: | ||
|code= | |code= | ||
− | |||
− | |||
#include <string> | #include <string> | ||
#include <string_view> | #include <string_view> | ||
+ | #include <iostream> | ||
+ | #include <cctype> | ||
struct ci_char_traits : public std::char_traits<char> | struct ci_char_traits : public std::char_traits<char> | ||
Line 130: | Line 173: | ||
static const char* find(const char* s, std::size_t n, char a) | static const char* find(const char* s, std::size_t n, char a) | ||
{ | { | ||
− | + | auto const ua (to_upper(a)); | |
while (n-- != 0) | while (n-- != 0) | ||
{ | { | ||
Line 161: | Line 204: | ||
Hello and heLLo are equal | Hello and heLLo are equal | ||
}} | }} | ||
+ | |||
+ | ===Defect reports=== | ||
+ | {{dr list begin}} | ||
+ | {{dr list item|wg=lwg|dr=467|std=C++98|before=for {{c/core|std::char_traits<char>}}, the semantics of {{tt|eq()}} and {{tt|lt()}}<br>are the same as the built-in {{c|1===}} and {{c|<}} on {{c/core|char}} respectively<ref>Most implementations call {{lc|std::memcmp()}} for efficiency, which interprets the data as arrays of {{c/core|unsigned char}}. If {{c/core|char}} [[cpp/language/types#Character types|is signed]] on such implementations, {{c/core|std::char_traits<char>}} fails to satisfy the requirements of {{named req|CharTraits}}.</ref>|after=changed to built-in {{c|1===}} and<br>{{c|<}} on {{c/core|unsigned char}}}} | ||
+ | {{dr list end}} | ||
+ | <references/> | ||
===See also=== | ===See also=== | ||
{{dsc begin}} | {{dsc begin}} | ||
{{dsc inc|cpp/string/dsc basic_string}} | {{dsc inc|cpp/string/dsc basic_string}} | ||
− | |||
− | |||
− | |||
− | |||
{{dsc end}} | {{dsc end}} | ||
{{langlinks|de|es|fr|it|ja|pt|ru|zh}} | {{langlinks|de|es|fr|it|ja|pt|ru|zh}} |
Revision as of 14:46, 12 September 2023
Defined in header <string>
|
||
template< class CharT |
||
The char_traits
class is a traits class template that abstracts basic character and string operations for a given character type. The defined operation set is such that generic algorithms almost always can be implemented in terms of it. It is thus possible to use such algorithms with almost any possible character or string type, just by supplying a customized char_traits
class.
The char_traits
class template serves as a basis for explicit instantiations. The user can provide a specialization for any custom character types. Several specializations are defined for the standard character types.
If an operation on traits emits an exception, the behavior is undefined.
Contents |
Standard specializations
Member typedefs of standard specializations are as follows:
Specialization | char_type
|
int_type
|
pos_type
|
---|---|---|---|
std::char_traits<char> | char | int | std::streampos |
std::char_traits<wchar_t> | wchar_t | std::wint_t
|
std::wstreampos |
std::char_traits<char16_t> (C++11) | char16_t | std::uint_least16_t | std::u16streampos |
std::char_traits<char32_t> (C++11) | char32_t | std::uint_least32_t | std::u32streampos |
std::char_traits<char8_t> (C++20) | char8_t | unsigned int | std::u8streampos |
Member type | Definition (same among all standard specializations) |
---|---|
off_type
|
std::streamoff |
state_type
|
std::mbstate_t |
comparison_category (C++20)
|
std::strong_ordering |
The semantics of the member functions of standard specializations are defined are as follows:
Specialization | assign
|
eq
|
lt
|
eof
|
---|---|---|---|---|
std::char_traits<char> | = | == for unsigned char |
< for unsigned char |
EOF |
std::char_traits<wchar_t> | = | == | < | WEOF
|
std::char_traits<char16_t> (C++11) | = | == | < | invalid UTF-16 code unit |
std::char_traits<char32_t> (C++11) | = | == | < | invalid UTF-32 code unit |
std::char_traits<char8_t> (C++20) | = | == | < | invalid UTF-8 code unit |
Standard specializations of char_traits
class template satisfy the requirements of CharTraits.
Member types
Type | Definition |
char_type
|
CharT
|
int_type
|
an integer type that can hold all values of char_type plus EOF
|
off_type
|
implementation-defined |
pos_type
|
implementation-defined |
state_type
|
implementation-defined |
Member functions
[static] |
assigns a character (public static member function) |
[static] |
compares two characters (public static member function) |
[static] |
moves one character sequence onto another (public static member function) |
[static] |
copies a character sequence (public static member function) |
[static] |
lexicographically compares two character sequences (public static member function) |
[static] |
returns the length of a character sequence (public static member function) |
[static] |
finds a character in a character sequence (public static member function) |
[static] |
converts int_type to equivalent char_type (public static member function) |
[static] |
converts char_type to equivalent int_type (public static member function) |
[static] |
compares two int_type values (public static member function) |
[static] |
returns an eof value (public static member function) |
[static] |
checks whether a character is eof value (public static member function) |
Example
User-defined character traits may be used to provide case-insensitive comparison:
#include <string> #include <string_view> #include <iostream> #include <cctype> struct ci_char_traits : public std::char_traits<char> { static char to_upper(char ch) { return std::toupper((unsigned char) ch); } static bool eq(char c1, char c2) { return to_upper(c1) == to_upper(c2); } static bool lt(char c1, char c2) { return to_upper(c1) < to_upper(c2); } static int compare(const char* s1, const char* s2, std::size_t n) { while (n-- != 0) { if (to_upper(*s1) < to_upper(*s2)) return -1; if (to_upper(*s1) > to_upper(*s2)) return 1; ++s1; ++s2; } return 0; } static const char* find(const char* s, std::size_t n, char a) { auto const ua (to_upper(a)); while (n-- != 0) { if (to_upper(*s) == ua) return s; s++; } return nullptr; } }; template<class DstTraits, class CharT, class SrcTraits> constexpr std::basic_string_view<CharT, DstTraits> traits_cast(const std::basic_string_view<CharT, SrcTraits> src) noexcept { return {src.data(), src.size()}; } int main() { using namespace std::literals; constexpr auto s1 = "Hello"sv; constexpr auto s2 = "heLLo"sv; if (traits_cast<ci_char_traits>(s1) == traits_cast<ci_char_traits>(s2)) std::cout << s1 << " and " << s2 << " are equal\n"; }
Output:
Hello and heLLo are equal
Defect reports
The following behavior-changing defect reports were applied retroactively to previously published C++ standards.
DR | Applied to | Behavior as published | Correct behavior |
---|---|---|---|
LWG 467 | C++98 | for std::char_traits<char>, the semantics of eq() and lt() are the same as the built-in == and < on char respectively[1] |
changed to built-in == and < on unsigned char |
- ↑ Most implementations call std::memcmp() for efficiency, which interprets the data as arrays of unsigned char. If char is signed on such implementations, std::char_traits<char> fails to satisfy the requirements of CharTraits.
See also
stores and manipulates sequences of characters (class template) |