Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/string"

From cppreference.com
< cpp
(no mutlibyte link on cpp/string?)
m
 
(46 intermediate revisions by 17 users not shown)
Line 1: Line 1:
 
{{title|Strings library}}
 
{{title|Strings library}}
{{cpp/string/sidebar}}
+
{{cpp/string/navbar}}
  
==={{rl | narrow | Null-terminated narrow string management}} ===
+
===Characters===
 +
In the C++ standard library, a ''character'' is an object which, when treated sequentially, can represent text.
  
==={{rl | multibyte | Null-terminated multibyte string management}} ===
+
The term means not only objects of [[cpp/language/type|character types]], but also any value that can be represented by a type that provides the definitions specified in the strings library and following libraries:
 +
* [[cpp/locale|localization library]]
 +
* [[cpp/io|input/output library]]
 +
{{rrev|since=c++11|
 +
* [[cpp/regex|regular expressions library]]
 +
}}
  
==={{rl | wide| Null-terminated wide string management}}===
+
In the strings library{{rev inl|since=c++11| and regular expressions library}}, a character can be of only ''char-like types'', i.e. those non-array types that are also of {{rev inl|until=c++20|{{named req|PODType}}}}{{rev inl|since=c++20|{{named req|TrivialType}} and {{named req|StandardLayoutType}}}} type. Therefore, characters are also referred as ''char-like objects'' in the strings library{{rev inl|since=c++11| and regular expressions library}}.
  
==={{rl | basic_string}}===
+
Some standard library components accept ''character container types''. They, too, are types used to represent individual characters. Such types are used for one of the template arguments of {{lc|std::char_traits}} and the class templates which use {{lc|std::char_traits}}.
  
{{ddcl | header=string | 1=
+
===Library components===
template<
+
The C++ strings library includes the following components:
    typename CharT,
+
    typename Traits = std::char_traits<CharT>,
+
    typename Allocator = std::allocator<CharT> >
+
> class basic_string;
+
}}
+
  
The class {{rlt|basic_string}} generalizes the way how sequences of characters are manipulated and stored.
+
====Character traits====
 +
Many character-related class templates (such as {{lc|std::basic_string}}) need a set of related types and functions to complete the definition of their semantics. These types and functions are provided as a set of member {{c/core|typedef}} names and functions in the template parameter {{tt|Traits}} used by each such template. The classes which are able to complete those semantics are {{named req|CharTraits}}.
  
Several specializations of the class {{tt|basic_string}} are provided:
+
The string library provides the class template {{lc|std::char_traits}} that defines types and functions for {{lc|std::basic_string}}{{rev inl|since=c++17| and {{lc|std::basic_string_view}}}}.
  
{{tdcl list begin}}
+
The following specializations are defined, all of them satisfy the {{named req|CharTraits}} requirements:
{{tdcl list header | string}}
+
{{dcl begin}}
{{tdcl list hitem | Type | Definition}}
+
{{dcl header|string}}
{{tdcl list item | {{tt|string}} | {{cpp|basic_string<char>}}}}
+
{{dcl|template<> class char_traits<char>;}}
{{tdcl list item | {{tt|wstring}} | {{cpp|basic_string<wchar_t>}}}}
+
{{dcl|template<> class char_traits<wchar_t>;}}
{{tdcl list item | {{tt|u16string}} | {{cpp|basic_string<char16_t>}} | notes={{mark c++11 feature}}}}
+
{{dcl|template<> class char_traits<char8_t>;|since=c++20}}
{{tdcl list item | {{tt|u32string}} | {{cpp|basic_string<char32_t>}} | notes={{mark c++11 feature}}}}
+
{{dcl|template<> class char_traits<char16_t>;|since=c++11}}
{{tdcl list end}}
+
{{dcl|template<> class char_traits<char32_t>;|since=c++11}}
 +
{{dcl end}}
  
===Hash support===
+
When a user-defined character container type for {{lc|std::basic_string}}{{rev inl|since=c++17| and {{lc|std::basic_string_view}}}} is used, it is also necessary to provide a corresponding character trait class (which can be a specialization of {{lc|std::char_traits}}).
  
The following specializations of class template {{ltt|cpp/utility/hash}} are defined. These specializations provide hash support for default string types.
+
{{anchor|String classes}}
 +
====String classes ({{lc|std::string}} etc.)====
 +
The class template {{lc|std::basic_string}} generalizes how sequences of characters are manipulated and stored. String creation, manipulation, and destruction are all handled by a convenient set of class methods and related functions.
  
{{ddcl list begin}}
+
Several specializations of {{lc|std::basic_string}} are provided for commonly-used types:
{{ddcl list header | string}}
+
{{dsc begin}}
{{ddcl list item | notes={{mark c++11 feature}}<br>{{mark c++11 feature}}<br>{{mark c++11 feature}}<br>{{mark c++11 feature}} |
+
{{dsc header|string}}
template<> class hash<std::string>;
+
{{dsc hitem|Type|Definition}}
template<> class hash<std::wstring>;
+
{{dsc|{{ttb|std::string}}|{{c/core|std::basic_string<char>}}}}
template<> class hash<std::u16string>;
+
{{dsc|{{ttb|std::wstring}}|{{c/core|std::basic_string<wchar_t>}}}}
template<> class hash<std::u32string>;
+
{{dsc|{{ttb|std::u8string}} {{mark since c++20}}|{{c/core|std::basic_string<char8_t>}}}}
 +
{{dsc|{{ttb|std::u16string}} {{mark since c++11}}|{{c/core|std::basic_string<char16_t>}}}}
 +
{{dsc|{{ttb|std::u32string}} {{mark since c++11}}|{{c/core|std::basic_string<char32_t>}}}}
 +
{{dsc end}}
 +
 
 +
{{rrev|since=c++17|
 +
{{anchor|String view classes}}
 +
====String view classes ({{lc|std::string_view}} etc.)====
 +
The class template {{lc|std::basic_string_view}} provides a lightweight object that offers read-only access to a string or a part of a string using an interface similar to the interface of {{lc|std::basic_string}}.
 +
 
 +
Several specializations of {{lc|std::basic_string_view}} are provided for commonly-used types:
 +
{{dsc begin}}
 +
{{dsc header|string_view}}
 +
{{dsc hitem|Type|Definition}}
 +
{{dsc|{{ttb|std::string_view}}|{{c/core|std::basic_string_view<char>}}}}
 +
{{dsc|{{ttb|std::wstring_view}}|{{c/core|std::basic_string_view<wchar_t>}}}}
 +
{{dsc|{{ttb|std::u8string_view}} {{mark since c++20}}|{{c/core|std::basic_string_view<char8_t>}}}}
 +
{{dsc|{{ttb|std::u16string_view}}|{{c/core|std::basic_string_view<char16_t>}}}}
 +
{{dsc|{{ttb|std::u32string_view}}|{{c/core|std::basic_string_view<char32_t>}}}}
 +
{{dsc end}}
 
}}
 
}}
{{ddcl list end}}
 
  
==={{rl| char_traits}}===
+
====Null-terminated sequence utilities====
 +
''Null-terminated character sequences'' (NTCTS) are sequences of characters that are terminated by a null character (the value after [[cpp/language/value initialization|value-initialization]]).
  
Strings library provides class template {{rlt|char_traits}}, defining types and functions for a character container. The following specializations are defined:
+
The strings library provides functions to create, inspect, and modify such sequences:
 +
* {{rl|byte|null-terminated byte strings}} (NTBS) helper functions (including support of {{rl|wide|wide character types}}),
 +
* {{rl|multibyte|null-terminated multibyte strings}} (NTMBS) helper functions.
  
{{ddcl list begin}}
+
===Relevant libraries===
{{ddcl list header | string}}
+
The [[cpp/locale|localization library]] provides support for string conversions (e.g. {{ltt std|cpp/locale/toupper}}), character classification functions (e.g. {{ltt std|cpp/locale/isspace}}), and text encoding recognition ({{ltt std|cpp/locale/text_encoding}}).
{{ddcl list item | notes=<br><br>{{mark c++11 feature}}<br>{{mark c++11 feature}} |
+
 
template<> class char_traits<std::string>;
+
===Defect reports===
template<> class char_traits<std::wstring>;
+
{{dr list begin}}
template<> class char_traits<std::u16string>;
+
{{dr list item|wg=lwg|dr=1170|std=C++98|before=char-like types could be array types|after=prohibited}}
template<> class char_traits<std::u32string>;
+
{{dr list end}}
}}
+
 
{{ddcl list end}}
+
===See also===
 +
{{dsc begin}}
 +
{{dsc see c|c/string|Strings library|nomono=true}}
 +
{{dsc end}}
 +
 
 +
{{langlinks|ar|de|es|fr|it|ja|pt|ru|zh}}

Latest revision as of 09:45, 6 November 2024

Contents

[edit] Characters

In the C++ standard library, a character is an object which, when treated sequentially, can represent text.

The term means not only objects of character types, but also any value that can be represented by a type that provides the definitions specified in the strings library and following libraries:

(since C++11)

In the strings library and regular expressions library(since C++11), a character can be of only char-like types, i.e. those non-array types that are also of PODType(until C++20)TrivialType and StandardLayoutType(since C++20) type. Therefore, characters are also referred as char-like objects in the strings library and regular expressions library(since C++11).

Some standard library components accept character container types. They, too, are types used to represent individual characters. Such types are used for one of the template arguments of std::char_traits and the class templates which use std::char_traits.

[edit] Library components

The C++ strings library includes the following components:

[edit] Character traits

Many character-related class templates (such as std::basic_string) need a set of related types and functions to complete the definition of their semantics. These types and functions are provided as a set of member typedef names and functions in the template parameter Traits used by each such template. The classes which are able to complete those semantics are CharTraits.

The string library provides the class template std::char_traits that defines types and functions for std::basic_string and std::basic_string_view(since C++17).

The following specializations are defined, all of them satisfy the CharTraits requirements:

Defined in header <string>
template<> class char_traits<char>;
template<> class char_traits<wchar_t>;
template<> class char_traits<char8_t>;
(since C++20)
template<> class char_traits<char16_t>;
(since C++11)
template<> class char_traits<char32_t>;
(since C++11)

When a user-defined character container type for std::basic_string and std::basic_string_view(since C++17) is used, it is also necessary to provide a corresponding character trait class (which can be a specialization of std::char_traits).

[edit] String classes (std::string etc.)

The class template std::basic_string generalizes how sequences of characters are manipulated and stored. String creation, manipulation, and destruction are all handled by a convenient set of class methods and related functions.

Several specializations of std::basic_string are provided for commonly-used types:

Defined in header <string>
Type Definition
std::string std::basic_string<char>
std::wstring std::basic_string<wchar_t>
std::u8string (since C++20) std::basic_string<char8_t>
std::u16string (since C++11) std::basic_string<char16_t>
std::u32string (since C++11) std::basic_string<char32_t>

String view classes (std::string_view etc.)

The class template std::basic_string_view provides a lightweight object that offers read-only access to a string or a part of a string using an interface similar to the interface of std::basic_string.

Several specializations of std::basic_string_view are provided for commonly-used types:

Defined in header <string_view>
Type Definition
std::string_view std::basic_string_view<char>
std::wstring_view std::basic_string_view<wchar_t>
std::u8string_view (since C++20) std::basic_string_view<char8_t>
std::u16string_view std::basic_string_view<char16_t>
std::u32string_view std::basic_string_view<char32_t>
(since C++17)

[edit] Null-terminated sequence utilities

Null-terminated character sequences (NTCTS) are sequences of characters that are terminated by a null character (the value after value-initialization).

The strings library provides functions to create, inspect, and modify such sequences:

[edit] Relevant libraries

The localization library provides support for string conversions (e.g. std::toupper), character classification functions (e.g. std::isspace), and text encoding recognition (std::text_encoding).

[edit] Defect reports

The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

DR Applied to Behavior as published Correct behavior
LWG 1170 C++98 char-like types could be array types prohibited

[edit] See also

C documentation for Strings library