Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/string"

From cppreference.com
< cpp
m (+link to section below.)
(Reorganized the library structure, also added a few definitions about characters and LWG issue #1170 DR.)
Line 2: Line 2:
 
{{cpp/string/navbar}}
 
{{cpp/string/navbar}}
  
The C++ strings library includes support for three general types of strings:
+
===Characters===
 +
In the C++ standard library, a ''character'' is an object which, when treated sequentially, can represent text.
  
* {{lc|std::basic_string}} - a templated class designed to manipulate strings of any character type.
+
The term does not mean only objects of [[cpp/language/type|character types]], but
* {{lc|std::basic_string_view}} {{mark since c++17}} - a lightweight non-owning read-only view into a subsequence of a string.
+
any value that can be represented by a type that provides the definitions specified in strings library and following libraries:
* {{ls|#Null-terminated strings}} - arrays of characters terminated by a special ''null'' character.
+
* [[cpp/locale|localization library]]
 +
* [[cpp/io|input/output library]]
 +
{{rrev|since=c++11|
 +
* [[cpp/regex|regular expressions library]]
 +
}}
 +
 
 +
In the strings library{{rev inl|since=c++11| and regular expressions library}}, characters can only be of ''char-like types'', which can be any non-array {{rev inl|until=c++20|[[cpp/named req/PODType|POD type]]}}{{rev inl|since=c++20|[[cpp/named req/TrivialType|trivial]] [[cpp/named req/StandardLayoutType|standard-layout]] type}}. Therefore, characters are also referred as ''char-like objects'' in the strings library{{rev inl|since=c++11| and regular expressions library}}.
 +
 
 +
Some standard library components accept ''character container types'', they are also types used to represent individual characters. Such types are used for one of the template parameters of {{lc|std::char_traits}} and the class templates which use {{lc|std::char_traits}}.
 +
 
 +
===Library components===
 +
The C++ strings library includes support for the following components:
 +
 
 +
====Character traits====
 +
Many character-related class templates (such as {{lc|std::basic_string}}) need a set of related types and functions to complete the definition of their semantics. These types and functions are provided as a set of member {{c/core|typedef}} names and functions in the template parameter {{tt|Traits}} used by each such template. The classes which are able to complete those semantics are ''character traits'', and they need to satisfy the {{named req|CharTraits}} requirements.
 +
 
 +
The string library provides the class template {{lc|std::char_traits}} that defines types and functions for {{lc|std::basic_string}}{{rev inl|since=c++17| and {{lc|std::basic_string_view}}}}.
 +
 
 +
The following specializations are defined, all of them satisfy the {{named req|CharTraits}} requirements:
 +
{{dcl begin}}
 +
{{dcl header|string}}
 +
{{dcl|template<> class char_traits<char>;}}
 +
{{dcl|template<> class char_traits<wchar_t>;}}
 +
{{dcl|template<> class char_traits<char8_t>;|since=c++20}}
 +
{{dcl|template<> class char_traits<char16_t>;|since=c++11}}
 +
{{dcl|template<> class char_traits<char32_t>;|since=c++11}}
 +
{{dcl end}}
 +
 
 +
When a user wants to use a user-defined character container type for {{lc|std::basic_string}}{{rev inl|since=c++17| and {{lc|std::basic_string_view}}}}, he also needs to provide a corresponding character trait class (which can be a specialization of {{lc|std::char_traits}}).
  
==={{lc|std::basic_string}}===
+
====String classes====
 
The templated class {{lc|std::basic_string}} generalizes how sequences of characters are manipulated and stored.  String creation, manipulation, and destruction are all handled by a convenient set of class methods and related functions.
 
The templated class {{lc|std::basic_string}} generalizes how sequences of characters are manipulated and stored.  String creation, manipulation, and destruction are all handled by a convenient set of class methods and related functions.
  
Line 21: Line 50:
 
{{dsc|{{ttb|std::u32string}} {{mark since c++11}}|{{c/core|std::basic_string<char32_t>}}}}
 
{{dsc|{{ttb|std::u32string}} {{mark since c++11}}|{{c/core|std::basic_string<char32_t>}}}}
 
{{dsc end}}
 
{{dsc end}}
 
  
 
{{rrev|since=c++17|
 
{{rrev|since=c++17|
==={{lc|std::basic_string_view}}===
+
====String view classes====
 
+
 
The templated class {{lc|std::basic_string_view}} provides a lightweight object that offers read-only access to a string or a part of a string using an interface similar to the interface of {{lc|std::basic_string}}.
 
The templated class {{lc|std::basic_string_view}} provides a lightweight object that offers read-only access to a string or a part of a string using an interface similar to the interface of {{lc|std::basic_string}}.
  
Line 40: Line 67:
 
}}
 
}}
  
===Null-terminated strings===
+
====Null-terminated sequence utilities====
Null-terminated strings are arrays of characters that are terminated by a special ''null'' character. C++ provides functions to create, inspect, and modify null-terminated strings.
+
''Null-terminated character sequences'' (NTCTS) are sequences of characters that are terminated by a null character (the value after value-initialization).
  
There are three types of null-terminated strings:
+
The strings library provides functions to create, inspect, and modify such sequences:
* {{rl|byte|null-terminated byte strings}}
+
* {{rl|byte|null-terminated byte strings}} (NTBS) helper functions (including support of {{rl|wide|wide character types}})
* {{rl|multibyte|null-terminated multibyte strings}}
+
* {{rl|multibyte|null-terminated multibyte strings}} (NTMBS) helper functions
* {{rl|wide|null-terminated wide strings}}
+
  
===Additional support===
+
===Relevant libraries===
===={{lc|std::char_traits}}====
+
The [[cpp/locale|localizations library]] provides support for string conversions (e.g. {{lc|std::wstring_convert}} or {{ltt std|cpp/locale/toupper}}) as well as functions that classify characters (e.g. {{ltt std|cpp/locale/isspace}} or {{ltt std|cpp/locale/isdigit}}).
The string library also provides class template {{lc|std::char_traits}} that defines types and functions for {{lc|std::basic_string}}{{rev inl|since=c++17| and {{lc|std::basic_string_view}}}}.
+
 
+
The following specializations are defined:
+
{{dcl begin}}
+
{{dcl header|string}}
+
{{dcl|template<> class char_traits<char>;}}
+
{{dcl|template<> class char_traits<wchar_t>;}}
+
{{dcl|template<> class char_traits<char8_t>;|since=c++20}}
+
{{dcl|template<> class char_traits<char16_t>;|since=c++11}}
+
{{dcl|template<> class char_traits<char32_t>;|since=c++11}}
+
{{dcl end}}
+
  
====Conversions and classification====
+
===Defect reports===
The [[cpp/locale|localizations library]] provides support for string conversions (e.g. {{lc|std::wstring_convert}} or {{ltt std|cpp/locale/toupper}}) as well as functions that classify characters (e.g. {{ltt std|cpp/locale/isspace}} or {{ltt std|cpp/locale/isdigit}}).
+
{{dr list begin}}
 +
{{dr list item|wg=lwg|dr=1170|std=C++98|before=char-like types could be array types|after=prohibited}}
 +
{{dr list end}}
  
 
===See also===
 
===See also===

Revision as of 01:49, 1 February 2024

Contents

Characters

In the C++ standard library, a character is an object which, when treated sequentially, can represent text.

The term does not mean only objects of character types, but any value that can be represented by a type that provides the definitions specified in strings library and following libraries:

(since C++11)

In the strings library and regular expressions library(since C++11), characters can only be of char-like types, which can be any non-array POD type(until C++20)trivial standard-layout type(since C++20). Therefore, characters are also referred as char-like objects in the strings library and regular expressions library(since C++11).

Some standard library components accept character container types, they are also types used to represent individual characters. Such types are used for one of the template parameters of std::char_traits and the class templates which use std::char_traits.

Library components

The C++ strings library includes support for the following components:

Character traits

Many character-related class templates (such as std::basic_string) need a set of related types and functions to complete the definition of their semantics. These types and functions are provided as a set of member typedef names and functions in the template parameter Traits used by each such template. The classes which are able to complete those semantics are character traits, and they need to satisfy the CharTraits requirements.

The string library provides the class template std::char_traits that defines types and functions for std::basic_string and std::basic_string_view(since C++17).

The following specializations are defined, all of them satisfy the CharTraits requirements:

Defined in header <string>
template<> class char_traits<char>;
template<> class char_traits<wchar_t>;
template<> class char_traits<char8_t>;
(since C++20)
template<> class char_traits<char16_t>;
(since C++11)
template<> class char_traits<char32_t>;
(since C++11)

When a user wants to use a user-defined character container type for std::basic_string and std::basic_string_view(since C++17), he also needs to provide a corresponding character trait class (which can be a specialization of std::char_traits).

String classes

The templated class std::basic_string generalizes how sequences of characters are manipulated and stored. String creation, manipulation, and destruction are all handled by a convenient set of class methods and related functions.

Several specializations of std::basic_string are provided for commonly-used types:

Defined in header <string>
Type Definition
std::string std::basic_string<char>
std::wstring std::basic_string<wchar_t>
std::u8string (since C++20) std::basic_string<char8_t>
std::u16string (since C++11) std::basic_string<char16_t>
std::u32string (since C++11) std::basic_string<char32_t>

String view classes

The templated class std::basic_string_view provides a lightweight object that offers read-only access to a string or a part of a string using an interface similar to the interface of std::basic_string.

Several specializations of std::basic_string_view are provided for commonly-used types:

Defined in header <string_view>
Type Definition
std::string_view std::basic_string_view<char>
std::wstring_view std::basic_string_view<wchar_t>
std::u8string_view (since C++20) std::basic_string_view<char8_t>
std::u16string_view std::basic_string_view<char16_t>
std::u32string_view std::basic_string_view<char32_t>
(since C++17)

Null-terminated sequence utilities

Null-terminated character sequences (NTCTS) are sequences of characters that are terminated by a null character (the value after value-initialization).

The strings library provides functions to create, inspect, and modify such sequences:

Relevant libraries

The localizations library provides support for string conversions (e.g. std::wstring_convert or std::toupper) as well as functions that classify characters (e.g. std::isspace or std::isdigit).

Defect reports

The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

DR Applied to Behavior as published Correct behavior
LWG 1170 C++98 char-like types could be array types prohibited

See also

C documentation for Strings library