Difference between revisions of "cpp/string/byte/strtok"

Revision as of 08:53, 6 June 2023

Defined in header `<cstring>`
char* strtok( char* str, const char* delim );

Finds the next token in a null-terminated byte string pointed to by str. The separator characters are identified by null-terminated byte string pointed to by delim.

This function is designed to be called multiple times to obtain successive tokens from the same string.

If str is not a null pointer, the call is treated as the first call to strtok for this particular string. The function searches for the first character which is not contained in delim.

If no such character was found, there are no tokens in str at all, and the function returns a null pointer.
If such character was found, it is the beginning of the token. The function then searches from that point on for the first character that is contained in delim.

If no such character was found, str has only one token, and the future calls to strtok will return a null pointer.
If such character was found, it is replaced by the null character '\0' and the pointer to the following character is stored in a static location for subsequent invocations.

The function then returns the pointer to the beginning of the token.

If str is a null pointer, the call is treated as a subsequent call to strtok: the function continues from where it left in previous invocation. The behavior is the same as if the previously stored pointer is passed as str.

char* strtok(char* str, const char* delim)
{
    static char* buffer;
 
    if (str != nullptr)
        buffer = str;
 
    buffer += std::strspn(buffer, delim);
 
    if (*buffer == '\0')
        return nullptr;
 
    char* const tokenBegin = buffer;
 
    buffer += std::strcspn(buffer, delim);
 
    if (*buffer != '\0')
        *buffer++ = '\0';
 
    return tokenBegin;
}

Actual C++ library implementations of this function delegate to the C library, where it may be implemented directly (as in MUSL libc), or in terms of its reentrant version (as in GNU libc).

Example

Run this code

#include <cstring>
#include <iomanip>
#include <iostream>
 
int main() 
{
    char input[] = "one + two * (three - four)!";
    const char* delimiters = "! +- (*)";
    char *token = std::strtok(input, delimiters);
    while (token)
    {
        std::cout << std::quoted(token) << ' ';
        token = std::strtok(nullptr, delimiters);
    }
 
    std::cout << "\nContents of the input string now:\n\"";
    for (std::size_t n = 0; n < sizeof input; ++n)
    {
        if (const char c = input[n]; c != '\0')
            std::cout << c;
        else
            std::cout << "\\0";
    }
    std::cout << "\"\n";
}

Output:

"one" "two" "three" "four" 
Contents of the input string now:
"one\0+ two\0* (three\0- four\0!\0"

@@ Line 1: / Line 1: @@
-{{cpp/title| strtok}}
+{{cpp/title|strtok}}
 {{cpp/string/byte/navbar}}
 {{dcl begin}}
-{{dcl header | cstring}}
+{{dcl header|cstring}}
-{{dcl | 1=
+{{dcl|1=
 char* strtok( char* str, const char* delim );
 }}
 {{dcl end}}
-Finds the next token in a null-terminated byte string pointed to by {{tt|str}}. The separator characters are identified by null-terminated byte string pointed to by {{tt|delim}}.
+Finds the next token in a null-terminated byte string pointed to by {{c|str}}. The separator characters are identified by null-terminated byte string pointed to by {{c|delim}}.
 This function is designed to be called multiple times to obtain successive tokens from the same string.
-* If {{c|str}} is not a null pointer, the call is treated as the first call to {{tt|strtok}} for this particular string. The function searches for the first character which is ''not'' contained in {{tt|delim}}.
+* If {{c|str}} is not a null pointer, the call is treated as the first call to {{tt|strtok}} for this particular string. The function searches for the first character which is ''not'' contained in {{c|delim}}.
-:* If no such character was found, there are no tokens in {{tt|str}} at all, and the function returns a null pointer.
+:* If no such character was found, there are no tokens in {{c|str}} at all, and the function returns a null pointer.
-:* If such character was found, it is the ''beginning of the token''. The function then searches from that point on for the first character that ''is'' contained in {{tt|delim}}.
+:* If such character was found, it is the ''beginning of the token''. The function then searches from that point on for the first character that ''is'' contained in {{c|delim}}.
-::* If no such character was found, {{tt|str}} has only one token, and the future calls to {{tt|strtok}} will return a null pointer
+::* If no such character was found, {{c|str}} has only one token, and the future calls to {{tt|strtok}} will return a null pointer.
 ::* If such character was found, it is ''replaced'' by the null character {{c|'\0'}} and the pointer to the following character is stored in a static location for subsequent invocations.
-:* The function then returns the pointer to the beginning of the token
+:* The function then returns the pointer to the beginning of the token.
 * If {{c|str}} is a null pointer, the call is treated as a subsequent call to {{tt|strtok}}: the function continues from where it left in previous invocation. The behavior is the same as if the previously stored pointer is passed as {{c|str}}.
 ===Parameters===
 {{par begin}}
-{{par | str | pointer to the null-terminated byte string to tokenize}}
+{{par|str|pointer to the null-terminated byte string to tokenize}}
-{{par | delim | pointer to the null-terminated byte string identifying delimiters}}
+{{par|delim|pointer to the null-terminated byte string identifying delimiters}}
 {{par end}}
@@ Line 30: / Line 30: @@
 ===Notes===
-This function is destructive: it writes the {{c|'\0'}} characters in the elements of the string {{tt|str}}. In particular, a [[cpp/language/string_literal|string literal]] cannot be used as the first argument of {{tt|std::strtok}}.
+This function is destructive: it writes the {{c|'\0'}} characters in the elements of the string {{c|str}}. In particular, a [[cpp/language/string_literal|string literal]] cannot be used as the first argument of {{tt|std::strtok}}.
 Each call to this function modifies a static variable: is not thread safe.
@@ Line 43: / Line 43: @@
      if (str != nullptr)
-    {
          buffer = str;
-    }
      buffer += std::strspn(buffer, delim);
      if (*buffer == '\0')
-    {
          return nullptr;
-    }
      char* const tokenBegin = buffer;
@@ Line 59: / Line 55: @@
      if (*buffer != '\0')
-    {
          *buffer++ = '\0';
-    }
      return tokenBegin;
@@ Line 67: / Line 61: @@
 }}
-Actual C++ library implementations of this function delegate to the C library, where it may be implemented directly (as in [https://github.com/bminor/musl/blob/master/src/string/strtok.c MUSL libc]), or in terms of its reentrant version (as in [https://github.com/bminor/glibc/blob/master/string/strtok.c GNU libc])
+Actual C++ library implementations of this function delegate to the C library, where it may be implemented directly (as in [https://github.com/bminor/musl/blob/master/src/string/strtok.c MUSL libc]), or in terms of its reentrant version (as in [https://github.com/bminor/glibc/blob/master/string/strtok.c GNU libc]).
 ===Example===
 {{example
- |
+|code=
- | code=
 #include <cstring>
+#include <iomanip>
 #include <iostream>
-#include <iomanip>
 int main()
@@ Line 82: / Line 75: @@
      const char* delimiters = "! +- (*)";
      char *token = std::strtok(input, delimiters);
-     while (token) {
+     while (token)
+    {
          std::cout << std::quoted(token) << ' ';
          token = std::strtok(nullptr, delimiters);
@@ Line 88: / Line 82: @@
      std::cout << "\nContents of the input string now:\n\"";
-     for (std::size_t n = 0; n < sizeof input; ++n) {
+     for (std::size_t n = 0; n < sizeof input; ++n)
+    {
          if (const char c = input[n]; c != '\0')
              std::cout << c;
@@ Line 96: / Line 91: @@
      std::cout << "\"\n";
 }
- | output=
+|output=
 "one" "two" "three" "four"
 Contents of the input string now:
@@ Line 104: / Line 99: @@
 ===See also===
 {{dsc begin}}
-{{dsc inc | cpp/string/byte/dsc strpbrk}}
+{{dsc inc|cpp/string/byte/dsc strpbrk}}
-{{dsc inc | cpp/string/byte/dsc strcspn}}
+{{dsc inc|cpp/string/byte/dsc strcspn}}
-{{dsc inc | cpp/string/byte/dsc strspn}}
+{{dsc inc|cpp/string/byte/dsc strspn}}
-{{dsc inc | cpp/ranges/dsc split_view}}
+{{dsc inc|cpp/ranges/dsc split_view}}
-{{dsc see c | c/string/byte/strtok}}
+{{dsc see c|c/string/byte/strtok}}
 {{dsc end}}
 {{langlinks|de|es|fr|it|ja|pt|ru|zh}}

Compiler support
Freestanding and hosted
Language
Standard library
Standard library headers
Named requirements
Feature test macros (C++20)
Language support library
Concepts library (C++20)
Metaprogramming library (C++11)
Diagnostics library
General utilities library
Strings library
Containers library
Iterators library
Ranges library (C++20)
Algorithms library
Numerics library
Localizations library
Input/output library
Filesystem library (C++17)
Regular expressions library (C++11)
Concurrency support library (C++11)
Execution support library (C++26)
Technical specifications
Symbols index
External libraries

Null-terminated strings
Byte strings
Multibyte strings
Wide strings
Classes
basic_string
basic_string_view (C++17)
char_traits

strpbrk	finds the first location of any character from a set of separators (function) [edit]
strcspn	returns the length of the maximum initial segment that consists of only the characters not found in another byte string (function) [edit]
strspn	returns the length of the maximum initial segment that consists of only the characters found in another byte string (function) [edit]
ranges::split_viewviews::split (C++20)	a `view` over the subranges obtained from splitting another `view` using a delimiter (class template) (range adaptor object)[edit]
C documentation for strtok

cppreference.com

Namespaces

Variants

Views

Actions

Difference between revisions of "cpp/string/byte/strtok"

Revision as of 08:53, 6 June 2023

Contents

Parameters

Return value

Notes

Possible implementation

Example

See also

Navigation

Toolbox

str	-	pointer to the null-terminated byte string to tokenize
delim	-	pointer to the null-terminated byte string identifying delimiters