Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/regex"

From cppreference.com
< cpp
m (Title: +since C++11)
m (fmt, headers sorted)
Line 13: Line 13:
  
 
===Main classes===
 
===Main classes===
 
 
These classes encapsulate a regular expression and the results of matching a regular expression within a target sequence of characters.
 
These classes encapsulate a regular expression and the results of matching a regular expression within a target sequence of characters.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/regex/dsc basic_regex }}
+
{{dsc inc|cpp/regex/dsc basic_regex}}
{{dsc inc | cpp/regex/dsc sub_match }}
+
{{dsc inc|cpp/regex/dsc sub_match}}
{{dsc inc | cpp/regex/dsc match_results }}
+
{{dsc inc|cpp/regex/dsc match_results}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Algorithms===
 
===Algorithms===
 
 
These functions are used to apply the regular expression encapsulated in a regex to a target sequence of characters.
 
These functions are used to apply the regular expression encapsulated in a regex to a target sequence of characters.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/regex/dsc regex_match }}
+
{{dsc inc|cpp/regex/dsc regex_match}}
{{dsc inc | cpp/regex/dsc regex_search }}
+
{{dsc inc|cpp/regex/dsc regex_search}}
{{dsc inc | cpp/regex/dsc regex_replace }}
+
{{dsc inc|cpp/regex/dsc regex_replace}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Iterators===
 
===Iterators===
 
 
The regex iterators are used to traverse the entire set of regular expression matches found within a sequence.
 
The regex iterators are used to traverse the entire set of regular expression matches found within a sequence.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/regex/dsc regex_iterator }}
+
{{dsc inc|cpp/regex/dsc regex_iterator}}
{{dsc inc | cpp/regex/dsc regex_token_iterator }}
+
{{dsc inc|cpp/regex/dsc regex_token_iterator}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Exceptions===
 
===Exceptions===
 
 
This class defines the type of objects thrown as exceptions to report errors from the regular expressions library.
 
This class defines the type of objects thrown as exceptions to report errors from the regular expressions library.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/regex/dsc regex_error }}
+
{{dsc inc|cpp/regex/dsc regex_error}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Traits===
 
===Traits===
 
 
The regex traits class is used to encapsulate the localizable aspects of a regex.
 
The regex traits class is used to encapsulate the localizable aspects of a regex.
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc inc | cpp/regex/dsc regex_traits }}
+
{{dsc inc|cpp/regex/dsc regex_traits}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Constants===
 
===Constants===
 
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc namespace | std::regex_constants }}
+
{{dsc namespace|std::regex_constants}}
{{dsc inc | cpp/regex/dsc syntax_option_type}}
+
{{dsc inc|cpp/regex/dsc syntax_option_type}}
{{dsc inc | cpp/regex/dsc match_flag_type}}
+
{{dsc inc|cpp/regex/dsc match_flag_type}}
{{dsc inc | cpp/regex/dsc error_type}}
+
{{dsc inc|cpp/regex/dsc error_type}}
 
{{dsc end}}
 
{{dsc end}}
  
 
===Example===
 
===Example===
 
{{example
 
{{example
| code=
+
|code=
 
#include <iostream>
 
#include <iostream>
 
#include <iterator>
 
#include <iterator>
#include <string>
 
 
#include <regex>
 
#include <regex>
 +
#include <string>
  
 
int main()
 
int main()
Line 81: Line 75:
  
 
     std::regex self_regex("REGULAR EXPRESSIONS",
 
     std::regex self_regex("REGULAR EXPRESSIONS",
            std::regex_constants::ECMAScript {{!}} std::regex_constants::icase);
+
        std::regex_constants::ECMAScript {{!}} std::regex_constants::icase);
     if (std::regex_search(s, self_regex)) {
+
     if (std::regex_search(s, self_regex))
 
         std::cout << "Text contains the phrase 'regular expressions'\n";
 
         std::cout << "Text contains the phrase 'regular expressions'\n";
    }
 
  
 
     std::regex word_regex("(\\w+)");
 
     std::regex word_regex("(\\w+)");
Line 97: Line 90:
 
     const int N = 6;
 
     const int N = 6;
 
     std::cout << "Words longer than " << N << " characters:\n";
 
     std::cout << "Words longer than " << N << " characters:\n";
     for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
+
     for (std::sregex_iterator i = words_begin; i != words_end; ++i)
 +
    {
 
         std::smatch match = *i;
 
         std::smatch match = *i;
 
         std::string match_str = match.str();
 
         std::string match_str = match.str();
         if (match_str.size() > N) {
+
         if (match_str.size() > N)
 
             std::cout << "  " << match_str << '\n';
 
             std::cout << "  " << match_str << '\n';
        }
 
 
     }
 
     }
  
Line 109: Line 102:
 
     std::cout << new_s << '\n';
 
     std::cout << new_s << '\n';
 
}
 
}
| output=
+
|output=
 
Text contains the phrase 'regular expressions'
 
Text contains the phrase 'regular expressions'
 
Found 20 words
 
Found 20 words

Revision as of 05:34, 25 December 2023

The regular expressions library provides a class that represents regular expressions, which are a kind of mini-language used to perform pattern matching within strings. Almost all operations with regexes can be characterized by operating on several of the following objects:

  • Target sequence. The character sequence that is searched for a pattern. This may be a range specified by two iterators, a null-terminated character string or a std::string.
  • Pattern. This is the regular expression itself. It determines what constitutes a match. It is an object of type std::basic_regex, constructed from a string with special syntax. See regex_constants::syntax_option_type for the description of supported syntax variations.
  • Matched array. The information about matches may be retrieved as an object of type std::match_results.
  • Replacement string. This is a string that determines how to replace the matches, see regex_constants::match_flag_type for the description of supported syntax variations.

Contents

Main classes

These classes encapsulate a regular expression and the results of matching a regular expression within a target sequence of characters.

regular expression object
(class template) [edit]
(C++11)
identifies the sequence of characters matched by a sub-expression
(class template) [edit]
identifies one regular expression match, including all sub-expression matches
(class template) [edit]

Algorithms

These functions are used to apply the regular expression encapsulated in a regex to a target sequence of characters.

attempts to match a regular expression to an entire character sequence
(function template) [edit]
attempts to match a regular expression to any part of a character sequence
(function template) [edit]
replaces occurrences of a regular expression with formatted replacement text
(function template) [edit]

Iterators

The regex iterators are used to traverse the entire set of regular expression matches found within a sequence.

iterates through all regex matches within a character sequence
(class template) [edit]
iterates through the specified sub-expressions within all regex matches in a given string or through unmatched substrings
(class template) [edit]

Exceptions

This class defines the type of objects thrown as exceptions to report errors from the regular expressions library.

reports errors generated by the regular expressions library
(class) [edit]

Traits

The regex traits class is used to encapsulate the localizable aspects of a regex.

provides metainformation about a character type, required by the regex library
(class template) [edit]

Constants

Defined in namespace std::regex_constants
general options controlling regex behavior
(typedef) [edit]
options specific to matching
(typedef) [edit]
describes different types of matching errors
(typedef) [edit]

Example

#include <iostream>
#include <iterator>
#include <regex>
#include <string>
 
int main()
{
    std::string s = "Some people, when confronted with a problem, think "
        "\"I know, I'll use regular expressions.\" "
        "Now they have two problems.";
 
    std::regex self_regex("REGULAR EXPRESSIONS",
        std::regex_constants::ECMAScript | std::regex_constants::icase);
    if (std::regex_search(s, self_regex))
        std::cout << "Text contains the phrase 'regular expressions'\n";
 
    std::regex word_regex("(\\w+)");
    auto words_begin = 
        std::sregex_iterator(s.begin(), s.end(), word_regex);
    auto words_end = std::sregex_iterator();
 
    std::cout << "Found "
              << std::distance(words_begin, words_end)
              << " words\n";
 
    const int N = 6;
    std::cout << "Words longer than " << N << " characters:\n";
    for (std::sregex_iterator i = words_begin; i != words_end; ++i)
    {
        std::smatch match = *i;
        std::string match_str = match.str();
        if (match_str.size() > N)
            std::cout << "  " << match_str << '\n';
    }
 
    std::regex long_word_regex("(\\w{7,})");
    std::string new_s = std::regex_replace(s, long_word_regex, "[$&]");
    std::cout << new_s << '\n';
}

Output:

Text contains the phrase 'regular expressions'
Found 20 words
Words longer than 6 characters:
  confronted
  problem
  regular
  expressions
  problems
Some people, when [confronted] with a [problem], think 
"I know, I'll use [regular] [expressions]." Now they have two [problems].