Namespaces
Variants
Views
Actions

std::regex_constants::syntax_option_type

From cppreference.com
< cpp‎ | regex
Revision as of 11:16, 2 November 2012 by P12bot (Talk | contribs)

Template:ddcl list begin <tr class="t-dsc-header">

<td>
Defined in header <regex>
</td>

<td></td> <td></td> </tr> <tr class="t-dcl ">

<td class="t-dcl-nopad">
typedef /*unspecified*/ syntax_option_type;

static constexpr syntax_option_type icase = /*unspecified*/;
static constexpr syntax_option_type nosubs = /*unspecified*/;
static constexpr syntax_option_type optimize = /*unspecified*/;
static constexpr syntax_option_type collate = /*unspecified*/;
static constexpr syntax_option_type ECMAScript = /*unspecified*/;
static constexpr syntax_option_type basic = /*unspecified*/;
static constexpr syntax_option_type extended = /*unspecified*/;
static constexpr syntax_option_type awk = /*unspecified*/;
static constexpr syntax_option_type grep = /*unspecified*/;

static constexpr syntax_option_type egrep = /*unspecified*/;
</td>

<td class="t-dcl-nopad"> </td> <td class="t-dcl-nopad"> </td> </tr> Template:ddcl list end

The syntax_option_type is a Template:concept that contains options that govern how regular expressions behave.

The possible values for this type (icase, optimize, etc.) are duplicated inside std::basic_regex.

Contents

Constants

Grammar option Effect(s)
ECMAScript Use the Modified ECMAScript regular expression grammar.
basic Use the basic POSIX regular expression grammar (grammar documentation).
extended Use the extended POSIX regular expression grammar (grammar documentation).
awk Use the regular expression grammar used by the awk utility in POSIX (grammar documentation).
grep Use the regular expression grammar used by the grep utility in POSIX. This is effectively the same as the basic option with the addition of newline '\n' as an alternation separator.
egrep Use the regular expression grammar used by the grep utility, with the -E option, in POSIX. This is effectively the same as the extended option with the addition of newline '\n' as an alternation separator in addition to '|'.
Grammar variation Effect(s)
icase Character matching should be performed without regard to case.
nosubs When performing matches, all marked sub-expressions (expr) are treated as non-marking sub-expressions (?:expr). No matches are stored in the supplied std::regex_match structure and mark_count() is zero.
optimize Instructs the regular expression engine to make matching faster, with the potential cost of making construction slower. For example, this might mean converting a non-deterministic FSA to a deterministic FSA.
collate Character ranges of the form "[a-b]" will be locale sensitive.
multiline (C++17) Specifies that ^ shall match the beginning of a line and $ shall match the end of a line, if the ECMAScript engine is selected.

At most one grammar option can be chosen out of ECMAScript, basic, extended, awk, grep, egrep. If no grammar is chosen, ECMAScript is assumed to be selected. The other options serve as variations, such that std::regex("meow", std::regex::icase) is equivalent to std::regex("meow", std::regex::ECMAScript|std::regex::icase).

Notes

Because POSIX uses "leftmost longest" matching rule (the longest matching subsequence is matched, and if there are several such subsequences, the first one is matched), it is not suitable, for example, for parsing markup languages: a POSIX regex such as "<tag[^>]*>.*</tag>" would match everything from the first "<tag" to the last "</tag>", including every "</tag>" and "<tag>" inbetween. On the other hand, ECMAScript supports non-greedy matches, and the ECMAScript regex "<tag[^>]*>.*?</tag>" would match only until the first closing tag.

Example

Illustrates the difference in the matching algorithm between ECMAScript and POSIX regular expressions

#include <iostream>
#include <string>
#include <regex>
 
int main()
{
    std::string str = "zzxayyzz";
    std::regex re1(".*(a|xayy)"); // ECMA
    std::regex re2(".*(a|xayy)", std::regex::extended); // POSIX
 
    std::cout << "Searching for .*(a|xayy) in zzxayyzz:\n";
    std::smatch m;
    std::regex_search(str, m, re1);
    std::cout << " ECMA (depth first search) match: " << m[0] << '\n';
    std::regex_search(str, m, re2);
    std::cout << " POSIX (leftmost longest)  match: " << m[0] << '\n';
}

Output:

Searching for .*(a|xayy) in zzxayyzz:
 ECMA (depth first search) match: zzxa
 POSIX (leftmost longest)  match: zzxayy

See also

Template:cpp/regex/dcl list basic regex