Namespaces
Variants
Views
Actions

Difference between revisions of "cpp/language/operator alternative"

From cppreference.com
< cpp‎ | language
(oh yeah, IBM failed to defend trigraphs. Good riddance.)
(+References)
 
(45 intermediate revisions by 23 users not shown)
Line 2: Line 2:
 
{{cpp/language/expressions/navbar}}
 
{{cpp/language/expressions/navbar}}
  
C++ (and C) source code may be written in any non-ASCII 7-bit character set that includes the [[enwiki:ISO 646|ISO 646:1983]] invariant character set. However, several C++ operators and punctuators require characters that are outside of the ISO 646 codeset: {{tt|{, }, [, ], #, \, ^, {{!}}, ~}}. To be able to use character encodings where some or all of these symbols do not exist (such as the German [http://de.wikipedia.org/wiki/DIN_66003 DIN 66003]), C++ defines two kinds of alternatives: additional keywords that correspond to the operators that use these characters and special combinations of two or three ISO 646 compatible characters that are interpreted as if they were a single non-ISO 646 character.
+
C++ (and C) source code may be written in any non-ASCII 7-bit character set that includes the {{enwiki|ISO 646|ISO 646:1983}} invariant character set. However, several C++ operators and punctuators require characters that are outside of the ISO 646 codeset: {{tt|{, }, [, ], #, \, ^, {{!}}, ~}}. To be able to use character encodings where some or all of these symbols do not exist (such as the German {{enwiki|DIN 66003}}), C++ defines the following alternatives composed of ISO 646 compatible characters.
  
==Alternative keywords==
+
===Alternative tokens===
 +
There are alternative spellings for several operators and other tokens that use non-ISO646 characters. In all respects of the language, each alternative token behaves exactly the same as its primary token, except for its spelling (the [[cpp/preprocessor/replace|stringification operator]] can make the spelling visible). The two-letter alternative tokens are sometimes called "digraphs". Despite being four-letters long, {{c|%:%:}} is also considered a digraph.
  
There are alternative spellings for several operators defined as keywords in the C++ standard.
+
{|class="wikitable"
 
+
{| class="wikitable"
+
 
|-
 
|-
! Primary
+
!Primary
! Alternative
+
!Alternative
 
|-
 
|-
| {{tt|&&}}
+
|{{tt|&&}}
| {{tt|and}}
+
|{{ltt|cpp/keyword/and}}
 
|-
 
|-
| {{tt|&{{=}}}}
+
|{{tt|1=&=}}
| {{tt|and_eq}}
+
|{{ltt|cpp/keyword/and_eq}}
 
|-
 
|-
| {{tt|&}}
+
|{{tt|&}}
| {{tt|bitand}}
+
|{{ltt|cpp/keyword/bitand}}
 
|-
 
|-
| {{tt|&#124;}}
+
|{{tt|&#124;}}
| {{tt|bitor}}
+
|{{ltt|cpp/keyword/bitor}}
 
|-
 
|-
| {{tt|~}}
+
|{{tt|~}}
| {{tt|compl}}
+
|{{ltt|cpp/keyword/compl}}
 
|-
 
|-
| {{tt|!}}
+
|{{tt|!}}
| {{tt|not}}
+
|{{ltt|cpp/keyword/not}}
 
|-
 
|-
| {{tt|!{{=}}}}
+
|{{tt|1=!=}}
| {{tt|not_eq}}
+
|{{ltt|cpp/keyword/not_eq}}
 
|-
 
|-
| {{tt|&#124;&#124;
+
|{{tt|&#124;&#124;}}
| {{tt|or
+
|{{ltt|cpp/keyword/or}}
 
|-
 
|-
| {{tt|&#124;{{=}}}}
+
|{{tt|1=&#124;=}}
| {{tt|or_eq}}
+
|{{ltt|cpp/keyword/or_eq}}
 
|-
 
|-
| {{tt|^}}
+
|{{tt|^}}
| {{tt|xor}}
+
|{{ltt|cpp/keyword/xor}}
 
|-
 
|-
| {{tt|^{{=}}}}
+
|{{tt|1=^=}}
| {{tt|xor_eq}}
+
|{{ltt|cpp/keyword/xor_eq}}
 +
|-
 +
|{{tt|{}}||{{tt|<%}}
 +
|-
 +
|{{tt|}<!---->}}||{{tt|%>}}
 +
|-
 +
|{{tt|[}}||{{tt|<:}}
 +
|-
 +
|{{tt|]}}||{{tt|:>}}
 +
|-
 +
|{{tt|#}}||{{tt|%:}}
 +
|-
 +
|{{tt|##}}||{{tt|%:%:}}
 
|}
 
|}
  
===Compatibility with C===
+
===Trigraphs {{mark until c++17|removed=yes}}===
The same words are defined in the C programming language in the include file {{c|<iso646.h>}} as macros. Because in C++ these are language keywords, the C++ version of {{c|<iso646.h>}}, as well as {{c|<ciso646>}}, does not define anything.
+
The following three-character groups (trigraphs) are {{rlp|translation phases|parsed before comments and string literals are recognized}}, and each appearance of a trigraph is replaced by the corresponding primary character:
 
+
==Digraphs and trigraphs==
+
 
+
The following combinations of two and three characters (digraphs and trigraphs) are valid substitutions for their respective primary characters:
+
  
{| class="wikitable"
+
{|class="wikitable"
|- style="text-align: left;"
+
|-style="text-align: left;"
! Primary
+
!Primary
! Digraph
+
!Trigraph
! Trigraph
+
 
|-
 
|-
| {{tt|{}} || {{tt|<%}} || {{tt|??<}}
+
|{{tt|{}}||{{tt|??<}}
 
|-
 
|-
| {{tt|}}} || {{tt|%>}} || {{tt|??>}}
+
|{{tt|}}}||{{tt|??>}}
 
|-
 
|-
| {{tt|[}} || {{tt|<:}} || {{tt|??(}}
+
|{{tt|[}}||{{tt|??(}}
 
|-
 
|-
| {{tt|]}} || {{tt|:>}} || {{tt|??)}}
+
|{{tt|]}}||{{tt|??)}}
 
|-
 
|-
| {{tt|#}} || {{tt|%:}} || {{tt|??{{=}}}}
+
|{{tt|#}}||{{tt|1=??=}}
 
|-
 
|-
| {{tt|\}} || || {{tt|??/}}
+
|{{tt|\}}||{{tt|??/}}
 
|-
 
|-
| {{tt|^}} || || {{tt|??'}}
+
|{{tt|^}}||{{tt|??'}}
 
|-
 
|-
| {{tt|<nowiki>|</nowiki>}} || || {{tt|??!}}
+
|{{tt|<nowiki>|</nowiki>}}||{{tt|??!}}
 
|-
 
|-
| {{tt|~}} || || {{tt|??-}}
+
|{{tt|~}}||{{tt|??-}}
 
|}
 
|}
  
Note that trigraphs (but not digraphs) are {{rlp|translation_phases|parsed before comments and string literals are recognized}}, so a comment such as {{c|// Will the next line be executed?????/}} will effectively comment out the following line, and the string literal such as {{c|"Enter date ??/??/??"}} is parsed as {{c|"Enter date \\??"}}.
+
Because trigraphs are processed early, a comment such as {{c|// Will the next line be executed?????/}} will effectively comment out the following line, and the string literal such as {{c|"Enter date ??/??/??"}} is parsed as {{c|"Enter date \\??"}}.
  
{{rev begin}}
+
===Notes===
{{rev|since=c++11|
+
The characters {{c|&}} and {{c|!}} are invariant under ISO-646, but alternatives are provided for the tokens that use these characters anyway to accommodate even more restrictive historical charsets<!-- best reference found so far "The Danish delegation did not, in fact, have a way of representing those characters on their terminals" from https://groups.google.com/d/msg/comp.std.c/eYbj0lCIvn4/89oK8U6JpqEJ , but it doesn't specifically call out & and ! -->.
When the parser meets the charater sequence {{c|<::}} and the subsequent character is neither {{c|:}} nor {{c|>}}, the {{c|<}} is treated as a preprocessor token by itself and not as the first character of the alternative token {{c|<:}}. Thus {{c|std::vector<::std::string>}} won't be wrongly treated as {{c|std::vector[:std::string>}}.
+
}}
+
{{rev end}}
+
  
{{rev begin}}
+
There is no alternative spelling (such as {{c|eq}}) for the equality operator {{c|1===}} because the character {{c|1==}} was present in all supported charsets.
{{rev|since=c++17|
+
Trigraphs (but not digraphs) are no longer part of C++.
+
}}
+
{{rev end}}
+
  
===Keywords===
+
===Compatibility with C===
 +
The same words are defined in the C programming language in the include file {{header|iso646.h|lang=c}} as macros. Because in C++ these are built into the language, the C++ version of {{ltt|cpp/header/ciso646|<iso646.h>}}, as well as {{header|ciso646}}, does not define anything. The non-word digraphs (e.g {{c|<%}}), however, are part of the core language and can be used without including any header (otherwise, they would be unusable on any charset that lacks {{c|#}}).
  
 +
===Keywords===
 
{{ltt|cpp/keyword/and}},
 
{{ltt|cpp/keyword/and}},
 
{{ltt|cpp/keyword/and_eq}},
 
{{ltt|cpp/keyword/and_eq}},
Line 108: Line 109:
  
 
===Example===
 
===Example===
 
 
{{example
 
{{example
| The following example demonstrates the use of several alternative keywords.
+
|The following example demonstrates the use of several alternative tokens.
| code=
+
|code=
 
%:include <iostream>
 
%:include <iostream>
  
int main(int argc, char *argv<::>)
+
struct X
 
<%
 
<%
     if (argc > 1 and argv<:1:> not_eq NULL) <%
+
     compl X() <%%> // destructor
        std::cout << "Hello, " << argv<:1:> << '\n';
+
    X() <%%>
 +
    X(const X bitand) = delete; // copy constructor
 +
    // X(X and) = delete; // move constructor
 +
   
 +
    bool operator not_eq(const X bitand other)
 +
    <%
 +
      return this not_eq bitand other;
 
     %>
 
     %>
 +
%>;
 +
 +
int main(int argc, char* argv<::>)
 +
<%
 +
    // lambda with reference-capture:
 +
    auto greet = <:bitand:>(const char* name)
 +
    <%
 +
        std::cout << "Hello " << name
 +
                  << " from " << argv<:0:> << '\n';
 +
    %>;
 +
   
 +
    if (argc > 1 and argv<:1:> not_eq nullptr)
 +
        greet(argv<:1:>);
 +
    else
 +
        greet("Anon");
 
%>
 
%>
 +
|p=true
 +
|output=Hello Anon from ./a.out
 
}}
 
}}
 +
 +
===References===
 +
{{ref std c++23}}
 +
{{ref std|section=5.5|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++20}}
 +
{{ref std|section=5.5|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++17}}
 +
{{ref std|section=5.5|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++14}}
 +
{{ref std|section=2.4|title=Trigraph sequences|id=lex.trigraph}}
 +
{{ref std|section=2.6|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++11}}
 +
{{ref std|section=2.4|title=Trigraph sequences|id=lex.trigraph}}
 +
{{ref std|section=2.6|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++03}}
 +
{{ref std|section=2.3|title=Trigraph sequences|id=lex.trigraph}}
 +
{{ref std|section=2.5|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
 +
{{ref std c++98}}
 +
{{ref std|section=2.3|title=Trigraph sequences|id=lex.trigraph}}
 +
{{ref std|section=2.5|title=Alternative tokens|id=lex.digraph}}
 +
{{ref std end}}
  
 
===See also===
 
===See also===
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc see c | c/language/operator_alternative | Alternative operator representations}}
+
{{dsc see c|c/language/operator alternative|Alternative operators and tokens|nomono=true}}
 
{{dsc end}}
 
{{dsc end}}
  
[[de:cpp/language/operator alternative]]
+
{{langlinks|de|es|fr|it|ja|pt|ru|zh}}
[[es:cpp/language/operator alternative]]
+
[[fr:cpp/language/operator alternative]]
+
[[it:cpp/language/operator alternative]]
+
[[ja:cpp/language/operator alternative]]
+
[[pt:cpp/language/operator alternative]]
+
[[ru:cpp/language/operator alternative]]
+
[[zh:cpp/language/operator alternative]]
+

Latest revision as of 22:25, 13 August 2024

 
 
C++ language
General topics
Flow control
Conditional execution statements
if
Iteration statements (loops)
for
range-for (C++11)
Jump statements
Functions
Function declaration
Lambda function expression
inline specifier
Dynamic exception specifications (until C++17*)
noexcept specifier (C++11)
Exceptions
Namespaces
Types
Specifiers
const/volatile
decltype (C++11)
auto (C++11)
constexpr (C++11)
consteval (C++20)
constinit (C++20)
Storage duration specifiers
Initialization
Expressions
Alternative representations
Literals
Boolean - Integer - Floating-point
Character - String - nullptr (C++11)
User-defined (C++11)
Utilities
Attributes (C++11)
Types
typedef declaration
Type alias declaration (C++11)
Casts
Memory allocation
Classes
Class-specific function properties
explicit (C++11)
static

Special member functions
Templates
Miscellaneous
 
 

C++ (and C) source code may be written in any non-ASCII 7-bit character set that includes the ISO 646:1983 invariant character set. However, several C++ operators and punctuators require characters that are outside of the ISO 646 codeset: {, }, [, ], #, \, ^, |, ~. To be able to use character encodings where some or all of these symbols do not exist (such as the German DIN 66003), C++ defines the following alternatives composed of ISO 646 compatible characters.

Contents

[edit] Alternative tokens

There are alternative spellings for several operators and other tokens that use non-ISO646 characters. In all respects of the language, each alternative token behaves exactly the same as its primary token, except for its spelling (the stringification operator can make the spelling visible). The two-letter alternative tokens are sometimes called "digraphs". Despite being four-letters long, %:%: is also considered a digraph.

Primary Alternative
&& and
&= and_eq
& bitand
| bitor
~ compl
! not
!= not_eq
|| or
|= or_eq
^ xor
^= xor_eq
{ <%
} %>
[ <:
] :>
# %:
## %:%:

[edit] Trigraphs (removed in C++17)

The following three-character groups (trigraphs) are parsed before comments and string literals are recognized, and each appearance of a trigraph is replaced by the corresponding primary character:

Primary Trigraph
{ ??<
} ??>
[ ??(
] ??)
# ??=
\ ??/
^ ??'
| ??!
~ ??-

Because trigraphs are processed early, a comment such as // Will the next line be executed?????/ will effectively comment out the following line, and the string literal such as "Enter date ??/??/??" is parsed as "Enter date \\??".

[edit] Notes

The characters & and ! are invariant under ISO-646, but alternatives are provided for the tokens that use these characters anyway to accommodate even more restrictive historical charsets.

There is no alternative spelling (such as eq) for the equality operator == because the character = was present in all supported charsets.

[edit] Compatibility with C

The same words are defined in the C programming language in the include file <iso646.h> as macros. Because in C++ these are built into the language, the C++ version of <iso646.h>, as well as <ciso646>, does not define anything. The non-word digraphs (e.g <%), however, are part of the core language and can be used without including any header (otherwise, they would be unusable on any charset that lacks #).

[edit] Keywords

and, and_eq, bitand, bitor, compl, not, not_eq, or, or_eq, xor, xor_eq

[edit] Example

The following example demonstrates the use of several alternative tokens.

%:include <iostream>
 
struct X
<%
    compl X() <%%> // destructor
    X() <%%>
    X(const X bitand) = delete; // copy constructor
    // X(X and) = delete; // move constructor
 
    bool operator not_eq(const X bitand other)
    <%
       return this not_eq bitand other;
    %>
%>;
 
int main(int argc, char* argv<::>) 
<%
    // lambda with reference-capture:
    auto greet = <:bitand:>(const char* name)
    <%
        std::cout << "Hello " << name
                  << " from " << argv<:0:> << '\n';
    %>;
 
    if (argc > 1 and argv<:1:> not_eq nullptr)
        greet(argv<:1:>);
    else
        greet("Anon");
%>

Possible output:

Hello Anon from ./a.out

[edit] References

  • C++23 standard (ISO/IEC 14882:2024):
  • 5.5 Alternative tokens [lex.digraph]
  • C++20 standard (ISO/IEC 14882:2020):
  • 5.5 Alternative tokens [lex.digraph]
  • C++17 standard (ISO/IEC 14882:2017):
  • 5.5 Alternative tokens [lex.digraph]
  • C++14 standard (ISO/IEC 14882:2014):
  • 2.4 Trigraph sequences [lex.trigraph]
  • 2.6 Alternative tokens [lex.digraph]
  • C++11 standard (ISO/IEC 14882:2011):
  • 2.4 Trigraph sequences [lex.trigraph]
  • 2.6 Alternative tokens [lex.digraph]
  • C++03 standard (ISO/IEC 14882:2003):
  • 2.3 Trigraph sequences [lex.trigraph]
  • 2.5 Alternative tokens [lex.digraph]
  • C++98 standard (ISO/IEC 14882:1998):
  • 2.3 Trigraph sequences [lex.trigraph]
  • 2.5 Alternative tokens [lex.digraph]

[edit] See also

C documentation for Alternative operators and tokens