Namespaces
Variants
Views
Actions

Difference between revisions of "c/language/operator alternative"

From cppreference.com
< c‎ | language
m (Example: fix: "Hello %s" cannot produce "Hello, World!"; fmt;)
 
(One intermediate revision by one user not shown)
Line 2: Line 2:
 
{{c/language/navbar}}
 
{{c/language/navbar}}
  
C source code may be written in any 8-bit character set that includes the [[enwiki:ISO 646|ISO 646:1983]] invariant character set, even non-ASCII ones. However, several C operators and punctuators require characters that are outside of the ISO 646 codeset: {{tt|{, }, [, ], #, \, ^, {{!}}, ~}}. To be able to use character encodings where some or all of these symbols do not exist (such as the German [http://en.wikipedia.org/wiki/DIN_66003 DIN 66003]), there are two possibilities: alternative spellings of operators that use these characters or special combinations of two or three ISO 646 compatible characters that are interpreted as if they were a single non-ISO 646 character.
+
C source code may be written in any 8-bit character set that includes the {{enwiki|ISO 646|ISO 646:1983}} invariant character set, even non-ASCII ones. However, several C operators and punctuators require characters that are outside of the ISO 646 codeset: {{tt|{, }, [, ], #, \, ^, {{!}}, ~}}. To be able to use character encodings where some or all of these symbols do not exist (such as the German {{enwiki|DIN 66003}}), there are two possibilities: alternative spellings of operators that use these characters or special combinations of two or three ISO 646 compatible characters that are interpreted as if they were a single non-ISO 646 character.
  
 
==Operator macros{{mark c95}}==
 
==Operator macros{{mark c95}}==
There are alternative spellings for the operators that use non-ISO646 characters, defined in {{ttb|<iso646.h>}} as macros:
+
There are alternative spellings for the operators that use non-ISO646 characters, defined in {{header|iso646.h}} as macros:
  
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc header | iso646.h}}
+
{{dsc header|iso646.h}}
{{dsc hitem | Primary | Alternative }}
+
{{dsc hitem|Primary|Alternative }}
{{dsc macro opr | {{tt|&&}} | {{tt|and}} | nolink=true}}
+
{{dsc macro opr|{{tt|&&}}|{{tt|and}}|nolink=true}}
{{dsc macro opr | {{tt|&{{=}}}} | {{tt|and_eq}} | nolink=true}}
+
{{dsc macro opr|{{tt|1=&=}}|{{tt|and_eq}}|nolink=true}}
{{dsc macro opr | {{tt|&}} | {{tt|bitand}} | nolink=true}}
+
{{dsc macro opr|{{tt|&}}|{{tt|bitand}}|nolink=true}}
{{dsc macro opr | {{tt|&#124;}} | {{tt|bitor}} | nolink=true}}
+
{{dsc macro opr|{{tt|&#124;}}|{{tt|bitor}}|nolink=true}}
{{dsc macro opr | {{tt|~}} | {{tt|compl}} | nolink=true}}
+
{{dsc macro opr|{{tt|~}}|{{tt|compl}}|nolink=true}}
{{dsc macro opr | {{tt|!}} | {{tt|not}} | nolink=true}}
+
{{dsc macro opr|{{tt|!}}|{{tt|not}}|nolink=true}}
{{dsc macro opr | {{tt|!{{=}}}} | {{tt|not_eq}} | nolink=true}}
+
{{dsc macro opr|{{tt|1=!=}}|{{tt|not_eq}}|nolink=true}}
{{dsc macro opr | {{tt|&#124;&#124;}} | {{tt|or}} | nolink=true}}
+
{{dsc macro opr|{{tt|&#124;&#124;}}|{{tt|or}}|nolink=true}}
{{dsc macro opr | {{tt|&#124;{{=}}}} | {{tt|or_eq}} | nolink=true}}
+
{{dsc macro opr|{{tt|1=&#124;=}}|{{tt|or_eq}}|nolink=true}}
{{dsc macro opr | {{tt|^}} | {{tt|xor}} | nolink=true}}
+
{{dsc macro opr|{{tt|^}}|{{tt|xor}}|nolink=true}}
{{dsc macro opr | {{tt|^{{=}}}} | {{tt|xor_eq}} | nolink=true}}
+
{{dsc macro opr|{{tt|1=^=}}|{{tt|xor_eq}}|nolink=true}}
 
{{dsc end}}
 
{{dsc end}}
  
 
The characters {{c|&}} and {{c|!}} are invariant under ISO-646, but alternatives are provided for the operators that use these characters anyway to accommodate even more restrictive historical charsets<!-- best reference found so far "The Danish delegation did not, in fact, have a way of representing those characters on their terminals" from https://groups.google.com/d/msg/comp.std.c/eYbj0lCIvn4/89oK8U6JpqEJ , but it doesn't specifically call out & and ! -->.
 
The characters {{c|&}} and {{c|!}} are invariant under ISO-646, but alternatives are provided for the operators that use these characters anyway to accommodate even more restrictive historical charsets<!-- best reference found so far "The Danish delegation did not, in fact, have a way of representing those characters on their terminals" from https://groups.google.com/d/msg/comp.std.c/eYbj0lCIvn4/89oK8U6JpqEJ , but it doesn't specifically call out & and ! -->.
  
There is no alternative spelling (such as {{c|eq}}) for the equality operator {{c|{{==}}}} because the character {{c|{{=}}}} was present in all supported charsets.
+
There is no alternative spelling (such as {{c|eq}}) for the equality operator {{c|1===}} because the character {{c|1==}} was present in all supported charsets.
  
 
==Alternative tokens{{mark c95}}==
 
==Alternative tokens{{mark c95}}==
The following alternative tokens are part of the core language, and, in all respects of the language, each alternative token behaves exactly the same as its primary token, except for its spelling (the [[c/preprocessor/replace|stringification operator]] can make the spelling visible). The two-letter alternative tokens are sometimes called "digraphs" (even though it's four letters long {{tt|%:%:}} is also called a digraph).
+
The following alternative tokens are part of the core language, and, in all respects of the language, each alternative token behaves exactly the same as its primary token, except for its spelling (the [[c/preprocessor/replace|stringification operator]] can make the spelling visible). The two-letter alternative tokens are sometimes called "digraphs" (even though it is four letters long {{c|%:%:}} is also considered a digraph).
  
  
{| class="wikitable"
+
{|class="wikitable"
 
|- style="text-align: left;"
 
|- style="text-align: left;"
 
! Primary
 
! Primary
 
! Alternative
 
! Alternative
 
|-
 
|-
| {{tt|{}} || {{tt|<%}}
+
|{{tt|{}}||{{tt|<%}}
 
|-
 
|-
| {{tt|}}} || {{tt|%>}}
+
|{{tt|}}}||{{tt|%>}}
 
|-
 
|-
| {{tt|[}} || {{tt|<:}}
+
|{{tt|[}}||{{tt|<:}}
 
|-
 
|-
| {{tt|]}} || {{tt|:>}}
+
|{{tt|]}}||{{tt|:>}}
 
|-
 
|-
| {{tt|#}} || {{tt|%:}}
+
|{{tt|#}}||{{tt|%:}}
 
|-
 
|-
| {{tt|##}} || {{tt|%:%:}}
+
|{{tt|##}}||{{tt|%:%:}}
 
|}
 
|}
  
 
==Trigraphs {{mark until c23|removed=yes}}==
 
==Trigraphs {{mark until c23|removed=yes}}==
 +
The following three-character groups (trigraphs) are {{rlp|translation phases|parsed before comments and string literals are recognized}}, and each appearance of a trigraph is replaced by the corresponding primary character:
  
The following three-character groups (trigraphs) are {{rlp|translation_phases|parsed before comments and string literals are recognized}}, and each appearance of a trigraph is replaced by the corresponding primary character:
+
{|class="wikitable"
 
+
{| class="wikitable"
+
 
|- style="text-align: left;"
 
|- style="text-align: left;"
 
! Primary
 
! Primary
 
! Trigraph
 
! Trigraph
 
|-
 
|-
| {{tt|{}} || {{tt|??<}}
+
|{{tt|{}}||{{tt|??<}}
 
|-
 
|-
| {{tt|}}} || {{tt|??>}}
+
|{{tt|}}}||{{tt|??>}}
 
|-
 
|-
| {{tt|[}} || {{tt|??(}}
+
|{{tt|[}}||{{tt|??(}}
 
|-
 
|-
| {{tt|]}} || {{tt|??)}}
+
|{{tt|]}}||{{tt|??)}}
 
|-
 
|-
| {{tt|#}} || {{tt|??{{=}}}}
+
|{{tt|#}}||{{tt|1=??=}}
 
|-
 
|-
| {{tt|\}} || {{tt|??/}}
+
|{{tt|\}}||{{tt|??/}}
 
|-
 
|-
| {{tt|^}} || {{tt|??'}}
+
|{{tt|^}}||{{tt|??'}}
 
|-
 
|-
| {{tt|<nowiki>|</nowiki>}} || {{tt|??!}}
+
|{{tt|<nowiki>|</nowiki>}}||{{tt|??!}}
 
|-
 
|-
| {{tt|~}} || {{tt|??-}}
+
|{{tt|~}}||{{tt|??-}}
 
|}
 
|}
  
Line 81: Line 80:
 
===Example===
 
===Example===
 
{{example
 
{{example
|The following example demonstrates alternative operator spellings from the {{ttb|<iso646.h>}} header as well as use of digraphs and trigraphs.
+
|Demonstrates alternative operator spellings from the {{header|iso646.h}} as well as use of digraphs and trigraphs.
The space character (between comma and W) in the first command-line argument, argv[1], requires the quotation marks: ", World!".
+
If command line arguments contain spaces they should be wrapped in the quotation marks, e.g., {{c|"Third World!"}}.
 
|code=
 
|code=
%:include <stdlib.h>
 
 
%:include <stdio.h>
 
%:include <stdio.h>
 +
%:include <stdlib.h>
 
??=include <iso646.h>
 
??=include <iso646.h>
 
   
 
   
Line 103: Line 102:
 
|p=true
 
|p=true
 
|output=
 
|output=
Hello, World!
+
Hello ./a.out
 
}}
 
}}
  
 
===See also===
 
===See also===
 
{{dsc begin}}
 
{{dsc begin}}
{{dsc see cpp | cpp/language/operator_alternative | Alternative operator representations|nomono=true}}
+
{{dsc see cpp|cpp/language/operator alternative|Alternative operator representations|nomono=true}}
 
{{dsc end}}
 
{{dsc end}}
  
 
{{langlinks|ja|zh}}
 
{{langlinks|ja|zh}}

Latest revision as of 15:38, 10 June 2023

C source code may be written in any 8-bit character set that includes the ISO 646:1983 invariant character set, even non-ASCII ones. However, several C operators and punctuators require characters that are outside of the ISO 646 codeset: {, }, [, ], #, \, ^, |, ~. To be able to use character encodings where some or all of these symbols do not exist (such as the German DIN 66003), there are two possibilities: alternative spellings of operators that use these characters or special combinations of two or three ISO 646 compatible characters that are interpreted as if they were a single non-ISO 646 character.

Contents

[edit] Operator macros(C95)

There are alternative spellings for the operators that use non-ISO646 characters, defined in <iso646.h> as macros:

Defined in header <iso646.h>
Primary Alternative
&&
and
(operator macro)
&=
and_eq
(operator macro)
&
bitand
(operator macro)
|
bitor
(operator macro)
~
compl
(operator macro)
!
not
(operator macro)
!=
not_eq
(operator macro)
||
or
(operator macro)
|=
or_eq
(operator macro)
^
xor
(operator macro)
^=
xor_eq
(operator macro)

The characters & and ! are invariant under ISO-646, but alternatives are provided for the operators that use these characters anyway to accommodate even more restrictive historical charsets.

There is no alternative spelling (such as eq) for the equality operator == because the character = was present in all supported charsets.

[edit] Alternative tokens(C95)

The following alternative tokens are part of the core language, and, in all respects of the language, each alternative token behaves exactly the same as its primary token, except for its spelling (the stringification operator can make the spelling visible). The two-letter alternative tokens are sometimes called "digraphs" (even though it is four letters long %:%: is also considered a digraph).


Primary Alternative
{ <%
} %>
[ <:
] :>
# %:
## %:%:

[edit] Trigraphs (removed in C23)

The following three-character groups (trigraphs) are parsed before comments and string literals are recognized, and each appearance of a trigraph is replaced by the corresponding primary character:

Primary Trigraph
{ ??<
} ??>
[ ??(
] ??)
# ??=
\ ??/
^ ??'
| ??!
~ ??-

Because trigraphs are processed early, a comment such as // Will the next line be executed?????/ will effectively comment out the following line, and the string literal such as "What's going on??!" is parsed as "What's going on|".

[edit] Example

Demonstrates alternative operator spellings from the <iso646.h> as well as use of digraphs and trigraphs. If command line arguments contain spaces they should be wrapped in the quotation marks, e.g., "Third World!".

%:include <stdio.h>
%:include <stdlib.h>
??=include <iso646.h>
 
int main(int argc, char** argv)
??<
    if (argc > 1 and argv<:1:> not_eq NULL)
    <%
       printf("Hello %s??/n", argv<:1:>);
    %>
    else
    <%
       printf("Hello %s??/n", argc? argv??(42??'42??) : __FILE__);
    %>
 
    return EXIT_SUCCESS;
??>

Possible output:

Hello ./a.out

[edit] See also

C++ documentation for Alternative operator representations