Namespaces
Variants
Views
Actions

Talk:c/string/multibyte/wcrtomb

From cppreference.com
< Talk:c‎ | string/multibyte
Revision as of 13:10, 29 August 2017 by Newatthis (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

[edit] restartable

Is there any value to be gained by mentioning that the "r" in the name "wcrtomb" means that the function is restartable and by explaining the case(s) where restarting wcrtomb would be useful? I am uncertain about what those cases are and about why this function is restartable since the wide character is supposed to be a single unit of type wchar_t. Any thoughts? Newatthis (talk) 04:38, 27 August 2017 (PDT)

This page already explains it where it says "taking into account the current multibyte conversion state *ps". Wide character is not the only input to this function. --Cubbi (talk) 07:05, 27 August 2017 (PDT)

[edit] wc is L'\0'

Isn't the conversion state parameter *ps updated to represent the initial "conversion" state rather than the initial "shift" state? The standard states, "the resulting state described is the initial conversion state." Newatthis (talk) 05:16, 29 August 2017 (PDT)

Adding even more ambiguous terminology does not help this page, it should be made simpler to be useful. Per 5.2.1.2, "each sequence of multibyte characters begins in an initial shift state" - that's what the "initial conversion state" describes. (if anything, I'd drop every mention of shift states as long outdated and just used conversion state.. but that would need at least an active defect report) --Cubbi (talk) 06:25, 29 August 2017 (PDT)
I have been wondering why I was struggling to prepare my question. I have been struggling with ambiguous terminology. Still, the phrase which you quote comes from the third bullet of 5.2.1.2, and that bullet qualifies the phrase with, "A multibyte character set MAY have a state-dependent encoding, wherein ..." Emphasis mine. For those character sets which are state-independent encodings, like UTF-8, the third bullet seems not to apply. In other words, how can an initial conversion state describe an initial shift state when the chosen multibyte character set is not a state-dependent encoding. Hence, my suggestion to use the phrase which comes straight from 7.29.6.3.3/3. No defect report required. Newatthis (talk) 07:11, 29 August 2017 (PDT)
For UTF-8 or GB18030 or another stateless encoding, none of this is relevant because *ps does not change on a call to wcrtomb. The point of the specification is that zero unshifts (as it must, to produce an NTMBS). This is already said, in many more words than necessary. Adding even more words adds no value. --Cubbi (talk) 08:39, 29 August 2017 (PDT)
My original suggestion involved replacing "shift" with "conversion", not adding words. I'll not pursue it. Newatthis (talk) 14:10, 29 August 2017 (PDT)