Talk:cpp/atomic/memory order

1 See also
2 std::memory_order_release and memory_order_acquire example
3 memory_order_acq_rel and non-RMW load/stores
4 Are memory_order_acquire and memory_order_release correct?
5 Acquire operation and fences
6 memory_order_acquire
7 happens-before required to be acyclic
8 Does the example in "Sequentially-consistent ordering" section really require seq_cst?
9 Example with atomic string* is not valid because it is not trivially copyable
10 Is release-consume ordering correct?
11 Wording in acq_rel description
12 Release-acquire ordering: must load the value that was stored
13 writes visible

[edit] See also

Should we add books on particular topics as external links/see also? "C++ Concurrency in Action" explains this part well, --Cubbi 19:17, 21 September 2011 (PDT)

Sure, let's have an 'External links' section at the end of the page in question.P12 20:29, 21 September 2011 (PDT)

[edit] std::memory_order_release and memory_order_acquire example

In this example, both writes have release order and both reads have acquire order. Wouldn't it be sufficient that the second write has release order and the second read has acquire order? 188.97.0.41 04:50, 21 November 2011 (PST)

No: release on the store to y but not on the store to x would only guarantee that store to x happens-before store to y from the point of view of thread 1 and any thread that loads from y with at least memory_order_acquire (not consume or relaxed). You're suggesting that load from y is relaxed, that means no synchronization whatsoever, assert may fail --Cubbi 07:14, 21 November 2011 (PST)

I think this example is doing more synchronization than is required. It would work equally well if the x.store in thread1 used std::memory_order_relaxed and the x.load in thread2 used std::memory_order_relaxed. Also it would work equally well if x was just a regular int, not std::atomic<int>. It might be more instructive/realistic to make x a regular int in this example to show the power of an acquire/release pair.

I agree, it's important to show the ordering of non-atomics. Will update (or you do, if you have a good example) --Cubbi 15:51, 1 January 2012 (PST)

..updated. --Cubbi 13:26, 2 January 2012 (PST)

[edit] memory_order_acq_rel and non-RMW load/stores

It seems that memory_order_acq_rel is only valid on read-modify-write operations and not on loads or stores. This is documented in § 29.6.5.9 and § 29.6.5.13 of (of drafts N3242 or N3291). However, memory_order_seq_cst is valid on these operations. The documentation here at cppreference.com implies that memory_order_acq_rel should work on loads/stores, and further defines memory_order_seq_cst as effectively being a stronger form of memory_order_acq_rel. This should be updated to explain that memory_order_acq_rel is not valid on load()/store() but that memory_order_seq_cst is. --Kballard (talk) 23:34, 8 April 2014 (PDT)

[edit] Are memory_order_acquire and memory_order_release correct?

AFAIK an acquire happens before all succeeding loads and stores, and a release happens before all preceding loads and stores, while this page only talks about stores and ignores loads.

With the definitions in this page, it is not possible to implement a spinlock with acquire/release atomic operations.

84.94.198.183 05:29, 18 May 2014 (PDT) (Avi Kivity)

[edit] Acquire operation and fences

(same issue with the "Release Operation" section).

The comment "Note that std::atomic_thread_fence is not an acquire operation." is misleading, if not totally wrong. e.g. load operations cannot be reordered after std::atomic_thread_fence(std::memory_order_acquire). --Itaj (talk) 18:38, 4 August 2016 (PDT)

this part of the page tries to be formal. See also In C++11, a Release Fence Is Not Considered a “Release Operation” at preshing.com --Cubbi (talk) 19:22, 4 August 2016 (PDT)

yeah, that's what I meant. I guess I'd like it to say "acquire fence is stronger than an acquire operation (see fences)" rather than just say they're different. --Itaj (talk) 21:25, 4 August 2016 (PDT)

okay, it is a good thing to say it's a stronger requirement. --Cubbi (talk) 03:33, 5 August 2016 (PDT)

[edit] memory_order_acquire

The description says it prohibits reordering memory "accesses" before the operation. Isn't it only "prohibits reordering of memory reads before the operation? Same issue with memory order release and acq_rel. --Itaj (talk) 18:46, 4 August 2016 (PDT)

this part of this page tries to be informal and you seem to be right, it used 'access' meaning 'read'. --Cubbi (talk) 19:22, 4 August 2016 (PDT)

[edit] happens-before required to be acyclic

I did think it was important to explain that it means that the implementation has to take care of the atomic operations execution in order to achieve that. Otherwise, to me, it seems quite inexplicable in what way it is "required". Change it however you see fit. --Itaj (talk) 21:33, 4 August 2016 (PDT)

the standard notes that this requirement is unnecessary unless memory_order_consume is involved, and memory_order_consume is unimplemented and temporarily deprecated, so it feels like the note shouldn't have a lot of real estate. Do you have an example in mind where the order of atomic operations is modified by this rule? --Cubbi (talk) 03:33, 5 August 2016 (PDT)

found the cycle analysis in http://www.cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf (which led to that sentence being added to the standard in http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3125.html which has a more explicit example of the cycle ) --Cubbi (talk) 06:03, 5 August 2016 (PDT)

Oh, it wasn’t on the top of my mind that memory_order_consume might be deprecated, I think I might have read something somewhere about it. What’s exactly the status?

n4606 (the C++17 committee draft which is now gathering NB comments) says "Implementations have found it infeasible to provide performance better than that of memory_order_acquire. Specification revisions are under consideration" in 29.3[atomics.order]p1.3, brought in via P0371r1. The new work in progress revised spec is P0190r2 --Cubbi (talk) 10:12, 8 August 2016 (PDT)

So, it is true that for acquire/release the acyclic requirement is reduntant. But it is redundant only in the sense that it is logically provable from the other requirements, specifically the coherency requirements. However for pedagogical purposes, I think it is very much worth mentioning it, specifically saying that it is provable for the acquire/release case. I think it is a very important property of the happen-before relationship, without it is hard to really comprehend the topic. I wouldn’t think that everyone would figure it out immediately by themselves, certainly not a first time learner, certainly I wouldn’t have. E.g. even the standard mentions this explicitly.--Itaj (talk) 09:39, 8 August 2016 (PDT)

oh, I see you had already added that. seems great, thanks for the attention. --Itaj (talk) 09:47, 8 August 2016 (PDT)

[edit] Does the example in "Sequentially-consistent ordering" section really require seq_cst?

I cannot understand why this will not work with atomic store-release and load-acquire operations:

#include <atomic>
#include <cassert>
#include <thread>

std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};

void write_x() {
    x.store(true, std::memory_order_release);
}

void write_y() {
    y.store(true, std::memory_order_release);
}

void read_x_then_y() {
    while (!x.load(std::memory_order_acquire))
        ;
    if (y.load(std::memory_order_acquire)) {
        ++z;
    }
}

void read_y_then_x() {
    while (!y.load(std::memory_order_acquire))
        ;
    if (x.load(std::memory_order_acquire)) {
        ++z;
    }
}

int main() {
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);
    a.join(); b.join(); c.join(); d.join();
    assert(z.load() != 0);  // may happen??
}

In read_x_then_y, y.load(std::memory_order_acquire) cannot be reordered before x.load(std::memory_order_acquire) according to the definition:

no reads or writes in the current thread can be reordered before this load

The same for read_y_then_x. Could someone explain me this? --Ki.stfu (talk) 05:49, 20 July 2017 (PDT)

Thread C reads 1 from x, then reads 0 from y. Thread D reads 1 from y, then reads 0 from x. Those are different threads, there is no reason they would have a sequentially-consistent view of memory.. unless you actually use sequentially-consistent memory order. --Cubbi (talk) 06:16, 20 July 2017 (PDT)

[edit] Example with atomic string* is not valid because it is not trivially copyable

The code example for "Release-Acquire ordering" is useful for demonstrating release-acquire. It also runs successfully on some platforms. However, using this example as a guide, I encountered memory corruption of atomic string content.

My research into this issue identified that 'string' is not trivially copyable (e.g. std::is_trivially_copyable<string>::value equals 0), much to my surprise. On MacOS with Apple LLVM g++ version 10.0.0, I was able to recreate memory corruption with single threaded load and store of atomic<string>, but not other trivially copyable atomics. Since trivially copyable is a requirement for atomic I concluded this was the cause of the corruption.

Can the code example be updated to use another class or a simple primitive type? It would be more correct and help those who assume all aspects of the examples are valid. Here is a potential update to the example. Possibly it is overly complicated, but also illustrates trivially copyable pointers.

Run this code

#include <thread>
#include <atomic>
#include <cassert>
#include <string>
#include <iostream>
 
 
class A {
 
    public:
        A(int x, int y) : x(x), y(y) {
        }
        int x;
        int y;
};
 
std::atomic<A*> ptr;
int data;
 
 
void producer()
{
    A* p  = new A(1, 2);
    data = 42;
    ptr.store(p, std::memory_order_release);
}
 
void consumer()
{
    assert(std::is_trivially_copyable<A>::value == 1); // Required by atomic<>
 
    A* p2;
    while (!(p2 = ptr.load(std::memory_order_acquire)))
        ;
    assert(p2->x == 1); // never fires
    assert(p2->y == 2); // never fires
 
    std::cout << "Hello. x + y = " << p2->x + p2->y << std::endl;
 
    assert(data == 42); // never fires
}
 
int main()
{
    std::thread t1(producer);
    std::thread t2(consumer);
    t1.join(); t2.join();
}

the example on the page does not use atomic<string> (which would indeed be a problem): it uses atomic<string*>. --Cubbi (talk) 06:32, 25 January 2019 (PST)

I was using also atomic<string*> but made a typo in this post. Your quick response did clarify my understanding, in that string is not trivially copyable, but string* is. Using atomic<string*> only makes the pointer atomic. I believe your original example is correct because the atomic string pointer is a reference to a globally defined literal. My issue with corruption was due to management of the string content being referenced by the pointer. Thank you for the follow up.

[edit] Is release-consume ordering correct?

The statement

If an atomic store in thread A is tagged memory_order_release and an atomic load in thread B from the same variable is tagged memory_order_consume, all memory writes (non-atomic and relaxed atomic) that are dependency-ordered-before the atomic store from the point of view of thread A, become visible side-effects within those operations in thread B into which the load operation carries dependency...

seems weird (the problematic part is highlighted). First of all, dependency-ordered-before is strictly inter-thread. But thread A can also produce some effects that should be visible to thread B. I believe dependency-ordered-before should be replaced with something like happens-before that is illustrated in the following diagram:

thread A	thread B
X* x = new X; // A
data = 42; // B
x->v = 24; // C
p.store(x, std::memory_order_release); // D
	x = p.load(std::memory_order_consume); // E
	f(..., x, ...); // F
	g(..., x->v, ...); // G

D is d.o.b. E (dependency-ordered-before), E c.d. (carries dependency) into F and G. G must see write C which is s.b. (sequenced before) D. So we have C s.b. D d.o.b. E c.d. G (and effectively C i.h.b. or just h.b. G, where i.h.b. stands for inter-thread happens-before). However, C is not d.o.b. D because both C and D are in the same thread.

But raw happens-before is also seems too weak to be substituted instead of d.o.b. in the quoted statement because data cannot be safely used as e.g. an argument of g in G (can it?).

--Kalaider (talk) 06:45, 15 March 2020 (PDT)

"happens before" is correct; that means that the writes either are sequenced before or inter-thread happen before the release-consume, and both are valid concatenations that give you "inter-thread happens before". The load from p doesn't carry a dependency to a use of data in your example, so you can't use data. T. Canens (talk) 07:58, 15 March 2020 (PDT)

[edit] Wording in acq_rel description

The description of memory_order_acq_rel says "No memory reads or writes in the current thread can be reordered before or after this store." First of all, "this store" seems wrong because acq_rel ordering applies by definition to RMW operations which involve both a load and a store. Second, although the Standard is not entirely clear about this, the interpretation seems to be that an acq_rel RMW consists of an acquire load and a release store. They are indivisible only in the sense that another write to the same object cannot happen in between, but reads and writes to other objects can. So a read or write of another object that comes later in program order is allowed to be reordered before the store, but not before the load. Likewise, an earlier read or write can be reordered after the load, but not after the store. Existing implementations do this, e.g. LL/SC architectures where the load and store of a RMW pair are actually separate instructions that may occur some distance apart.

See https://stackoverflow.com/questions/65568185/for-purposes-of-ordering-is-atomic-read-modify-write-one-operation-or-two for more discussion and a test case.

I would suggest the wording "No memory reads or writes in the current thread can be reordered before the load, nor after the store".

--Nate Eldredge (talk) 14:42, 19 March 2022 (PDT)

[edit] Release-acquire ordering: must load the value that was stored

In the discussion of release-acquire ordering, we have: "If an atomic store in thread A is tagged memory_order_release and an atomic load in thread B from the same variable is tagged memory_order_acquire, all memory writes (non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B." I would clarify that this promise holds only if the load in B actually returns the value that A stored, or a value from later in the release sequence. The Standard has the phrasing "a load that takes its value from the store" or "from the release sequence headed by the store".

This is probably assumed to be implicitly obvious, but I've encountered several beginners who have read this passage and gotten confused. They get the idea that this synchronization occurs no matter what, as if the load would magically wait for the store to complete before proceeding, like a kind of std::barrier. This of course breaks down if you think about it more carefully, e.g. if there are several stores throughout the program, how would the load know which one to wait for? But it is still confusing until they get it cleared up.

--Nate Eldredge (talk) 14:55, 19 March 2022 (PDT)

seems to make sense; applied both --Cubbi (talk) 07:36, 21 March 2022 (PDT)

[edit] writes visible

Does anyone know where in the standard it talks about things like "All writes in other threads that release the same atomic variable are visible in the current thread"? I think all I've found so far is the informal statement in 6.9.2.2/Note2. Thanks in advance. Uncreate (talk) 12:27, 3 October 2023 (PDT)

cppreference.com

Namespaces

Variants

Views

Actions