Discussion:
Is powerpc64 atomic_load_acq_##TYPE omitting isync believed correct?
Mark Millard via freebsd-hackers
2021-05-30 06:04:39 UTC
Permalink
In the code from /usr/include/machine/atomic.h for powerpc64
and powerpc there is:

#define ATOMIC_STORE_LOAD(TYPE) \
static __inline u_##TYPE \
atomic_load_acq_##TYPE(volatile u_##TYPE *p) \
{ \
u_##TYPE v; \
\
v = *p; \
powerpc_lwsync(); \
return (v); \
} \
\
static __inline void \
atomic_store_rel_##TYPE(volatile u_##TYPE *p, u_##TYPE v) \
{ \
\
powerpc_lwsync(); \
*p = v; \
}

This code sequence does not involve isync:

#define __ATOMIC_ACQ() __asm __volatile("isync" : : : "memory")

What justifies this? All the reference material I've
found for C++/C11 semantics agrees with:

https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

that shows (organized here to compare Relaxed vs.
Acquire and Release):

powerpc Load Relaxed vs. Acquire: ld vs. ld;cmp;bc;isync
powerpc Fence: Acquire: lwsync
powerpc Store Relaxed vs. Release: st vs. "Fence: Release";st
powerpc Fence: Release: lwsync

lwsync does not order prior stores vs. later loads, isync does
(and more in some respects). That likely (partially) explains
why load-acquire does not use just an acquire-fence in such
materials.

Is this a problem for being correct for "synchronizes with" in
"man atomic"? For the acquire operation reading the value
written by the release operation:

QUOTE
. . . the effects of all
prior stores by the releasing thread must become visible to subsequent
loads by the acquiring thread
END QUOTE

It seems that some later loads could be moved by the hardware
to be too early relative to various such prior stores (as seen
in the load-acquire thread): no constraint is placed for such
relationships by the atomic_load_acq_##TYPE as far as I can see.


(I got into this by finding some code that uses an
atomic_store_rel_##TYPE without any matching use of
atomic_load_acq_##TYPE or atomic_thread_fence_acq or other such,
so far as I found. But, looking around to see if I could find a
justification for such code, generated more questions, such as
in this note.)

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Mark Millard via freebsd-hackers
2021-05-31 11:23:04 UTC
Permalink
Post by Mark Millard via freebsd-hackers
In the code from /usr/include/machine/atomic.h for powerpc64
#define ATOMIC_STORE_LOAD(TYPE) \
static __inline u_##TYPE \
atomic_load_acq_##TYPE(volatile u_##TYPE *p) \
{ \
u_##TYPE v; \
\
v = *p; \
powerpc_lwsync(); \
return (v); \
} \
\
static __inline void \
atomic_store_rel_##TYPE(volatile u_##TYPE *p, u_##TYPE v) \
{ \
\
powerpc_lwsync(); \
*p = v; \
}
#define __ATOMIC_ACQ() __asm __volatile("isync" : : : "memory")
What justifies this? All the reference material I've
https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
that shows (organized here to compare Relaxed vs.
powerpc Load Relaxed vs. Acquire: ld vs. ld;cmp;bc;isync
powerpc Fence: Acquire: lwsync
powerpc Store Relaxed vs. Release: st vs. "Fence: Release";st
powerpc Fence: Release: lwsync
lwsync does not order prior stores vs. later loads, isync does
(and more in some respects). That likely (partially) explains
why load-acquire does not use just an acquire-fence in such
materials.
Is this a problem for being correct for "synchronizes with" in
"man atomic"? For the acquire operation reading the value
QUOTE
. . . the effects of all
prior stores by the releasing thread must become visible to subsequent
loads by the acquiring thread
END QUOTE
It seems that some later loads could be moved by the hardware
to be too early relative to various such prior stores (as seen
in the load-acquire thread): no constraint is placed for such
relationships by the atomic_load_acq_##TYPE as far as I can see.
(I got into this by finding some code that uses an
atomic_store_rel_##TYPE without any matching use of
atomic_load_acq_##TYPE or atomic_thread_fence_acq or other such,
so far as I found. But, looking around to see if I could find a
justification for such code, generated more questions, such as
in this note.)
Never mind. I figured out my significant confusion in
interpretation. (Net result: lwsync is more than
sufficient.)


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)

Loading...