Fix some race conditions in xca6ll/hot/amcssth.#83
Fix some race conditions in xca6ll/hot/amcssth.#83gareth-rees wants to merge 2 commits intomasterfrom
Conversation
4fd7155 to
b71e18c
Compare
f66d805 to
2ab4580
Compare
* Use memory-model-aware builtins in GCC and Clang when a memory location may be written by one thread and read by another, avoiding race conditions due to out-of-order updates on ARM. * Call dylan_make_wrappers while the test is still single-threaded, preventing multiple threads from racing to call it. * Prevent dylan_init from creating a padding object, as we must not have an exact root pointing at a padding object.
b71e18c to
e089937
Compare
Do you think it's better practice (in GitHub) to only link completely fixed issues using the "Development" field, and link partially fixed issues from comments? It would make sense if GitHub closes those issues automatically. (This could affect #97 ) |
GitHub automatically closes the issues in the "Development" field when the pull request is merged, so if you don't want the issue to be closed automatically then you shouldn't put it there. |
…repare for review
|
Executing proc.review.entry
Entry passed. Entry took 15 mins. |
|
Executing proc.review.plan.
Planning took 15 mins. |
|
Executing proc.review.kickoff
Kickoff took 15 minutes. |
There was a problem hiding this comment.
I concentrated on understanding consistency of the atomic operations with the source material.
Macros defined in testlib.h appear consistent.
I was troubled by the comment about MSVC being Intel only.
I don't know if the MPS in general is asserted to conform to the C++11 memory model.
It took me 30 minutes to be reasonably sure I understood what the source material was saying.
I didn't understand the purpose of the changes in fmtdytst.c.
Possible minor defect in amcss.c - multiple delclarations on one line? https://github.com/Ravenbrook/mps/pull/83/files#diff-2fe9be832421a20cd9426718eb3a93a0c04a930fa37eed15207bdeace863dfefR111
rptb1
left a comment
There was a problem hiding this comment.
Executing proc.review.check
- Read #59
- Partially read https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins
- Examining use of __atomics.
- M: I think amcssth may not be consistent with its purpose, which is to stress the AMC pool. Instead, it seems to be busy stressing itself, and not acting like a common mutator program by using locks. This is major because we might spend a lot of time fixing amcssth. #59 (comment) says "Still occurs if we turn off garbage collection by parking the area." Why should we be debugging such a case? Also, rule.code.simple, rule.code.justified, rule.code.independent.
- M: The fact that the above arises suggests a problem with rule.generic.purpose.
- Transformations appear to preserve correctness.
- mi: branch/2021-01-25/amcssth-races seems to be an undocumented source?
- proc.review.check.metrics: 2 Major, 3 minor, 35 mins checking, read entire doc, but focussed on diffs really. Problem: poor concentration due to ME/CFS.
| /* See <https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html> | ||
| * and <https://clang.llvm.org/docs/LanguageExtensions.html> */ | ||
| #define atomic_load(SRC, DEST) __atomic_load(SRC, DEST, __ATOMIC_ACQUIRE) | ||
| #define atomic_store(DEST, SRC) __atomic_store(DEST, SRC, __ATOMIC_RELEASE) |
There was a problem hiding this comment.
m: Unclear why __ATOMIC_ACQUIRE and __ATOMIC_RELEASE as opposed to other possibilities. rule.generic.clear
| #elif defined(MPS_BUILD_MV) | ||
|
|
||
| /* Microsoft Visual C/C++ does not need memory-model-aware load and store as | ||
| * loads and stores of register-sized values are atomic on Intel. */ |
There was a problem hiding this comment.
m: Is the atomicity important, or is it memory ordering? Should at least say which and why. Possible reasoning error also. rule.generic.self
thejayps
left a comment
There was a problem hiding this comment.
Found 2 minor (2m)
40 mins checking
Checked source code diffs, PR comments and partially consulted 3 interface documents for atomic operations.
| @@ -292,6 +292,25 @@ extern void randomize(int argc, char *argv[]); | |||
| extern void testlib_init(int argc, char *argv[]); | |||
|
|
|||
|
|
|||
There was a problem hiding this comment.
m: Two separate documents describing the interface to llvm atomic operations exist and describe different interfaces.
https://clang.llvm.org/docs/LanguageExtensions.html#c11-atomic-operations
https://www.llvm.org/docs/Atomics.html
The former directs us to use a part of the interface to check that atomic operations exist. It seems that the code below is using the interface described in the second document. If my understanding of this is correct, then it isn't clear to me why one is used and not the other. My knowledge of this area isn't well developed so the clarity issue may be in the documentation of the interfaces rather than in the documentation of this pull request. However, some explanation by way of comment on this pull request of how these interfaces were identified and chosen could be useful for adding extra clarity.
@rptb1 summary: It's not clear to @thejayps why A not B. rule.generic.clarity
|
Executing proc.review.log
Logging took 42 mins. |
|
Executing proc.review.brainstorm
Possible new issues:
End at 17:25. 30 mins bang on! |
|
We haven't assigned proc.review.role.editor or proc.review.role.improver and @thejayps is away next week. We will perhaps do some of these as a group for training purposes and pair programming. |
…s. Clarifying reminders based on review testing <#83 (comment)>.
Comment written by @rptb1 on @thejayps session: So this is a bit meta-circular: @thejayps asked me about improving proc.review to mention capture of brainstorming thoughts that come up later. (And his asking me is an example of that!) I'm writing this to demonstrate what I would do about that:
|
dylan_make_wrapperswhile the test is still single-threaded, preventing multiple threads from racing to call it.dylan_initfrom creating a padding object, as we must not have an exact root pointing at a padding object.Note that this does not fix all the race conditions—there is at least one more that we have not figured out—but it reduces the frequency of failures in amcssth to less than one in a hundred on my 8-core Apple M2.
Work towards #59 (Rare failure in amcssth in hot variety on Linux)