Intel Compiler Bug by anandrdbz · Pull Request #161 · MFlowCode/MFC

anandrdbz · 2023-05-25T11:39:12Z

The issue with the intel compilers seems to stem from the MPI_SENDRECV not occurring correctly in 2D / 3D.

I added a NaN check in the receive buffer to narrow down where this comes from, however, adding this check seems to remove this bug entirely. All test cases pass with the intel compiler.

The source of the bug seems rather bizarre, since adding a NaN check should not fundamentally change anything in the code. If I had to guess, MPI_SENDRECV using intel mpi seems to not be perfectly blocking as it should and the receive buffer is not fully populated. The extra time required to perform the check perhaps allows for the transfer to complete.

Either way, bug seems to be compiler related and not introduced in the code.

@sbryngelson , modular fp seems to fail in another case file with single precision (with gcc), so I'm not sure if this is the cause of the bug there, but we can always check

henryleberre · 2023-05-25T15:14:06Z

@anandrdbz It is a very strange issue. Is there a reason for including this check on CPUs no matter the compiler but not on GPUs, apart from requiring a GPU kernel launch when using Cuda aware MPI to perform the NaN check? If we only need it for Intel Compilers, we could guard it using the __INTEL_COMPILER preprocessor definition.

anandrdbz · 2023-05-25T15:19:58Z

@henryleberre it's only required on intel compilers (CPU), so yes we can change it from !ACC to just INTEL. I'll change it in the commit

Also, @sbryngelson , the PR says it failed on GPUs, but I verified that it works on Phoenix myself, so most likely it's just a random CI issue. You can confirm that it runs with Intel compilers as well

sbryngelson · 2023-05-25T16:06:41Z

Also, @sbryngelson , the PR says it failed on GPUs, but I verified that it works on Phoenix myself, so most likely it's just a random CI issue. You can confirm that it runs with Intel compilers as well

Perhaps, but if that's the case then we need to fix it.

sbryngelson · 2023-05-26T01:33:40Z

@anandrdbz I added Intel compilers back into the CI for this and you can see that they are failing.

anandrdbz · 2023-05-26T01:37:27Z

@sbryngelson it seems like it's failing a lot further into the test cases (3D viscous + bubbles), so I'll try to see what's causing it

sbryngelson · 2023-05-29T17:31:24Z

Some ideas (from @henryleberre): what happens if you disable mpi via --no-mpi? what if you add debug and disable mpi --no-mpi --debug? This should narrow down the places where the problems could occur.

Anand Radhakrishnan and others added 4 commits May 25, 2023 07:26

Intel Bug

ce1f832

Merge branch 'MFlowCode:master' into patch-1

517c229

Intel Bug

6c96b61

Merge branch 'patch-1' of https://github.com/anandrdbz/MFC into patch-1

d4f3e21

anandrdbz requested a review from sbryngelson as a code owner May 25, 2023 11:39

Anand Radhakrishnan and others added 3 commits May 25, 2023 07:48

gpu bug fixed

1f3e652

Typo

16dfcc1

Update m_mpi_proxy.fpp

8ea1bb6

anandrdbz added 2 commits May 25, 2023 11:27

Update m_mpi_proxy.fpp

d0ab972

Update m_mpi_proxy.fpp

f5885cb

Anand and others added 2 commits May 25, 2023 13:07

Typo

140f29d

Update ci.yml

cd4b5d1

sbryngelson requested a review from henryleberre as a code owner May 25, 2023 19:43

sbryngelson removed the request for review from henryleberre May 25, 2023 19:52

anandrdbz and others added 10 commits May 31, 2023 13:17

Update ci.yml

5b66ca7

Update ci.yml

485f7fe

TESTING INTEL COMPILER

31b5c4d

bug

55793cb

bug

88013bd

bug

55f24d0

bug

872c037

bug

a5c1bb0

Update ci.yml

9beaf42

UPDATE

9ab7639

Anand and others added 27 commits July 28, 2023 04:51

test

6644404

test

70fc39b

test

63ed1ae

test

5a1c71b

test

39828f0

test

b2c57ff

test

0d60622

test

7634285

Omega wrt

ce6cf67

test

94e82fb

test

035d11e

fix

a61a7f8

ci

f4f35cd

ci

3c844f5

ci

0ce3503

ci

7bc4d1d

ci

a62ec12

ci

7550791

ci

059c8d1

ci

5871c9b

ci

87b3497

ci

913a3dc

ci

66a5b31

ci

2c39f85

ci

f2ef824

ci

f36e437

CI

be8dc4a

sbryngelson approved these changes Aug 1, 2023

View reviewed changes

sbryngelson linked an issue Aug 1, 2023 that may be closed by this pull request

Intel compilers require some help #156

Closed

sbryngelson merged commit 363d584 into MFlowCode:master Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel Compiler Bug#161

Intel Compiler Bug#161
sbryngelson merged 82 commits into
MFlowCode:masterfrom
anandrdbz:patch-1

anandrdbz commented May 25, 2023 •

edited

Loading

Uh oh!

henryleberre commented May 25, 2023

Uh oh!

anandrdbz commented May 25, 2023 •

edited

Loading

Uh oh!

sbryngelson commented May 25, 2023

Uh oh!

sbryngelson commented May 26, 2023

Uh oh!

anandrdbz commented May 26, 2023

Uh oh!

sbryngelson commented May 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Conversation

anandrdbz commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

henryleberre commented May 25, 2023

Uh oh!

anandrdbz commented May 25, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbryngelson commented May 25, 2023

Uh oh!

sbryngelson commented May 26, 2023

Uh oh!

anandrdbz commented May 26, 2023

Uh oh!

sbryngelson commented May 29, 2023

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

anandrdbz commented May 25, 2023 •

edited

Loading

anandrdbz commented May 25, 2023 •

edited

Loading