Skip to content

Support building with GASNet-EX and MPI backends#322

Closed
manopapad wants to merge 4 commits intonv-legate:branch-22.10from
manopapad:gasnetex
Closed

Support building with GASNet-EX and MPI backends#322
manopapad wants to merge 4 commits intonv-legate:branch-22.10from
manopapad:gasnetex

Conversation

@manopapad
Copy link
Contributor

No description provided.

dest="gasnet",
action="store_true",
"--network",
dest="networks",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can feed multiple networks to Realm, but I am not advertising this currently, since this feature is not fully working on the Realm side.

@manopapad
Copy link
Contributor Author

One thing to note here is that previously we were assuming that "using networking" implied "using gasnet", and because gasnet requires MPI at build time, and initializes it at runtime, we could assume that "using networking" automatically meant "using MPI". This will not necessarily be the case for all backends, so we could presumably have two separate settings, USE_NETWORK and USE_MPI.

I chose to continue having a single USE_NETWORK setting; picking any networking backend, whether it requires MPI itself or not, also enables MPI support. Here is what USE_NETWORK=1 now implies regarding MPI:

  • We will build the legate sources with mpicc, regardless of what the Realm networking backed we are using was compiled with.
  • Any network-enabled build of legate will assume that an MPI execution environment has been set up (e.g. by mpirun or PMIx).
  • We will no longer be able to depend on some other component to MPI_Init for us. This is not currently true, since the MPI communicator expects the networking backend to have initialized MPI (see relevant inline comment on the code).

Comment on lines 248 to +250
log_coll.fatal(
"MPI has not been initialized, it should be initialized by "
"GASNet");
"the networking backend");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eddy16112 In the future we will want to also expect the case where the networking backend doesn't initialize MPI, and in that case do it ourselves.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have one variable GASNET_CONDUIT to tell which conduit is enabled for gasnet and another one LEGATE_NETWORK to tell which network backend is used? Then we can add a more comprehensive check here to check if we need to init MPI or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be simpler to just query MPI directly, using the MPI_Initialized call?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we do use MPI_Initialized. What I mean is we can have an extra level of verification. For example: if MPI_Initialized == true, then GASNET_CONDUIT should be mpi or ibv, otherwise, something is wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check these macros, defined in realm_defines.h:

// one of these is defined, depending on the network used:
#define REALM_USE_GASNET1
/* #undef REALM_USE_GASNETEX */
/* #undef REALM_USE_MPI */

// if using gasnet, one of these will be defined:
#define GASNET_CONDUIT_MPI 1
/* #undef GASNET_CONDUIT_IBV */
/* #undef GASNET_CONDUIT_UDP */
/* #undef GASNET_CONDUIT_ARIES */
/* #undef GASNET_CONDUIT_GEMINI */
/* #undef GASNET_CONDUIT_PSM */
/* #undef GASNET_CONDUIT_UCX */
/* #undef GASNET_CONDUIT_OFI */

"--gasnet",
dest="gasnet",
action="store_true",
"--network",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @trxcllnt @jefflarkin: Note this modification to an install.py flag, that might conflict with your changes.

@manopapad manopapad requested a review from magnatelee August 6, 2022 02:22
@magnatelee
Copy link
Contributor

@manopapad looks good to me.

One question though: is there a particular reason that we allow multiple networks? When both gasnet1 and gasnet-ex are given, the latter is always chosen by the build. And the MPI backend is activated only when REALM_NETWORKS is exactly "mpi" and contains nothing else. Though I don't expect the developers to specify more than one network, I don't see much value in allowing multiple networks.

@manopapad
Copy link
Contributor Author

One question though: is there a particular reason that we allow multiple networks?

@streichler had suggested we are at least able to pass multiple networks to Realm, in preparation for when Realm can actually support multiple network modules under the same build.

@streichler
Copy link

Realm has partially-implemented support for being able to run with multiple networks active. Although there are no immediate plans to complete that implementation, there is a desire to avoid structural limitations in build flows and the like that assume at most one network module is permitted.

@magnatelee
Copy link
Contributor

@streichler had suggested we are at least able to pass multiple networks to Realm, in preparation for when Realm can actually support multiple network modules under the same build.

that's what I guessed. I suggest we at least raise a warning when someone tries to do it, as it doesn't have the intended effect at the moment.

@manopapad
Copy link
Contributor Author

I added the warning as requested, but I was asked to sit on this PR while the cmake changes are finalized, and will rebase over that when it lands.

@manopapad manopapad added the category:improvement PR introduces an improvement and will be classified as such in release notes label Sep 9, 2022
@manopapad
Copy link
Contributor Author

Deprecated by #384

@manopapad manopapad closed this Sep 23, 2022
@manopapad manopapad deleted the gasnetex branch January 27, 2023 20:22
manopapad pushed a commit that referenced this pull request Mar 5, 2025
* Better heuristic to locate the gtest binary

* Make the fail-to-find-build-dir a soft failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category:improvement PR introduces an improvement and will be classified as such in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants