Skip to content

Conversation

@guilara
Copy link
Contributor

@guilara guilara commented Jul 30, 2024

Proposed changes

Add submit script information about Urania. Add no-validate flag to scheduler to prevent it from attempting to run mpi on the head node ( #6183).

Upgrade instructions

Code review checklist

  • The code is documented and the documentation renders correctly. Run
    make doc to generate the documentation locally into BUILD_DIR/docs/html.
    Then open index.html.
  • The code follows the stylistic and code quality guidelines listed in the
    code review guide.
  • The PR lists upgrade instructions and is labeled bugfix or
    new feature if appropriate.

Further comments

@guilara guilara force-pushed the PRs-Urania-SubmitScript branch from 2700b51 to 5a846f9 Compare July 30, 2024 09:02
@guilara guilara requested a review from nilsvu July 30, 2024 09:09
@guilara guilara marked this pull request as draft July 30, 2024 09:37
@guilara guilara force-pushed the PRs-Urania-SubmitScript branch from 5a846f9 to 9ab841b Compare July 30, 2024 09:44
@guilara guilara marked this pull request as ready for review July 30, 2024 09:45
@guilara guilara force-pushed the PRs-Urania-SubmitScript branch 3 times, most recently from 6f41978 to 455043b Compare July 30, 2024 12:13
{{ super() -}}
#SBATCH -D ./
#SBATCH --nodes {{ num_nodes | default(1) }}
#SBATCH --ntasks-per-node=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have found that 2 tasks per node work well on other clusters of this size, maybe even 3. I see you currently do 1 task but 2 comm threads. Have you tried 2 tasks with 1 comm thread each? Not sure if there's a difference @knelli2 @nilsdeppe do you know?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically each task = an MPI rank = a comm core

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't really experimented with this. So I don't know.

@guilara guilara force-pushed the PRs-Urania-SubmitScript branch 3 times, most recently from eddd75c to 3d4f862 Compare July 31, 2024 15:45
@guilara guilara requested a review from nilsvu July 31, 2024 15:48
@guilara guilara changed the title Add Urania machine and no-validate flag to scheduler Add Urania machine and validate flag to scheduler Jul 31, 2024
Copy link
Member

@nilsvu nilsvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can squash

spectre_setup_charm_paths() {
# Define Charm paths
export CHARM_ROOT=/u/guilara/charm_impi_2/mpi-linux-x86_64-smp
export PATH=$PATH:/u/guilara/charm_impi_2/mpi-linux-x86_64-smp/bin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you create a module file like /u/guilara/modules/charm/7.0.0-impi-2 that sets these paths. Then you can module load it below and don't have to call this extra spectre_setup_charm_paths function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I want to tackle this right now. Installing charm was a bit tricky

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then better set these paths in spectre_load_modules, or else you always have to call this extra function?

-D Python_EXECUTABLE=${SPECTRE_HOME}/env/bin/python \
-D Catch2_DIR=/u/guilara/repos/Catch2/install_dir/lib64/cmake/Catch2 \
-D MPI_C_COMPILER=/mpcdf/soft/SLE_15/packages/skylake\
/impi/gcc_11-11.2.0/2021.7.1/bin/mpigcc \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So cmake doesn't find these without help even though the impi/2021.7 module is loaded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just tried removing them. I must have added them because it couldn't find it.

-- Could NOT find MPI_C (missing: MPI_C_LIB_NAMES MPI_C_HEADER_DIR MPI_C_WORKS) 
-- Could NOT find MPI (missing: MPI_C_FOUND C)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is MPI needed? If everything works without explicitly linking MPI then just remove these lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do need it, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we don't link to it explicitly, only charm is built with it.

@guilara guilara force-pushed the PRs-Urania-SubmitScript branch 2 times, most recently from 2c0688e to 1438c07 Compare August 5, 2024 18:11
@guilara guilara requested a review from nilsvu August 5, 2024 18:14
Copy link
Contributor

@knelli2 knelli2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good for now. If you want to change anything later, you can.

Please squash this all into two commits (after @nilsvu has looked at this again). 1 for the new validate option in python, and 2 for adding the Urania env, machine, and submit files.

Comment on lines 28 to 30
source ${SPECTRE_HOME}/support/Environments/urania.sh
spectre_load_modules
spectre_setup_charm_paths
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[optional] Might also want to module list after this so you can see what's loaded

spectre_setup_charm_paths() {
# Define Charm paths
export CHARM_ROOT=/u/guilara/charm_impi_2/mpi-linux-x86_64-smp
export PATH=$PATH:/u/guilara/charm_impi_2/mpi-linux-x86_64-smp/bin
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then better set these paths in spectre_load_modules, or else you always have to call this extra function?

-D Python_EXECUTABLE=${SPECTRE_HOME}/env/bin/python \
-D Catch2_DIR=/u/guilara/repos/Catch2/install_dir/lib64/cmake/Catch2 \
-D MPI_C_COMPILER=/mpcdf/soft/SLE_15/packages/skylake\
/impi/gcc_11-11.2.0/2021.7.1/bin/mpigcc \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we don't link to it explicitly, only charm is built with it.

source /urania/u/guilara/repos/spack/var/spack/environments\
/env3_spectre_impi/loads
# Load python environment
source /u/guilara/envs/spectre_env
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing "bin/activate" ?

@knelli2 knelli2 added cli/pybindings Command line interface & Python bindings clusters Super computer support (env, submit, machine files) labels Aug 27, 2024
@guilara guilara force-pushed the PRs-Urania-SubmitScript branch 4 times, most recently from aa58e5e to 303d00e Compare May 15, 2025 13:51
@guilara guilara mentioned this pull request May 15, 2025
3 tasks
@guilara
Copy link
Contributor Author

guilara commented May 15, 2025

@nilsvu I know its been a while. I've rebased this. Hopefully we can get this merged. Unfortunately, spack loads doesn't work for me on either Urania nor Viper so I've kept the "slow" way of activating the spack environment

nilsvu
nilsvu previously approved these changes May 15, 2025
@nilsvu nilsvu enabled auto-merge May 15, 2025 15:16
auto-merge was automatically disabled May 15, 2025 15:50

Head branch was pushed to by a user without write access

@guilara guilara force-pushed the PRs-Urania-SubmitScript branch from dd3303e to bfc8b53 Compare May 15, 2025 15:50
@guilara guilara requested a review from nilsvu May 15, 2025 15:50
@guilara
Copy link
Contributor Author

guilara commented May 15, 2025

@nilsvu I've just removed here the function to set the charm path

Add no-validate flag to scheduler

Change name of option

Add environment file

Clean script

Fix validate flag

Fix
@guilara guilara force-pushed the PRs-Urania-SubmitScript branch from bfc8b53 to 032dec9 Compare May 15, 2025 17:59
@guilara guilara requested a review from nilsvu May 15, 2025 18:00
@nilsvu nilsvu merged commit 0f43b66 into sxs-collaboration:develop May 15, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli/pybindings Command line interface & Python bindings clusters Super computer support (env, submit, machine files)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants