Skip to content

Ensure threads are created only for I/O processes#697

Draft
jayeshkrishna wants to merge 11 commits into
masterfrom
jayeshkrishna/fix_num_threads
Draft

Ensure threads are created only for I/O processes#697
jayeshkrishna wants to merge 11 commits into
masterfrom
jayeshkrishna/fix_num_threads

Conversation

@jayeshkrishna
Copy link
Copy Markdown
Contributor

The number of threads were determined by first
getting the thread pool instance and querying it.
However getting the thread pool instance involves
creating the threads, so threads were created for
all processes.

We now use a static method for querying the
number of threads and the thread pool is
initialized only for I/O processes.

Also, removing waits on pending hdf5 operations
when closing/opening files

Add interface to the thread pool manager to get the number of
threads (configured for the thread pool)
The compute processes do not require any threads. However these
processes need to know the number of threads (configured in
the thread pool - in the I/O procs)

On I/O processes use the thread pool manager to init the thread
pool and get the thread ids. On compute procs use the thread
pool manager to just get the number of threads.
Disable waiting on pending HDF5 ops when closing non-HDF5 files
Disable waits for HDF5 ops when opening files
@jayeshkrishna jayeshkrishna requested a review from Copilot June 4, 2026 18:06
@jayeshkrishna jayeshkrishna self-assigned this Jun 4, 2026
@jayeshkrishna jayeshkrishna added Next Release Enhancements slated for the upcoming (next) release HDF5 Issues/PRs related to HDF5 Async Issues/PRs related to Asynchronous I/O labels Jun 4, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment on lines +15 to +21
int nthreads = 1;
#if PIO_USE_ASYNC_WR_THREAD
tids_ = PIO_Util::PIO_async_tpool_manager::get_tpool_instance()->get_thread_ids();
nthreads += PIO_Util::PIO_async_tpool_manager::get_num_threads();
if(io_comm != MPI_COMM_NULL){
tids_ = PIO_Util::PIO_async_tpool_manager::get_tpool_instance()->get_thread_ids();
assert(static_cast<int>(tids_.size()) == nthreads - 1);
}
Comment on lines +136 to +142
int nthreads = 1;
#if PIO_USE_ASYNC_WR_THREAD
/* Number of threads including the main thread */
std::size_t nthreads = tids_.size() + 1;
assert(union_comms_.size() == nthreads);
for(std::size_t tidx = 1; tidx < nthreads; tidx++){
nthreads += PIO_Util::PIO_async_tpool_manager::get_num_threads();
#endif
assert(static_cast<int>(union_comms_.size()) == nthreads);
for(int tidx = 1; tidx < nthreads; tidx++){
Avoid duping comms that the I/O threads don't use (or should be using)
Re-formatting code in pio_err function

This commit does not change the functionality (or add new code)
Since threads no longer have valid global comms, updating the
logic to print error by checking for validity of the comms before
using it
Threads should not be using global comms. Updating barrier in async
op to be on thread-specific I/O comm
When closing file ensure that we don't sync on the async I/O
threads
Since not all error handlers are multi-threaded use the return
error handler when closing files (in async+threaded case).
The new test case creates many files and writes to them using one
decomp per file. The test case has two phases for each I/O type,

1) Create decomposition, create file and write to file
2) Sync data (close+open for sync if needed), read and verify,
   close+delete files, free decomposition

Phase 1 is executed for all files before moving to phase 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Async Issues/PRs related to Asynchronous I/O HDF5 Issues/PRs related to HDF5 Next Release Enhancements slated for the upcoming (next) release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants