Skip to content

Python: reacquire GIL for deallocation operations#1878

Merged
franzpoeschel merged 5 commits into
openPMD:devfrom
franzpoeschel:python-reaquire-gil-for-deallocation
Apr 30, 2026
Merged

Python: reacquire GIL for deallocation operations#1878
franzpoeschel merged 5 commits into
openPMD:devfrom
franzpoeschel:python-reaquire-gil-for-deallocation

Conversation

@franzpoeschel

@franzpoeschel franzpoeschel commented Apr 28, 2026

Copy link
Copy Markdown
Contributor

This allows releasing the GIL for flushing, making it possible to run a data loading job in another Python thread

TODO:

  • Do the same for all other flush operations (Iteration::close, Attributable::seriesFlush)
  • Release 0.17.1 yay or nay? yay is defensible :D

@franzpoeschel franzpoeschel requested a review from ax3l April 28, 2026 12:37
@franzpoeschel franzpoeschel added this to the 0.18.0 milestone Apr 28, 2026
Comment thread src/binding/python/Series.cpp
Offset const &offset,
Extent const &extent)
{
a.inc_ref();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that removed now? because .reset() blocks now until a is handled?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a bug I noticed while working on this PR, I extracted this to #1879.
With #1775, I removed explicit ref-counting and replaced it by owning Pybind references. The unmatched inc_ref() is basically a memory leak.

Comment on lines +334 to 339
(T *)data,
[owning_handle =
std::make_optional(a.cast<py::object>())](T *) mutable {
py::gil_scoped_acquire need_the_gil_for_this;
owning_handle.reset();
});

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused what happens here. In dev it was a no-op, now it does aquire the gil and call reset. The do string is the same 😅 what's happening :D

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before this PR, owning_handle is auto-destroyed by RAII, that's why the lambda itself is a no-op. But since destroying a Python handle latches into the Python interpreter, we need to do it while holding the GIL, so I moved the destruction inside the function body.

@@ -283,22 +284,8 @@ not possible once it has been closed.
.def(
"__getitem__",
[](Snapshots &s, Series::IterationIndex_t key) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex saw this:

  • src/binding/python/Series.cpp:286: the new unconditional py::gil_scoped_release also covers SnapshotWorkflow::RandomAccess. In
    that path s[key] reaches RandomAccessIteratorContainer::operator[], which can mutate the underlying container by inserting a
    missing iteration. That was previously serialized by the GIL and is not a flushing/blocking path. I’d keep the old switch and
    only release the GIL for the synchronous workflow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Careful: s[key] in C++ inserts an element if missing on first access.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bug existed before this PR, but we can use this PR to fix it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Not a bug, false positive. This binds Snapshots, an opaque C++ map type, not managed by Python. Python only accesses this through methods, so the interpreter is not bothered by container state manipulation.

@franzpoeschel franzpoeschel requested a review from ax3l April 29, 2026 15:02
@franzpoeschel franzpoeschel changed the title Python: reaquire GIL for deallocation operations Python: reacquire GIL for deallocation operations Apr 29, 2026
@franzpoeschel franzpoeschel merged commit 6445feb into openPMD:dev Apr 30, 2026
31 checks passed
franzpoeschel added a commit to franzpoeschel/openPMD-api that referenced this pull request Apr 30, 2026
* Delete stray inc_ref

* Reaquire the GIL for releasing references to Python buffers

* Use std::optional for simplified explicit destruction

* Make Iterator GIL free

* Add this also for other flushing methods
franzpoeschel pushed a commit to franzpoeschel/openPMD-api that referenced this pull request Apr 30, 2026
@franzpoeschel franzpoeschel modified the milestones: 0.18.0, 0.17.1 May 12, 2026
ax3l added a commit that referenced this pull request Jun 23, 2026
* Version bump pybind11 -> 3.0.2 (#1849)

* Bump Pybind11 to 3.0.2

* Use cpp_function for defining properties

* Revert pybind11 default version back to old

* Revert "Revert pybind11 default version back to old"

This reverts commit 3f3063b.

* Bump toml11 to 4.4.0, nlohmann_json to 3.12.0 (#1842)

* Bump versions for toml11 nlohmann_json pybind11

* Return py::object instead of std::variant to pybind11

* Python: Fix keep_alive specifications, add tests for keep_alive (#1851)

* Fix keepalives

* Add simple GC test

* Add keepalive for snapshots api

* Add more extensive keepalive test

* Slightly API-breaking.. need to del everything

* tmp, check sth

* tmp check ci

* Revert "tmp check ci"

This reverts commit 1cf6973.

* Revert "tmp, check sth"

This reverts commit 93ed467.

* Fix typing issues with load/store_chunk in Python

* Remove duplicate Python test (#1867)

These are prefixed by test*, so they already run separately.

* Fix AppVeyor 64bit build (#1832)

* CI: Fix AppVeyor x64 (64bit) build: Architecture not part of `-G` anymore, maybe `-A x64`
* Fix instantiations for AttributeWithShapeAndResource
* `PatchRecordComponent`: Match Identical Types
* Consistency: `dtype_to_numpy` and `dtype_from_numpy`
* Update: `APITest.py` `testDataset`

Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com>

* Harmonize Datatype equality checks (#1854)

* Fix typing issues with load/store_chunk in Python

* Datatype helpers: non-template variants

* Unify Datatype equality semantics

* replace operator==(Datatype, Datatype) by isSame

* HDF5: Close operations also upon failure (#1866)

Replicate a Golang-inspired defer pattern in order to ensure resource cleanup also upon early return.

---------

Co-authored-by: AI Agent <ai@anthropic.com>

* HDF5: proper status check (#1870)

* Fix parallel deletion (#1858)

* create_directories: preserve sticky and setgid permissions (#1855)

* Fix sticky permissions

* Use the helpers only privately

* Cache Iteration indexes instead of computing them on the spot (#1860)

* attempt to fix issue #1859 by caching the index and avoid indexOf() call
for iterations

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com>

* Do not flush if the backend does not support Span API (#1863)

* Do not flush if the backend does not support Span API

* Fix flushing logic

* Flush dirty ADIOS2 files in sorted order (#1868)

* Keep written flag consistent across MPI ranks for rankTable (#1869)

* WIP: First draft of CHANGELOG for 0.17.1

This is an initial draft based on commits since v0.17.0.
Categories and items may be adjusted before release.

* Do not initially emplace any value for the cached Iteration index (#1873)

this avoids subtle bugs
followup to #1860

* Update changelog

* Fix defer::to_opaque (#1872)

* Remove catch2/catch_tostring.hpp include (#1875)

test/Files_Core/read_nonexistent_attribute.cpp:23:10: fatal error: catch2/catch_to
string.hpp: No such file or directory
   23 | #include <catch2/catch_tostring.hpp>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

* ADIOS2: enable stats by default, starting ADIOS2 v2.12.1 (#1877)

* ADIOS2: enable stats by default, starting ADIOS2 v2.12.0

This PR ornladios/ADIOS2#4985 changes things a bit performance-wise.

* Use most recent main branch of toml11 (#1874)

* Update toml11 to current main tag

b32a2fff0d27e1f7522f26a125101500ddb47156

* Define TOML11_DISABLE_SOURCE_LOCATION (up for discussion)

* Python: reacquire GIL for deallocation operations (#1878)

* Delete stray inc_ref

* Reaquire the GIL for releasing references to Python buffers

* Use std::optional for simplified explicit destruction

* Make Iterator GIL free

* Add this also for other flushing methods

* Migrate C++ Tests to Catch2 v3 (#1823)

* Migrate to Catch2 v3


Co-authored-by: Franz Pöschel <franz.poeschel@gmail.com>

* Update CHANGELOG: add missing backported commits (#1877 #1878 #1823)

* Delete stray inc_ref (#1879)

* Update changelog

* version -> 0.17.1

retrigger ci

* Doc: CASUS "for" (#1883)

The official spelling (homepage, recent papers) seems to use
"Center for Advanced ..." spelling, not "of". Update accordingly :)

* Update changelog

* Forbid slashes in attribute and variable names (#1884)

* Update changelog

* Fix contiguous check: Accept only C-contiguous, not Fortran contiguous (#1886)

* Update changelog

* Bump actions/checkout from 6 to 7

Bumps [actions/checkout](https://github.com/actions/checkout) from 6 to 7.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](actions/checkout@v6...v7)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix deprecated use in InvalidatableFile.hpp

* CI: Build Windows wheels with CMake 4.x (VS 2026)

The "MSVC w/o MPI via pip" job failed on the windows-latest runner after
it upgraded to Visual Studio 2026 (VS 18):

    Generator NMake Makefiles does not support platform specification,
    but platform x64 was specified.
    CMAKE_C_COMPILER not set, after EnableLanguage

`pip wheel` builds in an isolated environment that installs CMake per
pyproject.toml's build requirement. The `<4.0.0` cap pinned that CMake to
3.x, which predates the "Visual Studio 18 2026" generator (added in CMake
4.2), so CMake fell back to "NMake Makefiles" and rejected setup.py's
`-A x64`.

Drop the CMake upper bound so the isolated build uses a current CMake that
auto-detects the VS 2026 generator, matching the runner's own CMake that
the non-pip MSVC job already builds with. Also bump the job to Python 3.11.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* `setup.py`: License SPDX

The license metadata was replaced with a standard SPDX identifier.

* Changelog: Finalize

* cleanup: test version

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Axel Huebl <axel.huebl@plasma.ninja>
Co-authored-by: AI Agent <ai@anthropic.com>
Co-authored-by: Junmin Gu <guj@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: AI Agent <ai-agent@localhost>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: David Grote <grote1@llnl.gov>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants