Change implementation of the __init__() must be called when overriding __init__ safety feature to work for any metaclass.#30095
Conversation
* Call from new `tp_init_intercepted()` (adopting mechanism first added in PyCLIF: google/clif@7cba87d). * Remove `pybind11_meta_call()` (which was added with pybind/pybind11#2152).
7084009 to
7b66ebe
Compare
…nit != tp_init_intercepted`
…in).
```
==6380==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x5611589c9a58 in Py_DECREF third_party/python_runtime/v3_11/Include/object.h:537:9
...
Uninitialized value was created by a heap deallocation
#0 0x5611552757b0 in free third_party/llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:218:3
google#1 0x56115898e06b in _PyMem_RawFree third_party/python_runtime/v3_11/Objects/obmalloc.c:154:5
google#2 0x56115898f6ad in PyObject_Free third_party/python_runtime/v3_11/Objects/obmalloc.c:769:5
google#3 0x561158271bcc in PyObject_GC_Del third_party/python_runtime/v3_11/Modules/gcmodule.c:2407:5
google#4 0x7f21224b070c in pybind11_object_dealloc third_party/pybind11/include/pybind11/detail/class.h:483:5
google#5 0x5611589c2ed0 in subtype_dealloc third_party/python_runtime/v3_11/Objects/typeobject.c:1463:5
...
```
…ith PyPy and Python `type` as metaclass")
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602787434
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602884828
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602884828
…n/sharding.cc This change is to unblock google/pybind11clif#30095. Leaving wrapped C++ types uninitialized creates a potential for triggering undefined behavior from Python. PiperOrigin-RevId: 602884828
…pe' is not used` diagnostics (many CI jobs; seems to be a clang issue).
__init__() must be called when overriding __init__ safety feature to work for any metaclass.
| type->tp_base = type_incref(&PyType_Type); | ||
| type->tp_flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HEAPTYPE; | ||
|
|
||
| #if defined(PYPY_VERSION) |
There was a problem hiding this comment.
The new mechanism does not need to override tp_call?
There was a problem hiding this comment.
I think what is missing in the context is that pybind11_meta_call only conducts protection for base class init, and can thus be skipped on CPYTHON.
Maybe defining a few secondary compilation conditions,
#if defined(PYPY_VERSION)
#define PYBIND11_ENABLE_TP_META_CALL_INIT_PROTECTION
#else
#define PYBIND11_ENABLE_SAFE_TP_INIT_INIT_PROTECTION
#endif
and wrap the two protection mechanisms in their corresponding conditions??
There was a problem hiding this comment.
No: Basically, the starting point for the original change in PyCLIF was exactly to find a trick that doesn't involve manipulating the metaclass.
I also thought a little bit about "intercepting" PyType_Type.tp_call, but that would be very intrusive and have a runtime impact for pretty much all calls creating a Python object.
There was a problem hiding this comment.
Maybe defining a few secondary compilation conditions,
Done, thanks for the suggestion! I chose different names for the macros and made it so that the choice of implementation can be overridden externally.
| } | ||
|
|
||
| using derived_tp_init_registry_type | ||
| = std::unordered_map<PyTypeObject *, int (*)(PyObject *, PyObject *, PyObject *)>; |
There was a problem hiding this comment.
what about initproc as the value type?
There was a problem hiding this comment.
Much nicer, thanks! (I totally didn't think of that)
| from pybind11_tests import python_multiple_inheritance as m | ||
|
|
||
| # | ||
| # Using default py::metaclass(): |
There was a problem hiding this comment.
# CppBase0 Uses default py::metaclass()
|
|
||
|
|
||
| # | ||
| # Using py::metaclass((PyObject *) &PyType_Type): |
| // This mechanism was originally developed here: | ||
| // https://github.com/google/clif/commit/7cba87dd8385ab707c98e814ce742eeca877eb9e | ||
| extern "C" inline int tp_init_intercepted(PyObject *self, PyObject *args, PyObject *kw) { | ||
| assert(PyType_Check(self) == 0); |
There was a problem hiding this comment.
How is this condition guaranteed? Looks like we don't ensure it when intercepting tp_init, based on my initial read of the pybind11object_new method.
PS: if it was not for consisitency with pyclif naming, I would recommend calling this safe_tp_init.
There was a problem hiding this comment.
How is this condition guaranteed?
It's not. I put the assert() here only to be sure my code isn't doing something unexpected. This PR is already TGP tested, which makes me believe it is very unlikely that the assert() will ever fire, but if it does, I'll have a concrete situation to fix (as opposed to anticipating/guessing).
PS: if it was not for consisitency with pyclif naming,
Name changed to tp_init_with_safety_checks.
(After this PR is merged I'll go back and change the PyCLIF code accordingly. I also want to backport the weakref-based cleanup.)
| /// Instance creation function for all pybind11 types. It only allocates space for the | ||
| /// C++ object, but doesn't call the constructor -- an `__init__` function must do that. | ||
| extern "C" inline PyObject *pybind11_object_new(PyTypeObject *type, PyObject *, PyObject *) { | ||
| #if defined(PYBIND11_INIT_SAFETY_CHECKS_VIA_INTERCEPTING_TP_INIT) |
There was a problem hiding this comment.
maybe the ifdef for pybind11_meta_call should also be in the body like here?
There was a problem hiding this comment.
Decided on chat: leaving as is.
…ges. In order of importance: * Bug fix (moderately important): Missing `Py_DECREF(type_self)` in `tp_dealloc_impl`. * This bug was triggered with the upgrade to Python to 3.9. It was discovered coincidentally when adding the new `testDerivedTpInitRegistryWeakrefBasedCleanup` (see below). When running the test in a `while True` loop, the Resident Memory Size (`RES` in `top`) exceeded 1 GB after just a few seconds. * Root cause: python/cpython#79991 * Critical clue leading to the fix: https://github.com/pybind/pybind11/blob/768cebe17e65c2a0a64ed067510729efc3c7ff6c/include/pybind11/detail/class.h#L467-L469 * Bug fix (very minor): Incorrect `Py_DECREF(self)` in `tp_init_with_safety_checks`. * This bug was introduced with cl/559501787. The `Py_DECREF(self)` was accidentally adopted along with the corresponding pybind11 error message (see link in description of cl/559501787). It was discovered coincidentally by MSAN (heap use after free) while testing [google/pybind11clif#30095](google/pybind11clif#30095). * After inspecting https://github.com/python/cpython/blob/b833b00482234cb0f949e6c277ee1bce1a2cbb85/Objects/typeobject.c#L1103-L1106 it became clear that the `Py_DECREF(self)` is indeed incorrect in this situation (it is correct in the original pybind11 sources). * The weakref-based cleanup added in [google/pybind11clif#30095](google/pybind11clif#30095) is ported back to PyCLIF. — This was the original purpose of this CL. * `tp_init_intercepted` is renamed to `tp_init_with_safety_checks` for for compatibility with google/pybind11clif#30095. GitHub testing: #90 PiperOrigin-RevId: 604337502
Description
The safety feature introduced with pybind/pybind11#2152 (
__init__() must be called when overriding __init__) only works for the defaultpy::metaclass(). This PR changes the implementation of the safety feature so that it works for any metaclass (except when using PyPy; see below). (This has already uncovered missingsuper.__init__()calls in jax.)The main driving force for this PR is that PyCLIF-pybind11 needs to use
py::metaclass((PyObject *) &PyType_Type). Without this PR, the equivalent safety feature in PyCLIF would be lost. Note that it was a significant effort to clean up the Google codebase before the PyCLIF safety feature could be enabled, with this PyCLIF commit; i.e. it would be a very concerning loss.This PR is based on a mechanism originally introduced with that PyCLIF commit. However, the original PyCLIF mechanism lacks the weakref-based cleanup of the internal registry backing the mechanism (
derived_tp_init_registry). While this has not been a problem in practice, it is easy enough to add the cleanup feature here. The corresponding code is very similar to existing code here, which is a critical part of the core pybind11 functionality; in other words, it is a heavily tested and time-tested approach.Details, based on the original PyCLIF commit message linked above:
Situation:
CppBaseis apy::class_-wrapped C++ object.What happens when the Python interpreter processes the following code (usually at import time)?
When the native Python
PCclass is built:PCtp_newis set to useCppBasetp_new, butPCtp_initdoes NOT in any way involveCppBasetp_init.It is the responsibility of
PC.__init__to callCppBase.__init__, but this is not checked.The approach used in this PR is:
PCtp_initis replaced with an "intercept" function.The intercept function calls the original
PCtp_init.After that call finishes (and if it was successful), the intercept function checks if the
CppBasewrapped C++ object was initialized.Note that the
derived_tp_init_registry()->count(type) == 0condition inpybind11_object_new()enables daisy-chaining similar intercept functions, e.g. in other pybind11 extensions built with hidden visibility, or potentially other binding systems.The intercept mechanism turns out to not be compatible with PyPy, therefore the safety feature cannot be enabled for metaclasses other than the default
py::metaclass().Suggested changelog entry: