#49915
#49916
Related Code
|
RETURN_NOT_OK(TensorToNdarray(sparse_index.indptr()[i], base, &item)); |
|
if (PyList_SetItem(indptr.obj(), i, item) < 0) { |
|
Py_XDECREF(item); |
|
RETURN_NOT_OK(TensorToNdarray(sparse_index.indices()[i], base, &item)); |
|
if (PyList_SetItem(indices.obj(), i, item) < 0) { |
|
Py_XDECREF(item); |
Component(s)
Python
Reference Counting and Ownership in CPython Native API
Borrowed Reference
A borrowed reference is a reference obtained from an object that you don't own. You don't need to decrement its reference count when you're done with it, but you must ensure the object stays alive while you're using it (e.g., by creating an owned reference with Py_INCREF if necessary). Borrowed references are typically returned by functions like PyList_GetItem(), which returns an item from a list without incrementing its reference count.
New Reference
A new reference (also called an "owned reference") is a reference that you have ownership of. When you receive a new reference from a function (such as PyObject_New() or Py_BuildValue()), you are responsible for calling Py_DECREF() on it when you no longer need it to properly decrement its reference count. Failure to do so causes memory leaks.
Stolen Reference (Stealing)
A stolen reference is when a function takes ownership of a reference you pass to it. When you pass an object reference to a function that "steals" it, you no longer own that reference, and you should not call Py_DECREF() on it afterward. The function assumes full responsibility for managing the reference count of that object.
Example: PyList_SetItem
PyList_SetItem is a classic example of a reference-stealing function. According to the documentation:
"Set the item at index index in list to item. Return 0 on success. If index is out of bounds, return -1 and set an IndexError exception. Note: This function 'steals' a reference to item and discards a reference to an item already in the list at the affected position."
However, a critical ambiguity in the documentation is that it does not clearly state whether the reference is stolen in the case of function failure. This is fundamentally an all-or-nothing problem: when you pass an object to a stealing function, you must understand whether ownership is unconditionally transferred or only transferred on success.
Looking at the CPython source code clarifies this behavior:
int
PyList_SetItem(PyObject *op, Py_ssize_t i,
PyObject *newitem)
{
if (!PyList_Check(op)) {
Py_XDECREF(newitem);
PyErr_BadInternalCall();
return -1;
}
// ...
}
As demonstrated in the source code above, PyList_SetItem unconditionally calls Py_XDECREF(newitem) when the type check fails—meaning it always steals the reference, even on failure. The function takes ownership of newitem regardless of whether it successfully inserts the item into the list.
This behavior has serious implications for correct API usage. Consider the following incorrect code:
if (PyList_SetItem(a, b, something) < 0) {
Py_DECREF(something); // DANGER: Use-After-Free!
}
This code is defective because it leads to a use-after-free vulnerability. Since PyList_SetItem already stole the reference (and decremented it on failure via Py_XDECREF), the additional Py_DECREF(something) in the error-handling block causes a double decrement, potentially leading to an assertion failure in debug builds or memory corruption and crashes in release builds.
The correct pattern is simply:
if (PyList_SetItem(a, b, something) < 0) {
// Do NOT call Py_DECREF on 'something' - the reference was already stolen
return NULL; // or handle error appropriately
}
In summary, when dealing with stealing functions in the CPython API, you must relinquish all ownership responsibility for the passed reference and never decrement it after the call, regardless of the return value. Always consult the source code or thoroughly documented behavior to confirm whether a function truly provides an all-or-nothing stealing guarantee.
#49915
#49916
Related Code
arrow/python/pyarrow/src/arrow/python/numpy_convert.cc
Lines 396 to 398 in 0600621
arrow/python/pyarrow/src/arrow/python/numpy_convert.cc
Lines 404 to 406 in 0600621
Component(s)
Python
Reference Counting and Ownership in CPython Native API
Borrowed Reference
A borrowed reference is a reference obtained from an object that you don't own. You don't need to decrement its reference count when you're done with it, but you must ensure the object stays alive while you're using it (e.g., by creating an owned reference with
Py_INCREFif necessary). Borrowed references are typically returned by functions likePyList_GetItem(), which returns an item from a list without incrementing its reference count.New Reference
A new reference (also called an "owned reference") is a reference that you have ownership of. When you receive a new reference from a function (such as
PyObject_New()orPy_BuildValue()), you are responsible for callingPy_DECREF()on it when you no longer need it to properly decrement its reference count. Failure to do so causes memory leaks.Stolen Reference (Stealing)
A stolen reference is when a function takes ownership of a reference you pass to it. When you pass an object reference to a function that "steals" it, you no longer own that reference, and you should not call
Py_DECREF()on it afterward. The function assumes full responsibility for managing the reference count of that object.Example:
PyList_SetItemPyList_SetItemis a classic example of a reference-stealing function. According to the documentation:However, a critical ambiguity in the documentation is that it does not clearly state whether the reference is stolen in the case of function failure. This is fundamentally an all-or-nothing problem: when you pass an object to a stealing function, you must understand whether ownership is unconditionally transferred or only transferred on success.
Looking at the CPython source code clarifies this behavior:
As demonstrated in the source code above,
PyList_SetItemunconditionally callsPy_XDECREF(newitem)when the type check fails—meaning it always steals the reference, even on failure. The function takes ownership ofnewitemregardless of whether it successfully inserts the item into the list.This behavior has serious implications for correct API usage. Consider the following incorrect code:
This code is defective because it leads to a use-after-free vulnerability. Since
PyList_SetItemalready stole the reference (and decremented it on failure viaPy_XDECREF), the additionalPy_DECREF(something)in the error-handling block causes a double decrement, potentially leading to an assertion failure in debug builds or memory corruption and crashes in release builds.The correct pattern is simply:
In summary, when dealing with stealing functions in the CPython API, you must relinquish all ownership responsibility for the passed reference and never decrement it after the call, regardless of the return value. Always consult the source code or thoroughly documented behavior to confirm whether a function truly provides an all-or-nothing stealing guarantee.