Skip to content

Eliminate overhead of fetching and testing NULL attributes in STORE_ATTR specializations for new objects #144141

@markshannon

Description

@markshannon

Consider

class C:
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c
C(1,2,3)

This produces a trace looking something like this:

...
_CHECK_AND_ALLOCATE_OBJECT
_CREATE_INIT_FRAME
_PUSH_FRAME
# Some guards
_LOAD_FAST_BORROW_1
_LOAD_FAST_BORROW_0
# Some more guards
_STORE_ATTR_INSTANCE_VALUE
# Some more guards
_LOAD_FAST_BORROW_2
_LOAD_FAST_BORROW_0
# Some more guards
_STORE_ATTR_INSTANCE_VALUE
# Some more guards
_LOAD_FAST_BORROW_3
_LOAD_FAST_BORROW_0
# Some more guards
_STORE_ATTR_INSTANCE_VALUE
...

Each of those _STORE_ATTR_INSTANCE_VALUE reads the old value out of memory and then conditionally decrefs it.
But in this case we know that the old value was NULL so we can just overwrite it.
So we can replace this:

        PyObject **value_ptr = (PyObject**)(((char *)owner_o) + offset);
        PyObject *old_value = *value_ptr;
        FT_ATOMIC_STORE_PTR_RELEASE(*value_ptr, PyStackRef_AsPyObjectSteal(value));
        if (old_value == NULL) {
            PyDictValues *values = _PyObject_InlineValues(owner_o);
            Py_ssize_t index = value_ptr - values->values;
            _PyDictValues_AddToInsertionOrder(values, index);
        }
        Py_XDECREF(old_value);

with this:

        PyObject **value_ptr = (PyObject**)(((char *)owner_o) + offset);
        FT_ATOMIC_STORE_PTR_RELEASE(*value_ptr, PyStackRef_AsPyObjectSteal(value));
        PyDictValues *values = _PyObject_InlineValues(owner_o);
        Py_ssize_t index = value_ptr - values->values;
       _PyDictValues_AddToInsertionOrder(values, index);

On Aarch64, this reduces the number of machine instructions from 48 to 26.

The same reasoning also applies to _STORE_ATTR_SLOT where it reduces the number of machine instructions from 32 to 14.

See also #134584

We can probably remove some of those guards as well, but that's a separate issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usage

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions