Skip to content

Module-level mutable globals in CachedNodeNorm / CachedNameRes are not thread-safe #89

@gaurav

Description

@gaurav

Problem

services/nodenorm.py and services/nameres.py each hold a module-level dict for the from_url() factory:

cached_node_norms_by_url = {}   # nodenorm.py
cached_nameres_by_url = {}      # nameres.py

Under the current xdist model (per-process workers) this is safe. But if tests ever run in a threaded context, concurrent from_url() calls could race on these dicts. The right fix is to make the dict a class variable or use a lock.

Suggested Fix

class CachedNodeNorm:
    _instances: dict[str, "CachedNodeNorm"] = {}

    @classmethod
    def from_url(cls, nodenorm_url: str) -> "CachedNodeNorm":
        if nodenorm_url not in cls._instances:
            cls._instances[nodenorm_url] = cls(nodenorm_url)
        return cls._instances[nodenorm_url]

Why it was deferred

Identified during PR #67 review. No current threading concern; deferred to avoid scope creep.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions