Skip to content

fix(homeassistant): bootstrap state via REST instead of WS get_states#383

Merged
tomquist merged 2 commits into
developfrom
claude/fix-state-fetch-rest-no-changes
May 27, 2026
Merged

fix(homeassistant): bootstrap state via REST instead of WS get_states#383
tomquist merged 2 commits into
developfrom
claude/fix-state-fetch-rest-no-changes

Conversation

@tomquist
Copy link
Copy Markdown
Owner

@tomquist tomquist commented May 21, 2026

The previous get_states bootstrap used the WebSocket get_states command, which returns every entity in the Home Assistant state machine — wasteful for setups with many entities when we only care about one or a handful.

Swap to per-entity REST GET /api/states/<entity_id> requests, scheduled as a background task right after subscribe_entities. Cancel any in-flight bootstrap on disconnect or stop so a late response can't resurrect a stale value after _reset_for_reconnect clears the cache.

Summary by CodeRabbit

  • Bug Fixes
    • Resolved startup timeout when a configured Home Assistant sensor kept its initial value.
    • Initial sensor values are now reliably loaded at startup even if no state change occurs.
    • Background fetches are canceled on disconnect/reconnect to avoid applying stale values.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 21, 2026

Warning

Review limit reached

@tomquist, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 19 minutes and 11 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 328082fa-1a4e-46f8-96e7-a50acd1b215f

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf5713 and ca08289.

📒 Files selected for processing (1)
  • src/astrameter/powermeter/homeassistant_test.py

Walkthrough

The pull request migrates Home Assistant powermeter initialization from a WebSocket get_states request to a per-entity REST bootstrap scheduled after auth_ok, with task cancellation on stop/reconnect and updated tests and changelog.

Changes

REST-based initial state bootstrap for Home Assistant powermeter

Layer / File(s) Summary
Task infrastructure and URL builder
src/astrameter/powermeter/homeassistant.py
Add _fetch_states_task field to track the REST bootstrap coroutine and introduce _build_state_url() to construct authenticated per-entity state endpoints with scheme/path-prefix support.
Initial state fetch and auth-ok integration
src/astrameter/powermeter/homeassistant.py, src/astrameter/powermeter/homeassistant_test.py
Implement _fetch_initial_states() coroutine to sequentially fetch and cache tracked entity states via REST, triggered by auth_ok; remove WebSocket get_states request/id handling; update tests to assert only subscribe_entities is sent on auth.
Task lifecycle management
src/astrameter/powermeter/homeassistant.py
Ensure _fetch_states_task is cancelled and awaited in stop() and _reset_for_reconnect() to prevent stale values from being applied after disconnect/reconnect.
REST bootstrap test suite
src/astrameter/powermeter/homeassistant_test.py
Replace prior WebSocket get_states tests with comprehensive REST bootstrap coverage using a mocked ClientSession.get: validate bearer token, URL construction (HTTPS/path_prefix), per-entity fetching, skipping already-populated entities, 404/error handling, task scheduling after auth_ok, and cancellation on reconnect.
Changelog documentation
CHANGELOG.md
Update the timeout fix entry to reflect the new per-entity REST bootstrap behavior and remove mention of WebSocket get_states seeding.

Sequence Diagram

sequenceDiagram
  participant WebSocket
  participant HomeAssistant
  participant REST
  participant Cache
  WebSocket->>HomeAssistant: auth_ok
  HomeAssistant->>HomeAssistant: cancel prior _fetch_states_task
  HomeAssistant->>HomeAssistant: schedule _fetch_initial_states task
  loop per tracked entity
    HomeAssistant->>REST: GET /api/states/{entity_id} (Bearer)
    REST-->>HomeAssistant: { state: "...", ... }
    HomeAssistant->>Cache: update cached entity state
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tomquist/AstraMeter#382: Prior change that introduced WebSocket get_states-based bootstrap; this PR replaces that mechanism with per-entity REST bootstrapping.
  • tomquist/AstraMeter#364: Related work using REST GET /api/states/{entity_id} for populating Home Assistant entity state, overlapping in approach.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: replacing WebSocket get_states bootstrap with per-entity REST requests, which is the core technical modification across all three files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/fix-state-fetch-rest-no-changes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/astrameter/powermeter/homeassistant_test.py (1)

581-599: ⚡ Quick win

Cover the post-await stale-write race in this test.

This only proves cancel() is called on a hanging task. It never drives the real _fetch_initial_states() path where cancellation can arrive after the last await and still let a stale REST payload write into _entity_values. Please add a regression that lets the GET complete, calls _reset_for_reconnect(), and asserts the old bootstrap cannot reseed the cache.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/astrameter/powermeter/homeassistant_test.py` around lines 581 - 599, The
test only verifies cancel() was invoked but doesn't cover the race where
`_fetch_initial_states()` completes its network GET after
`_reset_for_reconnect()` and then writes stale data into `_entity_values`; add a
new regression within `test_reconnect_cancels_in_flight_fetch` (or a companion
test) that launches the real `_fetch_initial_states()`-like coroutine as
`pm._fetch_states_task`, arrange it to perform its final write after a short
delay (e.g., await an Event or sleep) so you can call
`pm._reset_for_reconnect()` before that write completes, then let the hanging
GET finish and assert the old bootstrap did NOT reseed `pm._entity_values`
(verify values remain whatever the reset set or remain empty). Use the actual
method name `_fetch_initial_states` / task slot `_fetch_states_task` and check
`_entity_values` to prove the stale payload cannot overwrite the cache.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/astrameter/powermeter/homeassistant.py`:
- Around line 159-163: The REST bootstrap coroutine cancelled via
_fetch_states_task can still run to completion and call _update_entity_value, so
add a generation token (e.g., self._conn_gen or self._bootstrap_gen) that is
incremented in _reset_for_reconnect() and captured at the start of the bootstrap
coroutine; before applying each response update call (the place that currently
calls _update_entity_value and writes into _entity_values) re-check that the
captured generation still equals the current token and that the entity is still
unset/fresh (e.g., not already present in self._entity_values or newer than the
bootstrap) and abort the write if not; apply the same pattern for other
bootstrap REST sections that call _update_entity_value (the other places noted)
to avoid stale overwrites and races with subscribe_entities updates.

---

Nitpick comments:
In `@src/astrameter/powermeter/homeassistant_test.py`:
- Around line 581-599: The test only verifies cancel() was invoked but doesn't
cover the race where `_fetch_initial_states()` completes its network GET after
`_reset_for_reconnect()` and then writes stale data into `_entity_values`; add a
new regression within `test_reconnect_cancels_in_flight_fetch` (or a companion
test) that launches the real `_fetch_initial_states()`-like coroutine as
`pm._fetch_states_task`, arrange it to perform its final write after a short
delay (e.g., await an Event or sleep) so you can call
`pm._reset_for_reconnect()` before that write completes, then let the hanging
GET finish and assert the old bootstrap did NOT reseed `pm._entity_values`
(verify values remain whatever the reset set or remain empty). Use the actual
method name `_fetch_initial_states` / task slot `_fetch_states_task` and check
`_entity_values` to prove the stale payload cannot overwrite the cache.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 636270bd-a491-4d9c-8956-658898bf31ea

📥 Commits

Reviewing files that changed from the base of the PR and between a17e34b and 812156b.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • src/astrameter/powermeter/homeassistant.py
  • src/astrameter/powermeter/homeassistant_test.py

Comment on lines +159 to +163
if self._fetch_states_task and not self._fetch_states_task.done():
# An in-flight REST bootstrap from the previous connection
# could otherwise race in after the reset and resurrect a
# stale value.
self._fetch_states_task.cancel()
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard REST bootstrap writes with a freshness check.

Cancelling the task is not enough here. If cancellation lands after await resp.json() returns, the coroutine will still execute _update_entity_value(...) once before hitting another await, so an old bootstrap can repopulate _entity_values after _reset_for_reconnect(). The same unconditional write can also overwrite a fresher subscribe_entities update that arrived while the GET was in flight.

Use a bootstrap/connection generation token and re-check that the entity is still unset before applying the REST response.

Proposed fix
 class HomeAssistant(Powermeter):
     def __init__(...):
         ...
         self._fetch_states_task: asyncio.Task[None] | None = None
+        self._bootstrap_generation = 0
         self._entities_ready = asyncio.Event()
         self._message_event = asyncio.Event()

     def _reset_for_reconnect(self) -> None:
         ...
+        self._bootstrap_generation += 1
         if self._fetch_states_task and not self._fetch_states_task.done():
             self._fetch_states_task.cancel()
         ...

     async def _handle_message(self, ws: aiohttp.ClientWebSocketResponse, raw: str) -> None:
         ...
         elif msg_type == "auth_ok":
             ...
+            generation = self._bootstrap_generation
             if self._fetch_states_task and not self._fetch_states_task.done():
                 self._fetch_states_task.cancel()
-            self._fetch_states_task = asyncio.create_task(self._fetch_initial_states())
+            self._fetch_states_task = asyncio.create_task(
+                self._fetch_initial_states(generation)
+            )

-    async def _fetch_initial_states(self) -> None:
+    async def _fetch_initial_states(self, generation: int) -> None:
         if not self._session:
             return
         headers = {"Authorization": f"Bearer {self.access_token}"}
         for eid in sorted(self._tracked_entities):
             if self._entity_values.get(eid) is not None:
                 continue
             ...
-            if isinstance(data, dict):
+            if (
+                generation == self._bootstrap_generation
+                and self._entity_values.get(eid) is None
+                and isinstance(data, dict)
+            ):
                 self._update_entity_value(eid, data.get("state"))

Also applies to: 237-239, 256-278

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/astrameter/powermeter/homeassistant.py` around lines 159 - 163, The REST
bootstrap coroutine cancelled via _fetch_states_task can still run to completion
and call _update_entity_value, so add a generation token (e.g., self._conn_gen
or self._bootstrap_gen) that is incremented in _reset_for_reconnect() and
captured at the start of the bootstrap coroutine; before applying each response
update call (the place that currently calls _update_entity_value and writes into
_entity_values) re-check that the captured generation still equals the current
token and that the entity is still unset/fresh (e.g., not already present in
self._entity_values or newer than the bootstrap) and abort the write if not;
apply the same pattern for other bootstrap REST sections that call
_update_entity_value (the other places noted) to avoid stale overwrites and
races with subscribe_entities updates.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race can't happen under asyncio's single-threaded model. _reset_for_reconnect() runs only from _ws_loop (a separate task), so when it calls _fetch_states_task.cancel(), the bootstrap task is necessarily suspended at an await (session.get / resp.json / async with exit). The synchronous _update_entity_value(...) write sits after those awaits with no intervening await, and cancel() sets _must_cancel so the next task step raises CancelledError at the suspension point even if the awaited result is ready — the write is never reached. If the task instead finished before reset, the reset's clear loop nulls the values anyway. A token would be dead code.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomquist, you're right — I retract the concern. The asyncio single-threaded model does provide the guarantee here:

  1. Cooperative scheduling — when _reset_for_reconnect() is executing inside _ws_loop, _fetch_initial_states is necessarily suspended at one of its await points (session.get, resp.json(), or the async with __aexit__), so no "between awaits" execution is possible.
  2. _must_cancel semanticsTask.cancel() sets the flag, and the injected CancelledError fires at the very next resumption point (the async with exit at minimum), which is before the synchronous _update_entity_value write.
  3. except asyncio.CancelledError: raise — the cancellation propagates cleanly rather than being swallowed.
  4. Completed-task path — if the task already finished before the reset, the for eid … self._entity_values[eid] = None loop in _reset_for_reconnect covers that case.

A generation token would genuinely be dead code. The existing implementation is correct as-is.

tomquist pushed a commit that referenced this pull request May 21, 2026
…t of #383)

WebSocket `get_states` returns every entity in the Home Assistant
state machine — wasteful for setups with many entities when we only
need one or a handful. Per-entity REST GETs against
`/api/states/<entity_id>` are far cheaper.

Direct port of Python PR #383:

  * Drop the `State::get_states_id` field and the WS `get_states`
    send + result handling added in #382.
  * Add `http: Arc<dyn HttpClient>` and
    `bootstrap_task: Arc<Mutex<Option<JoinHandle<()>>>>` to
    `HomeAssistant`. The bootstrap task is what we abort on
    reconnect / stop so a late response can't resurrect a stale
    value after `_reset_for_reconnect` clears the cache.
  * On `auth_ok`, after sending `subscribe_entities`, spawn the
    bootstrap task: it iterates the tracked entity set, GETs
    `/api/states/<entity_id>` (with the same bearer token used by
    the WS handshake), and feeds the response `state` field through
    the existing `update_value` helper.
  * Per-entity errors (404, network failure, decode) are logged at
    debug and swallowed — a single missing entity must not break
    the whole bootstrap; the WS stream may still deliver a value
    later.
  * The bootstrap loop skips entities whose `entity_values` slot is
    already populated, so it doesn't waste requests (or clobber a
    fresher WS-snapshot value).

Aborts the bootstrap task on `run_loop` reconnect-reset and on
`stop()`. Matches Python's `_fetch_states_task` cancellation
semantics.
The previous get_states bootstrap used the WebSocket get_states command,
which returns every entity in the Home Assistant state machine — wasteful
for setups with many entities when we only care about one or a handful.

Swap to per-entity REST GET /api/states/<entity_id> requests, scheduled
as a background task right after subscribe_entities. Cancel any in-flight
bootstrap on disconnect or stop so a late response can't resurrect a
stale value after _reset_for_reconnect clears the cache.
@tomquist tomquist force-pushed the claude/fix-state-fetch-rest-no-changes branch from 812156b to 5bf5713 Compare May 26, 2026 22:25
Drive the real _fetch_initial_states with a response that resolves only
after _reset_for_reconnect cancels the task, proving the post-await write
never lands on the cleared cache.
@tomquist tomquist merged commit db4b6f7 into develop May 27, 2026
25 checks passed
@tomquist tomquist deleted the claude/fix-state-fetch-rest-no-changes branch May 27, 2026 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants