Skip to content

Comments

LoadML - change time decay to extend up to 30 days#3401

Merged
springfall2008 merged 10 commits intomainfrom
fixes22
Feb 20, 2026
Merged

LoadML - change time decay to extend up to 30 days#3401
springfall2008 merged 10 commits intomainfrom
fixes22

Conversation

@springfall2008
Copy link
Owner

HAHistory - Add locking around history modifications

HAHistory - Add locking around history modifications
Copilot AI review requested due to automatic review settings February 20, 2026 09:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes two related changes: (1) extends the ML load forecaster's time decay window from 7 to 30 days for better long-term pattern learning, and (2) adds thread-safety to the HAHistory component by introducing locking around shared state modifications.

Changes:

  • Increased ML time decay parameter from 7 to 30 days with dynamic adjustment based on available data age
  • Added threading.Lock to HAHistory to protect concurrent access to history_data and history_entities dictionaries
  • Added comprehensive test coverage for the scenario where history cache is expanded from 7 to 30 days

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
apps/predbat/ha.py Adds history_lock (threading.Lock) initialization and wraps history_data modifications in prune_history() and update_entity() with lock to prevent race conditions
apps/predbat/load_ml_component.py Changes ml_time_decay_days from 7 to 30 days and adds dynamic calculation using max(ml_time_decay_days, load_data_age_days) to ensure decay extends to cover available data
apps/predbat/tests/test_hahistory.py Adds test_hahistory_expand_from_7_to_30_days() to verify cache correctly expands when requesting more days than currently cached
Comments suppressed due to low confidence (1)

apps/predbat/ha.py:76

  • The add_entity method modifies self.history_entities without holding the history_lock, creating a race condition. This method is called from get_history (line 94) outside of the lock, and history_entities is also accessed from multiple other methods. The lock should protect all accesses to history_entities to prevent race conditions when multiple threads call get_history concurrently or when the run method accesses history_entities while get_history is updating it.
    def add_entity(self, entity_id, days):
        """
        Add an entity to the history list
        """
        if entity_id in self.history_entities:
            if days > self.history_entities[entity_id]:
                self.history_entities[entity_id] = days
        else:
            self.history_entities[entity_id] = days

Comment on lines +84 to +88
with self.history_lock:
if self.history_data.get(entity_id, None) and self.history_entities.get(entity_id, 0) >= days:
result = [self.history_data[entity_id]]

if result is None:
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock is released before calling add_entity and accessing history_entities (lines 94-95 outside the diff). This creates a potential race condition where another thread could modify history_entities between the lock release and these accesses. Consider expanding the lock scope to include the entire cache-check-and-fetch logic, or add locking to the add_entity method to protect all accesses to history_entities.

Copilot uses AI. Check for mistakes.
is_initial=is_initial,
epochs=epochs,
time_decay_days=self.ml_time_decay_days,
time_decay_days=min(self.ml_time_decay_days, self.load_data_age_days),
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of max() here may not match the intended behavior. If load_data_age_days exceeds ml_time_decay_days (e.g., 60 days of data vs 30-day decay setting), this will use 60 as the decay timescale, giving the oldest samples (at day 60) a weight of exp(-60/60) ≈ 0.37. If the intention is to decay over 30 days regardless of data age, samples at day 60 should have weight exp(-60/30) ≈ 0.14, requiring time_decay_days to be capped at 30, not increased to 60. Consider whether min(ml_time_decay_days, load_data_age_days) would better match the intended exponential decay behavior, or clarify if the current max() logic is intentional to preserve weight for very old data.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

apps/predbat/ha.py:76

  • The add_entity() method modifies self.history_entities without acquiring self.history_lock. This creates a race condition because add_entity() is called from get_history() at line 94 (outside the lock), while history_entities is also accessed inside the lock at line 85 and in prune_history() at line 111. This can lead to inconsistent state or data corruption in multi-threaded scenarios.

To fix this, add_entity() should also use the lock when modifying self.history_entities.

    def add_entity(self, entity_id, days):
        """
        Add an entity to the history list
        """
        if entity_id in self.history_entities:
            if days > self.history_entities[entity_id]:
                self.history_entities[entity_id] = days
        else:
            self.history_entities[entity_id] = days

@@ -111,16 +111,14 @@ To prevent drift in long-range predictions, the model blends autoregressive pred
- **Early Stopping**: Training halts if validation error doesn't improve for 5 consecutive epochs
- **Time-Weighted Samples**: Recent data has higher importance (7-day exponential decay constant)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line still references "7-day exponential decay constant" but the code has been changed to use 30 days (ml_time_decay_days = 30 in load_ml_component.py line 75). The documentation should be updated to reflect the new time decay period or made more generic to match the changes made elsewhere in this file.

Copilot uses AI. Check for mistakes.
@@ -618,18 +620,3 @@ def _publish_entity(self):
},
app="load_ml",
)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removed is_alive() method had additional health checking logic that verified the component had been updated within the last 10 minutes. The base class implementation (ComponentBase.is_alive()) only checks if self.api_started is True, which is less strict and won't detect if the component becomes stuck or stops updating.

If the enhanced health checking was intentional and important for monitoring the LoadML component's health, this removal could reduce visibility into component health issues. Consider whether the base class implementation provides sufficient health monitoring or if the 10-minute timeout check should be retained.

Copilot uses AI. Check for mistakes.
@springfall2008 springfall2008 merged commit a4f755f into main Feb 20, 2026
1 check passed
@springfall2008 springfall2008 deleted the fixes22 branch February 20, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant