Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions config/batcontrol_config_dummy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,9 +104,8 @@ inverter:
# Only affects mode 8; ignored in other modes.
# fronius_inverter_id: '1' # Optional: ID of the inverter in Fronius API (default: '1')
# fronius_controller_id: '0' # Optional: ID of the controller in Fronius API (default: '0')
enable_resilient_wrapper: false # Enable resilient wrapper for graceful outage handling (default: false)
enable_resilient_wrapper: false # Skip a control cycle on transient inverter outages instead of terminating (default: false)
outage_tolerance_minutes: 24 # Minutes to tolerate inverter outages before terminating (default: 24)
retry_backoff_seconds: 60 # Seconds to wait before retrying after a failure (default: 60)

#--------------------------
# Dynamic Tariff Provider
Expand Down
16 changes: 6 additions & 10 deletions docs/configuration/inverter-configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,29 +22,25 @@ Default:
These options enable graceful handling of temporary inverter outages (e.g., during firmware upgrades or network interruptions).

### enable_resilient_wrapper
Enable or disable the resilient wrapper for graceful outage handling. When enabled, temporary inverter failures are handled gracefully by caching values and applying retry backoff. This helps batcontrol survive brief connection losses without terminating.
Enable or disable the resilient wrapper for graceful outage handling. When enabled, a temporary inverter failure makes batcontrol skip the current control cycle and retry on the next scheduled run, instead of terminating. No decisions are made on stale data - the inverter is simply read again next cycle. This helps batcontrol survive brief connection losses (e.g. firmware upgrades) without exiting.

Errors before the first successful control command still fail fast, so configuration mistakes are caught at startup.

Why not just let batcontrol crash and rely on the container restart policy? Without the wrapper, every inverter outage terminates the process, and `restart: unless-stopped` brings it straight back up. During a multi-minute outage this turns into a tight restart loop, and **each restart re-fetches the price and solar forecasts from their providers**. Repeated cold starts can therefore run into provider rate limits (e.g. Awattar/Tibber for prices, Forecast.Solar/SolarPrognose for solar), which can leave batcontrol without fresh data even after the inverter recovers. Keeping the process alive and skipping cycles avoids hammering both the inverter and the data providers.

Default:
```
enable_resilient_wrapper: false
```

### outage_tolerance_minutes
The maximum duration (in minutes) to tolerate inverter outages before terminating. This allows batcontrol to survive firmware upgrades or network issues up to the specified time window. After this timeout, batcontrol will give up and exit with an error.
The maximum duration (in minutes) to tolerate inverter outages before terminating. While the inverter is unreachable, each control cycle is skipped. If communication is not restored within this window, batcontrol gives up and exits with an error.

Default:
```
outage_tolerance_minutes: 24 # 24 minutes
```

### retry_backoff_seconds
The time to wait (in seconds) before retrying after an inverter failure. This prevents hammering an unavailable inverter during the outage period and allows time for recovery.

Default:
```
retry_backoff_seconds: 60 # 60 seconds
```

## mqtt
This enables the MQTT inverter driver, which allows integration with any battery/inverter system via MQTT topics. This is a generic bridge that works with any external system that can publish battery status and receive control commands over MQTT.

Expand Down
44 changes: 44 additions & 0 deletions src/batcontrol/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import os
import logging
import platform
import functools

import dataclasses
from typing import Optional
Expand All @@ -33,6 +34,7 @@

from .dynamictariff import DynamicTariff as tariff_factory
from .inverter import Inverter as inverter_factory
from .inverter import InverterError, InverterCommunicationError
from .forecastsolar import ForecastSolar as solar_factory

from .forecastconsumption import Consumption as consumption_factory
Expand All @@ -58,6 +60,28 @@
logger = logging.getLogger(__name__)


def _tolerate_inverter_outage(func):
"""Swallow inverter outages for externally triggered (API/evcc) actions.

These run on background threads in response to MQTT/evcc events. If the
inverter is briefly unavailable the request is dropped and logged; the
next scheduled run() reconciles the inverter state. Background calls
advance the shared outage clock in the resilient wrapper; termination on
a permanent outage is only triggered from the main run() loop.
"""
@functools.wraps(func)
def wrapper(self, *args, **kwargs):
try:
return func(self, *args, **kwargs)
except InverterError as e:
logger.warning(
"Inverter unavailable during '%s', ignoring external request: %s",
func.__name__, e,
)
return None
return wrapper


def _parse_optional_ratio(value, config_key: str) -> Optional[float]:
"""Parse an optional 0..1 ratio config value."""
if value is None:
Expand Down Expand Up @@ -475,6 +499,22 @@ def handle_forecast_error(self):
self.allow_discharging()

def run(self):
"""One control cycle. Aborts cleanly on a transient inverter outage.

Communication failures are turned into InverterCommunicationError by
the resilient wrapper. We skip the cycle and let the scheduler retry on
the next run - no decision is made on stale data. A permanent outage
surfaces as InverterOutageError, which propagates to terminate.
"""
try:
self._run_once()
except InverterCommunicationError as e:
logger.warning(
"Inverter unreachable this cycle (%s). "
"Skipping control cycle, will retry on next run.", e
)

def _run_once(self):
""" Main calculation & control loop """
logger.debug('Timeslots are in %d-minute intervals', self.time_resolution)

Expand Down Expand Up @@ -940,6 +980,7 @@ def get_max_charging_from_grid_limit(self) -> float:
""" Get the max charging from grid limit for battery control """
return self.max_charging_from_grid_limit

@_tolerate_inverter_outage
def set_discharge_blocked(self, discharge_blocked) -> None:
""" Avoid discharging if an external block is received,
but take care of the always_allow_discharge_limit.
Expand Down Expand Up @@ -1007,6 +1048,7 @@ def refresh_static_values(self) -> None:
# Trigger Inverter
self.inverter.refresh_api_values()

@_tolerate_inverter_outage
def api_set_mode(self, mode: int):
""" Log and change config run mode of inverter(s) from external call """
# Check if mode is valid
Expand Down Expand Up @@ -1043,6 +1085,7 @@ def api_set_mode(self, mode: int):
else:
self.__set_control_source(CONTROL_SOURCE_API)

@_tolerate_inverter_outage
def api_set_charge_rate(self, charge_rate: int):
""" Log and change config charge_rate and activate charging."""
if charge_rate < 0:
Expand All @@ -1060,6 +1103,7 @@ def api_set_charge_rate(self, charge_rate: int):
else:
self.__set_control_source(CONTROL_SOURCE_API)

@_tolerate_inverter_outage
def api_set_limit_battery_charge_rate(self, limit: int):
""" Set dynamic battery charge rate limit from external call

Expand Down
6 changes: 5 additions & 1 deletion src/batcontrol/inverter/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
from .inverter import Inverter
from .exceptions import InverterOutageError
from .exceptions import (
InverterError,
InverterCommunicationError,
InverterOutageError,
)
from .resilient_wrapper import ResilientInverterWrapper
34 changes: 23 additions & 11 deletions src/batcontrol/inverter/exceptions.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,34 @@
"""
Custom exceptions for inverter module.
Custom exceptions for the inverter module.

These exceptions provide specific error handling for inverter-related failures,
particularly for handling temporary outages gracefully.
These let batcontrol distinguish three situations:

1. Configuration errors -> fail immediately on the first run.
2. Transient communication loss -> skip the current control cycle and retry
on the next scheduled run (InverterCommunicationError).
3. Permanent outage -> terminate after the tolerance window
(InverterOutageError).
"""


class InverterOutageError(Exception):
class InverterError(Exception):
"""Base class for inverter communication problems."""


class InverterCommunicationError(InverterError):
"""
Exception raised when an inverter has been unreachable for too long.
Raised when an inverter call fails after initialization.

This exception is raised after the configured outage tolerance period
(default: 24 minutes) has elapsed without successful communication
with the inverter. This allows batcontrol to distinguish between:
Signals the caller to abort the current control cycle. batcontrol retries
on the next scheduled run, so no decision is ever made on stale data.
"""


class InverterOutageError(InverterError):
"""
Raised when the inverter stays unreachable beyond the tolerance window.

1. Configuration errors (fail immediately on first run)
2. Transient outages (tolerate for up to 24 minutes using cached values)
3. Permanent outages (terminate after 24 minutes)
Terminates batcontrol - the inverter is considered permanently down.

Attributes:
message: Explanation of the error
Expand Down
13 changes: 1 addition & 12 deletions src/batcontrol/inverter/inverter.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@
from .resilient_wrapper import (
ResilientInverterWrapper,
DEFAULT_OUTAGE_TOLERANCE_SECONDS,
DEFAULT_RETRY_BACKOFF_SECONDS
)

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -121,24 +120,14 @@ def create_inverter(config: dict) -> InverterInterface:
DEFAULT_OUTAGE_TOLERANCE_SECONDS / 60
) * 60 # Convert to seconds

# Get retry backoff from config (default: 60 seconds)
retry_backoff = config.get(
'retry_backoff_seconds',
DEFAULT_RETRY_BACKOFF_SECONDS
)

logger.info(
'Wrapping inverter with resilient wrapper '
'(outage tolerance: %.1f min, retry backoff: %.0fs)',
'(outage tolerance: %.1f min)',
outage_tolerance / 60,
retry_backoff
)
resilient_inverter = ResilientInverterWrapper(
inverter,
outage_tolerance_seconds=outage_tolerance,
retry_backoff_seconds=retry_backoff
)
# Preserve inverter_num on wrapper
resilient_inverter.inverter_num = inverter.inverter_num

return resilient_inverter
Loading