Skip to content

Commit 97af11b

Browse files
authored
Migrate Reporter to Status Dashboard API V2 for Incident Creation (#26)
Migrate Reporter to Status Dashboard API V2 for Incident Creation This PR migrates the reporter from Status Dashboard API V1 to V2 for sending incidents. The migration introduces component ID resolution via a cached lookup system and updates the incident data structure to match the V2 API contract. Changes Core Migration Replace /v1/component_status endpoint with /v2/incidents endpoint for incident creation Implement new V2 incident data structure with fields: title, description, impact, components, start_date, system, and type Use static title ("System incident from monitoring system") and description ("System-wide incident affecting one or multiple components. Created automatically.") to avoid exposing sensitive operational data on the public Status Dashboard Component Cache System Add component cache that fetches from /v2/components at startup Map (component name, attributes) to component ID with subset attribute matching Auto-refresh cache when a component is not found Retry initial cache load up to 3 times with 60-second delays Configuration Updates Increase HTTP timeout from 2s to 5s to accommodate V2 API response times No changes required to existing configuration file format or authorization mechanism (HMAC-signed JWT) Logging Enhancements Log comprehensive diagnostic details locally: timestamp, service name, environment, component details, impact value, and triggered metrics with values Diagnostic details are intentionally excluded from API requests for security Behavioral Notes Creates new incident request for every detection; relies on Status Dashboard's duplicate handling Continues monitoring other services if incident creation fails for one service Fails to start only if initial component cache load fails after all retries Testing Verify incidents are created via V2 API with correct component IDs Verify component cache refreshes correctly when components are added Verify existing HMAC authorization works with V2 endpoints Reviewed-by: Ilia Bakhterev
1 parent 0ba2a92 commit 97af11b

49 files changed

Lines changed: 7444 additions & 925 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/agents/copilot-instructions.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ cargo test [ONLY COMMANDS FOR ACTIVE TECHNOLOGIES][ONLY COMMANDS FOR ACTIVE TECH
2222
Rust 1.75+ (edition 2021, per Cargo.toml): Follow standard conventions
2323

2424
## Recent Changes
25+
- 003-sd-api-v2-migration: Added [if applicable, e.g., PostgreSQL, CoreData, files or N/A]
2526

2627
- 001-project-documentation: Added Rust 1.75+ (edition 2021, per Cargo.toml) + axum (0.6), tokio (1.28), serde (1.0), tracing (0.1), reqwest (0.11)
2728

.github/workflows/ci.yml

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,16 @@ jobs:
1818
steps:
1919
- uses: actions/checkout@v4
2020

21-
# Temporarily disabled linting and formatting checks, to be re-enabled later.
22-
# - name: Check formatting
23-
# run: make fmt-check
24-
#
25-
# - name: Run linter
26-
# run: make lint
21+
- name: Install Rust
22+
uses: dtolnay/rust-toolchain@stable
23+
with:
24+
components: rustfmt, clippy
25+
26+
- name: Check formatting
27+
run: make fmt-check
28+
29+
- name: Run linter
30+
run: make lint
2731

2832
- name: Run tests
2933
run: make test
@@ -37,6 +41,5 @@ jobs:
3741
steps:
3842
- uses: actions/checkout@v4
3943

40-
4144
- name: Run tests with coverage
4245
run: make coverage-check

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,6 @@ docs/
2828
coverage/
2929
tarpaulin-report.html
3030
cobertura.xml
31+
32+
# ai assistant files
33+
skills

Cargo.toml

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ path="src/bin/reporter.rs"
1818
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
1919

2020
[dependencies]
21+
anyhow = "~1.0"
2122
axum = { version="~0.6" }
2223
axum-macros = { version="~0.3" }
2324
chrono = "~0.4"
@@ -45,15 +46,12 @@ uuid = { version = "~1.3", features = ["v4", "fast-rng"] }
4546

4647
[dev-dependencies]
4748
mockito = "~1.0"
49+
serial_test = "3.3.1"
4850
tempfile = "~3.5"
4951
tokio-test = "*"
5052
tower = { version = "0.4", features = ["util"] }
5153
hyper = { version = "0.14", features = ["full"] }
5254

53-
[build-dependencies]
54-
schemars = "~0.8"
55-
serde = { version = "~1.0", features = ["derive"] }
56-
serde_json = "~1.0"
5755

5856
[target.'cfg(all(target_env = "musl", target_pointer_width = "64"))'.dependencies.jemallocator]
5957
version = "0.3"

Makefile

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -110,7 +110,7 @@ lint-fix:
110110
# ============================================================================
111111

112112
## Build mdbook documentation
113-
doc:
113+
doc: doc-schema
114114
mdbook build doc/
115115

116116
## Serve documentation locally with live reload
@@ -129,6 +129,10 @@ doc-api:
129129
doc-api-open:
130130
cargo doc --no-deps --open
131131

132+
## Generate JSON schema for configuration
133+
doc-schema:
134+
cargo test --lib -- generate_config_schema --ignored --nocapture
135+
132136
## Clean generated documentation
133137
doc-clean:
134138
rm -rf docs/*
@@ -205,6 +209,7 @@ help:
205209
@echo " doc-open - Build and open documentation in browser"
206210
@echo " doc-api - Generate Rust API documentation"
207211
@echo " doc-api-open - Generate and open Rust API docs in browser"
212+
@echo " doc-schema - Generate JSON schema for configuration"
208213
@echo " doc-clean - Clean generated documentation"
209214
@echo ""
210215
@echo " Utilities:"

README.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ mdbook serve doc/
7373
### Running Tests
7474

7575
```bash
76-
# Run all tests
76+
# Run all unit tests
7777
cargo test
7878

7979
# Run tests with output
@@ -86,6 +86,31 @@ cargo test common::tests
8686
cargo test -- --test-threads=4
8787
```
8888

89+
### E2E Integration Tests
90+
91+
End-to-end tests validate the complete pipeline using real Docker containers (go-carbon + carbonapi).
92+
93+
#### Prerequisites
94+
- Docker installed and running
95+
- Ports available: 2003, 8080, 3005, 9999
96+
97+
#### Running E2E Tests
98+
99+
```bash
100+
# Run E2E tests (Docker containers are managed automatically)
101+
cargo test --test integration_e2e_reporter -- --ignored --nocapture
102+
```
103+
104+
The E2E test validates 4 scenarios:
105+
| Scenario | Expected Weight | Description |
106+
|----------|-----------------|-------------|
107+
| healthy | 0 | All metrics within thresholds |
108+
| degraded_slow | 1 | API response time > 1200ms |
109+
| degraded_errors | 1 | Success rate < 65% |
110+
| outage | 2 | 100% API failures |
111+
112+
For details, see [Testing Guide](doc/testing.md).
113+
89114
### Test Coverage
90115

91116
```bash
@@ -99,8 +124,6 @@ cargo tarpaulin --out Html
99124
open tarpaulin-report.html
100125
```
101126

102-
For detailed testing documentation, see [Testing Guide](doc/testing.md).
103-
104127
### JSON Schema for Configuration
105128

106129
A JSON Schema for configuration validation is auto-generated during build:

build.rs

Lines changed: 0 additions & 163 deletions
This file was deleted.

doc/config.md

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,33 @@ This section is providing capability to describe query templates to be later ref
8989

9090
## status_dashboard
9191

92-
Configures URL and jwt secret for communication with the status dashboard
92+
Configures URL and JWT secret for communication with the status dashboard.
93+
94+
```yaml
95+
status_dashboard:
96+
url: "https://status-dashboard.example.com"
97+
secret: "your-jwt-secret"
98+
```
99+
100+
| Property | Type | Required | Default | Description |
101+
|----------|--------|----------|---------|---------------------------------------|
102+
| `url` | string | Yes | - | Status Dashboard API URL |
103+
| `secret` | string | No | - | JWT signing secret for authentication |
104+
105+
## health_query
106+
107+
Configures the time window for health metric queries.
108+
109+
```yaml
110+
health_query:
111+
query_from: "-5min"
112+
query_to: "-2min"
113+
```
114+
115+
| Property | Type | Required | Default | Description |
116+
|--------------|--------|----------|---------|---------------------------------------------------------------------|
117+
| `query_from` | string | No | `-5min` | Start time offset for health metric queries (e.g., "-10min", "-1h") |
118+
| `query_to` | string | No | `-2min` | End time offset for health metric queries (e.g., "-1min", "-30s") |
93119

94120
## flag_metrics
95121

doc/configuration/schema.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ TSDB connection configuration.
5252

5353
| Property | Type | Required | Default | Description |
5454
|----------|------|----------|---------|-------------|
55-
| `url` | string | Yes | - | TSDB URL (e.g., `http://graphite:8080`) |
56-
| `timeout` | integer | No | `10` | Query timeout in seconds |
55+
| `url` | string | Yes | - | TSDB URL (e.g., `http://graphite:8080`) |
56+
| `timeout` | integer | No | `2` | Query timeout in seconds |
5757

5858
```yaml
5959
datasource:

0 commit comments

Comments
 (0)