Skip to content

Commit 0e20681

Browse files
committed
Merge branch 'feat/sentry-integration'
2 parents 453ca09 + f28c452 commit 0e20681

8 files changed

Lines changed: 250 additions & 59 deletions

File tree

README.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -223,6 +223,134 @@ Following implementation requires MongoDB v4.2 or higher.
223223
ProxyPassReverse /fdsnws/availability/1 <HOST>:9001 timeout=600
224224
```
225225
226+
## Performance Tuning
227+
228+
### Gunicorn Workers Configuration
229+
230+
The number of Gunicorn workers directly affects how many concurrent requests your service can handle. The default configuration uses **1 worker** for maximum stability on resource-constrained servers.
231+
232+
#### Current Configuration (docker-compose.yml)
233+
```yaml
234+
command: gunicorn --bind 0.0.0.0:9001 --workers 1 start:app
235+
```
236+
237+
#### Adjusting Worker Count
238+
239+
**For servers with limited resources or thread creation issues:**
240+
```yaml
241+
# Minimum configuration (most stable)
242+
command: gunicorn --bind 0.0.0.0:9001 --workers 1 --timeout 600 start:app
243+
```
244+
245+
**For servers with moderate resources:**
246+
```yaml
247+
# 2-3 workers (recommended for most deployments)
248+
command: gunicorn --bind 0.0.0.0:9001 --workers 2 --timeout 600 start:app
249+
```
250+
251+
**For high-performance servers:**
252+
```yaml
253+
# Formula: (2 × CPU cores) + 1
254+
# Example for 4-core server: --workers 9
255+
command: gunicorn --bind 0.0.0.0:9001 --workers 4 --timeout 600 start:app
256+
```
257+
258+
#### Important Notes
259+
260+
1. **Each worker is a separate process** with its own memory footprint
261+
2. **More workers ≠ always better** - too many workers can exhaust system resources
262+
3. **Monitor for errors** after increasing workers:
263+
```bash
264+
docker logs -f fdsnws-availability-api
265+
# Watch for "pthread_create failed" or similar errors
266+
```
267+
268+
4. **Resource usage check:**
269+
```bash
270+
docker stats fdsnws-availability-api
271+
# If CPU < 80% and memory available, you can add more workers
272+
```
273+
274+
### MongoDB Connection Pool
275+
276+
The MongoDB connection pool is configured in `apps/wfcatalog_client.py`:
277+
278+
```python
279+
maxPoolSize=1 # Connections per worker
280+
```
281+
282+
#### How It Works
283+
284+
- **Each Gunicorn worker** has its own MongoDB client
285+
- **Total connections** = `workers × maxPoolSize`
286+
- **Example:** 2 workers × 1 pool = 2 total MongoDB connections
287+
288+
#### When to Adjust
289+
290+
**Keep `maxPoolSize=1` if:**
291+
- ✅ Using sync workers (default Gunicorn configuration)
292+
- ✅ Each worker handles one request at a time
293+
- ✅ Server has resource constraints
294+
295+
**Increase `maxPoolSize` only if:**
296+
- Using async workers (gevent/eventlet)
297+
- Using threading within workers
298+
- MongoDB is a bottleneck (check with profiling)
299+
300+
#### Example Configurations
301+
302+
| Workers | maxPoolSize | Total Connections | Use Case |
303+
|---------|-------------|-------------------|----------|
304+
| 1 | 1 | 1 | Minimal (default) |
305+
| 2 | 1 | 2 | Recommended |
306+
| 4 | 1 | 4 | High performance |
307+
| 2 | 5 | 10 | Async workers |
308+
309+
### Thread Limiting (Important!)
310+
311+
The configuration includes thread limits to prevent `pthread_create failed` errors on restricted servers:
312+
313+
```yaml
314+
environment:
315+
OPENBLAS_NUM_THREADS: 1
316+
MKL_NUM_THREADS: 1
317+
NUMEXPR_NUM_THREADS: 1
318+
OMP_NUM_THREADS: 1
319+
```
320+
321+
**Do not remove these** unless you're certain your server can handle multiple threads per process. These prevent NumPy/ObsPy from spawning excessive threads.
322+
323+
### Troubleshooting
324+
325+
**Problem:** Service crashes with "pthread_create failed"
326+
- **Solution:** Reduce workers to 1, keep thread limits in place
327+
328+
**Problem:** Slow response times under load
329+
- **Solution:** Increase workers (if resources allow), monitor with `docker stats`
330+
331+
**Problem:** High memory usage
332+
- **Solution:** Reduce workers, check for memory leaks with profiling
333+
334+
**Problem:** MongoDB connection errors
335+
- **Solution:** Check total connections (workers × maxPoolSize) against MongoDB limits
336+
337+
### Performance Monitoring
338+
339+
See `tests/performance/` for profiling and benchmarking tools:
340+
341+
```bash
342+
# Quick performance test
343+
bash tests/performance/quick_test.sh
344+
345+
# Detailed profiling
346+
python tests/performance/profiler.py
347+
348+
# Load testing
349+
locust -f tests/performance/locustfile.py --host=http://localhost:9001
350+
```
351+
352+
For more details, see [Performance Analysis Plan](tests/performance/README.md).
353+
226354
## Running in development environment
227355

228356
1. Go to the root directory.

apps/data_access_layer.py

Lines changed: 21 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -196,43 +196,27 @@ def sort_records(params: dict, data: list[list[Any]]) -> None:
196196
elif params["orderby"] == "latestupdate_desc":
197197
data.sort(key=lambda x: x[UPDATED], reverse=True)
198198
else:
199-
# Default sorting: NSLC, Time, Quality, SampleRate
200-
# We sort by multiple keys in reverse priority (Python sort is stable)
201-
202-
# 1. Sort by Quality and SampleRate
203-
data.sort(key=lambda x: (x[QUALITY], x[SAMPLERATE]))
204-
# 2. Sort by Time (Start, End) - descending? Wait, original code had reverse=True for time?
205-
# Original: data.sort(key=lambda x: (x[START], x[END]), reverse=True)
206-
# But usually we want ascending time?
207-
# Let's check original logic carefully.
208-
209-
# Original Lines:
210-
# 200: data.sort(key=lambda x: (x[QUALITY], x[SAMPLERATE]))
211-
# 201: data.sort(key=lambda x: (x[START], x[END]), reverse=True)
212-
# 202: data.sort(key=lambda x: x[:QUALITY])
213-
214-
# Line 202 sorts by first 4 columns (Net, Sta, Loc, Cha).
215-
# Since Python sort is stable, previous sorts within those groups are preserved.
216-
217-
# Line 201 sorted by Time DESCENDING? That seems odd for a time series.
218-
# But if we want Earliest to Latest, it should be Ascending.
219-
# Maybe reverse=True was a bug or specific requirement?
220-
# The user's query example showed 2023-11-23 then 2023-11-22, which is DESCENDING.
221-
# If the user wants standard time order, it should be Ascending.
222-
223-
# Let's KEEP original logic for now, but ensure it runs.
224-
# WAIT, if 201 is reverse=True, then data is sorted Time DESCENDING?
225-
# Let's verify what `nslc_time_quality_samplerate` implies. "ordered by ... time ..." usually means Ascending.
226-
227-
# If I change reverse=True to False, I might break expected behavior if descending was intended.
228-
# But "Earliest" column usually suggests ascending.
229-
230-
# Let's stick to the minimal fix: remove the surrounding IF, Keep the logic same.
231-
232-
data.sort(key=lambda x: (x[QUALITY], x[SAMPLERATE]))
233-
# 2. Sort by Time (Start, End) - Ascending
234-
data.sort(key=lambda x: (x[START], x[END]), reverse=False)
235-
data.sort(key=lambda x: x[:QUALITY])
199+
# Default sorting: NSLC (Network, Station, Location, Channel),
200+
# then Time (Start, End), then Quality, then SampleRate
201+
#
202+
# OPTIMIZATION: Use single sort with compound key instead of 3 separate sorts.
203+
# This is more efficient and clearer than relying on stable sort behavior.
204+
#
205+
# Sort order:
206+
# 1. Network, Station, Location, Channel (x[0], x[1], x[2], x[3])
207+
# 2. Start time, End time (x[START], x[END])
208+
# 3. Quality (x[QUALITY])
209+
# 4. Sample rate (x[SAMPLERATE])
210+
data.sort(key=lambda x: (
211+
x[0], # Network
212+
x[1], # Station
213+
x[2], # Location
214+
x[3], # Channel
215+
x[START], # Start time (ascending - earliest first)
216+
x[END], # End time
217+
x[QUALITY], # Quality
218+
x[SAMPLERATE] # Sample rate
219+
))
236220

237221

238222
# else:

apps/globals.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
# error message constants
3939
DOCUMENTATION_URI = "http://www.fdsn.org/webservices/fdsnws-availability-1.0.pdf"
4040
SERVICE = "fdsnws-availability"
41-
VERSION = "1.0.3"
41+
VERSION = "1.0.4"
4242

4343

4444
class Error:

config.py.sample

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,11 @@ class Config:
6060
CACHE_RESP_PERIOD = (
6161
os.environ.get("CACHE_SHORT_INV_PERIOD") or CACHE_RESP_PERIOD
6262
)
63+
# Sentry configuration (optional)
64+
SENTRY_DSN = os.environ.get("SENTRY_DSN") or ""
65+
SENTRY_TRACES_SAMPLE_RATE = float(
66+
os.environ.get("SENTRY_TRACES_SAMPLE_RATE") or "1.0"
67+
)
6368
except NameError:
6469
print("Missing environment variables.")
6570
raise

docker-compose.yml

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,8 +31,12 @@ services:
3131
context: ./
3232
dockerfile: Dockerfile.api
3333
restart: always
34-
# Run with 1 sync worker (absolute minimum memory/thread footprint)
35-
command: gunicorn --bind 0.0.0.0:9001 --workers 1 start:app
34+
# Worker Configuration:
35+
# - Default: 1 worker (most stable for resource-constrained servers)
36+
# - Moderate: 2-3 workers (recommended if no thread creation issues)
37+
# - High-performance: (2 × CPU cores) + 1 workers
38+
# See README.md "Performance Tuning" section for details
39+
command: gunicorn --bind 0.0.0.0:9001 --workers 1 --timeout 600 start:app
3640
container_name: fdsnws-availability-api
3741
network_mode: "host"
3842
environment:

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "ws-availability"
3-
version = "0.1.0"
3+
version = "1.0.4"
44
description = "Add your description here"
55
readme = "README.md"
66
requires-python = ">=3.10"
@@ -13,6 +13,7 @@ dependencies = [
1313
"requests==2.31.0",
1414
"pydantic>=2.0.0",
1515
"pydantic-settings>=2.12.0",
16+
"sentry-sdk[flask]>=2.0.0",
1617
]
1718

1819
[dependency-groups]

start.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,82 @@
11
import logging
22
import os
33

4+
import sentry_sdk
45
from flask import Flask, make_response, render_template
56

67
from apps.globals import VERSION
78
from apps.root import output
89
from config import Config
910

1011

12+
def before_send(event, hint):
13+
"""
14+
Scrub sensitive data from Sentry events before sending.
15+
This prevents passwords, API keys, and other secrets from being exposed.
16+
"""
17+
# List of sensitive field names to scrub (case-insensitive)
18+
sensitive_keys = {
19+
"password", "passwd", "pwd", "secret", "api_key", "apikey",
20+
"token", "auth", "authorization", "credentials", "private_key",
21+
"access_token", "refresh_token", "session", "cookie"
22+
}
23+
24+
def scrub_dict(data):
25+
"""Recursively scrub sensitive data from dictionaries."""
26+
if not isinstance(data, dict):
27+
return
28+
29+
for key in list(data.keys()):
30+
key_lower = str(key).lower()
31+
# Check if key contains any sensitive keyword
32+
if any(sensitive in key_lower for sensitive in sensitive_keys):
33+
data[key] = "[Filtered]"
34+
elif isinstance(data[key], dict):
35+
scrub_dict(data[key])
36+
elif isinstance(data[key], list):
37+
for item in data[key]:
38+
if isinstance(item, dict):
39+
scrub_dict(item)
40+
41+
# Scrub request data
42+
if "request" in event:
43+
scrub_dict(event["request"])
44+
45+
# Scrub extra context
46+
if "extra" in event:
47+
scrub_dict(event["extra"])
48+
49+
# Scrub user context
50+
if "user" in event:
51+
scrub_dict(event["user"])
52+
53+
# Scrub breadcrumbs
54+
if "breadcrumbs" in event:
55+
for breadcrumb in event["breadcrumbs"].get("values", []):
56+
scrub_dict(breadcrumb)
57+
58+
# Scrub local variables from stack traces
59+
if "exception" in event:
60+
for exception in event["exception"].get("values", []):
61+
if "stacktrace" in exception:
62+
for frame in exception["stacktrace"].get("frames", []):
63+
if "vars" in frame:
64+
scrub_dict(frame["vars"])
65+
66+
return event
67+
68+
69+
# Initialize Sentry before creating the Flask app
70+
if Config.SENTRY_DSN:
71+
sentry_sdk.init(
72+
dsn=Config.SENTRY_DSN,
73+
traces_sample_rate=Config.SENTRY_TRACES_SAMPLE_RATE,
74+
# Add data like request headers and IP for users
75+
send_default_pii=True,
76+
# Scrub sensitive data before sending
77+
before_send=before_send,
78+
)
79+
1180
app = Flask(__name__)
1281

1382
FMT = "[%(asctime)s] %(levelname)s [%(filename)s:%(lineno)d] [%(funcName)s] %(message)s"

0 commit comments

Comments
 (0)