This directory contains Ruuter configuration files for internal API endpoints that handle backend operations for the Common Knowledge Base. These endpoints are used by microservices for database operations and pipeline coordination.
Ruuter Internal provides backend API endpoints that are called by other services within the CKB ecosystem. Unlike the external API, these endpoints do not require user authentication and are designed for service-to-service communication.
DSL/Ruuter.internal/ckb/
├── GET/ # Read operations
│ ├── agency/ # Agency data retrieval
│ ├── client/ # Client data operations
│ ├── pipeline/ # Pipeline status and triggers
│ ├── source-file/ # File processing queries
│ └── source/ # Source management
├── POST/ # Write operations
│ ├── agency/ # Agency updates
│ ├── pipeline/ # Pipeline operations
│ ├── reports/ # Report management
│ ├── source-file/ # File processing
│ └── source/ # Source updates
└── TEMPLATES/ # Reusable workflow templates
├── agency/ # Agency-specific workflows
├── pipeline/ # Data processing workflows
└── source_file/ # File processing workflows
GET /pipeline/scheduler-check-for-unscheduled-records- Find sources needing schedulingGET /pipeline/trigger-pipeline-for-sceduled-sources- Execute scheduled tasksGET /pipeline/zip- Archive processed dataPOST /pipeline/clean-scraped-file- Trigger file cleaningPOST /pipeline/upload-file-sync- Synchronous file uploadPOST /pipeline/delete-file-sync- Synchronous file deletion
GET /source-file/get-one-source-file-to-scrape- Get next file for processingPOST /source-file/add-scrapped-file- Add scraped file to databasePOST /source-file/update-cleaned-file- Update file after cleaningPOST /source-file/update-scrapped-file- Update scraping statusPOST /source-file/upload-metadata- Upload file metadata
GET /agency/get- Retrieve agency informationPOST /agency/update-data-hash- Update agency data hashPOST /agency/update-zip-dirty- Mark agency data for re-zipping
GET /source/get- Retrieve source configurationPOST /source/update-status- Update source processing status
GET /client/data/exist- Check if client data existsGET /client/data/import- Import client dataGET /client/data/new- Get new client data
POST /reports/add- Create processing reportPOST /reports/update- Update report statusPOST /reports/logs/add- Add log entries
Reusable workflow templates for common operations:
clean-file.yml- File cleaning workflowtrigger-scrapper-*.yml- Various scraper trigger workflowsupdate-next-rendered-timestamp.yml- Scheduling updates
zip.yml- Agency data archival workflow
upload-metadata.yml- File metadata upload workflow
tara.yml- TARA authentication integration
Internal endpoints directly execute SQL queries via Resql integration:
- Read Operations: Execute SELECT queries from
DSL/Resql/ckb/GET/ - Write Operations: Execute INSERT/UPDATE/DELETE from
DSL/Resql/ckb/POST/ - Transaction Management: Proper database transaction handling
- Connection Pooling: Efficient database connection management
Internal API endpoints are called by:
- Scrapper Service: File processing and metadata updates
- Cleaning Service: Cleaned file uploads and status updates
- Data Export: Database queries and data extraction
- Scheduler: Pipeline trigger coordination
- File Processing: File upload/download operations
Each YAML file typically contains:
DSL:
- name: operation_name
http:
method: GET/POST
url: /internal/endpoint
steps:
- name: step_name
resql:
query: query_name
assign: variable_name- Database Errors: Proper SQL error handling and rollback
- Service Failures: Graceful degradation when dependent services are unavailable
- Validation Errors: Input validation with appropriate error responses
- Logging: Comprehensive logging for debugging and monitoring
While internal endpoints don't require user authentication:
- Network Isolation: Should only be accessible within the service network
- Input Validation: All inputs validated before database operations
- SQL Injection Prevention: Parameterized queries via Resql
- Resource Limits: Rate limiting and resource constraints
When adding new internal endpoints:
- Create YAML configuration in appropriate GET/POST directory
- Define corresponding SQL queries in Resql directory
- Add error handling and validation
- Test service integration
- Update documentation
- Health Checks: Endpoint availability monitoring
- Performance Metrics: Query execution time tracking
- Error Rates: Failed operation monitoring
- Resource Usage: Database connection and memory monitoring