Skip to content

feat: add middleware management API specs#196

Open
keshxvdayal wants to merge 6 commits intodevfrom
feature/middleware-api-spec
Open

feat: add middleware management API specs#196
keshxvdayal wants to merge 6 commits intodevfrom
feature/middleware-api-spec

Conversation

@keshxvdayal
Copy link

@keshxvdayal keshxvdayal commented Jan 24, 2026

Detailed Implementation Plan

  1. Create OpenAPI Specification Document

    • Create data/specs/middleware_management.yaml
    • Define base paths and versioning (/ga4gh/tes/v1/middlewares)
    • Document all endpoints:
      • GET /middlewares - List all configured middlewares
      • POST /middlewares - Add new middleware (local or from GitHub)
      • GET /middlewares/{middleware_id} - Get middleware details
      • PUT /middlewares/{middleware_id} - Update middleware configuration
      • DELETE /middlewares/{middleware_id} - Remove middleware
      • PUT /middlewares/reorder - Reorder middleware stack
      • POST /middlewares/validate - Validate middleware code without adding
    • Define comprehensive schemas:
      • MiddlewareConfig - Full middleware configuration object
      • MiddlewareCreate - Request body for creating middleware
      • MiddlewareUpdate - Request body for updating middleware
      • MiddlewareList - Array response with pagination support
      • MiddlewareOrder - Order configuration for reordering
      • ValidationRequest - Code validation request
      • ErrorResponse - Standard error format
    • Document query parameters (pagination, filtering, sorting)
    • Define all possible HTTP status codes and error responses
  2. Integrate with FOCA Configuration

    • Update pro_tes/config.yaml to reference the new spec
    • Add middleware API paths to FOCA routing configuration
    • Configure CORS and authentication settings (if applicable)
    • Set up request/response validation middleware via FOCA
  3. Write API Documentation

    • Create docs/api/middleware_management.md
    • Provide usage examples with curl commands
    • Document authentication requirements
    • Explain order/priority system for middleware execution
    • Document fallback group behavior
    • Add sequence diagrams for complex operations (reordering, fallback)
  4. Validation and Review

    • Validate OpenAPI spec with openapi-generator validate
    • Generate API documentation preview with Redoc/Swagger UI
    • Create examples for each endpoint operation
    • Peer review with team focusing on:
      • API design best practices
      • Consistency with existing proTES APIs
      • GA4GH TES specification alignment

Summary by Sourcery

Add an OpenAPI specification and configuration wiring for a new runtime middleware management API in proTES.

New Features:

  • Introduce a Middleware Management REST API specification with endpoints for listing, creating, updating, deleting, reordering, and validating middlewares.

Enhancements:

  • Extend FOCA configuration to load the new middleware management OpenAPI spec with strict request/response validation and routing to middleware controllers.
  • Add a MongoDB index on middleware name to support efficient lookup and management operations.

Documentation:

  • Add middleware management architecture and API design documentation describing endpoints, data models, design decisions, and future work.

Copilot AI review requested due to automatic review settings January 24, 2026 17:38
@sourcery-ai
Copy link

sourcery-ai bot commented Jan 24, 2026

Reviewer's Guide

Adds a new OpenAPI 3.0 specification for runtime middleware management and wires it into FOCA/Connexion configuration and project docs, without yet implementing controller logic.

Sequence diagram for adding a new middleware via the OpenAPI spec

sequenceDiagram
  actor Admin
  participant Client as API_client
  participant Connexion as FOCA_Connexion
  participant Controller as MiddlewaresController
  participant DB as MongoDB_middlewares

  Admin->>Client: Prepare MiddlewareCreate_payload
  Client->>Connexion: POST /ga4gh/tes/v1/middlewares
  Connexion->>Connexion: Validate_request_against_OpenAPI
  Connexion->>Controller: addMiddleware(MiddlewareCreate)

  Controller->>DB: Query existing_middlewares_for_order
  DB-->>Controller: Existing_middlewares_with_orders
  Controller->>Controller: Compute_final_order_and_shift_stack
  Controller->>DB: Insert new_middleware_document
  DB-->>Controller: Insert_result_with_id

  Controller-->>Connexion: 201 MiddlewareCreateResponse
  Connexion-->>Client: HTTP_201_with_JSON_body
  Client-->>Admin: Show_created_middleware_id_and_order
Loading

Class diagram for middleware management OpenAPI schemas

classDiagram
  class MiddlewareConfig {
    string _id
    string name
    string class_path_string
    string[] class_path_group
    int order
    string source
    string github_url
    object config
    boolean enabled
    string created_at
    string updated_at
  }

  class MiddlewareCreate {
    string name
    string class_path_string
    string[] class_path_group
    int order
    string github_url
    object config
    boolean enabled
  }

  class MiddlewareUpdate {
    string name
    int order
    object config
    boolean enabled
  }

  class MiddlewareList {
    MiddlewareConfig[] middlewares
    int total
    int limit
    int offset
  }

  class MiddlewareCreateResponse {
    string _id
    int order
    string message
  }

  class MiddlewareOrder {
    string[] ordered_ids
  }

  class ValidationRequest {
    string class_path
    string github_url
    string code
  }

  class ValidationErrorItem {
    int line
    int column
    string message
    string severity
  }

  class ValidationWarningItem {
    int line
    string message
    string severity
  }

  class ValidationResponse {
    boolean valid
    string message
    ValidationErrorItem[] errors
    ValidationWarningItem[] warnings
  }

  class ErrorResponse {
    string error
    string message
    object details
    string timestamp
  }

  MiddlewareList "1" --> "*" MiddlewareConfig : contains
  MiddlewareCreateResponse --> MiddlewareConfig : _id_and_order_align_with
  MiddlewareOrder "1" --> "*" MiddlewareConfig : orders
  ValidationResponse "1" --> "*" ValidationErrorItem : has
  ValidationResponse "1" --> "*" ValidationWarningItem : has
Loading

File-Level Changes

Change Details Files
Integrate middleware management OpenAPI spec into FOCA API configuration and database indexes.
  • Add MongoDB index configuration for middlewares collection keyed by name.
  • Register new middleware_management OpenAPI spec in FOCA specs list with strict request/response validation enabled.
  • Configure routing to pro_tes.api.middlewares.controllers and temporarily disable authentication for these endpoints.
  • Tidy custom middleware list config (no behavioral change).
pro_tes/config.yaml
Introduce Middleware Management OpenAPI 3.0 specification describing middleware lifecycle operations and data models.
  • Define seven middleware management endpoints for listing, creating, retrieving, updating, deleting, reordering, and validating middlewares under /ga4gh/tes/v1/middlewares.
  • Add query parameters for pagination, sorting, and filtering on list endpoint and force flag on delete.
  • Specify request/response schemas for middleware CRUD, ordering, validation, and error handling, including MongoDB ObjectId patterns and GitHub URL constraints.
  • Model middleware configuration with support for single class paths and fallback groups, tracking source (local/github), timestamps, and soft-delete semantics.
pro_tes/api/middleware_management.yaml
Add high-level documentation for the Middleware Management API design and its role in the project.
  • Document the purpose and background of the middleware management feature and its design-first approach.
  • Summarize the defined endpoints, schemas, and key design decisions (ordering, soft delete, immutable class paths, GitHub integration).
  • Describe integration with FOCA/Connexion, file layout, validation strategy, and planned future subtasks for implementation and security.
  • Clarify that this PR is non-breaking and currently limited to specification and documentation work.
docs/middleware.md

Possibly linked issues

  • #N/A: PR creates the full middleware_management OpenAPI spec, schemas, endpoints, and docs exactly as the issue requests.

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The MiddlewareCreateResponse schema only exposes _id, order, and message, but the docs in docs/middleware.md state that it returns the full middleware object; consider either expanding the response schema (e.g., embed MiddlewareConfig) or updating the docs to reflect the actual payload to avoid confusion for API consumers.
  • In ValidationRequest, the class_path field is described as "Class path or code to validate" while a separate code field is also defined; it would be clearer to tighten the descriptions and possibly add explicit rules (e.g., mutual exclusivity or precedence) so clients know exactly how these fields should be used together.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `MiddlewareCreateResponse` schema only exposes `_id`, `order`, and `message`, but the docs in `docs/middleware.md` state that it returns the full middleware object; consider either expanding the response schema (e.g., embed `MiddlewareConfig`) or updating the docs to reflect the actual payload to avoid confusion for API consumers.
- In `ValidationRequest`, the `class_path` field is described as "Class path or code to validate" while a separate `code` field is also defined; it would be clearer to tighten the descriptions and possibly add explicit rules (e.g., mutual exclusivity or precedence) so clients know exactly how these fields should be used together.

## Individual Comments

### Comment 1
<location> `pro_tes/config.yaml:82` </location>
<code_context>
+      - api/middleware_management.yaml
+    add_operation_fields:
+      x-openapi-router-controller: pro_tes.api.middlewares.controllers
+    disable_auth: True
+    connexion:
+      strict_validation: True
</code_context>

<issue_to_address>
**🚨 issue (security):** Exposing middleware management without auth looks risky for production use.

Since this endpoint can change the middleware stack at runtime, keeping `disable_auth: True` lets anyone with network access reorder/enable/disable/add middlewares. Unless this is strictly limited to a trusted/debug environment, this should require authz (or be configurable per deployment) to avoid a straightforward privilege escalation path.
</issue_to_address>

### Comment 2
<location> `pro_tes/api/middleware_management.yaml:572-358` </location>
<code_context>
+            - "507f1f77bcf86cd799439012"
+            - "507f1f77bcf86cd799439013"
+
+    ValidationRequest:
+      type: object
+      description: Request body for validating middleware code
+      required:
+        - class_path
+      properties:
</code_context>

<issue_to_address>
**issue (bug_risk):** ValidationRequest requires class_path even when validating raw code, which conflicts with the description.

The schema contradicts the docs: `code` is documented as an alternative to `class_path`, but `class_path` is required. This blocks use cases where clients submit only raw code or only a `github_url`. Please adjust the schema (e.g., with `oneOf`/`anyOf` to require at least one of `class_path`, `github_url`, or `code`, or by making `class_path` optional and clarifying the validation rules) so the spec matches the intended behavior.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a comprehensive OpenAPI 3.0 specification for middleware management functionality in proTES, enabling dynamic management of middleware components at runtime. This is the first subtask in a multi-part implementation following a design-first approach, focusing solely on the API specification and documentation without implementing the actual controller logic.

Changes:

  • Added complete OpenAPI 3.0 specification defining 7 REST endpoints for middleware CRUD operations, reordering, and validation
  • Configured FOCA integration with new middleware management API routes and MongoDB collection
  • Added comprehensive documentation describing the API design, endpoints, data models, and implementation plan

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 17 comments.

File Description
pro_tes/api/middleware_management.yaml New OpenAPI 3.0 specification defining middleware management API with 7 endpoints and 9 schemas
pro_tes/config.yaml Added FOCA integration for middleware API, database collection configuration, and routing setup
docs/middleware.md Comprehensive documentation covering API design, implementation details, security considerations, and future work
Comments suppressed due to low confidence (1)

docs/middleware.md:210

  • The changelog date "2026-01-24" is in the future. This should be corrected to use the actual date when this work was completed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

description: |
Retrieve all configured middlewares with their order, metadata, and status.
Results are sorted by execution order (ascending) by default.
operationId: listMiddlewares
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. The TES API uses PascalCase (GetServiceInfo, CreateTask, GetTask, CancelTask, ListTasks) while the middleware API uses camelCase (listMiddlewares, addMiddleware, getMiddleware, updateMiddleware, deleteMiddleware, reorderMiddlewares, validateMiddleware). For consistency with the existing GA4GH TES API in this codebase, consider using PascalCase for all operationIds in the middleware API.

Copilot uses AI. Check for mistakes.
local class paths or fetched from GitHub repositories. If order is not specified,
the middleware is appended to the end of the stack. If order is specified,
existing middlewares at that position or higher are shifted up by one.
operationId: addMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (AddMiddleware instead of addMiddleware) to match the pattern used in the existing API (e.g., CreateTask, GetTask, etc. in pro_tes/api/9e9c5aa.task_execution_service.openapi.yaml).

Copilot uses AI. Check for mistakes.
get:
summary: Get middleware details
description: Retrieve detailed information about a specific middleware by ID
operationId: getMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (GetMiddleware instead of getMiddleware) to match the pattern used in the existing API.

Copilot uses AI. Check for mistakes.
description: |
Remove a middleware from the execution stack. By default performs soft delete
(sets enabled=false). Use force=true query parameter for hard deletion.
operationId: deleteMiddleware
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operationId naming convention is inconsistent with the existing TES API. Consider using PascalCase (DeleteMiddleware instead of deleteMiddleware) to match the pattern used in the existing API.

Copilot uses AI. Check for mistakes.
Comment on lines 572 to 591
ValidationRequest:
type: object
description: Request body for validating middleware code
required:
- class_path
properties:
class_path:
type: string
description: Class path or code to validate
example: "pro_tes.plugins.middlewares.task_distribution.distance.TaskDistributionDistance"
github_url:
type: string
description: GitHub URL for fetching middleware code
nullable: true
pattern: '^https://raw\.githubusercontent\.com/.+\.py$'
example: "https://raw.githubusercontent.com/user/repo/main/middleware.py"
code:
type: string
description: Raw Python code to validate (alternative to class_path)
nullable: true
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ValidationRequest schema marks 'class_path' as required but also provides 'code' as an "alternative to class_path" (line 590). This is contradictory - if 'code' is truly an alternative, then 'class_path' should not be required. Consider using oneOf to express that either 'class_path' or 'code' is required, but not both, or clarify the validation logic in the description.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address.

middlewares:
indexes:
- keys:
name: 1
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The database index is created on the 'name' field, but the OpenAPI spec indicates that both 'name' and 'class_path' should be unique (based on the 400 error description mentioning 'duplicate name/class_path'). Consider adding indexes on both 'name' and 'class_path' fields with unique constraints to enforce this at the database level, or clarify whether only 'name' needs to be unique.

Suggested change
name: 1
name: 1
options:
"unique": True
- keys:
class_path: 1
options:
"unique": True

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uniqueg i have already implemented and pushed this suggestion here, please do check it. thanks a lot!

Comment on lines 104 to 135
## File Structure

```
pro_tes/
├── api/
│ └── middleware_management.yaml (OpenAPI specification)
└── config.yaml (FOCA integration)

docs/
└── middleware.md (This file)
```

## Documentation Deliverables

**API Documentation**: Comprehensive guide with request/response examples for each endpoint. Includes curl commands, common use cases, and troubleshooting tips.

**Architecture Decision Record**: Documents twelve major design decisions with rationale, alternatives considered, and consequences. Serves as reference for future development.

**Postman Collection**: Ready-to-use collection with fourteen pre-configured requests. Includes environment variables, test scripts, and example data for all scenarios.

**Quick Reference**: Single-page reference with essential endpoints, parameters, and response codes. Designed for daily development use.

**Validation Script**: Bash script that validates OpenAPI syntax using multiple tools. Checks for common errors like undefined schema references and invalid endpoint definitions.

## Testing Approach

This subtask focuses on specification validation rather than runtime testing since no executable code is implemented yet. Validation performed:

**YAML Syntax**: Verified file parses correctly as valid YAML without syntax errors.

**OpenAPI Compliance**: Confirmed specification follows OpenAPI 3.0 standards including required fields, valid schema definitions, and proper reference resolution.

Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation references several files that are not included in this PR: 'docs/api/middleware_management.md', 'middleware_management.postman_collection.json', 'QUICK_REFERENCE.md', 'docs/architecture/middleware_api_design.md', and 'scripts/validate_openapi.sh'. Either these files should be included in this PR, or the documentation should clarify that these are planned deliverables for future PRs. The "Documentation Deliverables" section also describes these files in detail, which creates confusion about what is actually delivered in this subtask.

Copilot uses AI. Check for mistakes.
Comment on lines 647 to 671
ErrorResponse:
type: object
description: Standard error response
required:
- error
- message
properties:
error:
type: string
description: Error type/code
example: "MiddlewareNotFound"
message:
type: string
description: Human-readable error message
example: "Middleware with ID '507f1f77bcf86cd799439011' not found"
details:
type: object
description: Additional error details
nullable: true
additionalProperties: true
timestamp:
type: string
format: date-time
description: Error timestamp
example: "2026-01-24T10:30:00Z"
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ErrorResponse schema is inconsistent with the existing error handling convention. The FOCA exception configuration (pro_tes/config.yaml:120) requires error responses to have 'message' and 'code' fields, but the ErrorResponse schema defines 'error', 'message', 'details', and 'timestamp' fields without a 'code' field. For consistency with the existing error handling (see pro_tes/exceptions.py:52-121), the ErrorResponse schema should include a 'code' field containing the HTTP status code, and consider using 'code' instead of or in addition to 'error' for the error type.

Copilot uses AI. Check for mistakes.
Comment on lines 156 to 172
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"

No newline at end of file
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace on empty line. Remove the trailing spaces for consistency with code style.

Suggested change
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"

Copilot uses AI. Check for mistakes.
Comment on lines 84 to 90
```yaml
specs:
- path:
- api/middleware_management.yaml
add_operation_fields:
x-openapi-router-controller: pro_tes.api.middlewares.controllers
disable_auth: True
Copy link

Copilot AI Jan 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FOCA configuration for the middleware management API is setting disable_auth: True, which exposes all middleware management endpoints without any authentication. In any non-isolated environment this allows an unauthenticated attacker to list, create, modify, or delete middlewares, potentially leading to arbitrary code execution or unauthorized changes to workflow behavior. Require authentication for these endpoints (e.g., by wiring them into your existing security scheme instead of using disable_auth) and, if needed, restrict unauthenticated access to development-only configs that are not deployed to production.

Copilot uses AI. Check for mistakes.
@keshxvdayal keshxvdayal force-pushed the feature/middleware-api-spec branch from b3118e8 to 5025c12 Compare January 24, 2026 18:02
Copy link
Member

@uniqueg uniqueg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have a chance to review everything yet, but please go ahead and address the existing comments.

url: https://www.apache.org/licenses/LICENSE-2.0.html

servers:
- url: /ga4gh/tes/v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider hosting these elsewhere, given that these are not GA4GH (TES) endpoints. Just /protes/v1 would be fine, I guess.

Comment on lines 34 to 50
- name: limit
in: query
description: Maximum number of results to return
required: false
schema:
type: integer
minimum: 1
maximum: 100
default: 50
- name: offset
in: query
description: Number of results to skip (for pagination)
required: false
schema:
type: integer
minimum: 0
default: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please double check to ensure that pagination is in line with the GA4GH pagination guide that was recently published: https://github.com/ga4gh/TASC/blob/main/recommendations/API%20pagination%20guide.md

summary: Add local middleware
value:
name: "Distance-based Router"
class_path: "pro_tes.plugins.middlewares.task_distribution.distance.TaskDistributionDistance"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this only works if the package containing the class is installed. If middlewares are shipped within proTES (which is a bit of an antipattern and will probably not be supported in the future), this will be fine - but if not, I think we need to come up with something better.

Possibly we need more than one parameter, (1) a package root path (containing a setup.py or pyproject.toml file that will allow installation) OR package name (to be installed from PyPI or another registry) OR Git repo AND (2) a class path, as an entry point to ensure that we know what code to execute in the package once it has been installed. This also makes the use of different sources (local, Git, package registry) consistent.

Comment on lines 122 to 123
- "pro_tes.plugins.middlewares.task_distribution.distance.TaskDistributionDistance"
- "pro_tes.plugins.middlewares.task_distribution.random.TaskDistributionRandom"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about combinations of local, GitHub etc. middlewares. Not all middlewares need to come from the same source, so how is the mapping done? I think the structuring of the model needs to be different - each one will need a URI (local path, GitHub URL, package registry), an entry point (class path) and possibly a type (local, Git, package registry). See comment above for more details.

Copy link
Member

@uniqueg uniqueg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some ways to go, but we're getting there! :)

Comment on lines 13 to 58
The Middleware Management API is defined using OpenAPI 3.0 specification and provides seven REST endpoints for complete middleware lifecycle management.

### API Endpoints

**List Middlewares** - GET /protes/v1/middlewares
Returns all configured middlewares with pagination and filtering support. Results are sorted by execution order by default. Supports filtering by enabled status and source type.

**Add Middleware** - POST /protes/v1/middlewares
Creates a new middleware in the execution stack. Supports loading from local packages, GitHub repositories, or PyPI packages. Automatically handles order assignment and stack shifting.

**Get Middleware Details** - GET /protes/v1/middlewares/{middleware_id}
Retrieves detailed information about a specific middleware including configuration, metadata, and execution statistics.

**Update Middleware** - PUT /protes/v1/middlewares/{middleware_id}
Updates middleware configuration. Only allows modification of name, order, config parameters, and enabled status. Package path and entry point cannot be changed for security reasons.

**Delete Middleware** - DELETE /protes/v1/middlewares/{middleware_id}
Removes a middleware from the stack. Supports soft delete (disable) by default and hard delete with force parameter.

**Reorder Stack** - PUT /protes/v1/middlewares/reorder
Reorders the entire middleware execution stack by accepting an ordered array of middleware IDs.

**Validate Code** - POST /protes/v1/middlewares/validate
Validates middleware code before creation. Checks Python syntax, required interface implementation, and security constraints.

### Data Model

The API uses comprehensive schema definitions to structure request and response data:

**MiddlewareConfig**: Complete middleware representation including ID, name, package information (source type, package path, entry point), execution order, enabled status, configuration parameters, and timestamps.

**MiddlewareCreate**: Request body for creating new middleware. Includes name, package source configuration (local path, GitHub URL, or PyPI package), entry point (class path), optional order, enabled flag, and configuration dict.

**MiddlewareUpdate**: Request body for updates. Limited to name, order, config, and enabled fields to prevent unauthorized code changes.

**MiddlewareList**: Paginated list response containing middleware array, total count, page information, and navigation tokens following GA4GH pagination guidelines.

**MiddlewareCreateResponse**: Response after successful creation including the middleware ID, assigned order, and success message.

**MiddlewareOrder**: Request body for reordering containing an array of middleware IDs in desired execution order.

**ValidationRequest**: Code validation request containing package source information and entry point to validate.

**ValidationResponse**: Validation result including validity boolean, validation messages, error details with line numbers, and warnings.

**ErrorResponse**: Standard error response with HTTP status code, error message, and optional details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can replace all of this with just a reference to the Swagger Editor, with something like this: https://editor.swagger.io/?url=https://raw.githubusercontent.com/elixir-cloud-aai/proTES/refs/heads/feature/middleware-api-spec/pro_tes/api/middleware_management.yaml

The documentation is much more comprehensive, easier to parse and play with, and always up to date. Obviously you need to adapt the link to point to the path it will have once the PR is merged, not the one from this branch.


**Order-Based Execution**: Middlewares execute in ascending order. Lower order values run first. This provides clear, predictable execution flow that's easy to understand and debug.

**Fallback Group Support**: Allows grouping multiple middleware sources in a single middleware entry. If the first middleware fails, the system automatically tries the next one in the list. Each middleware in a fallback group specifies its own source, package path, and entry point.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify: Not the next one in the last, but the next one in the fallback stack/group.


**Fallback Group Support**: Allows grouping multiple middleware sources in a single middleware entry. If the first middleware fails, the system automatically tries the next one in the list. Each middleware in a fallback group specifies its own source, package path, and entry point.

**Soft Delete Default**: DELETE operations disable rather than remove middlewares by default. This preserves execution history and allows easy rollback. Hard delete requires explicit force parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea, but I'm not sure this implementation although it raises some questions.

First of all, it breaks with REST semantics. DELETE operations delete a resource. If you want to modify a resource, you should use UPDATE (or possibly PUT), e.g., on a enabled parameter in the data model. This would also make it easy to reactivate it (no new endpoint needed).

Also, can this be applied to an entire fallback group at once?

And how do you actually roll back the entire state - if reproducibility is your problem here?

In any case, how do you ensure reproducibility in another proTES instance?

To be honest, there are so many questions around this that I think this should be disabled for now. Rather, you should think about how you can capture enough information about the middleware stack (put a whole copy in it, including versions) in the task logs that it can be recreated in another instance (or the same one in the future).

Speaking of versions: How can different versions of middlewares be handled? Say I want to update a repo from Git to a new commit/tag? Or a new PyPI version? And how are multiple versions handled? Do you plan to sandbox/isolate the installation artifacts or dependencies for each middleware (you absolutely should).

This brings me to another question? How does DELETE work? Does it just remove the middleware from the stack, or does it also undo any installations? Because over time, conflicts may accumulate, if isolation is not in place. And even if it is, this grows over time...

Of course, most of these points are discussion points for the next PR. For the API model in this PR, I would just remove any provisioning for soft deletion.


**ErrorResponse**: Standard error response with HTTP status code, error message, and optional details.

### Key Features
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of these are currently available, so this is something that should make the docs only once these features are actually implemented. You really need to understand and internalize the continuous documentation philosophy. It doesn't matter if you think it's there next week, tomorrow or even in 2 hours. Each PR should update the documentation to represent the state of that code change - you never know what will happen.

Copy link
Member

@uniqueg uniqueg Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this PR does not have any controllers (which is okay, it shouldn't have), none of these features are currently supported and should, in principle, not be in the docs until the next PR. Same goes for anything else that depends on the controllers being implemented and functional.

**Immutable Package Configuration**: Once created, a middleware's package source and entry point cannot be changed. This prevents security risks from code substitution attacks. To change implementation, users must delete and recreate.

**Multiple Package Sources**: Supports loading middleware from:
- **Local packages**: Installed Python packages with a class path entry point
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually discouraged and should be deprecated, to be removed in the future (once the middlewares inside proTES are migrated to a new repo). The reason being that it's difficult/impossible to reproduce. Plus, in most cases, users won't have access to the running instance's file system, so it's really just useful for developers and local deployments.

For now, please note that it's for development purposes and is deprecated, and list it as the third, not the first option.

description: Complete middleware configuration object
required:
- _id
- name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe shouldn't be required? You could make it nullable and use the package or repo name etc. as default if the user doesn't supply one.

- _id
- name
- source
- order
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be nullable. Make it easy for people to use the API.

Comment on lines +494 to +495
- created_at
- updated_at
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be set by the system, so should be nullable as well - a user shouldn't supply these.

But now I'm realizing why you require all of these - you are using the model in the request AND response?

Check out the readOnly and writeOnly types in OpenAPI. With these, you can give more fine-grained control over what you expect from the user in the request and what you need in the response. In any case, this may not completely solve the problem. If you want to always respond with a param, just implement it like that. But don't require it from the user if it's not absolutely necessary.

order:
type: integer
description: Execution order (0 = first)
minimum: 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a default!

source:
$ref: '#/components/schemas/MiddlewareSource'

ValidationResponse:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot reason about this without seeing the implementation (or implementation plan). Please remove it from this PR and readd when you implement it. Same with the Validation Request and endpoint. In fact, leave this out of the next PR, too. It's a bonus feature and does not have high priority at this point.

@uniqueg uniqueg changed the title feat: add middleware management OpenAPI Spec feat: add middleware management API specs Feb 8, 2026

The API includes comprehensive security controls:

**Authentication Required**: All middleware management endpoints require authentication through the existing proTES security scheme.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed?

Comment on lines +193 to +204
## Dependencies

**External**:
- OpenAPI 3.0 specification format
- FOCA framework (Flask-based configuration)
- Connexion (OpenAPI request routing)
- MongoDB (persistence layer)

**Internal**:
- Existing proTES API structure
- Current middleware plugin architecture
- MongoDB database configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed?

Comment on lines +206 to +208
## API Specification Location

The complete OpenAPI 3.0 specification is available at: `pro_tes/api/middleware_management.yaml`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means: Remove it here. Not addressed.

Comment on lines +59 to +64
- name: enabled
in: query
description: Filter by enabled status
required: false
schema:
type: boolean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants