Skip to content

Commit 0066e5c

Browse files
authored
datamate documents
🎉 add full featured docs
2 parents 0b3452e + 8348d45 commit 0066e5c

51 files changed

Lines changed: 13159 additions & 311 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 205 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,205 @@
1+
---
2+
title: API Reference
3+
description: DataMate API documentation
4+
weight: 4
5+
---
6+
7+
{{% pageinfo %}}
8+
DataMate provides complete REST APIs supporting programmatic access to all core features.
9+
{{% /pageinfo %}}
10+
11+
## API Overview
12+
13+
DataMate API is based on REST architecture design, providing the following services:
14+
15+
- **Data Management API**: Dataset and file management
16+
- **Data Cleaning API**: Data cleaning task management
17+
- **Data Collection API**: Data collection task management
18+
- **Data Annotation API**: Data annotation task management
19+
- **Data Synthesis API**: Data synthesis task management
20+
- **Data Evaluation API**: Data evaluation task management
21+
- **Operator Market API**: Operator management
22+
- **RAG Indexer API**: Knowledge base and vector retrieval
23+
- **Pipeline Orchestration API**: Pipeline orchestration management
24+
25+
## Authentication
26+
27+
DataMate supports two authentication methods:
28+
29+
### JWT Authentication (Recommended)
30+
31+
```http
32+
GET /api/v1/data-management/datasets
33+
Authorization: Bearer <your-jwt-token>
34+
```
35+
36+
Get JWT Token:
37+
38+
```http
39+
POST /api/v1/auth/login
40+
Content-Type: application/json
41+
42+
{
43+
"username": "admin",
44+
"password": "password"
45+
}
46+
```
47+
48+
Response:
49+
50+
```json
51+
{
52+
"token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
53+
"expiresIn": 86400
54+
}
55+
```
56+
57+
### API Key Authentication
58+
59+
```http
60+
GET /api/v1/data-management/datasets
61+
X-API-Key: <your-api-key>
62+
```
63+
64+
## Common Response Format
65+
66+
### Success Response
67+
68+
```json
69+
{
70+
"code": 200,
71+
"message": "success",
72+
"data": {
73+
// Response data
74+
}
75+
}
76+
```
77+
78+
### Error Response
79+
80+
```json
81+
{
82+
"code": 400,
83+
"message": "Bad Request",
84+
"error": "Invalid parameter: datasetId",
85+
"timestamp": "2024-01-15T10:30:00Z",
86+
"path": "/api/v1/data-management/datasets"
87+
}
88+
```
89+
90+
### Paged Response
91+
92+
```json
93+
{
94+
"content": [],
95+
"page": 0,
96+
"size": 20,
97+
"totalElements": 100,
98+
"totalPages": 5,
99+
"first": true,
100+
"last": false
101+
}
102+
```
103+
104+
## API Endpoints
105+
106+
### Data Management
107+
108+
| Endpoint | Method | Description |
109+
|----------|--------|-------------|
110+
| `/data-management/datasets` | GET | Get dataset list |
111+
| `/data-management/datasets` | POST | Create dataset |
112+
| `/data-management/datasets/{id}` | GET | Get dataset details |
113+
| `/data-management/datasets/{id}` | PUT | Update dataset |
114+
| `/data-management/datasets/{id}` | DELETE | Delete dataset |
115+
| `/data-management/datasets/{id}/files` | GET | Get file list |
116+
| `/data-management/datasets/{id}/files/upload` | POST | Upload files |
117+
118+
### Data Cleaning
119+
120+
| Endpoint | Method | Description |
121+
|----------|--------|-------------|
122+
| `/data-cleaning/tasks` | GET | Get cleaning task list |
123+
| `/data-cleaning/tasks` | POST | Create cleaning task |
124+
| `/data-cleaning/tasks/{id}` | GET | Get task details |
125+
| `/data-cleaning/tasks/{id}` | PUT | Update task |
126+
| `/data-cleaning/tasks/{id}` | DELETE | Delete task |
127+
| `/data-cleaning/tasks/{id}/execute` | POST | Execute task |
128+
129+
### Data Collection
130+
131+
| Endpoint | Method | Description |
132+
|----------|--------|-------------|
133+
| `/data-collection/tasks` | GET | Get collection task list |
134+
| `/data-collection/tasks` | POST | Create collection task |
135+
| `/data-collection/tasks/{id}` | GET | Get task details |
136+
| `/data-collection/tasks/{id}/execute` | POST | Execute collection task |
137+
138+
### Data Synthesis
139+
140+
| Endpoint | Method | Description |
141+
|----------|--------|-------------|
142+
| `/data-synthesis/tasks` | GET | Get synthesis task list |
143+
| `/data-synthesis/tasks` | POST | Create synthesis task |
144+
| `/data-synthesis/templates` | GET | Get instruction template list |
145+
| `/data-synthesis/templates` | POST | Create instruction template |
146+
147+
### Operator Market
148+
149+
| Endpoint | Method | Description |
150+
|----------|--------|-------------|
151+
| `/operator-market/operators` | GET | Get operator list |
152+
| `/operator-market/operators` | POST | Publish operator |
153+
| `/operator-market/operators/{id}` | GET | Get operator details |
154+
| `/operator-market/operators/{id}/install` | POST | Install operator |
155+
156+
### RAG Indexer
157+
158+
| Endpoint | Method | Description |
159+
|----------|--------|-------------|
160+
| `/rag/knowledge-bases` | GET | Get knowledge base list |
161+
| `/rag/knowledge-bases` | POST | Create knowledge base |
162+
| `/rag/knowledge-bases/{id}/documents` | POST | Upload documents |
163+
| `/rag/knowledge-bases/{id}/search` | POST | Vector search |
164+
165+
## Error Codes
166+
167+
| Code | Description |
168+
|------|-------------|
169+
| 200 | Success |
170+
| 201 | Created |
171+
| 400 | Bad Request |
172+
| 401 | Unauthorized |
173+
| 403 | Forbidden |
174+
| 404 | Not Found |
175+
| 409 | Conflict |
176+
| 500 | Internal Server Error |
177+
178+
## Rate Limiting
179+
180+
API call rate limits:
181+
182+
- **Default limit**: 1000 requests/hour
183+
- **Burst limit**: 100 requests/minute
184+
185+
Exceeding the limit returns `429 Too Many Requests`.
186+
187+
Response headers contain rate limiting information:
188+
189+
```http
190+
X-RateLimit-Limit: 1000
191+
X-RateLimit-Remaining: 999
192+
X-RateLimit-Reset: 1642252800
193+
```
194+
195+
## Version Management
196+
197+
API versions are specified through URL paths:
198+
199+
- Current version: `/api/v1/`
200+
- Future versions: `/api/v2/`
201+
202+
## Related Documentation
203+
204+
- [Developer Guide](/docs/developer-guide/) - Architecture and development guide
205+
- [OpenAPI Specifications](https://github.com/ModelEngine-Group/DataMate/tree/main/backend/openapi/specs) - Complete OpenAPI specs
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
---
2+
title: Data Management API
3+
description: Dataset and file management API
4+
weight: 1
5+
---
6+
7+
{{% pageinfo %}}
8+
Data management API provides capabilities for dataset and file creation, query, update, and deletion.
9+
{{% /pageinfo %}}
10+
11+
## Basic Information
12+
13+
- **Base URL**: `http://localhost:8092/api/v1/data-management`
14+
- **Authentication**: JWT / API Key
15+
- **Content-Type**: `application/json`
16+
17+
## Dataset Management
18+
19+
### Get Dataset List
20+
21+
```http
22+
GET /data-management/datasets?page=0&size=20&type=text
23+
```
24+
25+
**Query Parameters**:
26+
27+
| Parameter | Type | Required | Description |
28+
|-----------|------|----------|-------------|
29+
| page | integer | No | Page number, starts from 0 |
30+
| size | integer | No | Page size, default 20 |
31+
| type | string | No | Dataset type filter |
32+
| tags | string | No | Tag filter, comma-separated |
33+
| keyword | string | No | Keyword search |
34+
| status | string | No | Status filter |
35+
36+
**Response Example**:
37+
```json
38+
{
39+
"content": [
40+
{
41+
"id": "dataset-001",
42+
"name": "text_dataset",
43+
"description": "Text dataset",
44+
"type": {
45+
"code": "TEXT",
46+
"name": "Text"
47+
},
48+
"status": "ACTIVE",
49+
"fileCount": 1000,
50+
"totalSize": 1073741824,
51+
"createdAt": "2024-01-15T10:00:00Z"
52+
}
53+
],
54+
"page": 0,
55+
"size": 20,
56+
"totalElements": 1
57+
}
58+
```
59+
60+
### Create Dataset
61+
62+
```http
63+
POST /data-management/datasets
64+
Content-Type: application/json
65+
66+
{
67+
"name": "my_dataset",
68+
"description": "My dataset",
69+
"type": "TEXT",
70+
"tags": ["training", "nlp"]
71+
}
72+
```
73+
74+
### Get Dataset Details
75+
76+
```http
77+
GET /data-management/datasets/{datasetId}
78+
```
79+
80+
### Update Dataset
81+
82+
```http
83+
PUT /data-management/datasets/{datasetId}
84+
Content-Type: application/json
85+
86+
{
87+
"name": "updated_dataset",
88+
"description": "Updated description"
89+
}
90+
```
91+
92+
### Delete Dataset
93+
94+
```http
95+
DELETE /data-management/datasets/{datasetId}
96+
```
97+
98+
## File Management
99+
100+
### Get File List
101+
102+
```http
103+
GET /data-management/datasets/{datasetId}/files?page=0&size=20
104+
```
105+
106+
### Upload File
107+
108+
```http
109+
POST /data-management/datasets/{datasetId}/files/upload/chunk
110+
Content-Type: multipart/form-data
111+
```
112+
113+
### Download File
114+
115+
```http
116+
GET /data-management/datasets/{datasetId}/files/{fileId}/download
117+
```
118+
119+
### Delete File
120+
121+
```http
122+
DELETE /data-management/datasets/{datasetId}/files/{fileId}
123+
```
124+
125+
## Error Response
126+
127+
```json
128+
{
129+
"code": 400,
130+
"message": "Bad Request",
131+
"error": "Invalid parameter: datasetId",
132+
"timestamp": "2024-01-15T10:30:00Z",
133+
"path": "/api/v1/data-management/datasets"
134+
}
135+
```
136+
137+
## SDK Usage
138+
139+
### Python
140+
141+
```python
142+
from datamate import DataMateClient
143+
144+
client = DataMateClient(
145+
base_url="http://localhost:8080",
146+
api_key="your-api-key"
147+
)
148+
149+
# Get datasets
150+
datasets = client.data_management.get_datasets()
151+
152+
# Create dataset
153+
dataset = client.data_management.create_dataset(
154+
name="my_dataset",
155+
type="TEXT"
156+
)
157+
```
158+
159+
### cURL
160+
161+
```bash
162+
# Get datasets
163+
curl -X GET "http://localhost:8092/api/v1/data-management/datasets" \
164+
-H "Authorization: Bearer your-jwt-token"
165+
166+
# Create dataset
167+
curl -X POST "http://localhost:8092/api/v1/data-management/datasets" \
168+
-H "Authorization: Bearer your-jwt-token" \
169+
-H "Content-Type: application/json" \
170+
-d '{
171+
"name": "my_dataset",
172+
"type": "TEXT"
173+
}'
174+
```
175+
176+
## Related Documentation
177+
178+
- [Data Management](/docs/user-guide/data-management/) - User guide
179+
- [OpenAPI Specs](https://github.com/ModelEngine-Group/DataMate/blob/main/backend/openapi/specs/data-management.yaml) - Complete specs

0 commit comments

Comments
 (0)