Skip to content

Commit 6ab93d6

Browse files
authored
Merge pull request #167 from RECETOX/copilot/fix-3750d4c6-183c-42cf-9227-986c01132cfb
Add comprehensive documentation for adding new services to MSMetaEnhancer
2 parents 8fedcdf + 5c7815c commit 6ab93d6

1 file changed

Lines changed: 256 additions & 0 deletions

File tree

CONTRIBUTING.md

Lines changed: 256 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,259 @@ In case you feel like you've made a valuable contribution, but you don't know ho
3535
1. update the [CHANGELOG](CHANGELOG.md) file with change;
3636
1. [push](http://rogerdudler.github.io/git-guide/>) your feature branch to (your fork of) this repository on GitHub;
3737
1. create the pull request, e.g. following the instructions [here](https://help.github.com/articles/creating-a-pull-request/).
38+
39+
## Adding a new service to MSMetaEnhancer
40+
41+
MSMetaEnhancer has a modular architecture that makes it easy to add new conversion services. There are two main types of converters:
42+
43+
- **Web Converters**: Services that make HTTP requests to external APIs
44+
- **Compute Converters**: Services that perform local computations
45+
46+
### Architecture Overview
47+
48+
The MSMetaEnhancer system consists of several key components:
49+
50+
1. **Base Converter Classes**:
51+
- `Converter`: Abstract base class for all converters
52+
- `WebConverter`: Base class for web-based API services
53+
- `ComputeConverter`: Base class for local computation services
54+
55+
2. **Job System**:
56+
- `Job`: Represents a conversion task (source → target using specific converter)
57+
- Jobs are defined as tuples: `(source_attribute, target_attribute, converter_name)`
58+
59+
3. **Converter Builder**:
60+
- Automatically discovers and instantiates available converters
61+
- Manages both web and compute converters
62+
63+
4. **Dynamic Method Creation**:
64+
- Converters automatically generate methods like `compound_name_to_inchi()`
65+
- Based on the conversions list defined in each converter
66+
67+
### Adding a new Web Converter
68+
69+
To add a new web-based service, follow these steps:
70+
71+
#### 1. Create the converter file
72+
73+
Create a new Python file in `MSMetaEnhancer/libs/converters/web/` named after your service (e.g., `MyService.py`).
74+
75+
#### 2. Implement the converter class
76+
77+
```python
78+
from MSMetaEnhancer.libs.converters.web.WebConverter import WebConverter
79+
80+
class MyService(WebConverter):
81+
"""
82+
Brief description of what your service does.
83+
84+
Service URL: https://example.com/api
85+
"""
86+
87+
def __init__(self, session):
88+
super().__init__(session)
89+
90+
# Define the service endpoints
91+
self.endpoints = {
92+
'MyService': 'https://api.example.com/v1/'
93+
}
94+
95+
# Define the conversions this service supports
96+
conversions = [
97+
('source_attr', 'target_attr', 'conversion_method'),
98+
# Add more conversions as needed
99+
]
100+
self.create_top_level_conversion_methods(conversions)
101+
102+
# Add rate limiting if needed (optional)
103+
# self.throttler = Throttler(rate_limit=5) # 5 requests per second
104+
105+
async def conversion_method(self, input_data):
106+
"""
107+
Implement the actual conversion logic.
108+
109+
:param input_data: The input data to convert
110+
:return: Dictionary with converted data
111+
"""
112+
# Build the API request
113+
args = f'endpoint/{input_data}'
114+
115+
# Make the request (with throttling if configured)
116+
response = await self.query_the_service('MyService', args)
117+
118+
# Parse and return the result
119+
if response:
120+
return self.parse_response(response)
121+
return {}
122+
123+
def parse_response(self, response):
124+
"""
125+
Parse the API response and extract relevant data.
126+
127+
:param response: Raw API response
128+
:return: Dictionary with parsed data
129+
"""
130+
# Implement response parsing logic
131+
# Return a dictionary with attribute names as keys
132+
return {'target_attr': parsed_value}
133+
```
134+
135+
#### 3. Register the converter
136+
137+
Add your new converter to `MSMetaEnhancer/libs/converters/web/__init__.py`:
138+
139+
```python
140+
from MSMetaEnhancer.libs.converters.web.MyService import MyService
141+
142+
__all__ = ['IDSM', 'CTS', 'CIR', 'PubChem', 'BridgeDb', 'MyService']
143+
```
144+
145+
#### 4. Add tests
146+
147+
Create a test file `tests/test_MyService.py`:
148+
149+
```python
150+
import pytest
151+
from MSMetaEnhancer.libs.converters.web.MyService import MyService
152+
153+
@pytest.mark.dependency()
154+
async def test_service_available():
155+
"""Test if the service is available."""
156+
# Implementation depends on your service
157+
pass
158+
159+
@pytest.mark.dependency(depends=["test_service_available"])
160+
async def test_conversion():
161+
"""Test the conversion functionality."""
162+
# Mock the service and test your conversion methods
163+
pass
164+
```
165+
166+
### Adding a new Compute Converter
167+
168+
For local computation services (like RDKit), follow these steps:
169+
170+
#### 1. Create the converter file
171+
172+
Create a new Python file in `MSMetaEnhancer/libs/converters/compute/` named after your service.
173+
174+
#### 2. Implement the converter class
175+
176+
```python
177+
from MSMetaEnhancer.libs.converters.compute.ComputeConverter import ComputeConverter
178+
179+
class MyComputeService(ComputeConverter):
180+
"""
181+
Description of your compute service.
182+
"""
183+
184+
def __init__(self):
185+
super().__init__()
186+
187+
# Define the conversions this service supports
188+
conversions = [
189+
('source_attr', 'target_attr', 'conversion_method'),
190+
# Add more conversions as needed
191+
]
192+
self.create_top_level_conversion_methods(conversions, asynch=False)
193+
194+
def conversion_method(self, input_data):
195+
"""
196+
Implement the computation logic.
197+
198+
:param input_data: The input data to process
199+
:return: Dictionary with computed data
200+
"""
201+
# Perform local computation
202+
result = some_computation(input_data)
203+
return {'target_attr': result}
204+
```
205+
206+
#### 3. Register the converter
207+
208+
Add your new converter to `MSMetaEnhancer/libs/converters/compute/__init__.py`:
209+
210+
```python
211+
from MSMetaEnhancer.libs.converters.compute.MyComputeService import MyComputeService
212+
213+
__all__ = ['RDKit', 'MyComputeService']
214+
```
215+
216+
### Adding conversion functions to existing services
217+
218+
To add new conversion functions to existing converters:
219+
220+
#### 1. Add the conversion method
221+
222+
Add a new method to the existing converter class:
223+
224+
```python
225+
async def new_conversion_method(self, input_data):
226+
"""
227+
Description of what this conversion does.
228+
229+
:param input_data: Input data
230+
:return: Converted data
231+
"""
232+
# Implementation here
233+
pass
234+
```
235+
236+
#### 2. Register the conversion
237+
238+
Add the new conversion to the `conversions` list in the `__init__` method:
239+
240+
```python
241+
conversions = [
242+
# existing conversions...
243+
('new_source_attr', 'new_target_attr', 'new_conversion_method'),
244+
]
245+
```
246+
247+
### Key principles for converter development
248+
249+
1. **Error Handling**: Always handle API errors gracefully and return empty dictionaries when data is not available
250+
2. **Rate Limiting**: Respect API rate limits using throttling mechanisms
251+
3. **Data Validation**: Validate input data before making API calls
252+
4. **Response Parsing**: Implement robust response parsing that handles various response formats
253+
5. **Documentation**: Include docstrings for all methods explaining parameters and return values
254+
6. **Testing**: Write comprehensive tests including service availability and conversion functionality
255+
256+
### Testing your changes
257+
258+
After implementing your converter:
259+
260+
1. Run the existing tests to ensure you haven't broken anything:
261+
```bash
262+
pytest tests/
263+
```
264+
265+
2. Run your specific tests:
266+
```bash
267+
pytest tests/test_YourService.py -v
268+
```
269+
270+
3. Test the integration by using the converter in a real scenario
271+
272+
### Common patterns and utilities
273+
274+
- **Throttling**: Use `Throttler` class for rate limiting
275+
- **Caching**: Use `@lru_cache` decorator for caching responses
276+
- **Error handling**: Inherit from base converter classes for consistent error handling
277+
- **Data escaping**: Use decorators like `@escape_single_quotes` for input sanitization
278+
279+
### Template files
280+
281+
To help you get started quickly, you can use these template files as starting points:
282+
283+
- **Web Converter Template**: Use the CTS or PubChem converters as reference implementations
284+
- **Compute Converter Template**: Use the RDKit converter as a reference implementation
285+
- **Test Template**: Follow the existing test patterns in the `tests/` directory
286+
287+
These templates include:
288+
- Proper class structure and inheritance
289+
- Common import patterns
290+
- Standard method signatures
291+
- Error handling patterns
292+
- Documentation structure
293+
- Test structure and mocking examples

0 commit comments

Comments
 (0)