Skip to content

Commit 93d4cf4

Browse files
committed
Merge branch 'main' into develop
# Conflicts: # src/biocatalyzer/bioreactor.py
2 parents 9296c41 + 1a5e8ce commit 93d4cf4

7 files changed

Lines changed: 59 additions & 33 deletions

File tree

.github/workflows/main.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ on:
55
branches:
66
- main
77
- develop
8+
push:
9+
branches:
10+
- main
11+
812

913
jobs:
1014
test:
@@ -14,6 +18,8 @@ jobs:
1418
matrix:
1519
os:
1620
- ubuntu-latest
21+
- windows-latest
22+
- macos-latest
1723
python-version:
1824
- 3.8
1925

README.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,20 +25,20 @@ Installing from GitHub:
2525
biocatalyzer_cli <PATH_TO_COMPOUNDS> <OUTPUT_DIRECTORY> [--neutralize=<BOOL>] [--reaction_rules=<FILE_PATH>] [--organisms=<FILE_PATH>] [--patterns_to_remove=<FILE_PATH>] [--molecules_to_remove=<FILE_PATH>] [--min_atom_count=<INT>] [--match_ms_data=<BOOL>] [--ms_data_path=<FILE_PATH>] [--tolerance=<FLOAT>] [--n_jobs=<INT>]
2626
```
2727

28-
| Argument | Example | Description | Default |
29-
|--------------------------------------|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
30-
| compounds <PATH_TO_COMPOUNDS> | `file.tsv` or `"smile1;smiles2;smile3;etc"` | The path to the file containing the compounds to use as reactants. Or ;-separated SMILES strings.<sup>1</sup> | |
31-
| output_directory <OUTPUT_DIRECTORY> | `output/directory/` | The path directory to save the results to. | |
32-
| neutralize | `True` or `False` | Whether to neutralize the compounds before predicting the products. In this case the new products will also be neutralized. | `False` |
33-
| reaction_rules | `file.tsv` or `None` | The path to the file containing the reaction rules to use.<sup>2</sup> | [all_reaction_rules_forward_no_smarts_duplicates_sample.tsv](src/biocatalyzer/data/reactionrules/all_reaction_rules_forward_no_smarts_duplicates_sample.tsv) |
34-
| organisms | `file.tsv` or `"org_id1;org_id2;org_id3;etc"` or `None` | The path to the file containing the organisms to use. Or ;-separated organisms identifiers. Reaction Rules will be selected accordingly (select only rules associated with enzymes encoded by genes from this organisms).<sup>3</sup> | All reaction rules are used. |
35-
| patterns_to_remove | `patterns.tsv` or `None` | The path to the file containing the patterns to remove from the products. <sup>4</sup> | [patterns.tsv](src/biocatalyzer/data/patterns_to_remove/patterns.tsv) |
36-
| molecules_to_remove | `molecules.tsv` or `None` | The path to the file containing the molecules to remove from the products. <sup>5</sup> | [byproducts.tsv](src/biocatalyzer/data/byproducts_to_remove/byproducts.tsv) |
37-
| min_atom_count | `4` | The minimum number of heavy atoms a product must have. | `5` |
38-
| match_ms_data | `True` or `False` | Whether to match the predicted products to the MS data. | `False` |
39-
| ms_data_path | `ms_data.tsv` | The path to the file containing the MS data. <sup>6</sup> | `None` |
40-
| tolerance | `0.02` | The mass tolerance to use when matching masses. | `0.02` |
41-
| n_jobs | `6` | The number of jobs to run in parallel (-1 uses all). | `1` |
28+
| Argument | Example | Description | Default |
29+
|--------------------------------------|---------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------|
30+
| compounds <PATH_TO_COMPOUNDS> | `file.tsv` or `"smile1;smiles2;smile3;etc"` | The path to the file containing the compounds to use as reactants. Or ;-separated SMILES strings.<sup>1</sup> | |
31+
| output_directory <OUTPUT_DIRECTORY> | `output/directory/` | The path directory to save the results to. | |
32+
| neutralize | `True` or `False` | Whether to neutralize the compounds before predicting the products. In this case the new products will also be neutralized. | `False` |
33+
| reaction_rules | `file.tsv` or `None` | The path to the file containing the reaction rules to use.<sup>2</sup> | [all_reaction_rules_forward_no_smarts_duplicates_sample.tsv](src/biocatalyzer/data/reactionrules/all_reaction_rules_forward_no_smarts_duplicates_sample.tsv) |
34+
| organisms | `file.tsv` or `"org_id1;org_id2;org_id3;etc"` or `None` | The path to the file containing the organisms to use. Or ;-separated organisms identifiers. Reaction Rules will be selected accordingly (select only rules associated with enzymes encoded by genes from these organisms).<sup>3</sup> | All reaction rules are used. |
35+
| patterns_to_remove | `patterns.tsv` or `None` | The path to the file containing the patterns to remove from the products. <sup>4</sup> | [patterns.tsv](src/biocatalyzer/data/patterns_to_remove/patterns.tsv) |
36+
| molecules_to_remove | `molecules.tsv` or `None` | The path to the file containing the molecules to remove from the products. <sup>5</sup> | [byproducts.tsv](src/biocatalyzer/data/byproducts_to_remove/byproducts.tsv) |
37+
| min_atom_count | `4` | The minimum number of heavy atoms a product must have. | `5` |
38+
| match_ms_data | `True` or `False` | Whether to match the predicted products to the MS data. | `False` |
39+
| ms_data_path | `ms_data.tsv` | The path to the file containing the MS data. <sup>6</sup> | `None` |
40+
| tolerance | `0.02` | The mass tolerance to use when matching masses. | `0.02` |
41+
| n_jobs | `6` | The number of jobs to run in parallel (-1 uses all). | `1` |
4242

4343
### Compounds
4444

src/biocatalyzer/bioreactor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -546,7 +546,7 @@ def process_results(self, save: bool = True, overwrite: bool = True):
546546
Tuple[pd.DataFrame, str]
547547
The processed results and the path to the results file.
548548
"""
549-
results = pd.read_csv(self._new_compounds_path, sep='\t', header=0)
549+
results = pd.read_csv(self._new_compounds_path, sep='\t', header=0, lineterminator='\n')
550550
results.EC_Numbers = results.EC_Numbers.fillna('')
551551
results = results.groupby(['OriginalCompoundID', 'NewCompoundSmiles']).agg({'OriginalCompoundSmiles': 'first',
552552
'OriginalReactionRuleID': ';'.join,

src/biocatalyzer/clis/cli.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,9 @@ def biocatalyzer_cli(compounds,
102102
103103
output_path: Path to the output directory.
104104
"""
105+
logging.basicConfig(filename=f'{output_path}logging.log', level=logging.DEBUG)
105106
if reaction_rules is None:
107+
logging.info(f"Using default reaction rules file.")
106108
reaction_rules = os.path.join(
107109
DATA_FILES, '../data/reactionrules/reaction_rules_biocatalyzer.tsv.bz2')
108110
br = BioReactor(compounds_path=compounds,
@@ -114,7 +116,6 @@ def biocatalyzer_cli(compounds,
114116
molecules_to_remove_path=molecules_to_remove,
115117
min_atom_count=min_atom_count,
116118
n_jobs=n_jobs)
117-
logging.basicConfig(filename=f'{output_path}logging.log', level=logging.DEBUG)
118119
br.react()
119120
_, new_results_path = br.process_results()
120121

@@ -123,6 +124,7 @@ def biocatalyzer_cli(compounds,
123124
raise ValueError("The path to the MS data file is required when matching MS data.")
124125

125126
else:
127+
logging.info(f"Matching MS data.")
126128
ms = MSDataMatcher(ms_data_path=ms_data_path,
127129
compounds_to_match_path=new_results_path,
128130
output_path=output_path,

0 commit comments

Comments
 (0)