本文档描述 algorithms.py 与 main.py 的主要类型与函数,并补充新增的“成分读取接口”。
-
Ingredient(dataclass)- 字段:
- 标识与来源:
name: str、smiles: str | None - 安全性:
hepatotoxicity_score: float | None - ADME:
ob, dl, mw, alogp, h_don, h_acc, caco2, bbb, fasa, hl(均为float | None)
- 标识与来源:
- 行为:
- 若未提供
hepatotoxicity_score且存在smiles,在__post_init__中自动估算(优先 DILIPred,不可用则使用启发式)。
- 若未提供
- 字段:
-
CandidateSolution- 结构:
select_bits: bytearray(位存储选择)、proportions: array('f')(配比)、length: int - 方法:
from_components(selects, proportions)、iter_selects()、iter_selected_indices()、normalized_proportions()、with_normalized()
- 结构:
-
load_ingredients_csv(path: str | Path, encoding='utf-8') -> list[Ingredient]- 支持常见同义列名(不区分大小写),至少需要
name,且smiles与hepatotoxicity_score至少其一存在。 bbb既可为 0/1 或 true/false,也可为 logBB 数值。
- 支持常见同义列名(不区分大小写),至少需要
-
load_ingredients_json(path: str | Path, encoding='utf-8') -> list[Ingredient]- 期望 JSON 根为数组;键名支持与 CSV 相同的同义映射。
-
load_ingredients(path: str | Path, fmt: str | None = None, encoding='utf-8') -> list[Ingredient]- 通用入口;
fmt为csv/json/None(auto);当fmt=None时按扩展名推断。
- 通用入口;
示例:
from algorithms import load_ingredients
ings = load_ingredients("resources/templates/ingredients.csv", fmt="csv")compute_desirability_score(ingredient) -> floatcompute_complementarity_score(candidate, ingredients) -> floatcompute_aci_score(candidate, ingredients, lambda_weight=None) -> floatcompute_hepatotoxicity_score(candidate, ingredients) -> floatevaluate_metrics(candidate, ingredients) -> (aci_with_penalty, toxicity, penalty, aci_raw)diversity_penalty(candidate, ingredients) -> float
single_objective_score(candidate, ingredients, toxicity_weight=1.0) -> floatmulti_objective_score(candidate, ingredients) -> (aci, toxicity)
CombinationProblem(pymoo 兼容问题定义)PymooSingleObjectiveGA(ingredients, toxicity_weight=None, generations=None, population_size=None, crossover_prob=None, mutation_prob=None)- 用于单目标优化(最大化
ACI - toxicity_weight * 肝毒性),.run() -> CandidateSolution - 交叉/变异概率优先级:构造函数参数 >
config/algorithms.json中的ga.crossover_prob/ga.mutation_prob> pymoo 默认。
- 用于单目标优化(最大化
PymooNSGAII(ingredients, generations=None, population_size=None)- 多目标 NSGA-II(最小化
(-ACI, toxicity)),.run() -> list[CandidateSolution], np.ndarray - 交叉/变异概率可通过配置
nsga2.crossover_prob/nsga2.mutation_prob调整。
- 多目标 NSGA-II(最小化
PySwarmsPSO(ingredients, ...)→.run() -> CandidateSolution
注意:numpy/pymoo/pyswarms/dilipred 均惰性导入,仅在使用到相应功能时才需要环境已安装依赖。
--mode {hello,demo}:默认hello(无需依赖),demo需安装优化与可视化依赖--ingredients PATH:成分文件路径(CSV/JSON)--format {auto,csv,json}:文件格式,默认auto
run_single_objective_algorithms(ingredients) -> (list[CandidateSolution], float)run_nsga(ingredients, toxicity_weight, highlight_points=None) -> Nonerun_demo(ingredients: list[Ingredient] | None = None) -> intmain(argv: list[str] | None = None) -> int
示例:
python main.py --mode demo --ingredients resources/templates/ingredients.json --format jsonname: name/ingredient/compound/drug/名称/成分smiles: smiles/smilehepatotoxicity_score: hepatotoxicity/hepatotoxicity_score/dili/toxicity/toxicity_scorebbb: bbb/logbb/bloodbrainbarrier(允许 0/1/true/false 或数值)- 其他:
ob, dl, mw, alogp, h_don(hbd), h_acc(hba), caco2(logpapp/papp), fasa, hl(half_life/t_half)