Week 21, Project 1 · Biophysics Portfolio · CS Research Self-Study
A physics-based rigid-body molecular docking engine that predicts how small-molecule drug candidates bind to protein targets. Implements a five-term scoring function calibrated against AutoDock4, Monte Carlo simulated annealing with sigmoidal cooling and multi-restart sampling, soft-core van der Waals potentials, binding-pocket carving with complementary pocket-lining atoms, binding-site-only scoring, redocking validation across six benchmark systems, and interactive Streamlit visualization.
| Feature | Description |
|---|---|
| Scoring Function | AutoDock4-calibrated 5-term physics-based scorer |
| van der Waals | Soft-core Lennard-Jones with Bondi radii + AMBER ff99 ε |
| Electrostatics | Coulomb with distance-dependent dielectric (ε = 4r) |
| Desolvation | Eisenberg-McLachlan solvation model |
| Hydrogen Bonds | 10-12 potential for directional H-bonds |
| Torsional Penalty | Entropy cost per rotatable bond |
| Monte Carlo | Sigmoidal annealing with multi-restart Metropolis sampling |
| Grid Search | Systematic translational + rotational enumeration |
| Pocket Modeling | Binding-pocket carving with complementary lining atoms |
| Scoring Filter | Ligand-centric proximity filter eliminates distant noise |
| Redocking | RMSD < 2.0 Å validation against native poses |
| Preset Systems | 6 well-characterized protein-ligand complexes |
| System | Protein | Ligand | Rotatable Bonds | Difficulty |
|---|---|---|---|---|
| Trypsin-Benzamidine | Serine protease | Small inhibitor | 1 | Easy |
| HIV Protease-Indinavir | Retroviral protease | Peptidomimetic | 7 | Medium |
| CDK2-Staurosporine | Kinase | Natural product | 2 | Easy |
| Thrombin-PPACK | Serine protease | Tripeptide | 5 | Medium |
| Lysozyme-NAG3 | Glycoside hydrolase | Trisaccharide | 4 | Medium |
| Carbonic Anhydrase-Dorzolamide | Metalloenzyme | Sulfonamide | 4 | Hard |
- Physics-based scoring with five calibrated energy terms following AutoDock4 conventions
- Pocket carving with complementary lining atoms for realistic binding-site geometry
- Soft-core van der Waals potential eliminates singularities and improves sampling through tight pockets
- Sigmoidal cooling schedule maintains exploration temperature before rapid convergence
- Multi-restart Monte Carlo with 8 independent trajectories for diverse pose generation
- Ligand-centric scoring filter evaluates only protein atoms near the current ligand pose, reducing noise and computation
- Optimized grid search with pre-computed rotation matrices, donor-acceptor masks, and vectorized pose analysis
- Dual sampling strategies — Monte Carlo simulated annealing and grid-based search
- Redocking validation across six diverse protein-ligand benchmark systems
- Interactive exploration of score landscapes, energy decomposition, and binding site geometry
- Scoring function lab — adjust weights in real time, see how scoring affects pose ranking
week_21_project_1/
├── app.py # Streamlit dashboard (8 pages)
├── main.py # CLI entry point (4 modes)
├── requirements.txt
├── .gitignore
├── README.md
├── week_21_project_1_outline.md # Project specification
├── src/
│ ├── __init__.py # Package re-exports
│ ├── docking_engine.py # Scoring, sampling, presets (~2080 lines)
│ ├── analysis.py # Analysis pipelines (~830 lines)
│ └── visualization.py # Plotly + Matplotlib renderers (~1150 lines)
├── tests/
│ └── test_lock_and_key.py # 26 test classes, 118 tests (~1120 lines)
├── docs/
│ ├── scientific_report.md
│ ├── w21p1_lock_and_key_ieee.tex
│ └── w21p1_lock_and_key_ieee.pdf
└── figures/ # Generated, gitignored
pip install -r requirements.txtpython main.py # Default docking
python main.py --dock --system "CDK2-Staurosporine" --save
python main.py --redock --steps 500
python main.py --compare --save
python main.py --gallery --verbosestreamlit run app.pypytest tests/ -vThe docking score approximates the binding free energy:
where
| Flag | Default | Description |
|---|---|---|
--dock |
✓ | Run docking (default mode) |
--redock |
Redocking challenge with RMSD validation | |
--compare |
Cross-system comparison | |
--gallery |
Show all preset systems | |
--system |
Trypsin-Benzamidine | Preset system name |
--method |
monte_carlo | Sampling method (monte_carlo / grid) |
--steps |
300 | Number of sampling steps |
--temperature |
300.0 | MC temperature (K) |
--seed |
42 | Random seed |
--save |
Save figures to figures/ | |
--verbose |
Show additional output |
- 🏠 Home — Project overview, scoring function table, system gallery
- 🔑 Dock a Ligand — Run docking with configurable parameters, 3D pose overlay
- 🎯 Redocking Challenge — Validate native pose recovery (RMSD < 2.0 Å)
- 📊 Score Explorer — Score-vs-rank, convergence, contact analysis, heatmaps
- 🏗️ Binding Site Inspector — Pocket geometry, residue composition, 3D map
- ⚖️ Cross-System Comparison — Redocking performance across benchmarks
- 🧪 Scoring Function Lab — Energy decomposition, custom weight explorer
- 📚 Theory & Mathematics — Mathematical foundations with LaTeX equations
| Component | Weight | Physics |
|---|---|---|
| van der Waals | 0.1662 | Shape complementarity (Lennard-Jones 6-12) |
| Electrostatic | 0.1209 | Charge-charge interactions (Coulomb) |
| Desolvation | 0.1406 | Solvation penalty (Eisenberg-McLachlan) |
| Hydrogen Bond | 0.0585 | Directional H-bonds (10-12 potential) |
| Torsion | 0.2983 | Conformational entropy per rotatable bond |
Comprehensive test suite with 26 test classes and 118 test methods covering:
- Atom/Molecule dataclass construction and properties
- Scoring functions — vdW (hard-core + soft-core), electrostatic, desolvation, H-bond energy
- Pose scoring — total score and breakdown with binding-site filtering
- RMSD calculation — identity, known values, multi-atom
- Transforms — identity, translation, rotation distance preservation
- Sampling — Monte Carlo (sigmoidal + linear cooling, multi-restart) and grid search
- Pocket carving — clearance filtering, lining atom placement, atom count reduction
- Preset builders — all 6 systems build correctly with pocket carving
- Analysis pipelines — docking, redocking, scoring, comparison, binding site
- Plotly renderer — all 11 static methods return
go.Figure - Matplotlib renderer — all 8 static methods return
plt.Figure - CLI parsing — all modes and flags
- Integration — full build → dock → analyze → visualize pipeline
- Edge cases — empty inputs, unknown systems, boundary values, soft-core potentials
numpy>=1.24scipy>=1.10matplotlib>=3.7plotly>=5.14streamlit>=1.28pandas>=2.0pytest>=7.3
Ryan Kamp Department of Computer Science, University of Cincinnati kamprj@mail.uc.edu | GitHub