ALF: Open-Source Active Learning Framework for Atomistic Modeling

📄Documentation

This code automates the construction of datasets for machine learned interatomic potentials (MLIPs) through active learning. By automating job execution utilizing the Parsl framework, the active learning process can run for many iterations without human intervention. ALF breaks the process down into 4 fundamental tasks:

Initial system construction (Bootstrapping)
ML interatomic potential training
ML configurational sampling
Electronic structure labeling

Overview of the ALF workflow.

ALF tracks model uncertainty through a general-purpose ensemble calculator. During each active learning iteration, a committee of independently trained MLIP models evaluates newly sampled configuration, and ALF uses the disagreement among the ensemble predictions to identify regions of the potential energy surface that are insufficiently represented in the current training dataset. These high-uncertainty configurations are then selected for ground-truth labeling and added to the training dataset.

ALF uses Parsl to manage task execution across HPC clusters and job schedulers (e.g., SLURM). Because resource layouts differ between computing environments, users should modify the Parsl configuration files in alframework/parsl_resource_configs before running production workflows. These files control machine-specific settings such as scheduler type, queue or partition name, allocation account, walltime, node counts, worker counts, launch commands, and software environment setup.

📋 Requirements:

The requirements for this software are evolving, though generally, they will include the following:

Parsl
NumPy
ASE
A QM software package with interface (usually ASE)
A MLIP model - we provide an interface to the open-source and flexible HIPPYNN architecture

✅ Installation

Clone the repository and install ALF using pip:

git clone https://github.com/lanl/ALF.git
cd ALF
python -m pip install -e .

⚙️ Configuration:

To control job flow in the active learning framework, 5 json files are used:

master_config.json
builder_config.json
ml_config.json
mlmd_config.json
qm_config.json

The master configuration file controls how all pieces of the framework are assembled and defines where to find the other 4 files. Each of the other 4 files passes inputs to one of the four subtasks enumerated in the first section. To see how these files relate to one another, please see the examples folder.

With the json files completed, the PYTHONPATH environment variable must be set to the directory where alframework is held. Eventually, this step will be replaced by making alframework an installable package.

🧪 Testing:

Once the environment is constructed with the required packages, it is important to test individual operations done by the active learning framework for erorrs. This is done to ensure all processes complete successufuly when run in active learning. Testing each of of the four sub processes is enabled in the following way:

python -m alframework master.json --test_builder #Test structure building
python -m alframework master.json --test_sampler #Test mlmd sampling
python -m alframework master.json --test_ml 
python -m alframework master.json --test_qm

These functions will execute in such a way as to pass errors back to the front end to enable easier debugging. Errors encountered in the active learning phase.

▶️ Execution:

Once each task has been tested, active learning can be started with:

python -m alframework master.json

It is generally advised to run the master process on a head node inside a terminal multiplexer (screen, tmux, zellij) for session persistence. This will allow the ALF master process to continue to run over multiple days/weeks, even after you disconnect. It will automatically interface with the queueing system and run future jobs on compute nodes.

📃 Citations:

If you use ALF in your research, citations to the papers 1-3 and this repository are mandatory. Please, also consider citing other examples below.

[1] Code release paper and molten salts case study
in preparation - link will appear here

[2] ALF-prouced MLIP for bulk aluminum
Justin S. Smith, Benjamin Nebgen, Nithin Mathew, Jie Chen, Nicholas Lubbers, Leonid Burakovsky, Sergei Tretiak, Hai Ah Nam, Timothy Germann, Saryu Fensin, Kipton Barros. "Automated discovery of a robust interatomic potential for aluminum" Nat. Comm. 2021, 12, 1257. https://doi.org/10.1038/s41467-021-21376-0

[2] Uncertainty-driven dynamics for active learning - UDD sampler
Nicholas Lubbers, Ying Wai Li, Richard Messerly, Sergei Tretiak, Justin S. Smith, Benjamin Nebgen. "Uncertainty-driven dynamics for active learning of interatomic potentials" Nat. Comp. Sci. 2023, 1968. https://doi.org/10.1038/s43588-023-00406-5

[3] ALF-trained reactive potential for organics
Shuhao Zhang, Malgorzata Makos, Ryan Jadrich, Elfi Kraka, Kipton Barros, Benjamin Nebgen, Sergei Tretiak, Olexandr Isayev, Nicholas Lubbers, Richard Messerly, Justin Smith. "Exploring the frontiers of chemistry with a general reactive machine learning potential". https://doi.org/10.26434/chemrxiv-2022-15ct6

[4] Original implementation of ALF and proof of concept study
Justin S. Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, Adrian E. Roitberg. "Less is more: Sampling chemical space with active learning". J. Chem. Phys. 2018, 148, 241733. https://doi.org/10.1063/1.5023802

Name		Name	Last commit message	Last commit date
Latest commit History 226 Commits
.github/workflows		.github/workflows
alframework		alframework
docs		docs
examples		examples
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ALF: Open-Source Active Learning Framework for Atomistic Modeling

📄Documentation

📋 Requirements:

✅ Installation

⚙️ Configuration:

🧪 Testing:

▶️ Execution:

📃 Citations:

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ALF: Open-Source Active Learning Framework for Atomistic Modeling

📄Documentation

📋 Requirements:

✅ Installation

⚙️ Configuration:

🧪 Testing:

▶️ Execution:

📃 Citations:

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages