| title | Pre-Workshop Setup Guide: Computational Reproducibility in Machine Learning |
|---|---|
| author | Waheed U. Bajwa (Rutgers University--New Brunswick) |
| date | February 25, 2025 |
Welcome to the Computational Reproducibility in Machine Learning workshop! To ensure a smooth experience, please follow the setup instructions below before attending.
Our default production environment will be Windows 10/11 with Windows Subsystem for Linux (WSL2), along with VS Code. The same setup works on Linux natively. However, Mac users will need to use a Docker-enabled Linux virtual machine (VM) via Canonical’s Multipass in order to build a Docker image.
This guide provides setup instructions for both Windows and macOS.
If you do not have these accounts already, please create them:
- GitHub (for versioning): Sign up here
- DockerHub (for publishing Docker images; can be linked with GitHub): Sign up here
- Zenodo (for publishing research outputs; can be linked with GitHub): Sign up here
-
Open PowerShell as Administrator and run:
wsl --installThis installs WSL2 along with a default Linux distribution.
-
If WSL is already installed, ensure it is set to version 2:
wsl --set-default-version 2
-
Install a Linux distribution (e.g., Ubuntu 24.04 LTS):
wsl --install -d Ubuntu
-
Launch the Linux terminal from the Start Menu, and when prompted, create a username and password.
Inside the Ubuntu terminal, run:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.shWhen prompted, say "yes" to the question:
Do you wish to update your shell profile to automatically initialize conda? (yes/no)
To ensure Miniconda is active, restart the shell:
source ~/.bashrc(Optional) To disable auto-activation of Conda’s base environment, run:
conda config --set auto_activate_base false
source ~/.bashrcDownload and install VS Code: Download here
- For Windows, select the User Installer version.
Inside VS Code:
- Install the Remote Development extension pack from Microsoft to develop inside WSL.
- Install the following additional extensions:
- Docker from Microsoft
- GitHub Repositories from GitHub
- GitHub Pull Requests from GitHub
- Jupyter from Microsoft
- Python from Microsoft
- Remote Repositories from Microsoft
- Additional extensions for R, MATLAB, etc., if needed.
Download and install Git for Windows: Download here
Verify installation:
git --version-
Download and install Docker Desktop: Download here
-
During installation, select WSL2 as the backend.
-
Enable the WSL integration in Docker settings.
-
Verify installation:
docker --version
Install Homebrew (Mac package manager) by running:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Verify installation:
brew --versionDownload and install VS Code for macOS: Download here
Inside VS Code:
- Install the following extensions:
- Docker from Microsoft
- GitHub Repositories from GitHub
- GitHub Pull Requests from GitHub
- Jupyter from Microsoft
- Python from Microsoft
- Remote Development from Microsoft
- Remote Repositories from Microsoft
- Additional extensions for R, MATLAB, etc., if needed.
Install Git via Homebrew:
brew install gitVerify installation:
git --versionDownload Docker for macOS: Download here
-
Ensure you select the correct version for your processor (Intel or Apple Silicon).
-
Verify installation:
docker --version
Generate an SSH key:
ssh-keygen -t rsa -b 4096Copy the public key from id_rsa.pub:
cat ~/.ssh/id_rsa.pubCreate a cloud-config.yaml file for SSH access to the Multipass Linux VM (save it where your public key is stored):
#cloud-config
users:
- name: ubuntu
sudo: ['ALL=(ALL) NOPASSWD:ALL']
ssh-authorized-keys:
- ssh-rsa <PASTE YOUR PUBLIC KEY HERE>Download and install Multipass for macOS: Download here
Verify installation:
multipass versionLaunch a Docker-enabled Multipass instance with SSH support:
multipass launch docker --name vscode-docker --memory 8G --disk 80G --cpus 2 --cloud-init ~/.ssh/cloud-config.yamlCheck the instance’s IP address:
multipass listIn VS Code, go to Remote Development, open SSH settings, and add:
Host multipass-vscode-docker
HostName <MULTIPASS_INSTANCE_IP_ADDRESS>
User ubuntu
Click on the host name multipass-vscode-docker and connect.
Once inside the Multipass shell, install Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.shFollow the same Conda activation steps as in the Windows setup.
(Optional Note): If you are unable to connect to your Multipass Linux VM using VS Code, you can still perform Docker-based activities within the Linux VM (with a mounted folder inside your system). Meanwhile, the rest of the activities can be carried out using Miniconda installed on macOS (installation guide) along with VS Code on the host macOS.
To stop (hibernate) the instance:
multipass stop vscode-dockerTo delete the instance:
multipass delete --purge vscode-docker- Ensure you can create files using VS Code inside your WSL (Windows) or Multipass Linux VM (Mac).
- Play around with Conda, Git, and Docker before the workshop, to the extent possible.
- Test that Jupyter Notebooks work inside your environment.
This guide ensures that your environment is properly set up before the workshop. Please complete all steps and reach out if you encounter issues. See you at the workshop!