Skip to content

Recreate workshop for CHTC (HTCondor)#76

Draft
qualiaMachine wants to merge 1 commit intomainfrom
claude/recreate-chtc-workshop-O8Xjm
Draft

Recreate workshop for CHTC (HTCondor)#76
qualiaMachine wants to merge 1 commit intomainfrom
claude/recreate-chtc-workshop-O8Xjm

Conversation

@qualiaMachine
Copy link
Copy Markdown
Owner

Summary

  • Adapts the entire GCP/Vertex AI workshop into a CHTC/HTCondor workshop for UW-Madison's Center for High Throughput Computing
  • Replaces all 9 episodes: GCP concepts → HTCondor submit files, GPU Lab, DAGMan workflows, CHTC filesystems
  • Adds new submit_files/ directory with HTCondor submit files, wrapper scripts, and parameter files
  • Adds new helper scripts: preprocess_data.py, evaluate_model.py, aggregate_results.py
  • Updates all learner materials (setup, reference, compute guide) and instructor notes for CHTC
  • Preserves the same pedagogical structure (Carpentries Sandpaper format) and Titanic dataset examples

Episode mapping (GCP → CHTC)

# Was (GCP) Now (CHTC)
01 Overview of GCP for ML Overview of CHTC for ML
02 Vertex AI Workbench notebooks Connecting to CHTC (SSH, submit node)
03 GCS data storage Data management (/home, /staging, SQUID)
04 Vertex AI custom jobs (XGBoost) HTCondor training jobs (XGBoost)
05 Vertex AI GPU jobs (PyTorch) CHTC GPU Lab jobs (PyTorch)
06 Vertex AI HP tuning HTCondor queue + DAGMan HP tuning
07 RAG with Gemini on Vertex AI RAG on CHTC (API-based + open-source options)
08 gcloud CLI workflows Advanced HTCondor workflows (DAGMan)
09 GCP billing & cleanup Resource management & best practices

Test plan

  • Verify all 9 episodes render correctly in Sandpaper
  • Test HTCondor submit files on a CHTC submit node
  • Run training scripts (train_xgboost.py, train_nn.py) locally and via HTCondor
  • Test HP tuning pipeline with params.txt
  • Verify DAGMan workflow example
  • Test RAG pipeline with API key setup
  • Review all episodes for accuracy of CHTC-specific details (file paths, GPU Lab settings, runtime limits)

https://claude.ai/code/session_013kTNZ8Y4fXeTMqgJSUhEnn

Replace all 9 GCP/Vertex AI episodes with CHTC/HTCondor equivalents:
- Ep01: Overview of CHTC for ML (was GCP overview)
- Ep02: Connecting to CHTC via SSH (was Vertex AI Workbench notebooks)
- Ep03: Data management on CHTC filesystems (was GCS buckets)
- Ep04: Training with HTCondor submit files (was Vertex AI custom jobs)
- Ep05: GPU training via CHTC GPU Lab (was Vertex AI GPU jobs)
- Ep06: HP tuning with HTCondor queue/DAGMan (was Vertex AI HP tuning)
- Ep07: RAG pipeline on CHTC (was Vertex AI + Gemini)
- Ep08: Advanced HTCondor workflows/DAGMan (was gcloud CLI)
- Ep09: Resource management best practices (was GCP billing/cleanup)

Also adds:
- HTCondor submit files and wrapper scripts (submit_files/)
- New helper scripts (preprocess, evaluate, aggregate results)
- Updated learner materials (setup, reference, compute guide)
- CHTC-specific instructor notes and learner profiles
- Updated config.yaml, index.md, README.md for CHTC branding

https://claude.ai/code/session_013kTNZ8Y4fXeTMqgJSUhEnn
@github-actions
Copy link
Copy Markdown

🆗 Pre-flight checks passed 😃

This pull request has been checked and contains no modified workflow files or spoofing.

Results of any additional workflows will appear here when they are done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants