Version: 11.0
Contact: hsharma@anl.gov
This manual walks you through the workflow of taking radially integrated (caked) powder-diffraction data produced by MIDAS and importing it into GSAS-II. While GSAS-II is a full-featured package for advanced crystallographic analysis, this guide is an oversimplified introduction focused specifically on reading MIDAS integration results for profile fitting.
It assumes you already know your material (via a provided CIF file) and are primarily interested in extracting refined lattice parameters, macroscopic strains, and potentially texture analysis. Full Rietveld refinement (including solving or heavily refining atomic positions and thermal parameters) is a complex process and is better learned through dedicated GSAS-II tutorials.
| Component | Minimum Version | Installation |
|---|---|---|
| MIDAS | 9.0 | See README.md — typically installed to ~/opt/MIDAS/ but any path works. Scripts auto-detect their install location. |
| GSAS-II | Latest | conda install gsas2pkg -c briantoby or pip install GSASII — see §2.3 below. |
| Python | 3.9+ | Required by both packages |
| zarr | Latest | pip install zarr or conda install zarr (required for reading MIDAS .zarr.zip files) |
The MIDAS scripts need to import GSASIIscriptable at runtime. There are two setups:
-
Conda (recommended): Install the
gsas2pkgpackage into a conda environment:conda create -n GSASII briantoby::gsas2pkg -c conda-forge conda activate GSASII
Or install into an existing environment:
conda install gsas2pkg -c briantoby -c conda-forge
-
Pip: Install directly from the GSAS-II GitHub:
pip install GSASII
-
Manual / source install: If GSAS-II is installed in a non-standard location, point the scripts to it with an environment variable:
export GSASII_PATH=/path/to/GSAS-IIThis can point to either the GSAS-II root or the
GSASIIsubdirectory — the script tries both. Add this to your.bashrc/.zshrcfor persistence.
Before starting, you should have:
- A calibrated parameter file from
AutoCalibrateZarr.pyor manual calibration (see FF_Calibration.md). - A caked output file produced by
integrator.py(see FF_Radial_Integration.md).
The MIDAS integrator.py produces caked diffraction data in a .zarr.zip file. This file contains everything GSAS-II needs to import the data as a set of 1D powder patterns, and is read by the MIDAS zarr importer in GSAS-II.
The streaming GPU pipeline (integrator_batch_process.py → integrator_stream_process_h5.py) also generates a compatible .zarr.zip alongside the HDF5 output. Instrument parameters are read from the parameter file or use defaults matching the GSAS-II reader. Use --no-zarr to skip zarr creation, or --zarr-output <name> to set a custom filename.
<filename>.zarr.zip
├── InstrumentParameters/
│ ├── Distance # Sample-to-detector distance (µm)
│ ├── Lam # Wavelength (Å)
│ ├── Polariz # Polarization fraction
│ ├── SH_L # Low-angle asymmetry parameter
│ ├── U, V, W # Gaussian peak-profile parameters (Caglioti)
│ ├── X, Y # Lorentzian peak-profile parameters
│ └── Z # Additional profile parameter
├── OmegaSumFrame/
│ └── LastFrameNumber_<N> # Caked intensity data (R × η array)
│ └── .attrs # FirstOme, LastOme, Number Of Frames Summed
├── Omegas # Effective ω angles for each summed frame
└── REtaMap # (4 × R × η) array: 2θ, η, intensity/bin-area
These parameters are set in the MIDAS caking parameter file and directly control the structure of the output:
| Parameter | Description | Typical Value |
|---|---|---|
EtaBinSize |
Azimuthal bin width (°) | 5 |
EtaMin / EtaMax |
Azimuthal range (°) | −180 / 180 |
RBinSize |
Radial bin width (pixels) | 0.5 |
RMin / RMax |
Radial range (pixels) | 10 / 2880 |
OmegaSumFrames |
Number of ω frames summed per output pattern | 20 |
Important
The OmegaSumFrames parameter controls how many raw frames are averaged into a single caked pattern. Setting this to the total number of frames produces a single, high-statistics pattern — ideal for Rietveld refinement. Setting it to a smaller number produces multiple patterns at different ω positions, useful for texture analysis or checking for preferred orientation.
For full details, see FF_Radial_Integration.md. A typical invocation is:
python ~/opt/MIDAS/FF_HEDM/workflows/integrator.py \
-resultFolder ./output \
-paramFN ps_caking.txt \
-dataFN /path/to/data.vrx.h5 \
-dataLoc /exchange/data \
-darkFN /path/to/dark.vrx.h5 \
-darkLoc /exchange/data \
-startFileNr 48005 \
-endFileNr 48024 \
-convertFiles 1 \
-mapDetector 1 \
-nCPUs 10- Launch GSAS-II from your terminal or application launcher.
- Go to File → New Project and save the project with a descriptive name (e.g.,
CeO2_refinement.gpx).
- Go to Import → Powder Data → from MIDAS zarr file.
- Browse to your
.zarr.zipfile and select it. - GSAS-II will read all azimuthal lineouts from the zarr file. Each lineout with ≥ 20 unmasked points becomes a separate histogram.
- Instrument parameters (wavelength, profile terms) are automatically read from the
InstrumentParameters/group in the zarr file. You can override them with an external.instprmfile placed alongside the.zarr.zipwith the same base name.
Tip
If GSAS-II does not show the MIDAS import option, ensure that zarr is installed in the Python environment used to run GSAS-II. You can verify with python -c "import zarr; print(zarr.__version__)".
After importing, the powder pattern should appear in the plot window. Check:
- The x-axis shows 2θ (or Q, d-spacing — toggle via the plot menu).
- The peak positions look reasonable for your calibrant or sample.
- The intensity scale is not clipped or empty.
If the pattern looks inverted, shifted, or empty, the most common causes are:
- Wrong
ImTransOptduring MIDAS calibration. - Incorrect
RMin/RMaxrange (too narrow or in the wrong units). - The dark frame was not subtracted properly.
Before proceeding to refinement, verify that the MIDAS calibration is consistent with what GSAS-II expects.
If your caked data is from a known powder standard (CeO₂, LaB₆, Si):
- Select the imported powder histogram in the data tree.
- Go to Import → Phase → from CIF file and load the CIF file for your calibrant.
- Link the phase to the histogram when prompted.
- Click Calculate → Compute (without refining anything) to overlay the expected peak positions on the observed pattern.
- The tick marks should align with observed peak centers. If there is a systematic shift, revisit your MIDAS calibration.
- Select the Background entry under the histogram in the data tree.
- Choose a Chebyshev polynomial with 3–6 coefficients as a starting background model.
- Check the Refine box next to the background function.
For our purposes (extracting lattice parameters, measuring macroscopic strains, or observing texture), we perform a partial refinement. We assume the crystal structure provided in your CIF is correct, so we will not refine atomic positions or thermal parameters.
Refinement should proceed in stages, from the most robust parameters to the most sensitive. Below is a recommended sequence.
flowchart TD
A["1. Background<br/>(Chebyshev 3–6 terms)"] --> B["2. Scale Factor"]
B --> C["3. Unit Cell<br/>(lattice parameters/strain)"]
C --> D["4. Peak Profile<br/>(U, V, W, X, Y)"]
D --> E["5. Preferred Orientation<br/>(if needed for texture)"]
style A fill:#1a1a2e,stroke:#e94560,color:#fff
style B fill:#16213e,stroke:#0f3460,color:#fff
style C fill:#16213e,stroke:#0f3460,color:#fff
style D fill:#0f3460,stroke:#e94560,color:#fff
style E fill:#533483,stroke:#e94560,color:#fff
- Check the Refine box for the background.
- Under the phase's histogram data, check the Refine box for the Scale factor.
- Click Calculate → Refine. Monitor Rwp — it should drop significantly.
- Under Phases → General, check the Refine Unit Cell box.
- Refine again. The lattice parameters should converge to values close to the expected values for your material, giving you the macroscopic strain.
- Under Instrument Parameters, enable refinement of U, V, W (Gaussian broadening) one at a time.
- After stabilizing, enable X and Y (Lorentzian broadening).
- Refine after each addition. If the refinement diverges, remove the last added parameter and try with a different starting value.
Warning
Do not refine all profile parameters simultaneously on the first attempt. This can lead to strong correlations and divergence. Add parameters in groups (areas, then intensities, then widths) and verify that Rwp improves and the difference curve (observed − calculated) becomes flatter.
- Examine the difference curve at the bottom of the pattern plot. Systematic errors indicate model deficiencies (missing phase, strong texture, etc.).
- Inspect the refined values for physical reasonableness:
- Lattice parameters near expected values.
- Rwp typically < 15% for a good refinement of synchrotron data.
Go to File → Save Project to save all refinement results, parameters, and plots in the .gpx file.
- CIF: Go to Export → Phase → in CIF format to export the refined crystal structure.
- Plots: Right-click on the plot window and select Save Figure to export publication-quality figures.
- CSV tables: Some GSAS-II menus allow export of parameter tables in CSV format.
| Symptom | Likely Cause | Fix |
|---|---|---|
| No HDF5 import option in GSAS-II | h5py not installed |
Install with pip install h5py in the GSAS-II environment |
| Empty pattern after import | Wrong dataset selected in HDF5 tree | Re-import and select the correct dataset |
| Peaks at wrong 2θ positions | Incorrect wavelength | Verify Wavelength in MIDAS parameter file matches GSAS-II |
| Very broad peaks | Incorrect Lsd or px in calibration |
Recalibrate with AutoCalibrateZarr.py |
| Asymmetric peaks at low 2θ | Axial divergence | Enable the SH_L asymmetry parameter in GSAS-II |
| Goal | Recommended OmegaSumFrames |
|---|---|
| Maximum statistics (Rietveld) | Total number of frames in the scan |
| Texture analysis | 1 (individual frames) or small number |
| Quick check | 10–20 |
If your sample contains multiple phases:
- Import all phase CIF files one at a time via Import → Phase.
- Link each phase to the histogram.
- Refine scale factors for all phases simultaneously — the relative scale factors give the phase fractions.
The calibration parameters from MIDAS can serve as starting instrument-parameter guesses in GSAS-II. The key conversions are:
| MIDAS Parameter | GSAS-II Equivalent | Notes |
|---|---|---|
Lsd (µm) |
Distance (mm) | Divide by 1000 |
Wavelength (Å) |
Lam (Å) | Direct |
px (µm) |
— | Used internally by MIDAS; not needed by GSAS-II |
BC (pixels) |
— | Used internally by MIDAS; not needed by GSAS-II |
Note
Once the data is caked (converted to 1D I vs. 2θ), the detector geometry is already encoded in the 2θ values. GSAS-II does not need BC, px, or tilt parameters — it only needs the wavelength and profile parameters.
For batch processing or pipeline integration, MIDAS provides a script that automates the staged refinement workflow described above using the GSAS-II scripting API. Each histogram (lineout) is refined independently in its own .gpx project, and when --nCPUs > 1 these refinements run in parallel via multiprocessing.
Note
In the examples below, $MIDAS_INSTALL_DIR refers to wherever MIDAS is installed on your system (commonly ~/opt/MIDAS). The scripts auto-detect their install location, so you only need the correct absolute or relative path to the script.
Since our goal is primarily lattice and profile refinement rather than full Rietveld, it is strongly recommended to pass the --no-atoms flag to lock atomic positions and thermal parameters.
python $MIDAS_INSTALL_DIR/utils/gsas_ii_refine.py \
--data output/sample.zarr.zip \
--cif CeO2.cif \
--out refinement/ \
--bkg-terms 6 \
--no-atoms \
--nCPUs 8The script extracts lineouts directly from the zarr file (without relying on GSAS-II's MIDAS zarr reader) and passes each lineout to GSAS-II as a temporary .xye file. It performs the same staged refinement as the GUI workflow, per histogram, while ignoring atomic positions due to --no-atoms:
| Stage | Parameters Refined |
|---|---|
| 1 | Background (Chebyshev) + Scale |
| 2 | Unit cell (lattice parameters) |
| 3a | Gaussian profile: U, V, W |
| 3b | Lorentzian profile: X, Y + asymmetry SH/L |
Each histogram produces its own hist_NNNN.gpx project that can be opened in the GSAS-II GUI.
| Argument | Description | Default |
|---|---|---|
--data / -d |
MIDAS .zarr.zip file (required) |
— |
--cif / -c |
One or more CIF files (required) | — |
--out / -o |
Output directory for per-histogram .gpx files |
refinement/ |
--instprm / -i |
Optional .instprm override |
(from zarr) |
--bkg-terms |
Number of background coefficients | 6 |
--limits LOW HIGH |
2θ limits (degrees) | (full range) |
--nCPUs |
Number of parallel workers | 1 |
--no-atoms |
Skip atomic position refinement | — |
--no-export |
Skip CIF/CSV export | — |
-v |
Verbose (debug) logging | — |
hist_NNNN.gpx— GSAS-II project (openable in the GUI)hist_NNNN_<phase>.cif— Refined crystal structurehist_NNNN_data.csv— Pattern data (observed, calculated, difference)refinement_summary.json— Aggregated results: Rwp per histogram, mean Rwp, lattice parameters
# Refine all histograms using 16 parallel workers
python $MIDAS_INSTALL_DIR/utils/gsas_ii_refine.py \
--data output/sample.zarr.zip \
--cif Al2O3.cif Fe3O4.cif \
--out refinement/ \
--nCPUs 16 \
--limits 2.0 30.0The script can be imported directly for integration into larger pipelines:
from gsas_ii_refine import run_refinement
results = run_refinement(
data_file='output/sample.zarr.zip',
cif_files=['CeO2.cif'],
output_dir='refinement/',
bkg_terms=6,
n_cpus=8,
)
print(f"Mean Rwp = {results['mean_Rwp']:.3f}%")
print(f"Succeeded: {results['succeeded']} / {results['total_histograms']}")Tip
Start by inspecting your data in the GSAS-II GUI to verify the import and choose appropriate 2θ limits. Once satisfied, use the script to automate the refinement for batch processing.
The integrate_and_refine.py wrapper runs the full pipeline from raw data to refined crystal structures in a single command. It supports two integration backends:
| Backend | Engine | Data Source | Best For |
|---|---|---|---|
batch (default) |
integrator.py (CPU/OpenMP) |
Individual data files (-dataFN) |
Post-experiment analysis, systems without GPUs |
stream |
integrator_batch_process.py (GPU/CUDA) |
Folder of images or live PVA stream | Real-time experiments, high-throughput |
Both backends produce a .zarr.zip file that is passed to Stage 2 (Rietveld refinement).
Process individual data files using the OpenMP-based integrator:
python $MIDAS_INSTALL_DIR/FF_HEDM/workflows/integrate_and_refine.py \
--backend batch \
-paramFN ps.txt \
-dataFN data/sample_000001.h5 \
--cif CeO2.cif \
--out refinement/ \
-nCPUs 8Process a folder of images using the GPU streaming pipeline:
python $MIDAS_INSTALL_DIR/FF_HEDM/workflows/integrate_and_refine.py \
--backend stream \
--param-file ps.txt \
--folder /data/experiment/scan_01/ \
--cif CeO2.cif \
--out refinement/ \
-nCPUs 8Connect to a live EPICS PVAccess detector stream:
python $MIDAS_INSTALL_DIR/FF_HEDM/workflows/integrate_and_refine.py \
--backend stream \
--param-file ps.txt \
--pva --pva-ip 10.54.105.139 \
--cif CeO2.cif \
--out refinement/ \
-nCPUs 8| Flag | Effect |
|---|---|
--skip-integration --zarr-file FILE |
Skip Stage 1 and use an existing .zarr.zip |
--skip-refinement |
Run only Stage 1 (integration) |
--backend batch |
Use CPU batch integrator (default) |
--backend stream |
Use GPU streaming integrator |
Batch backend (--backend batch):
-paramFN, -dataFN, -resultFolder, -darkFN, -dataLoc, -darkLoc, -numFrameChunks, -preProcThresh, -startFileNr, -endFileNr, -nCPUsLocal
Stream backend (--backend stream):
--param-file, --folder, --pva, --pva-ip, --output-h5, --dark, --stream-data-loc, --compress, --zarr-output
Refinement (both backends):
--cif, --out, --instprm, --bkg-terms, --limits, --no-atoms, --no-export
Shared: -nCPUs (parallel integration and refinement), -v
- FF_Radial_Integration.md — MIDAS radial integration / caking workflow (both CPU and GPU pipelines)
- FF_Calibration.md — FF-HEDM geometry calibration (produces parameters used by integrator)
- README.md — High-level MIDAS overview and manual index
- GSAS-II Tutorials — Official GSAS-II tutorials and documentation
- GSASIIscriptable API — GSAS-II scripting reference
If you encounter any issues or have questions, please open an issue on this repository.