Parse UiPath XAML workflow files and extract complete metadata - arguments, variables, activities, expressions, and annotations.
A zero-dependency parser for UiPath XAML workflow files. Extract all metadata from automation projects:
- Workflow arguments (inputs/outputs)
- Variables and their scopes
- Activities and their configurations
- Business logic annotations
- VB.NET and C# expressions
Available in: Python (stable) | Go (planned)
pip install cpmf-uips-xamlOr for development:
git clone https://github.com/rpapub/xaml-parser.git
cd xaml-parser/python
uv syncfrom pathlib import Path
from cpmf_uips_xaml import XamlParser
parser = XamlParser()
result = parser.parse_file(Path("Main.xaml"))
if result.success:
for arg in result.content.arguments:
print(f"{arg.direction.upper()}: {arg.name} ({arg.type})")
if arg.annotation:
print(f" → {arg.annotation}")Output:
IN: Config (System.Collections.Generic.Dictionary<String, Object>)
→ Configuration dictionary from orchestrator
OUT: TransactionData (System.Data.DataRow)
→ Current transaction item
result = parser.parse_file(Path("Process.xaml"))
for activity in result.content.activities:
indent = " " * activity.depth_level
print(f"{indent}{activity.tag}: {activity.display_name or '(unnamed)'}")Output:
Sequence: Process Transaction
TryCatch: Try Process
Assign: Set Transaction Data
InvokeWorkflowFile: Update System
LogMessage: Transaction Complete
result = parser.parse_file(Path("workflow.xaml"))
# Root workflow annotation
if result.content.root_annotation:
print(f"Workflow Purpose: {result.content.root_annotation}")
# Activity annotations
for activity in result.content.activities:
if activity.annotation:
print(f"\n{activity.display_name}:")
print(f" {activity.annotation}")config = {'extract_expressions': True}
parser = XamlParser(config)
result = parser.parse_file(Path("workflow.xaml"))
for activity in result.content.activities:
for expr in activity.expressions:
print(f"{activity.display_name}: {expr.content}")
print(f" Language: {expr.language}")
print(f" Type: {expr.expression_type}")import json
result = parser.parse_file(Path("Main.xaml"))
doc = {
'workflow': result.content.display_name or 'Main',
'description': result.content.root_annotation,
'arguments': [
{
'name': arg.name,
'type': arg.type,
'direction': arg.direction,
'description': arg.annotation
}
for arg in result.content.arguments
],
'activity_count': len(result.content.activities),
'variable_count': len(result.content.variables)
}
print(json.dumps(doc, indent=2))import sys
result = parser.parse_file(Path("workflow.xaml"))
if not result.success:
print(f"❌ Parsing failed: {', '.join(result.errors)}")
sys.exit(1)
# Check for required arguments
required = ['in_Config', 'out_Result']
actual = {arg.name for arg in result.content.arguments}
if not all(req in actual for req in required):
print(f"❌ Missing required arguments")
sys.exit(1)
print(f"✅ Workflow valid: {len(result.content.activities)} activities")invocations = []
for activity in result.content.activities:
if activity.tag == 'InvokeWorkflowFile':
workflow_path = activity.visible_attributes.get('WorkflowFileName', '')
invocations.append(workflow_path)
print("Invoked workflows:")
for path in invocations:
print(f" - {path}")New in v2.0: Transform parsed workflows into queryable graph structures with multiple output views.
Parse entire UiPath projects and analyze call graphs, control flow, and activity relationships:
from pathlib import Path
from cpmf_uips_xaml import ProjectParser, analyze_project
# Parse entire project
parser = ProjectParser()
project_result = parser.parse_project(Path("MyProject"), recursive=True)
# Build queryable graph structures
index = analyze_project(project_result)
# Query the project
print(f"Total workflows: {index.total_workflows}")
print(f"Total activities: {index.activities.node_count()}")
print(f"Entry points: {len(index.entry_points)}")
# Find circular dependencies
cycles = index.find_call_cycles()
if cycles:
print(f"Warning: Found {len(cycles)} circular call chains")Generate different representations of the same project:
from cpmf_uips_xaml.views import FlatView
view = FlatView()
output = view.render(index)
# Returns traditional flat list of workflowsFollow the execution path from an entry point, showing nested invocations:
from cpmf_uips_xaml.views import ExecutionView
# Start from entry point workflow
entry_workflow_id = index.entry_points[0]
view = ExecutionView(entry_point=entry_workflow_id, max_depth=10)
output = view.render(index)
# Output shows:
# - Call depth for each workflow
# - Nested activities (callee activities under InvokeWorkflowFile)
# - Execution order from entry to leavesUse case: Understand what actually runs when you start from Main.xaml
Extract focused context around a specific activity:
from cpmf_uips_xaml.views import SliceView
# Focus on a specific activity
focal_activity_id = "act:sha256:abc123def456"
view = SliceView(focus=focal_activity_id, radius=2)
output = view.render(index)
# Output includes:
# - The focal activity
# - Parent chain (root to focal)
# - Siblings (same parent)
# - Context activities within radiusUse case: Provide relevant context to LLMs without overwhelming token limits
# Parse project with flat view (default)
cpmf-uips-xaml project.json --dto --json
# Execution view from entry point
cpmf-uips-xaml project.json --dto --json \
--view execution \
--entry "wf:sha256:abc123def456"
# Slice view around specific activity
cpmf-uips-xaml project.json --dto --json \
--view slice \
--focus "act:sha256:abc123def456" \
--radius 3
# With progress reporting (rich/tqdm/json/simple)
cpmf-uips-xaml project.json --progress rich --jsonOutput Modes:
--json- Raw JSON output--dto- Normalized DTO with stable IDs and edges--arguments- Show only arguments--activities- Show only activities--tree- Show activity tree--summary- Show summary for multiple files--graph- Show workflow dependency graph (project mode)
View Transformations (with --dto):
--view {nested,execution,slice}- View type (default: nested)--entry WORKFLOW_ID- Entry point for execution view--focus ACTIVITY_ID- Focal activity for slice view--radius N- Context radius for slice view
Output Options:
--profile {full,minimal,mcp,datalake}- Output profile--combine- Combine all workflows into single output--sort- Sort output alphabetically-o, --output PATH- Output file/directory
Analysis:
--metrics- Include workflow metrics--anti-patterns- Detect anti-patterns
Progress Reporting: (new in v0.3)
--progress {rich,tqdm,json,simple}- Progress reporter typerich- Animated progress bars (requirespip install rich)tqdm- tqdm-style progress (requirespip install tqdm)json- JSON-lines for machine parsingsimple- Plain text progress
Logging & Performance:
-v, --verbose- Enable verbose diagnostic logging--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}- Set log level--log-dir DIR- Directory for log files--no-log-file- Disable log file output--performance- Enable detailed performance profiling
Project Parsing:
--entry-points-only- Parse only entry points (no recursive discovery)
The ProjectIndex provides powerful query methods:
# Get workflow by ID or path
workflow = index.get_workflow("wf:sha256:abc123")
workflow = index.get_workflow_by_path("Workflows/Process.xaml")
# Get activity and its containing workflow
activity = index.get_activity("act:sha256:def456")
parent_workflow = index.get_workflow_for_activity("act:sha256:def456")
# Get all workflows reachable from entry point
reachable = index.workflows.reachable_from(entry_workflow_id)
# Topological sort of workflow call graph
execution_order = index.get_execution_order()
# Extract context around activity
context = index.slice_context("act:sha256:abc123", radius=2)XAML Files → Parse → Normalize → Analyze → ProjectIndex (IR)
↓
Views (Nested, Execution, Slice)
↓
Emitters (JSON, Mermaid, Docs)
Stages:
- Parse - Extract raw data from XAML (XamlParser, ProjectParser)
- Normalize - Convert to stable DTOs with IDs and edges
- Analyze - Build queryable graph structures (ProjectIndex)
- View - Transform IR for specific use cases
- Emit - Output in various formats
The cpmf_uips_xaml.api module provides a clean facade organized into focused submodules:
api.parsing - Parse and normalize XAML
from cpmf_uips_xaml.api import parse_file, parse_project, normalize_parse_results
# Parse single file
result = parse_file(Path("Main.xaml"))
# Parse entire project
project_result = parse_project(Path("MyProject"))
# Parse + normalize to DTO
workflow_dto = parse_file_to_dto(Path("Main.xaml"))api.analysis - Build indices and analyze
from cpmf_uips_xaml.api import build_index, analyze_project
# Build index from workflows
index = build_index(workflows, project_dir=Path("."))
# Parse + analyze (complete pipeline)
project_result, analyzer, index = parse_and_analyze_project(Path("MyProject"))api.views - Transform to different views
from cpmf_uips_xaml.api import render_project_view
# Render execution view
output = render_project_view(
analyzer, index,
view_type="execution",
entry_point="wf:sha256:abc123"
)api.emit - Output workflows
from cpmf_uips_xaml.api import emit_workflows
# Emit to JSON
emit_workflows(workflows, format="json", output_path=Path("output.json"))
# Emit to Mermaid diagram
emit_workflows(workflows, format="mermaid", output_path=Path("diagram.md"))api.config - Configuration management
from cpmf_uips_xaml.api import load_default_config
config = load_default_config()Use XamlParser directly when:
- Parsing a single file with minimal processing
- Need fine-grained control over parser config
- Working with raw
ParseResultobjects
Use API facade (api.*) when:
- Parsing projects (multiple files)
- Building indices and graphs
- Generating different views
- Orchestrating the full pipeline (parse → normalize → analyze → emit)
ProjectIndex is an Intermediate Representation (IR) with 4 graph layers:
- Workflows Graph: All workflows with metadata
- Activities Graph: All activities across all workflows
- Call Graph: Workflow invocation relationships
- Control Flow Graph: Activity execution edges
Benefits:
- Single parse, multiple output formats
- Queryable structure for analysis tools
- Optimized for LLM context extraction
- 100% backward compatible (NestedView produces same output as v1.x)
- Clean layer boundaries (CLI → API → Stages)
See docs/ADR-GRAPH-ARCHITECTURE.md for design decisions.
- Name, type, direction (in/out/inout)
- Default values
- Documentation annotations
- Name, type, scope
- Default values
- Scoped to workflow or activity
- Activity type (Sequence, Assign, If, etc.)
- Display name and annotations
- All properties (visible and ViewState)
- Nested configuration
- Parent-child relationships
- Depth level in tree
- VB.NET and C# expressions
- Expression type (assignment, condition, etc.)
- Variable and method references
- LINQ query detection
- XML namespaces
- Assembly references
- Expression language (VB/C#)
- Parse diagnostics and performance
The parser supports multiple output formats via emitters:
| Format | Extension | Description | Use Case |
|---|---|---|---|
| JSON | .json |
Structured workflow data | API integration, data analysis |
| Mermaid | .md |
Call graph diagrams | Documentation, visualization |
| Doc | .md |
Human-readable docs | Team documentation |
Emitter Usage:
from cpmf_uips_xaml.api import emit_workflows
# JSON output
emit_workflows(workflows, format="json", output_path=Path("output.json"))
# Mermaid diagram
emit_workflows(workflows, format="mermaid", output_path=Path("diagram.md"))
# Documentation
emit_workflows(workflows, format="doc", output_path=Path("docs.md"))CLI:
# Automatic format selection based on extension
cpmf-uips-xaml project.json -o output.json # JSON
cpmf-uips-xaml project.json --graph -o diagram.md # Mermaidconfig = {
'extract_arguments': True, # Extract workflow arguments
'extract_variables': True, # Extract variables
'extract_activities': True, # Extract activities
'extract_expressions': True, # Parse expressions (slower)
'extract_viewstate': False, # Include ViewState data
'strict_mode': False, # Fail on any error
'max_depth': 50, # Max activity nesting depth
}
parser = XamlParser(config)The parser handles errors gracefully:
result = parser.parse_file(Path("malformed.xaml"))
if not result.success:
print("Errors:")
for error in result.errors:
print(f" - {error}")
print("\nWarnings:")
for warning in result.warnings:
print(f" - {warning}")
# Partial results may still be available
if result.content:
print(f"\nPartially parsed: {len(result.content.activities)} activities")| Language | Status | Package |
|---|---|---|
| Python | ✅ Stable (3.9+) | xaml-parser |
| Go | 🚧 Planned | github.com/rpapub/xaml-parser/go |
- Python API Documentation - Detailed Python usage
- Contributing Guide - For developers
- Architecture - Design decisions
- Schemas - JSON output schemas
- Static Analysis - Extract metadata for code quality tools
- Documentation - Auto-generate workflow documentation
- Migration - Parse workflows for platform migration
- CI/CD Validation - Validate structure in pipelines
- Code Review - Extract business logic for review
- Dependency Analysis - Map workflow dependencies
CLI Breaking Change:
The --progress flag changed from a boolean to a choice of reporter types.
Before (v0.2.x):
cpmf-uips-xaml project.json --progress # Boolean flagAfter (v0.3.x):
# Choose a specific reporter
cpmf-uips-xaml project.json --progress rich
cpmf-uips-xaml project.json --progress tqdm
cpmf-uips-xaml project.json --progress json
cpmf-uips-xaml project.json --progress simple
# Or omit for no progress (default)
cpmf-uips-xaml project.jsonAPI Breaking Change:
The show_progress parameter was replaced with a reporter parameter.
Before (v0.2.x):
from cpmf_uips_xaml.api import parse_and_analyze_project
result, analyzer, index = parse_and_analyze_project(
project_dir,
show_progress=True # Boolean
)After (v0.3.x):
from cpmf_uips_xaml.api import parse_and_analyze_project
from cpmf_uips_xaml.cli.reporters import RichReporter
# With progress
result, analyzer, index = parse_and_analyze_project(
project_dir,
reporter=RichReporter()
)
# No progress (default)
result, analyzer, index = parse_and_analyze_project(project_dir)Benefits:
- Library is now UI-agnostic (no Rich dependency in core)
- Multiple reporter types (Rich, tqdm, JSON, Simple)
- Easy to add custom reporters (implement
ProgressReporterprotocol) - Zero overhead when disabled (default
NULL_REPORTER)
CC-BY 4.0 - Christian Prior-Mamulyan and contributors
Attribution:
XAML Parser by Christian Prior-Mamulyan, licensed under CC-BY 4.0
Source: https://github.com/rpapub/xaml-parser
- GitHub: https://github.com/rpapub/xaml-parser
- Issues: https://github.com/rpapub/xaml-parser/issues
- PyPI: https://pypi.org/project/xaml-parser/ (planned)
Originally developed as part of the rpax automation analysis project.