Skip to content

LiaPlayground/LiaScript_Community_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

12 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

LiaScript Community Analysis

Research papers and analyses exploring the LiaScript open-source ecosystem β€” covering adoption patterns, collaboration networks, feature usage, and community structure.

Subprojects

Subproject Type Focus
DELFI2026_usage_patterns Conference paper Feature adoption analysis β€” which LiaScript features are actually used by course creators?
journal_collaboration Journal paper Collaboration and content reuse patterns in decentralized OER
conference_development Conference paper User segmentation and implications for platform development
contributor_graph Network analysis Contributor-repository network β€” are courses isolated or connected through "super users"?
liascript_course Interactive course Data-driven overview of the LiaScript ecosystem
author_map Visualization Geographic distribution of LiaScript committers (interactive map)

Shared resources (bibliography, figures, config) are in papers/shared/.

Data

The dataset (~3.5 GB) is not included in this repository. It contains crawled LiaScript course data from GitHub (March 2026, 3,672 validated courses).

Download: https://ificloud.xsitepool.tu-freiberg.de/index.php/s/52zRDccb5AP6JFL

The data path is configured in papers/shared/config.yaml.

Quick Start

Prerequisites Check

# Verify installations
python3 --version  # Need 3.9+
pandoc --version
xelatex --version
node --version     # For Mermaid diagrams
mmdc --version     # For Mermaid diagrams

Installation

1. System Dependencies

Ubuntu/Debian:

sudo apt install pandoc texlive-xetex texlive-latex-extra chromium-browser
npm install -g @mermaid-js/mermaid-cli

macOS:

brew install pandoc node
brew install --cask mactex chromium
npm install -g @mermaid-js/mermaid-cli

2. Python Environment

Option A: Using Pipenv (Recommended)

cd /home/sz/Desktop/Python/LiaScript_Paper
pipenv install
pipenv shell

Option B: Virtual Environment

cd /home/sz/Desktop/Python/LiaScript_Paper
python3 -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Mermaid Diagram Support

The project includes a Pandoc Lua filter for Mermaid diagrams (paper/filters/mermaid.lua).

Test Mermaid installation:

mmdc --version

# If "command not found":
npm install -g @mermaid-js/mermaid-cli

Troubleshooting Mermaid:

  • "mmdc not found": Add npm to PATH: export PATH="$PATH:$(npm bin -g)"
  • "No usable sandbox": Run with --no-sandbox flag (handled automatically)
  • "Could not find Chrome": export PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium

Run the Pipeline

python run_pipeline.py

Output:

  • paper/build/paper.md - Markdown version
  • paper/build/paper.tex - LaTeX version
  • paper/build/paper.pdf - PDF version
  • mermaid-images/ - Generated diagram images

πŸ“Š Overview

This project provides a complete workflow for:

  1. Data Loading: Automatically loads and merges LiaScript datasets
  2. Analysis: Runs comprehensive analyses:
    • Descriptive statistics
    • Feature usage patterns
    • Collaboration networks
    • Temporal trends
    • Topic clustering
    • License compliance
  3. Paper Generation: Generates documents from Jinja2 templates
  4. Multi-Format Export: Outputs in Markdown, LaTeX, and PDF with embedded Mermaid diagrams

πŸ“ Project Structure

LiaScript_Paper/
β”œβ”€β”€ config/
β”‚   └── paper_config.yaml          # Main configuration
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ raw/liascript_data/        # Symlink to LiaScript data
β”‚   └── processed/                 # Analysis caches
β”‚
β”œβ”€β”€ analyses/                      # Analysis modules
β”‚   β”œβ”€β”€ descriptive_stats.py
β”‚   β”œβ”€β”€ feature_analysis.py
β”‚   β”œβ”€β”€ collaboration_analysis.py
β”‚   β”œβ”€β”€ temporal_analysis.py
β”‚   β”œβ”€β”€ topic_clustering.py
β”‚   β”œβ”€β”€ network_analysis.py
β”‚   └── license_analysis.py
β”‚
β”œβ”€β”€ paper/
β”‚   β”œβ”€β”€ filters/
β”‚   β”‚   └── mermaid.lua           # Pandoc Lua filter for diagrams
β”‚   β”œβ”€β”€ templates/                # LaTeX templates
β”‚   β”œβ”€β”€ sections/                 # Jinja2 section templates
β”‚   β”œβ”€β”€ figures/                  # Generated figures
β”‚   └── build/                    # Generated papers
β”‚
β”œβ”€β”€ pipeline/                     # Core pipeline modules
β”‚   β”œβ”€β”€ data_loader.py
β”‚   β”œβ”€β”€ analysis_runner.py
β”‚   └── paper_builder.py
β”‚
β”œβ”€β”€ run_pipeline.py              # Main entry point
└── README.md

βš™οΈ Configuration

Edit config/paper_config.yaml to configure:

paper:
  title: "Your Paper Title"
  authors:
    - name: "Author Name"
      affiliation: "Institution"
      email: "email@example.com"

  data:
    base_path: "/path/to/LiaScript/data"

  analyses:
    enabled:
      - descriptive_stats
      - feature_analysis
      - collaboration_analysis
      - temporal_analysis
      - license_analysis

  output:
    format:
      - markdown
      - latex
      - pdf

πŸ”§ Advanced Usage

Run with Options

# Use custom config
python run_pipeline.py --config my_config.yaml

# Skip analysis (use cached results)
python run_pipeline.py --skip-analysis

# Analysis only (no paper generation)
python run_pipeline.py --skip-paper

# Debug mode
python run_pipeline.py --log-level DEBUG

Adding New Analyses

  1. Create file in analyses/:
# analyses/my_analysis.py
def run_analysis(df, config):
    results = {}
    # ... your analysis ...
    return results
  1. Enable in config/paper_config.yaml:
analyses:
  enabled:
    - my_analysis
  1. Create section template in paper/sections/:
<!-- 05_my_section.md.jinja -->
## My Analysis Results

{{ results.my_analysis.key_finding }}

Embedding Mermaid Diagrams

In any .md.jinja template, use standard Mermaid syntax:

```mermaid
flowchart TB
    A[Start] --> B[Process]
    B --> C[End]
```

The Lua filter automatically converts them to images during PDF generation.


πŸ“‹ Data Requirements

Expected data files in data/raw/liascript_data/:

  • LiaScript_files.p - Main file dataset with license info
  • LiaScript_commits.p - Commit history
  • LiaScript_metadata.p - Extracted metadata
  • LiaScript_content.p - Full text content
  • LiaScript_ai_meta.p - AI-generated metadata
  • LiaScript_repositories.p - Repository info

πŸ”¬ Research Questions

The pipeline addresses five main research questions:

RQ1: How widespread is LiaScript adoption in the international education community?

RQ2: What characterizes LiaScript content in terms of structure and features?

RQ3: What collaboration patterns exist in LiaScript course development?

RQ4: What lifecycle and sustainability patterns emerge?

RQ5: How open are LiaScript materials in terms of licensing?

See config/paper_config.yaml for detailed sub-questions.


πŸ› Troubleshooting

Pandoc Errors

Pandoc not found:

# Ubuntu/Debian
sudo apt install pandoc

# macOS
brew install pandoc

XeLaTeX errors:

# Install full TeX distribution
sudo apt install texlive-full  # Ubuntu
brew install --cask mactex      # macOS

Mermaid Diagram Errors

Filter errors:

  • Ensure mmdc is installed: npm install -g @mermaid-js/mermaid-cli
  • Check Lua filter exists: ls paper/filters/mermaid.lua

Image generation fails:

  • Install Chromium: sudo apt-get install chromium-browser
  • Set browser path: export PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium

Missing Data Files

Verify symlink:

ls -la data/raw/liascript_data/
# Should point to: /media/sz/Data/Connected_Lecturers/LiaScript/raw

Fix broken symlink:

rm data/raw/liascript_data
ln -s /path/to/actual/data data/raw/liascript_data

Module Import Errors

# Ensure correct directory and environment
cd LiaScript_Paper
source venv/bin/activate  # or: pipenv shell
python run_pipeline.py

πŸ§ͺ Development

Exploratory Analysis

Use Jupyter notebooks:

jupyter notebook notebooks/
from pipeline.data_loader import LiaScriptDataLoader

loader = LiaScriptDataLoader("/path/to/data")
df = loader.load_all()
df.head()

Running Tests

pytest tests/

πŸ“„ Output Validation

After running the pipeline, verify:

# Check PDF was generated
ls -lh paper/build/paper.pdf

# Verify reasonable file size (1-2 MB, not 50KB!)
du -h paper/build/paper.pdf

# Check images were embedded
pdfimages -list paper/build/paper.pdf

# View the PDF
xdg-open paper/build/paper.pdf  # Linux
open paper/build/paper.pdf      # macOS

πŸ“ License

[Specify license]

πŸ‘₯ Contact

[Your contact information]

πŸ“– Citation

If you use this pipeline in your research, please cite:

[Citation information to be added]

πŸ™ Acknowledgments

  • Built on data from the LiaScript Community Analysis project
  • Mermaid diagram support via Pandoc Lua filter
  • Pipeline framework inspired by reproducible research best practices

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors