Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -59,12 +59,11 @@ repos:
hooks:
- id: pytest
name: pytest
entry: pytest -m "not integration_test"
language: python
entry: uv run pytest -m "not integration_test"
language: system
types: [python]
pass_filenames: false
always_run: true
additional_dependencies: [pytest]

ci:
autofix_commit_msg: |
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ vector-aixpert/
├── src/aixpert/
│ ├── controlled_images/ # Baseline vs fairness-aware image generation
│ ├── deepfake_detection/ # Curated multimodal deepfake data preparation
│ ├── data_generation/
│ │ ├── synthetic_data_generation/
│ │ │ ├── images/ # Domain/risk-specific image + VQA generation
Expand Down Expand Up @@ -94,6 +95,9 @@ uv run mkdocs serve
- **Controlled Images** — Matched baseline vs fairness-aware images across professions.
➜ [`src/aixpert/controlled_images/README.md`](src/aixpert/controlled_images/README.md)

- **Deepfake Detection** — FACT-HO bundle preparation for LAV-DF, FakeAVCeleb, and VCapAV.
➜ [`src/aixpert/deepfake_detection/README.md`](src/aixpert/deepfake_detection/README.md)

- **Agent Pipeline (CrewAI)** — Single-agent orchestration for prompt/image/metadata generation.
➜ [`src/aixpert/data_generation/agent_pipeline/README.md`](src/aixpert/data_generation/agent_pipeline/README.md)

Expand Down Expand Up @@ -149,4 +153,4 @@ Resources used in preparing this research were provided, in part, by the Provinc

This work is part of the AIXpert project, funded by the **European Union's Horizon Europe Research and Innovation Programme** under Grant Agreement No. **101214389**, and the **Swiss State Secretariat for Education, Research and Innovation (SERI)**. Views expressed are those of the authors and do not necessarily reflect those of the European Union or funding authorities.

🌐 [Project Website](https://aixpert-project.eu/) · [LinkedIn](https://www.linkedin.com/company/aixpert-project/) · [X/Twitter](https://x.com/AIXPERT_project) · [YouTube](https://www.youtube.com/@AIXPERT_project)
🌐 [Project Website](https://aixpert-project.eu/) · [LinkedIn](https://www.linkedin.com/company/aixpert-project/) · [X/Twitter](https://x.com/AIXPERT_project) · [YouTube](https://www.youtube.com/@AIXPERT_project)
1 change: 1 addition & 0 deletions _typos.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ LLM = "LLM" # Large Language Model
LLMs = "LLMs" # Large Language Models (plural)
VQA = "VQA" # Visual Question Answering
IG = "IG" # Integrated Gradients
HumAIne = "HumAIne" # EU project name
2 changes: 1 addition & 1 deletion docs/projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,4 +78,4 @@ Statistical metrics (e.g. Statistical Parity, Equal Opportunity), zero-shot expl

- See [CONTRIBUTING.md](https://github.com/VectorInstitute/vector-aixpert/blob/main/CONTRIBUTING.md) for coding standards (PEP8, Google docstrings), pre-commit hooks (`ruff`, `mypy`, `typos`, `nbQA`), branching, and tests.
- **Run docs locally:** `uv sync --no-group docs` then `mkdocs serve` → [http://127.0.0.1:8000](http://127.0.0.1:8000)
- **CI:** GitHub Actions (`code_checks.yml`, `unit_tests.yml`, `integration_tests.yml`)
- **CI:** GitHub Actions (`code_checks.yml`, `unit_tests.yml`, `integration_tests.yml`)
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,7 @@ ignore = [
# Ignore import violations in all `__init__.py` files.
[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["E402", "F401", "F403", "F811"]
"src/aixpert/deepfake_detection/builders.py" = ["PLR0912"]
# Ignoring undocumented public functions, public init, magic method, and magic numbers in tests folder
"tests/*" = ["D103", "D105", "D107"]

Expand Down
38 changes: 38 additions & 0 deletions src/aixpert/deepfake_detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Deepfake Detection

This module packages the most reviewable and reusable parts of the current
multimodal deepfake work into the `vector-aixpert` monorepo.

## Scope

The first version focuses on data preparation rather than full training:

- FACT-HO sample and bundle domain objects
- deterministic grouping and split helpers
- dataset builders for `LAV-DF`, `FakeAVCeleb`, and manifest-first `VCapAV`
- a small CLI for one-sample smoke summaries

## Why this is curated

The original working directory contains many experiment scripts, environment
fixes, and cluster-specific launchers. For a first monorepo integration, this
module keeps only the parts that are easiest to review, test, and scale.

That means this initial contribution intentionally excludes:

- repeated training variants
- plotting and monitoring helpers
- local outputs and checkpoints
- user-specific absolute paths

## Example

From the repository root:

```bash
uv run python -m aixpert.deepfake_detection.cli summarize \
--dataset vcapav \
--data-root /path/to/data \
--metadata-path /path/to/vcapav_manifest.jsonl \
--vcapav-split-strategy metadata
```
36 changes: 36 additions & 0 deletions src/aixpert/deepfake_detection/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""Curated utilities for multimodal deepfake dataset preparation."""

from aixpert.deepfake_detection.builders import (
DatasetPartitions,
FakeAVCelebBuilder,
FakeAVCelebConfig,
LAVDFBuilder,
LAVDFConfig,
SelectionLimits,
SplitConfig,
VCapAVBuilder,
VCapAVConfig,
)
from aixpert.deepfake_detection.core import (
FactHOBundle,
FactHOSample,
assign_group_indices,
build_bundles_from_samples,
)


__all__ = [
"DatasetPartitions",
"FactHOBundle",
"FactHOSample",
"FakeAVCelebBuilder",
"FakeAVCelebConfig",
"LAVDFBuilder",
"LAVDFConfig",
"SelectionLimits",
"SplitConfig",
"VCapAVBuilder",
"VCapAVConfig",
"assign_group_indices",
"build_bundles_from_samples",
]
Loading