From 960d25f7f3632c3bdecf0565b30081d96fc7e00b Mon Sep 17 00:00:00 2001
From: Ahmed <76680009+AhmedRadwan02@users.noreply.github.com>
Date: Wed, 4 Feb 2026 13:30:00 -0500
Subject: [PATCH 1/2] Updating AIXpert website content (#54)
* Updating AIXpert website content
* Removing Git Icon top right and updating papers
---
docs/index.md | 83 ++++++----------
docs/papers.md | 63 +++++++++++++
docs/stylesheets/extra.css | 68 ++++++++++----
docs/team.md | 95 ++++++-------------
docs/updates.md | 21 +++++
docs/user_guide.md | 187 ++++++-------------------------------
mkdocs.yml | 9 +-
7 files changed, 224 insertions(+), 302 deletions(-)
create mode 100644 docs/papers.md
create mode 100644 docs/updates.md
diff --git a/docs/index.md b/docs/index.md
index 24a1460..2a825e7 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -1,21 +1,21 @@
-# AI Fairness Data Generation and Question Answering System
+# AIXpert at Vector Institute
-_Transparent tools and standardized benchmarks for **fair**, **explainable**, and **accountable** generative AI._
+_**[Vector Institute's](https://vectorinstitute.ai)** contribution to the [AIXpert Project](https://aixpert-project.eu/): tools, benchmarks, and research for **explainable**, **accountable**, and **fair** AI._
-> The rapid growth of generative AI brings powerful capabilities—but it also magnifies long-standing concerns around **bias, fairness, and representation**. Many models reproduce stereotypes embedded in training data, especially around demographic attributes (e.g., gender, ethnicity, age).
-> This project enables **systematic, controlled experimentation** so researchers and practitioners can pinpoint _when_ and _why_ bias occurs—and what actually mitigates it.
+> The AIXpert project aims to transform how AI is developed, deployed, and trusted by society. Vector’s work within AIXpert focuses on **responsible AI**: fairness-aware data generation and evaluation, multimodal benchmarks (audio-video, vision-language), factuality and transparency in agentic systems, and open tools for reproducible, governance-ready research.
+---
-## 🌍 What is the project about?
+## What we do
-The **AI Fairness Data Generation and Question Answering System** is part of **[Vector Institute's](https://vectorinstitute.ai)** contribution to the broader [AIXPERT Project](https://aixpert-project.eu/), a multi-institutional initiative, to develop tools and benchmarks for **fairness-aware data generation and evaluation** in generative AI.
+Vector’s contribution to AIXpert aligns with the project’s **vision and objectives**:
-It provides:
+- **Build an adaptable, explainable AI-agentic platform** — Develop interoperable tools and modules that connect explainability, accountability, and fairness.
+- **Define and assess AI trustworthiness** — Establish measurable criteria and indicators for evaluating the reliability and ethical alignment of AI systems.
+- **Advance explainable multimodal foundation models** — Drive research in interpretable vision–language–reasoning and multimodal understanding.
+- **Demonstrate real-world impact** — Validate the framework across sectors including healthcare, employment, and education.
-- **Controlled synthetic datasets** to isolate bias-inducing factors safely and reproducibly.
-- **Agentic automation** (CrewAI + custom LLM agents) for prompt generation, content creation, metadata, and QC.
-- **Fairness metrics & explainers** to visualize model behavior and surface disparities.
-- **Open, configurable pipelines** aligned with responsible AI practices and emerging governance needs.
+For the full **AIXpert vision, consortium, and funding**, see [About](about.md).
---
@@ -23,61 +23,38 @@ It provides:
-- **Develop a Controlled Data Pipeline**
- Create a reproducible, configurable pipeline for generating **text, image, and video** with precise control over **demographic** and **contextual** variables.
+- **Explainable, accountable AI**
+ Develop tools and benchmarks for **interpretability**, **fairness**, and **transparency** in generative and multimodal AI, aligned with AIXpert’s vision.
-- **Enable Fairness-Aware Benchmarking**
- Provide tools to build matched **baseline vs. fairness-aware** datasets for bias diagnosis and mitigation experiments.
+- **Trust, risk, and security in agentic AI**
+ Advance **TRiSM** (Trust, Risk, and Security Management) and transparency frameworks for safe, explainable agentic and multi-agent systems.
-- **Support Multi-Domain Risk Analysis**
- Generate multimodal data for **hiring, healthcare, legal, education**, and more, covering risks like **bias, toxicity, misinformation**.
+- **Multimodal and real-world evaluation**
+ Create benchmarks and datasets for **audio-video understanding**, **vision-language** assessment, and **fairness** across domains and demographics.
-- **Integrate Agentic AI for Automation**
- Orchestrate generation and QC with **CrewAI** and **custom LLM agents** (prompts, assets, annotations, validation).
+- **Define and assess AI trustworthiness**
+ Establish **measurable criteria** and evaluation suites for reliability, factuality, and ethical alignment of AI systems.
-- **Advance Interpretability & Explainability**
- Combine **zero-shot LLM explainers** and fairness metrics to produce **interpretable** assessments and visualizations.
+- **Real-world impact**
+ Validate approaches across **healthcare, employment, education**, and high-stakes domains through pilots, benchmarks, and open releases.
-- **Foster Open Research & Collaboration**
- Share configs, tools, and docs openly to enable **reproducible research** and **transparent governance**.
+- **Open, reproducible research**
+ Share **code**, **datasets**, and **documentation** openly to support reproducible research, benchmarking, and transparent governance.
---
-## Pipeline
-
-
-
----
-
## Recent updates
-- :material-rocket-launch: **Released data generation pipeline** (multimodal, configurable, agent-orchestrated).
-- :material-robot: **Single-agent pipeline** prototype for rapid dataset bootstrapping.
-- :material-file-document: NeurIPS 2025 LLM-eval Workshop paper: [_Bias in the Picture: Benchmarking VLMs with Social-Cue News Images and LLM-as-Judge Assessment_](https://arxiv.org/abs/2509.19659)
-- :material-file-document: Preprint: [_TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems_](https://arxiv.org/abs/2506.04133)
-- :material-file-document-edit-outline: TechRxiv article: [_Responsible Agentic Reasoning and AI Agents—A Critical Survey_](https://www.techrxiv.org/users/574774/articles/1329333-responsible-agentic-reasoning-and-ai-agents-a-critical-survey?mode=edit)
-- :material-post-outline: Poster: **Single-Agent TRiSM** (NeurIPS LAW)
-
----
+- :material-newspaper: **AIXpert news** — Our work was highlighted on the [AIXpert project website](https://aixpert-project.eu/2026/01/28/advancing-trustworthy-explainable-and-responsible-ai-at-neurips-2025/): *Advancing Trustworthy, Explainable, and Responsible AI at NeurIPS 2025* (Bias in the Picture, HumaniBench, Carbon Literacy, and more).
+- :material-play-circle: **SONIC-O1** — Paper: [_A Real-World Benchmark for Evaluating MLLMs on Audio-Video Understanding_](https://arxiv.org/abs/2601.21666) (arXiv).
+- :material-database: **SONIC-O1** — Dataset on [Hugging Face](https://huggingface.co/datasets/vector-institute/sonic-o1) (231 videos, ~60h, 4,958 QAs, 13 domains, demographic metadata).
+- :material-github: **SONIC-O1** — [Code](https://github.com/VectorInstitute/sonic-o1) and evaluation pipeline (summarization, MCQ, temporal localization).
+- :material-medal: **SONIC-O1** — [Leaderboard](https://huggingface.co/spaces/vector-institute/sonic-o1-leaderboard) for model comparisons and fairness analysis.
-
-
-> Have feedback or want to contribute? See the [:material-account-group: Team](team.md) page and open an issue or pull request.
+[:material-arrow-right: **View full list**](updates.md){ .md-button .md-button--primary }
---
-## License
-
-This code in this repo is released under the **MIT License**.
+> Have feedback or want to contribute? See the [:material-account-group: Team](team.md) page and open an issue or pull request.
diff --git a/docs/papers.md b/docs/papers.md
new file mode 100644
index 0000000..3061943
--- /dev/null
+++ b/docs/papers.md
@@ -0,0 +1,63 @@
+# Papers
+
+Selected publications and preprints from the AIXpert project. Each entry links to arXiv (or equivalent) where available.
+
+---
+
+## AIXpert project papers
+
+### SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
+
+**Paper** · **Code** · **Dataset** · **Leaderboard** ·
+
+**Authors:** Ahmed Y. Radwan, Christos Emmanouilidis, Hina Tabassum, Deval Pandya, Shaina Raza.
+
+SONIC-O1, a fully human-verified real-world audio-video benchmark with 4,958 annotations across 13 conversational domains. We evaluate multimodal models on video summarization, evidence-grounded QA, and temporal event localization, and release an extensible evaluation suite to support reproducible benchmarking and robustness analysis.
+
+### Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning
+
+**Paper** · **Code** · **Dataset** ·
+
+**Authors:** Sindhuja Chaduvula, Ahmed Y. Radwan, Azib Farooq, Yani Ioannou, Shaina Raza.
+
+Preference-learning method (F-DPO) that targets factuality directly, improving factuality scores while reducing hallucination rates across multiple open-weight LLMs.
+
+### Bias in the Picture: Benchmarking VLMs with Social-Cue News Images and LLM-as-Judge Assessment
+
+**Paper** (NeurIPS 2025 LLM-eval Workshop) · **Code** ·
+
+**Authors:** Aravind Narayanan, Vahid Reza Khazaie, Shaina Raza.
+
+Benchmarking vision-language models with social-cue news images and LLM-as-judge assessment.
+
+### TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems
+
+**Paper** ·
+
+**Authors:** Shaina Raza, Ranjan Sapkota, Manoj Karkee, Christos Emmanouilidis.
+
+A review of trust, risk, and security management (TRiSM) in LLM-based agentic and multi-agent systems.
+
+### Responsible Agentic Reasoning and AI Agents—A Critical Survey
+
+**Paper** (TechRxiv) ·
+
+**Authors:** Shaina Raza (Vector Institute), Ranjan Sapkota, Manoj Karkee (Cornell University), Christos Emmanouilidis (University of Groningen).
+
+Critical survey of responsible agentic reasoning and AI agents.
+
+### Evaluating and Regulating Agentic AI: A Study of Benchmarks, Metrics and Regulation
+
+**Paper** (TechRxiv) · **Code** ·
+
+**Authors:** Azib Farooq, Shaina Raza, Nazmul Karim, Hasan Iqbal, Athanasios V. Vasilakos, Christos Emmanouilidis.
+
+Reviews recent progress in developing and assessing agentic AI along three dimensions: benchmarks, metrics, and governance. Analyzes how evaluation frameworks capture reasoning, planning, collaboration, and ethical alignment in single- and multi-agent systems. Aims to establish a unified foundation for trustworthy, auditable, and human-aligned AI agents.
+
+---
+
+
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
index 5c79bfd..3a56adb 100644
--- a/docs/stylesheets/extra.css
+++ b/docs/stylesheets/extra.css
@@ -1,3 +1,26 @@
+/* Inline badge icons (shields.io) for arXiv, GitHub, Hugging Face */
+.md-typeset img[src*="img.shields.io"] {
+ height: 1.15em;
+ vertical-align: middle;
+ margin-right: 0.15em;
+}
+
+/* User guide: floating card for each project section */
+.md-content .md-typeset div.user-guide-card {
+ background-color: #f5f5f5 !important;
+ border-radius: 12px !important;
+ box-shadow: 0 2px 12px rgba(0, 0, 0, 0.08) !important;
+ padding: 1.25rem 1.5rem !important;
+ margin: 0.5rem 0 1rem 0 !important;
+ border: none !important;
+ display: block !important;
+}
+
+[data-md-color-scheme="slate"] .md-content .md-typeset div.user-guide-card {
+ background-color: rgba(255, 255, 255, 0.06) !important;
+ box-shadow: 0 2px 12px rgba(0, 0, 0, 0.3) !important;
+}
+
[data-md-color-primary="vector"] {
--md-primary-fg-color: #eb088a;
--md-primary-fg-color--light: #f252a5;
@@ -91,47 +114,56 @@
}
-/* Reduce space between team members */
+/* Team cards: compact, name + LinkedIn only, 3 per line */
.team-grid {
display: grid;
- grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
- gap: 1.5rem;
+ grid-template-columns: repeat(3, 1fr);
+ gap: 0.75rem;
margin-top: 1rem;
}
.team-card {
background-color: var(--md-surface);
- border-radius: 12px;
- box-shadow: 0 1px 4px rgba(0, 0, 0, 0.1);
- padding: 1rem 1.25rem;
+ border-radius: 8px;
+ box-shadow: 0 1px 3px rgba(0, 0, 0, 0.08);
+ padding: 0.5rem 0.75rem;
text-align: left;
transition: transform 0.15s ease;
+ min-width: 0;
}
.team-card:hover {
- transform: translateY(-3px);
+ transform: translateY(-2px);
}
.team-card h3 {
- margin-top: 0;
- margin-bottom: 0.25rem;
+ margin: 0 0 0.25rem 0;
+ font-size: 0.95em;
+ font-weight: 600;
+ white-space: nowrap;
+ overflow: hidden;
+ text-overflow: ellipsis;
}
.team-card p {
- margin: 0.25rem 0;
- line-height: 1.4;
-}
-
-.team-links .twemoji,
-.team-links svg {
- width: 18px;
- height: 18px;
- vertical-align: text-bottom;
+ margin: 0;
+ line-height: 1.3;
}
.team-links a {
text-decoration: none;
color: var(--md-primary-fg-color);
+ font-size: 0.85em;
+}
+
+.team-links .fab {
+ margin-right: 0.25em;
+}
+
+@media (max-width: 600px) {
+ .team-grid {
+ grid-template-columns: 1fr;
+ }
}
/* Center logo */
diff --git a/docs/team.md b/docs/team.md
index 870700b..ff582c5 100644
--- a/docs/team.md
+++ b/docs/team.md
@@ -2,100 +2,65 @@
The team at the Vector Institute behind the development of this project focuses on ethical AI practices, promoting fairness, accountability, and sustainability.
-For inquiries or support, contact:
-
---