VectorInstitute · aravind-3105 · Feb 4, 2026 · Feb 4, 2026 · Feb 4, 2026
diff --git a/docs/index.md b/docs/index.md
@@ -1,83 +1,60 @@
-# AI Fairness Data Generation and Question Answering System
+# AIXpert at Vector Institute
 
-_Transparent tools and standardized benchmarks for **fair**, **explainable**, and **accountable** generative AI._
+_**[Vector Institute's](https://vectorinstitute.ai)** contribution to the [AIXpert Project](https://aixpert-project.eu/): tools, benchmarks, and research for **explainable**, **accountable**, and **fair** AI._
 
-> The rapid growth of generative AI brings powerful capabilities—but it also magnifies long-standing concerns around **bias, fairness, and representation**. Many models reproduce stereotypes embedded in training data, especially around demographic attributes (e.g., gender, ethnicity, age).
-> This project enables **systematic, controlled experimentation** so researchers and practitioners can pinpoint _when_ and _why_ bias occurs—and what actually mitigates it.
+> The AIXpert project aims to transform how AI is developed, deployed, and trusted by society. Vector’s work within AIXpert focuses on **responsible AI**: fairness-aware data generation and evaluation, multimodal benchmarks (audio-video, vision-language), factuality and transparency in agentic systems, and open tools for reproducible, governance-ready research.
 
+---
 
-## 🌍 What is the project about?
+## What we do
 
-The **AI Fairness Data Generation and Question Answering System** is part of **[Vector Institute's](https://vectorinstitute.ai)** contribution to the broader [AIXPERT Project](https://aixpert-project.eu/), a multi-institutional initiative, to develop tools and benchmarks for **fairness-aware data generation and evaluation** in generative AI.
+Vector’s contribution to AIXpert aligns with the project’s **vision and objectives**:
 
-It provides:
+- **Build an adaptable, explainable AI-agentic platform** — Develop interoperable tools and modules that connect explainability, accountability, and fairness.
+- **Define and assess AI trustworthiness** — Establish measurable criteria and indicators for evaluating the reliability and ethical alignment of AI systems.
+- **Advance explainable multimodal foundation models** — Drive research in interpretable vision–language–reasoning and multimodal understanding.
+- **Demonstrate real-world impact** — Validate the framework across sectors including healthcare, employment, and education.
 
-- **Controlled synthetic datasets** to isolate bias-inducing factors safely and reproducibly.
-- **Agentic automation** (CrewAI + custom LLM agents) for prompt generation, content creation, metadata, and QC.
-- **Fairness metrics & explainers** to visualize model behavior and surface disparities.
-- **Open, configurable pipelines** aligned with responsible AI practices and emerging governance needs.
+For the full **AIXpert vision, consortium, and funding**, see [About](about.md).
 
 ---
 
 ## Objectives
 
 <div class="grid cards" markdown>
 
--   **Develop a Controlled Data Pipeline**
-    Create a reproducible, configurable pipeline for generating **text, image, and video** with precise control over **demographic** and **contextual** variables.
+-   **Explainable, accountable AI**
+    Develop tools and benchmarks for **interpretability**, **fairness**, and **transparency** in generative and multimodal AI, aligned with AIXpert’s vision.
 
--   **Enable Fairness-Aware Benchmarking**
-    Provide tools to build matched **baseline vs. fairness-aware** datasets for bias diagnosis and mitigation experiments.
+-   **Trust, risk, and security in agentic AI**
+    Advance **TRiSM** (Trust, Risk, and Security Management) and transparency frameworks for safe, explainable agentic and multi-agent systems.
 
--   **Support Multi-Domain Risk Analysis**
-    Generate multimodal data for **hiring, healthcare, legal, education**, and more, covering risks like **bias, toxicity, misinformation**.
+-   **Multimodal and real-world evaluation**
+    Create benchmarks and datasets for **audio-video understanding**, **vision-language** assessment, and **fairness** across domains and demographics.
 
--   **Integrate Agentic AI for Automation**
-    Orchestrate generation and QC with **CrewAI** and **custom LLM agents** (prompts, assets, annotations, validation).
+-   **Define and assess AI trustworthiness**
+    Establish **measurable criteria** and evaluation suites for reliability, factuality, and ethical alignment of AI systems.
 
--   **Advance Interpretability & Explainability**
-    Combine **zero-shot LLM explainers** and fairness metrics to produce **interpretable** assessments and visualizations.
+-   **Real-world impact**
+    Validate approaches across **healthcare, employment, education**, and high-stakes domains through pilots, benchmarks, and open releases.
 
--   **Foster Open Research & Collaboration**
-    Share configs, tools, and docs openly to enable **reproducible research** and **transparent governance**.
+-   **Open, reproducible research**
+    Share **code**, **datasets**, and **documentation** openly to support reproducible research, benchmarking, and transparent governance.
 
 </div>
 
 ---
 
-## Pipeline
-
-![Project Pipeline](assets/fairness_pipeline.jpg)
-
----
-
 ## Recent updates
 
-- :material-rocket-launch: **Released data generation pipeline** (multimodal, configurable, agent-orchestrated).
-- :material-robot: **Single-agent pipeline** prototype for rapid dataset bootstrapping.
-- :material-file-document: NeurIPS 2025 LLM-eval Workshop paper: [_Bias in the Picture: Benchmarking VLMs with Social-Cue News Images and LLM-as-Judge Assessment_](https://arxiv.org/abs/2509.19659)
-- :material-file-document: Preprint: [_TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems_](https://arxiv.org/abs/2506.04133)
-- :material-file-document-edit-outline: TechRxiv article: [_Responsible Agentic Reasoning and AI Agents—A Critical Survey_](https://www.techrxiv.org/users/574774/articles/1329333-responsible-agentic-reasoning-and-ai-agents-a-critical-survey?mode=edit)
-- :material-post-outline: Poster: **Single-Agent TRiSM** (NeurIPS LAW)
-
----
+- :material-newspaper: **AIXpert news** — Our work was highlighted on the [AIXpert project website](https://aixpert-project.eu/2026/01/28/advancing-trustworthy-explainable-and-responsible-ai-at-neurips-2025/): *Advancing Trustworthy, Explainable, and Responsible AI at NeurIPS 2025* (Bias in the Picture, HumaniBench, Carbon Literacy, and more).
+- :material-play-circle: **SONIC-O1** — Paper: [_A Real-World Benchmark for Evaluating MLLMs on Audio-Video Understanding_](https://arxiv.org/abs/2601.21666) (arXiv).
+- :material-database: **SONIC-O1** — Dataset on [Hugging Face](https://huggingface.co/datasets/vector-institute/sonic-o1) (231 videos, ~60h, 4,958 QAs, 13 domains, demographic metadata).
+- :material-github: **SONIC-O1** — [Code](https://github.com/VectorInstitute/sonic-o1) and evaluation pipeline (summarization, MCQ, temporal localization).
+- :material-medal: **SONIC-O1** — [Leaderboard](https://huggingface.co/spaces/vector-institute/sonic-o1-leaderboard) for model comparisons and fairness analysis.
 
-<!-- ## Get started
-
-1. **Install project deps**
-   ```bash
-   uv sync
-   ```
-
-2. **Serve docs locally**
-   ```bash
-   uv run mkdocs serve
-   ``` -->
-
-> Have feedback or want to contribute? See the [:material-account-group: Team](team.md) page and open an issue or pull request.
+[:material-arrow-right: **View full list**](updates.md){ .md-button .md-button--primary }
 
 ---
 
-## License
-
-This code in this repo is released under the **MIT License**.
+> Have feedback or want to contribute? See the [:material-account-group: Team](team.md) page and open an issue or pull request.
diff --git a/docs/papers.md b/docs/papers.md
@@ -0,0 +1,63 @@
+# Papers
+
+Selected publications and preprints from the AIXpert project. Each entry links to arXiv (or equivalent) where available.
+
+---
+
+## AIXpert project papers
+
+### SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
+
+**Paper** · <a href="https://arxiv.org/abs/2601.21666" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/arXiv-B31B1B?style=flat-square&amp;logo=arxiv&amp;logoColor=white" alt="arXiv"></a> **Code** · <a href="https://github.com/VectorInstitute/sonic-o1" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/GitHub-181717?style=flat-square&amp;logo=github&amp;logoColor=white" alt="GitHub"></a> **Dataset** · <a href="https://huggingface.co/datasets/vector-institute/sonic-o1" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/Hugging_Face-FFD21E?style=flat-square&amp;logo=huggingface&amp;logoColor=000" alt="Hugging Face"></a> **Leaderboard** · <a href="https://huggingface.co/spaces/vector-institute/sonic-o1-leaderboard" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/Leaderboard-FFD21E?style=flat-square&amp;logo=huggingface&amp;logoColor=000" alt="Leaderboard"></a>
+
+**Authors:** Ahmed Y. Radwan, Christos Emmanouilidis, Hina Tabassum, Deval Pandya, Shaina Raza.
+
+SONIC-O1, a fully human-verified real-world audio-video benchmark with 4,958 annotations across 13 conversational domains. We evaluate multimodal models on video summarization, evidence-grounded QA, and temporal event localization, and release an extensible evaluation suite to support reproducible benchmarking and robustness analysis.
+
+### Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning
+
+**Paper** · <a href="https://arxiv.org/abs/2601.03027" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/arXiv-B31B1B?style=flat-square&amp;logo=arxiv&amp;logoColor=white" alt="arXiv"></a> **Code** · <a href="https://github.com/VectorInstitute/Factual-Preference-Alignment" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/GitHub-181717?style=flat-square&amp;logo=github&amp;logoColor=white" alt="GitHub"></a> **Dataset** · <a href="https://huggingface.co/datasets/vector-institute/Factuality_Alignment" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/Hugging_Face-FFD21E?style=flat-square&amp;logo=huggingface&amp;logoColor=000" alt="Hugging Face"></a>
+
+**Authors:** Sindhuja Chaduvula, Ahmed Y. Radwan, Azib Farooq, Yani Ioannou, Shaina Raza.
+
+Preference-learning method (F-DPO) that targets factuality directly, improving factuality scores while reducing hallucination rates across multiple open-weight LLMs.
+
+### Bias in the Picture: Benchmarking VLMs with Social-Cue News Images and LLM-as-Judge Assessment
+
+**Paper** (NeurIPS 2025 LLM-eval Workshop) · <a href="https://arxiv.org/abs/2509.19659" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/arXiv-B31B1B?style=flat-square&amp;logo=arxiv&amp;logoColor=white" alt="arXiv"></a> **Code** · <a href="https://github.com/VectorInstitute/bias-in-the-picture-benchmark" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/GitHub-181717?style=flat-square&amp;logo=github&amp;logoColor=white" alt="GitHub"></a>
+
+**Authors:** Aravind Narayanan, Vahid Reza Khazaie, Shaina Raza.
+
+Benchmarking vision-language models with social-cue news images and LLM-as-judge assessment.
+
+### TRiSM for Agentic AI: A Review of Trust, Risk, and Security Management in LLM-based Agentic Multi-Agent Systems
+
+**Paper** · <a href="https://arxiv.org/abs/2506.04133" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/arXiv-B31B1B?style=flat-square&amp;logo=arxiv&amp;logoColor=white" alt="arXiv"></a>
+
+**Authors:** Shaina Raza, Ranjan Sapkota, Manoj Karkee, Christos Emmanouilidis.
+
+A review of trust, risk, and security management (TRiSM) in LLM-based agentic and multi-agent systems.
+
+### Responsible Agentic Reasoning and AI Agents—A Critical Survey
+
+**Paper** (TechRxiv) · <a href="https://www.techrxiv.org/users/574774/articles/1329333-responsible-agentic-reasoning-and-ai-agents-a-critical-survey" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/Paper-0F7DC2?style=flat-square" alt="Paper"></a>
+
+**Authors:** Shaina Raza (Vector Institute), Ranjan Sapkota, Manoj Karkee (Cornell University), Christos Emmanouilidis (University of Groningen).
+
+Critical survey of responsible agentic reasoning and AI agents.
+
+### Evaluating and Regulating Agentic AI: A Study of Benchmarks, Metrics and Regulation
+
+**Paper** (TechRxiv) · <a href="https://www.techrxiv.org/users/985444/articles/1350845-evaluating-and-regulating-agentic-ai-a-study-of-benchmarks-metrics-and-regulation" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/Paper-0F7DC2?style=flat-square" alt="Paper"></a> **Code** · <a href="https://github.com/itsazibfarooq/agenticEvaluation" target="_blank" rel="noopener"><img src="https://img.shields.io/badge/GitHub-181717?style=flat-square&amp;logo=github&amp;logoColor=white" alt="GitHub"></a>
+
+**Authors:** Azib Farooq, Shaina Raza, Nazmul Karim, Hasan Iqbal, Athanasios V. Vasilakos, Christos Emmanouilidis.
+
+Reviews recent progress in developing and assessing agentic AI along three dimensions: benchmarks, metrics, and governance. Analyzes how evaluation frameworks capture reasoning, planning, collaboration, and ethical alignment in single- and multi-agent systems. Aims to establish a unified foundation for trustworthy, auditable, and human-aligned AI agents.
+
+---
+
+<!-- Reusable badge icons (shields.io, Simple Icons). Use in HTML: <a href="URL" target="_blank" rel="noopener"><img src="BADGE_URL" alt="..."></a>
+  arXiv:  https://img.shields.io/badge/arXiv-B31B1B?style=flat-square&logo=arxiv&logoColor=white
+  GitHub: https://img.shields.io/badge/GitHub-181717?style=flat-square&logo=github&logoColor=white
+  Hugging Face: https://img.shields.io/badge/Hugging_Face-FFD21E?style=flat-square&logo=huggingface&logoColor=000
+-->
diff --git a/docs/stylesheets/extra.css b/docs/stylesheets/extra.css
@@ -1,3 +1,26 @@
+/* Inline badge icons (shields.io) for arXiv, GitHub, Hugging Face */
+.md-typeset img[src*="img.shields.io"] {
+  height: 1.15em;
+  vertical-align: middle;
+  margin-right: 0.15em;
+}
+
+/* User guide: floating card for each project section */
+.md-content .md-typeset div.user-guide-card {
+  background-color: #f5f5f5 !important;
+  border-radius: 12px !important;
+  box-shadow: 0 2px 12px rgba(0, 0, 0, 0.08) !important;
+  padding: 1.25rem 1.5rem !important;
+  margin: 0.5rem 0 1rem 0 !important;
+  border: none !important;
+  display: block !important;
+}
+
+[data-md-color-scheme="slate"] .md-content .md-typeset div.user-guide-card {
+  background-color: rgba(255, 255, 255, 0.06) !important;
+  box-shadow: 0 2px 12px rgba(0, 0, 0, 0.3) !important;
+}
+
 [data-md-color-primary="vector"] {
   --md-primary-fg-color: #eb088a;
   --md-primary-fg-color--light: #f252a5;
@@ -91,47 +114,56 @@
 }
 
 
-/* Reduce space between team members */
+/* Team cards: compact, name + LinkedIn only, 3 per line */
 .team-grid {
   display: grid;
-  grid-template-columns: repeat(auto-fill, minmax(280px, 1fr));
-  gap: 1.5rem;
+  grid-template-columns: repeat(3, 1fr);
+  gap: 0.75rem;
   margin-top: 1rem;
 }
 
 .team-card {
   background-color: var(--md-surface);
-  border-radius: 12px;
-  box-shadow: 0 1px 4px rgba(0, 0, 0, 0.1);
-  padding: 1rem 1.25rem;
+  border-radius: 8px;
+  box-shadow: 0 1px 3px rgba(0, 0, 0, 0.08);
+  padding: 0.5rem 0.75rem;
   text-align: left;
   transition: transform 0.15s ease;
+  min-width: 0;
 }
 
 .team-card:hover {
-  transform: translateY(-3px);
+  transform: translateY(-2px);
 }
 
 .team-card h3 {
-  margin-top: 0;
-  margin-bottom: 0.25rem;
+  margin: 0 0 0.25rem 0;
+  font-size: 0.95em;
+  font-weight: 600;
+  white-space: nowrap;
+  overflow: hidden;
+  text-overflow: ellipsis;
 }
 
 .team-card p {
-  margin: 0.25rem 0;
-  line-height: 1.4;
-}
-
-.team-links .twemoji,
-.team-links svg {
-  width: 18px;
-  height: 18px;
-  vertical-align: text-bottom;
+  margin: 0;
+  line-height: 1.3;
 }
 
 .team-links a {
   text-decoration: none;
   color: var(--md-primary-fg-color);
+  font-size: 0.85em;
+}
+
+.team-links .fab {
+  margin-right: 0.25em;
+}
+
+@media (max-width: 600px) {
+  .team-grid {
+    grid-template-columns: 1fr;
+  }
 }
 
 /* Center logo */