You are welcome to open a PR to add related papers!
- 1. Graph-Based Context Engineering
- 2. Graph Priors for LLM Architectures
- 3. Graph-Based Interpretability and Control
- 4. Graph-Centric Trust and Safety
- Citation
- A Survey of Context Engineering for Large Language Models - arXiv 2025
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - NeurIPS 2020
- Retrieval Meets Long Context Large Language Models - ICLR 2024
- HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models - NeurIPS 2024
- From Local to Global: A Graph RAG Approach to Query-Focused Summarization - arXiv 2024
- STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases - NeurIPS 2024
- G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering - NeurIPS 2024
- GNN-RAG: Graph Neural Retrieval for Efficient Large Language Model Reasoning on Knowledge Graphs - Findings ACL 2025
- GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation - NeurIPS 2025
- RAG vs. GraphRAG: A Systematic Evaluation and Key Insights - arXiv 2025
- Don't Forget to Connect! Improving RAG with Graph-Based Reranking - arXiv 2024
- When to Use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation - arXiv 2025
- Haystack Engineering: Context Engineering for Heterogeneous and Agentic Long-Context Evaluation - arXiv 2025
- Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation - ICLR 2025
- Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory - arXiv 2025
- A-MEM: Agentic Memory for LLM Agents - arXiv 2025
- Zep: A Temporal Knowledge Graph Architecture for Agent Memory - arXiv 2025
- Scaling Long-Horizon LLM Agent via Context-Folding - arXiv 2025
- AgentFold: Long-Horizon Web Agents with Proactive Context Management - arXiv 2025
- Graph-R1: Towards Agentic GraphRAG Framework via End-to-End Reinforcement Learning - arXiv 2025
- MInference 1.0: Accelerating Pre-Filling for Long-Context LLMs via Dynamic Sparse Attention - NeurIPS 2024
- QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference - ICML 2024
- H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models - NeurIPS 2023
- PyramidKV: Dynamic KV Cache Compression Based on Pyramidal Information Funneling - arXiv 2024
- SnapKV: LLM Knows What You Are Looking for Before Generation - NeurIPS 2024
- KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction - arXiv 2025
- SALS: Sparse Attention in Latent Space for KV Cache Compression - arXiv 2025
- Reformer: The Efficient Transformer - ICLR 2020
- Parallel Context Windows for Large Language Models - ACL 2023
- APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding - ICLR 2025
- Block-Attention for Efficient Prefilling - arXiv 2024
- Scalable In-Context Ranking with Generative Models - arXiv 2025
- Attention Is All You Need - NeurIPS 2017
- Graph Attention Networks - ICLR 2018
- Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models - arXiv 2025
- Struc-EMB: The Potential of Structure-Aware Encoding in Language Embeddings - arXiv 2025
- Lost in the Middle: How Language Models Use Long Contexts - TACL 2024
- On the Emergence of Position Bias in Transformers - ICML 2025
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - NeurIPS 2022
- Measuring Faithfulness in Chain-of-Thought Reasoning - arXiv 2023
- Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting - NeurIPS 2023
- Towards Automated Circuit Discovery for Mechanistic Interpretability - NeurIPS 2023
- Attribution Patching Outperforms Automated Circuit Discovery - BlackboxNLP 2024
- On the Biology of a Large Language Model - Transformer Circuits Thread 2025
- Transcoders Find Interpretable LLM Feature Circuits - NeurIPS 2024
- Weight-Sparse Transformers Have Interpretable Circuits - arXiv 2025
- GraphGhost: Tracing Structures Behind Large Language Models - arXiv 2025
- Verifying Chain-of-Thought Reasoning via Its Computational Graph - arXiv 2025
- Topology of Reasoning: Understanding Large Reasoning Models Through Reasoning Graph Properties - arXiv 2025
- RL Squeezes, SFT Expands: A Comparative Study of Reasoning LLMs - arXiv 2025
- Understanding Reasoning Ability of Language Models from the Perspective of Reasoning Paths Aggregation - arXiv 2024
- Structured Reasoning for LLMs: A Unified Framework for Efficiency and Explainability - ICLR 2026
- Who's Harry Potter? Approximate Unlearning for LLMs - arXiv 2023
- Locating and Editing Factual Associations in GPT - NeurIPS 2022
- Mass-Editing Memory in a Transformer - arXiv 2022
- Knowledge Unlearning for Mitigating Privacy Risks in Language Models - ACL 2023
- Machine Unlearning of Pre-Trained Large Language Models - arXiv 2024
- Towards Unbounded Machine Unlearning - NeurIPS 2023
- Evaluating Deep Unlearning in Large Language Models - arXiv 2024
- Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness - arXiv 2025
- Evaluating the Ripple Effects of Knowledge Editing in Language Models - TACL 2024
- MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions - arXiv 2023
- Retrieval-Enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering - CIKM 2024
- SafetyBench: Evaluating the Safety of Large Language Models - ACL 2024
- Universal and Transferable Adversarial Attacks on Aligned Language Models - arXiv 2023
- Jailbroken: How Does LLM Safety Training Fail? - NeurIPS 2023
- AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models - arXiv 2023
- Qwen3Guard Technical Report - arXiv 2025
- Prompt, Divide, and Conquer: Bypassing Large Language Model Safety Filters via Segmented and Distributed Prompt Processing - arXiv 2025
- Safe in Isolation, Dangerous Together: Agent-Driven Multi-Turn Decomposition Jailbreaks on LLMs - REALM 2025
- The Trojan Knowledge: Bypassing Commercial LLM Guardrails via Harmless Prompt Weaving and Adaptive Tree Search - arXiv 2025
- Chain-of-Thought Hijacking - arXiv 2025
- SentinelAgent: Graph-Based Anomaly Detection in Multi-Agent Systems - arXiv 2025
- GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling - arXiv 2025
- Deliberative Alignment: Reasoning Enables Safer Language Models - arXiv 2024
@misc{li2026beyond,
title={Beyond Sequences: How Graph Learning Can Advance Trustworthy Large Language Models},
author={Mufei Li and Shikun Liu and Xinnan Dai and Rongzhe Wei and Haoyu Wang and Xinjie Shen and Siqi Miao and Jiliang Tang and Pan Li},
howpublished = {https://github.com/Graph-COM/Awesome-Graph4TruthLLM},
year={2026}
}


