diff --git a/CHANGELOG.md b/CHANGELOG.md
new file mode 100644
index 0000000..bc07a72
--- /dev/null
+++ b/CHANGELOG.md
@@ -0,0 +1,92 @@
+# Changelog
+
+## [1.3.0]
+
+### Added
+- **Admin configuration UI** with role-based access for DB, LLM, and GraphRAG settings
+  - Separate pages for DB config, LLM provider config, and GraphRAG config
+  - Graph admin role restriction via `ConfigScopeToggle`
+  - `apiToken` auth option added to GraphDB config with conditional UI
+- **Per-graph chatbot LLM override** (`chat_service` in `llm_config`) with inheritance from `completion_service`
+  - Missing keys fall back to `completion_service` automatically
+  - Graph admins can configure per graph via the UI
+- **Secret masking** in configuration API responses
+  - GET responses return masked values; backend substitutes on save/test
+  - Credentials never reach the frontend
+- **Session idle timeout** (1 hour) that auto-clears the session on inactivity
+  - Session data moved from `localStorage` to `sessionStorage`; theme stays in `localStorage`
+  - Timer pauses during long-running operations (ingest, rebuild)
+- **Auth guard** on all UI routes
+  - `RequireAuth` wrapper redirects unauthenticated users to login
+  - SPA routing with `serve -s` and catch-all route
+- **GraphRAG config UI fields**
+  - Search parameters: `top_k`, `num_hops`, `num_seen_min`, `community_level`, `doc_only`
+  - Advanced ingestion settings: `load_batch_size`, `upsert_delay`, `default_concurrency`
+  - All chunker settings (chunk_size, overlap_size, method, threshold, pattern) shown and saved regardless of selected chunker
+- **Multimodal inherit checkbox** in LLM config UI
+  - "Use same model as completion service" option in both single and multi-provider modes
+  - Amber warning when inheriting: "Ensure your completion model supports vision input"
+- **`get_embedding_config()`** getter in `common/config.py` for parity with other service getters
+- **Greeting detection** in agent router
+  - Regex-based pattern matching for common greetings, farewells, and thanks
+  - Responds directly without invoking query generation or search
+- **Centralized LLM token usage tracking**
+  - All LLM call sites (15+) migrated to `invoke_with_parser` / `ainvoke_with_parser`
+  - Supports both structured (JSON) and plain text LLM responses
+- **JSON parsing fallback** for LLM responses
+  - Handles responses wrapped in preamble text or markdown code fences
+  - Entity extraction uses a 3-tier fallback: direct parse, code fence extraction, regex extraction
+- **Cypher/GSQL output validation** before query execution
+  - Checks for required query keywords before wrapping in `INTERPRET OPENCYPHER QUERY`
+  - Invalid output raises an error and retries instead of executing garbage queries
+- **Retriever scoring** for all retriever types when `combine=False`
+  - Scoring logic lifted from `CommunityRetriever` into `BaseRetriever`
+  - Similarity, Hybrid, and Sibling retrievers now score and rank context chunks
+- **User-customized prompts** persisted under `configs/` across container restarts
+- **Unit tests** for LLM invocation and JSON parsing (13 test cases)
+
+### Changed
+- **All config consumers use `get_xxx_config(graphname)` getters** instead of direct `llm_config` access
+  - `root.py`, `report-service/root.py`, `ecc/main.py`, `ui.py` migrated
+  - Test connection and save endpoints use `_build_test_config()` overlay pattern
+  - `_unmask_auth` resolves credentials via getters for correct per-graph resolution
+- **Multimodal service inherits completion model directly** when not explicitly configured
+  - Removed hardcoded `DEFAULT_MULTIMODAL_MODELS` that silently substituted different models
+- **LLM config UI improvements**
+  - Red asterisk markers on mandatory model name fields
+  - Shared `LLM_PROVIDERS` constant replaces duplicate provider lists
+  - State synced when toggling between single/multi-provider modes
+  - Reordered sections: Completion → Chatbot → Multimodal → Embedding
+- Config file writes are now atomic with file locking to prevent race conditions
+  - `_config_file_lock` prevents concurrent overwrites
+  - In-memory config updates use atomic dict replacement instead of clear-and-update
+- Chat history messages display instantly without typewriter animation
+  - History messages tagged with `response_type: "history"` to skip CSS animation
+- Chatbot model selection uses `chat_service` config with `completion_service` fallback
+  - Community summarization prompt loaded at call time instead of import time
+- README config documentation updated for clarity and consistency
+  - Parameter descriptions focus on purpose, not implementation details
+  - `token_limit`, `default_concurrency`, and other parameters reworded
+  - `multimodal_service` defaults corrected to show inheritance from `completion_service`
+- `default_concurrency` replaces `tg_concurrency` in `graphrag_config`
+  - Configurable per graph
+- Wired up `default_mem_threshold` and `default_thread_limit` in database connection proxy
+
+### Fixed
+- **Bedrock multimodal connection test** — 1x1 test PNG rejected by Bedrock image validation; replaced with 20x20 PNG
+- **Provider-aware image format** in multimodal test and `image_data_extractor`
+  - GenAI/VertexAI require `image_url` format; Bedrock/Anthropic use `type:"image"` with source block
+- **report-service/root.py** — `llm_config` used but never imported (NameError on health endpoint)
+- **Null service values** stripped before config reload (null = inherit, key should be absent)
+- Login page shows proper error messages based on HTTP status
+  - 401/403: "Invalid credentials"; other errors: "Server error (N)"; network failure: "Unable to connect"
+- SPA routing fixed with catch-all route to login page
+- Rebuild dialog button no longer flickers between status labels
+  - Polling stops once rebuild completes; final status message preserved
+- Idle timer pauses during long-running operations (ingest, rebuild)
+  - Uses pause/resume instead of repeated signal activity calls
+- Bedrock model names no longer trigger token calculator warnings
+  - Provider prefix and version suffix stripped before tiktoken lookup
+- Config reload no longer clears in-memory state during concurrent requests
+- Startup validation restored for `llm_service` and `llm_model`
+- `HTTPException` properly re-raised in config and DB test endpoints
diff --git a/README.md b/README.md
index a6261f5..e49317b 100644
--- a/README.md
+++ b/README.md
@@ -469,14 +469,19 @@ Copy the below code into `configs/server_config.json`. You shouldn’t need to c
 | `chat_history_api` | string | `"http://chat-history:8002"` | URL of the chat history service. No change needed when using the provided Docker Compose file. |
 | `chunker` | string | `"semantic"` | Default document chunker. Options: `semantic`, `character`, `regex`, `markdown`, `html`, `recursive`. |
 | `extractor` | string | `"llm"` | Entity extraction method. Options: `llm`, `graphrag`. |
-| `chunker_config` | object | `{}` | Chunker-specific settings. For `character`/`markdown`/`recursive`: `chunk_size`, `overlap_size`. For `semantic`: `method`, `threshold`. For `regex`: `pattern`. |
-| `top_k` | int | `5` | Number of top similar results to retrieve during search. |
-| `num_hops` | int | `2` | Number of graph hops to traverse when expanding retrieved results. |
-| `num_seen_min` | int | `2` | Minimum occurrence threshold for a node to be included in search results. |
-| `community_level` | int | `2` | Community hierarchy level used for community search. |
-| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks (not entities). |
-| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. |
-| `with_chunk` | bool | `true` | If true, community search also includes document chunks in results. |
+| `chunker_config` | object | `{}` | Chunker-specific settings (see sub-parameters below). All settings are saved regardless of which chunker is selected as default. |
+| ↳ `chunk_size` | int | `2048` | Maximum number of characters per chunk. Used by `character`, `markdown`, `html`, and `recursive` chunkers. Larger values produce fewer, bigger chunks; smaller values produce more, finer-grained chunks. |
+| ↳ `overlap_size` | int | 1/8 of `chunk_size` | Number of overlapping characters between consecutive chunks. Used by `character`, `markdown`, `html`, and `recursive` chunkers. More overlap preserves cross-chunk context but increases total chunk count. Set to `0` for no overlap. |
+| ↳ `method` | string | `"percentile"` | Breakpoint detection method for the `semantic` chunker. Options: `percentile`, `standard_deviation`, `interquartile`, `gradient`. Controls how the chunker decides where to split based on embedding similarity. |
+| ↳ `threshold` | float | `0.95` | Similarity threshold for the `semantic` chunker. Higher values produce more splits (smaller chunks); lower values produce fewer splits (larger chunks). |
+| ↳ `pattern` | string | `""` | Regular expression pattern for the `regex` chunker. The document is split at each match of this pattern. |
+| `top_k` | int | `5` | Number of initial seed results to retrieve per search. Also caps the final scored results. Increasing `top_k` increases the overall context size sent to the LLM. |
+| `num_hops` | int | `2` | Number of graph hops to traverse from seed nodes during hybrid search. More hops expand the result set with related context. |
+| `num_seen_min` | int | `2` | Minimum occurrence count for a node to be included during hybrid search traversal. Higher values filter out loosely connected nodes, reducing context size. |
+| `community_level` | int | `2` | Community hierarchy level for community search. Higher levels retrieve broader, higher-order community summaries. |
+| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks, excluding entity data. |
+| `doc_only` | bool | `false` | If true, hybrid search retrieves whole documents instead of chunks. Significantly increases context size. |
+| `with_chunk` | bool | `true` | If true, community search also includes document chunks alongside community summaries. Increases context size. |
 | `doc_process_switch` | bool | `true` | Enable/disable document processing during knowledge graph build. |
 | `entity_extraction_switch` | bool | same as `doc_process_switch` | Enable/disable entity extraction during knowledge graph build. |
 | `community_detection_switch` | bool | same as `entity_extraction_switch` | Enable/disable community detection during knowledge graph build. |
@@ -552,7 +557,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
 | Parameter | Type | Default | Description |
 | --- | --- | --- | --- |
 | `authentication_configuration` | object | — | Shared authentication credentials for all services. Service-level values take precedence. |
-| `token_limit` | int | — | Maximum token count for retrieved context. Inherited by all services if not set at service level. `0` or omitted means unlimited. |
+| `token_limit` | int | — | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. Inherited by all services if not set at service level. `0` or omitted means unlimited. |
 
 **`completion_service` parameters:**
 
@@ -564,7 +569,7 @@ In the `llm_config` section of `configs/server_config.json` file, copy JSON conf
 | `model_kwargs` | object | No | `{}` | Additional model parameters (e.g., `{"temperature": 0}`). |
 | `prompt_path` | string | No | `"./common/prompts/openai_gpt4/"` | Path to prompt template files. |
 | `base_url` | string | No | — | Custom API endpoint URL. |
-| `token_limit` | int | No | inherited from top-level | Max token count for retrieved context sent to the LLM. `0` or omitted means unlimited. |
+| `token_limit` | int | No | inherited from top-level | Hard cap on token count for retrieved context sent to the LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |
 
 **`embedding_service` parameters:**
 
@@ -587,16 +592,16 @@ Chatbot LLM override. If not configured, inherits from `completion_service`. Con
 | `model_kwargs` | object | No | inherited from completion | Additional model parameters (e.g., `{"temperature": 0}`). |
 | `prompt_path` | string | No | inherited from completion | Path to prompt template files. |
 | `base_url` | string | No | inherited from completion | Custom API endpoint URL. |
-| `token_limit` | int | No | inherited from completion | Max token count for retrieved context sent to the chatbot LLM. `0` or omitted means unlimited. |
+| `token_limit` | int | No | inherited from completion | Hard cap on token count for retrieved context sent to the chatbot LLM. Context exceeding this limit is truncated. `0` or omitted means unlimited. |
 
 **`multimodal_service` parameters (optional):**
 
-Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` with a default vision model derived per provider.
+Vision model for image processing during document ingestion. If not configured, inherits from `completion_service` — ensure the completion model supports vision input.
 
 | Parameter | Type | Required | Default | Description |
 | --- | --- | --- | --- | --- |
 | `llm_service` | string | No | inherited from completion | Multimodal LLM provider. |
-| `llm_model` | string | No | auto-derived per provider | Vision model name (e.g., `gpt-4o`). |
+| `llm_model` | string | No | inherited from completion | Vision model name (e.g., `gpt-4o`). |
 | `authentication_configuration` | object | No | inherited from completion | Service-specific auth credentials. Overrides top-level values. |
 | `model_kwargs` | object | No | inherited from completion | Additional model parameters. |
 | `prompt_path` | string | No | inherited from completion | Path to prompt template files. |
diff --git a/common/chunkers/character_chunker.py b/common/chunkers/character_chunker.py
index 6d4138a..abf2480 100644
--- a/common/chunkers/character_chunker.py
+++ b/common/chunkers/character_chunker.py
@@ -1,12 +1,12 @@
 from common.chunkers.base_chunker import BaseChunker
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class CharacterChunker(BaseChunker):
-    def __init__(self, chunk_size=0, overlap_size=0):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.overlap_size = overlap_size
+    def __init__(self, chunk_size=0, overlap_size=-1):
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         if self.chunk_size <= self.overlap_size:
diff --git a/common/chunkers/html_chunker.py b/common/chunkers/html_chunker.py
index 326dff8..83b3477 100644
--- a/common/chunkers/html_chunker.py
+++ b/common/chunkers/html_chunker.py
@@ -20,7 +20,7 @@
 from langchain.text_splitter import RecursiveCharacterTextSplitter
 
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class HTMLChunker(BaseChunker):
@@ -30,7 +30,7 @@ class HTMLChunker(BaseChunker):
     - Automatically detects which headers (h1-h6) are present in the HTML
     - Uses only the headers that exist in the document for optimal chunking
     - If custom headers are provided, uses those instead of auto-detection
-    - Supports chunk_size / chunk_overlap: when chunk_size > 0, oversized
+    - Supports chunk_size / overlap_size: when chunk_size > 0, oversized
       header-based chunks are further split with RecursiveCharacterTextSplitter
     - When chunk_size is 0 (default), a fallback of 4096 is used so that
       headerless HTML documents are still split into reasonable chunks
@@ -39,11 +39,11 @@ class HTMLChunker(BaseChunker):
     def __init__(
         self,
         chunk_size: int = 0,
-        chunk_overlap: int = 0,
+        overlap_size: int = -1,
         headers: Optional[List[Tuple[str, str]]] = None,
     ):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.chunk_overlap = chunk_overlap
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
         self.headers = headers
 
     def _detect_headers(self, html_content: str) -> List[Tuple[str, str]]:
@@ -96,7 +96,7 @@ def chunk(self, input_string: str) -> List[str]:
             recursive_splitter = RecursiveCharacterTextSplitter(
                 separators=TEXT_SEPARATORS,
                 chunk_size=self.chunk_size,
-                chunk_overlap=self.chunk_overlap,
+                chunk_overlap=self.overlap_size,
             )
             final_chunks = []
             for chunk in initial_chunks:
diff --git a/common/chunkers/markdown_chunker.py b/common/chunkers/markdown_chunker.py
index 2d4c4ce..85c1a82 100644
--- a/common/chunkers/markdown_chunker.py
+++ b/common/chunkers/markdown_chunker.py
@@ -20,18 +20,18 @@
 # When chunk_size is not configured, cap any heading-section that exceeds this
 # so that form-based PDFs (tables/bold but no # headings) are not left as a
 # single multi-thousand-character chunk.
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class MarkdownChunker(BaseChunker):
-    
+
     def __init__(
         self,
         chunk_size: int = 0,
-        chunk_overlap: int = 0
+        overlap_size: int = -1
     ):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.chunk_overlap = chunk_overlap
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         md_splitter = ExperimentalMarkdownSyntaxTextSplitter()
@@ -46,7 +46,7 @@ def chunk(self, input_string):
             recursive_splitter = RecursiveCharacterTextSplitter(
                 separators=TEXT_SEPARATORS,
                 chunk_size=self.chunk_size,
-                chunk_overlap=self.chunk_overlap,
+                chunk_overlap=self.overlap_size,
             )
             md_chunks = []
             for chunk in initial_chunks:
diff --git a/common/chunkers/recursive_chunker.py b/common/chunkers/recursive_chunker.py
index 4c8a324..69ee83a 100644
--- a/common/chunkers/recursive_chunker.py
+++ b/common/chunkers/recursive_chunker.py
@@ -16,13 +16,13 @@
 from common.chunkers.separators import TEXT_SEPARATORS
 from langchain.text_splitter import RecursiveCharacterTextSplitter
 
-_DEFAULT_FALLBACK_SIZE = 4096
+_DEFAULT_CHUNK_SIZE = 2048
 
 
 class RecursiveChunker(BaseChunker):
-    def __init__(self, chunk_size=0, overlap_size=0):
-        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_FALLBACK_SIZE
-        self.overlap_size = overlap_size
+    def __init__(self, chunk_size=0, overlap_size=-1):
+        self.chunk_size = chunk_size if chunk_size > 0 else _DEFAULT_CHUNK_SIZE
+        self.overlap_size = overlap_size if overlap_size >= 0 else self.chunk_size // 8
 
     def chunk(self, input_string):
         text_splitter = RecursiveCharacterTextSplitter(
diff --git a/common/config.py b/common/config.py
index 371e303..3dc3be1 100644
--- a/common/config.py
+++ b/common/config.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+import copy
 import json
 import logging
 import os
@@ -122,6 +123,73 @@ def _resolve_service_config(base_config, override=None):
     return result
 
 
+def resolve_llm_services(llm_cfg: dict) -> dict:
+    """
+    Resolve per-service configs from an llm_config dict.
+
+    Applies the same resolution chain as the get_xxx_config() getters but
+    operates on the provided dict instead of the global llm_config. This
+    allows both the on-disk config and a candidate config (from UI payload)
+    to be resolved with the same logic.
+
+    Resolution:
+      1. Inject top-level authentication_configuration into each service
+      2. completion_service / embedding_service: used as-is
+      3. chat_service / multimodal_service: completion_service base + overrides
+
+    When chat_service or multimodal_service is absent, the resolved config
+    falls back to completion_service (inherit).
+
+    Returns dict with keys: completion_service, embedding_service,
+    chat_service, multimodal_service — each a fully resolved config.
+    """
+    # Work on deep copies to avoid mutating the input
+    cfg = copy.deepcopy(llm_cfg)
+
+    # Inject top-level auth into service configs (same as reload_llm_config)
+    top_auth = cfg.get("authentication_configuration", {})
+    if top_auth:
+        for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+            if svc_key in cfg:
+                svc = cfg[svc_key]
+                if "authentication_configuration" not in svc:
+                    svc["authentication_configuration"] = top_auth.copy()
+                else:
+                    merged = top_auth.copy()
+                    merged.update(svc["authentication_configuration"])
+                    svc["authentication_configuration"] = merged
+
+    # Inject top-level region_name into service configs if missing
+    top_region = cfg.get("region_name")
+    if top_region:
+        for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+            if svc_key in cfg and "region_name" not in cfg[svc_key]:
+                cfg[svc_key]["region_name"] = top_region
+
+    completion = cfg.get("completion_service", {})
+
+    # Resolve embedding: inherit provider-level config from completion
+    # when the embedding provider matches the completion provider.
+    # (embedding has a different schema — model_name vs llm_model —
+    # so we only inherit shared provider fields like region_name.)
+    embedding = cfg.get("embedding_service", {}).copy()
+    embedding_provider = embedding.get("embedding_model_service", "").lower()
+    completion_provider = completion.get("llm_service", "").lower()
+    if embedding_provider and embedding_provider == completion_provider:
+        # Identity/schema keys that belong to the embedding service itself
+        embedding_own_keys = {"embedding_model_service", "model_name", "authentication_configuration", "token_limit"}
+        for k, v in completion.items():
+            if k not in embedding_own_keys and k not in embedding:
+                embedding[k] = v
+
+    return {
+        "completion_service": completion.copy(),
+        "embedding_service": embedding,
+        "chat_service": _resolve_service_config(completion, cfg.get("chat_service")),
+        "multimodal_service": _resolve_service_config(completion, cfg.get("multimodal_service")),
+    }
+
+
 def get_completion_config(graphname=None):
     """
     Return completion_service config for the given graph.
@@ -142,13 +210,24 @@ def get_completion_config(graphname=None):
     return result
 
 
-DEFAULT_MULTIMODAL_MODELS = {
-    "openai": "gpt-4o-mini",
-    "azure": "gpt-4o-mini",
-    "genai": "gemini-3.5-flash",
-    "vertexai": "gemini-3.5-flash",
-    "bedrock": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
-}
+def get_embedding_config(graphname=None):
+    """
+    Return embedding_service config for the given graph.
+
+    Resolution: merge graph-specific embedding_service overrides on top of
+    global embedding_service. Graph configs only store overrides, so unchanged
+    fields always inherit the latest global values.
+    """
+    graph_llm = _load_graph_llm_config(graphname)
+    override = graph_llm.get("embedding_service")
+    if override:
+        logger.debug(f"[get_embedding_config] graph={graphname} using graph-specific overrides")
+    result = _resolve_service_config(llm_config["embedding_service"], override)
+
+    if graphname:
+        result["graphname"] = graphname
+
+    return result
 
 
 def get_chat_config(graphname=None):
@@ -189,21 +268,6 @@ def get_chat_config(graphname=None):
     return result
 
 
-def _apply_default_multimodal_model(override, provider):
-    """Apply default vision model if llm_model is not explicitly set."""
-    if override and "llm_model" not in override:
-        default_model = DEFAULT_MULTIMODAL_MODELS.get(provider)
-        if default_model:
-            return {**override, "llm_model": default_model}
-        return override
-    if not override:
-        default_model = DEFAULT_MULTIMODAL_MODELS.get(provider)
-        if default_model:
-            return {"llm_model": default_model}
-        return None
-    return override
-
-
 def get_multimodal_config(graphname=None):
     """
     Return the multimodal/vision config for the given graph.
@@ -211,9 +275,10 @@ def get_multimodal_config(graphname=None):
     Resolution chain:
       1. Start with global completion_service
       2. Merge graph-specific completion_service overrides (shared base)
-      3. Merge multimodal_service overrides (graph-specific > global > default model)
+      3. Merge multimodal_service overrides (graph-specific > global)
 
-    Returns the merged config, or None if the provider doesn't support vision.
+    When no multimodal_service override exists ("inherit"), the completion
+    config is returned as-is — the completion model is used for vision.
     """
     graph_llm = _load_graph_llm_config(graphname)
 
@@ -223,17 +288,11 @@ def get_multimodal_config(graphname=None):
         graph_llm.get("completion_service"),
     )
 
-    # Find multimodal override: graph-specific > global > None
+    # Find multimodal override: graph-specific > global > None (inherit)
     mm_override = graph_llm.get("multimodal_service")
     if mm_override is None and "multimodal_service" in llm_config:
         mm_override = llm_config["multimodal_service"]
 
-    provider = (mm_override or {}).get("llm_service", base.get("llm_service", "")).lower()
-    mm_override = _apply_default_multimodal_model(mm_override, provider)
-
-    if mm_override is None:
-        return None
-
     return _resolve_service_config(base, mm_override)
 
 
@@ -301,6 +360,12 @@ def get_graphrag_config(graphname=None):
                 merged.update(svc["authentication_configuration"])
                 svc["authentication_configuration"] = merged
 
+# Inject top-level region_name into service configs if missing
+if "region_name" in llm_config:
+    for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+        if svc_key in llm_config and "region_name" not in llm_config[svc_key]:
+            llm_config[svc_key]["region_name"] = llm_config["region_name"]
+
 _comp = llm_config.get("completion_service")
 if _comp is None:
     raise Exception("completion_service is not found in llm_config")
@@ -479,6 +544,12 @@ def reload_llm_config(new_llm_config: dict = None):
                         merged.update(svc["authentication_configuration"])
                         svc["authentication_configuration"] = merged
 
+        # Inject top-level region_name into service configs if missing
+        if "region_name" in new_llm_config:
+            for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+                if svc_key in new_llm_config and "region_name" not in new_llm_config[svc_key]:
+                    new_llm_config[svc_key]["region_name"] = new_llm_config["region_name"]
+
         new_completion_config = new_llm_config.get("completion_service")
         new_embedding_config = new_llm_config.get("embedding_service")
 
diff --git a/common/utils/image_data_extractor.py b/common/utils/image_data_extractor.py
index 48f9b65..711c562 100644
--- a/common/utils/image_data_extractor.py
+++ b/common/utils/image_data_extractor.py
@@ -7,16 +7,31 @@
 logger = logging.getLogger(__name__)
 
 _multimodal_client = None
+_multimodal_provider = None
 
 def _get_client():
-    global _multimodal_client
+    global _multimodal_client, _multimodal_provider
     if _multimodal_client is None and get_multimodal_config():
         try:
-            _multimodal_client = get_llm_service(get_multimodal_config())
+            config = get_multimodal_config()
+            _multimodal_provider = config.get("llm_service", "").lower()
+            _multimodal_client = get_llm_service(config)
         except Exception:
             logger.warning("Failed to create multimodal LLM client")
     return _multimodal_client
 
+def _build_image_content_block(image_base64: str, media_type: str) -> dict:
+    """Build a LangChain image content block appropriate for the configured provider."""
+    if _multimodal_provider in ("genai", "vertexai"):
+        return {
+            "type": "image_url",
+            "image_url": {"url": f"data:{media_type};base64,{image_base64}"},
+        }
+    return {
+        "type": "image",
+        "source": {"type": "base64", "media_type": media_type, "data": image_base64},
+    }
+
 def describe_image_with_llm(file_path):
     """
     Read image file and convert to base64 to send to LLM.
@@ -49,10 +64,7 @@ def describe_image_with_llm(file_path):
                             "If the image has any logo, identify and describe the logo."
                         ),
                     },
-                    {
-                        "type": "image_url",
-                        "image_url": {"url": f"data:image/jpeg;base64,{image_base64}"},
-                    },
+                    _build_image_content_block(image_base64, "image/jpeg"),
                 ],
             ),
         ]
diff --git a/ecc/app/ecc_util.py b/ecc/app/ecc_util.py
index e17ce9f..a28567a 100644
--- a/ecc/app/ecc_util.py
+++ b/ecc/app/ecc_util.py
@@ -18,24 +18,24 @@ def get_chunker(chunker_type: str = "", graphname: str = None):
         )
     elif chunker_type == "character":
         chunker = character_chunker.CharacterChunker(
-            chunk_size=chunker_config.get("chunk_size", 4096),
-            overlap_size=chunker_config.get("overlap_size", 0),
+            chunk_size=chunker_config.get("chunk_size", 0),
+            overlap_size=chunker_config.get("overlap_size", -1),
         )
     elif chunker_type == "markdown":
         chunker = markdown_chunker.MarkdownChunker(
             chunk_size=chunker_config.get("chunk_size", 0),
-            chunk_overlap=chunker_config.get("overlap_size", 0),
+            overlap_size=chunker_config.get("overlap_size", -1),
         )
     elif chunker_type == "html":
         chunker = html_chunker.HTMLChunker(
             chunk_size=chunker_config.get("chunk_size", 0),
-            chunk_overlap=chunker_config.get("overlap_size", 0),
+            overlap_size=chunker_config.get("overlap_size", -1),
             headers=chunker_config.get("headers", None),
         )
     elif chunker_type == "recursive":
         chunker = recursive_chunker.RecursiveChunker(
-            chunk_size=chunker_config.get("chunk_size", 4096),
-            overlap_size=chunker_config.get("overlap_size", 0),
+            chunk_size=chunker_config.get("chunk_size", 0),
+            overlap_size=chunker_config.get("overlap_size", -1),
         )
     elif chunker_type == "single" or chunker_type == "image":
         # Single chunker: NEVER splits, always returns 1 chunk
diff --git a/ecc/app/main.py b/ecc/app/main.py
index 0db691b..d15ac75 100644
--- a/ecc/app/main.py
+++ b/ecc/app/main.py
@@ -35,7 +35,6 @@
     graphrag_config,
     embedding_service,
     get_llm_service,
-    llm_config,
     get_completion_config,
     get_graphrag_config,
     reload_db_config,
@@ -225,7 +224,7 @@ async def run_with_tracking(task_key: str, run_func, graphname: str, conn):
         llm_result = reload_llm_config()
         if llm_result["status"] == "success":
             LogWriter.info(f"LLM config reloaded: {llm_result['message']}")
-            completion_service = llm_config.get("completion_service", {})
+            completion_service = get_completion_config(graphname)
             ecc_model = completion_service.get("llm_model", "unknown")
             ecc_provider = completion_service.get("llm_service", "unknown")
             LogWriter.info(
diff --git a/graphrag-ui/src/pages/setup/GraphRAGConfig.tsx b/graphrag-ui/src/pages/setup/GraphRAGConfig.tsx
index 0e05e4e..dc33689 100644
--- a/graphrag-ui/src/pages/setup/GraphRAGConfig.tsx
+++ b/graphrag-ui/src/pages/setup/GraphRAGConfig.tsx
@@ -1,4 +1,4 @@
-import React, { useState, useEffect } from "react";
+import React, { useState, useEffect, useRef } from "react";
 import { Settings, Save, Loader2 } from "lucide-react";
 import { Input } from "@/components/ui/input";
 import { Button } from "@/components/ui/button";
@@ -14,7 +14,7 @@ import ConfigScopeToggle from "@/components/ConfigScopeToggle";
 const GraphRAGConfig = () => {
   const [selectedGraph, setSelectedGraph] = useState(sessionStorage.getItem("selectedGraph") || "");
   const [availableGraphs, setAvailableGraphs] = useState<string[]>([]);
-  const [reuseEmbedding, setReuseEmbedding] = useState(false);
+  const [reuseEmbedding, setReuseEmbedding] = useState(true);
   const [eccUrl, setEccUrl] = useState("http://graphrag-ecc:8001");
   const [chatHistoryUrl, setChatHistoryUrl] = useState("http://chat-history:8002");
 
@@ -35,11 +35,11 @@ const GraphRAGConfig = () => {
   const [maxConcurrency, setMaxConcurrency] = useState("10");
 
   // Chunker-specific settings
-  const [chunkSize, setChunkSize] = useState("1024");
-  const [overlapSize, setOverlapSize] = useState("0");
-  const [semanticMethod, setSemanticMethod] = useState("percentile");
-  const [semanticThreshold, setSemanticThreshold] = useState("0.95");
-  const [regexPattern, setRegexPattern] = useState("\\r?\\n");
+  const [chunkSize, setChunkSize] = useState("");
+  const [overlapSize, setOverlapSize] = useState("");
+  const [semanticMethod, setSemanticMethod] = useState("");
+  const [semanticThreshold, setSemanticThreshold] = useState("");
+  const [regexPattern, setRegexPattern] = useState("");
 
   const [isLoading, setIsLoading] = useState(false);
   const [isSaving, setIsSaving] = useState(false);
@@ -50,6 +50,10 @@ const GraphRAGConfig = () => {
   const [configScope, setConfigScope] = useState<"global" | "graph">("global");
   const [graphOverrides, setGraphOverrides] = useState<Record<string, any>>({});
 
+  // Track configs as loaded from API so we only save what's needed
+  const loadedGlobalConfig = useRef<Record<string, any>>({});
+  const loadedGraphOverrides = useRef<Record<string, any>>({});
+
   useEffect(() => {
     const site = JSON.parse(sessionStorage.getItem("site") || "{}");
     setAvailableGraphs(site.graphs || []);
@@ -59,7 +63,7 @@ const GraphRAGConfig = () => {
 
   const applyGraphragConfig = (graphragConfig: any) => {
     if (!graphragConfig) return;
-    setReuseEmbedding(graphragConfig.reuse_embedding || false);
+    setReuseEmbedding(graphragConfig.reuse_embedding ?? true);
     setEccUrl(graphragConfig.ecc || "http://graphrag-ecc:8001");
     setChatHistoryUrl(graphragConfig.chat_history_api || "http://chat-history:8002");
     setDefaultChunker(graphragConfig.chunker || "semantic");
@@ -67,17 +71,17 @@ const GraphRAGConfig = () => {
     setNumHops(String(graphragConfig.num_hops ?? 2));
     setNumSeenMin(String(graphragConfig.num_seen_min ?? 2));
     setCommunityLevel(String(graphragConfig.community_level ?? 2));
-    setDocOnly(graphragConfig.doc_only || false);
+    setDocOnly(graphragConfig.doc_only ?? false);
     setLoadBatchSize(String(graphragConfig.load_batch_size ?? 500));
     setUpsertDelay(String(graphragConfig.upsert_delay ?? 0));
     setMaxConcurrency(String(graphragConfig.default_concurrency ?? 10));
 
     const chunkerConfig = graphragConfig.chunker_config || {};
-    setChunkSize(String(chunkerConfig.chunk_size || 1024));
-    setOverlapSize(String(chunkerConfig.overlap_size || 0));
-    setSemanticMethod(chunkerConfig.method || "percentile");
-    setSemanticThreshold(String(chunkerConfig.threshold || 0.95));
-    setRegexPattern(chunkerConfig.pattern || "\\r?\\n");
+    setChunkSize(String(chunkerConfig.chunk_size ?? ""));
+    setOverlapSize(chunkerConfig.overlap_size != null ? String(chunkerConfig.overlap_size) : "");
+    setSemanticMethod(chunkerConfig.method || "");
+    setSemanticThreshold(chunkerConfig.threshold != null ? String(chunkerConfig.threshold) : "");
+    setRegexPattern(chunkerConfig.pattern != null ? chunkerConfig.pattern : "");
   };
 
   const fetchConfig = async (scope?: "global" | "graph", graphname?: string) => {
@@ -100,12 +104,17 @@ const GraphRAGConfig = () => {
 
       const data = await response.json();
 
+      const deepCopy = (obj: any) => JSON.parse(JSON.stringify(obj || {}));
+      loadedGlobalConfig.current = deepCopy(data.graphrag_config);
+
       if (effectiveScope === "graph" && data.graphrag_overrides) {
+        loadedGraphOverrides.current = deepCopy(data.graphrag_overrides);
         setGraphOverrides(data.graphrag_overrides);
         // Show per-graph values: merge global + overrides for display
         const merged = { ...data.graphrag_config, ...data.graphrag_overrides };
         applyGraphragConfig(merged);
       } else {
+        loadedGraphOverrides.current = {};
         setGraphOverrides({});
         applyGraphragConfig(data.graphrag_config);
       }
@@ -126,28 +135,20 @@ const GraphRAGConfig = () => {
     try {
       const creds = sessionStorage.getItem("creds");
       
-      // Prepare chunker config based on selected chunker type
-      const chunkerConfig: any = {};
-      
-      if (defaultChunker === "character" || defaultChunker === "markdown" || defaultChunker === "recursive") {
-        chunkerConfig.chunk_size = parseInt(chunkSize);
-        chunkerConfig.overlap_size = parseInt(overlapSize);
-      } else if (defaultChunker === "semantic") {
-        chunkerConfig.method = semanticMethod;
-        chunkerConfig.threshold = parseFloat(semanticThreshold);
-      } else if (defaultChunker === "regex") {
-        chunkerConfig.pattern = regexPattern;
-      } else if (defaultChunker === "html") {
-        // HTML chunker doesn't require specific config in the current implementation
-        // but we keep it consistent
-      }
-      
-      const graphragConfigData: any = {
+      // Build current UI state — only include non-empty fields
+      const currentChunkerConfig: any = {};
+      if (chunkSize !== "") currentChunkerConfig.chunk_size = parseInt(chunkSize);
+      if (overlapSize !== "") currentChunkerConfig.overlap_size = parseInt(overlapSize);
+      if (semanticMethod !== "") currentChunkerConfig.method = semanticMethod;
+      if (semanticThreshold !== "") currentChunkerConfig.threshold = parseFloat(semanticThreshold);
+      if (regexPattern !== "") currentChunkerConfig.pattern = regexPattern;
+
+      const currentConfig: any = {
         reuse_embedding: reuseEmbedding,
         ecc: eccUrl,
         chat_history_api: chatHistoryUrl,
         chunker: defaultChunker,
-        chunker_config: chunkerConfig,
+        chunker_config: currentChunkerConfig,
         top_k: parseInt(topK),
         num_hops: parseInt(numHops),
         num_seen_min: parseInt(numSeenMin),
@@ -158,6 +159,87 @@ const GraphRAGConfig = () => {
         default_concurrency: parseInt(maxConcurrency),
       };
 
+      // Display defaults — used to avoid saving values the user never changed
+      const displayDefaults: Record<string, any> = {
+        reuse_embedding: true,
+        ecc: "http://graphrag-ecc:8001",
+        chat_history_api: "http://chat-history:8002",
+        chunker: "semantic",
+        top_k: 5,
+        num_hops: 2,
+        num_seen_min: 2,
+        community_level: 2,
+        doc_only: false,
+        load_batch_size: 500,
+        upsert_delay: 0,
+        default_concurrency: 10,
+      };
+
+      // Determine which config to diff against based on scope
+      const globalCfg = loadedGlobalConfig.current;
+      const globalChunker = globalCfg.chunker_config || {};
+      const graphragConfigData: any = {};
+
+      // Helper: should a key be saved?
+      // - If it was in the reference config (loaded/overrides), always save
+      // - If it differs from the reference config AND differs from display default, save
+      const shouldSave = (key: string, current: any, reference: Record<string, any>, wasInRef: boolean) => {
+        if (wasInRef) return true;
+        const diffFromRef = JSON.stringify(current) !== JSON.stringify(reference[key]);
+        const matchesDefault = JSON.stringify(current) === JSON.stringify(displayDefaults[key]);
+        return diffFromRef && !matchesDefault;
+      };
+
+      if (configScope === "graph") {
+        // Graph scope: save values that differ from effective global (loaded or display default)
+        // or were already overridden per-graph
+        const overrides = loadedGraphOverrides.current;
+        const overridesChunker = overrides.chunker_config || {};
+
+        for (const key of Object.keys(currentConfig)) {
+          if (key === "chunker_config") continue;
+          const effectiveGlobal = key in globalCfg ? globalCfg[key] : displayDefaults[key];
+          const diffFromGlobal = JSON.stringify(currentConfig[key]) !== JSON.stringify(effectiveGlobal);
+          const wasOverridden = key in overrides;
+          if (wasOverridden || diffFromGlobal) {
+            graphragConfigData[key] = currentConfig[key];
+          }
+        }
+
+        const chunkerConfig: any = {};
+        for (const key of Object.keys(currentChunkerConfig)) {
+          const effectiveGlobal = key in globalChunker ? globalChunker[key] : undefined;
+          const diffFromGlobal = JSON.stringify(currentChunkerConfig[key]) !== JSON.stringify(effectiveGlobal);
+          const wasOverridden = key in overridesChunker;
+          if (wasOverridden || diffFromGlobal) {
+            chunkerConfig[key] = currentChunkerConfig[key];
+          }
+        }
+        if (Object.keys(chunkerConfig).length > 0 || "chunker_config" in overrides) {
+          graphragConfigData.chunker_config = chunkerConfig;
+        }
+      } else {
+        // Global scope: save loaded keys + user changes (skip display defaults for unloaded keys)
+        for (const key of Object.keys(currentConfig)) {
+          if (key === "chunker_config") continue;
+          if (shouldSave(key, currentConfig[key], globalCfg, key in globalCfg)) {
+            graphragConfigData[key] = currentConfig[key];
+          }
+        }
+
+        const chunkerConfig: any = {};
+        for (const key of Object.keys(currentChunkerConfig)) {
+          const wasLoaded = key in globalChunker;
+          const changed = JSON.stringify(currentChunkerConfig[key]) !== JSON.stringify(globalChunker[key]);
+          if (wasLoaded || changed) {
+            chunkerConfig[key] = currentChunkerConfig[key];
+          }
+        }
+        if (Object.keys(chunkerConfig).length > 0 || "chunker_config" in globalCfg) {
+          graphragConfigData.chunker_config = chunkerConfig;
+        }
+      }
+
       if (configScope === "graph") {
         graphragConfigData.scope = "graph";
         graphragConfigData.graphname = selectedGraph;
@@ -408,8 +490,11 @@ const GraphRAGConfig = () => {
                 </p>
               </div>
 
-              {/* Settings for character/markdown/recursive chunkers */}
-              {(defaultChunker === "character" || defaultChunker === "markdown" || defaultChunker === "recursive") && (
+              {/* Character/Markdown/Recursive chunker settings */}
+              <div className="border border-gray-200 dark:border-[#3D3D3D] rounded-lg p-4">
+                <h3 className="text-sm font-medium mb-3 text-black dark:text-white">
+                  Character / Markdown / Recursive Chunker
+                </h3>
                 <div className="grid grid-cols-2 gap-4">
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
@@ -418,7 +503,7 @@ const GraphRAGConfig = () => {
                     <Input
                       type="number"
                       className="dark:border-[#3D3D3D] dark:bg-background"
-                      placeholder="1024"
+                      placeholder="2048"
                       value={chunkSize}
                       onChange={(e) => setChunkSize(e.target.value)}
                     />
@@ -426,7 +511,6 @@ const GraphRAGConfig = () => {
                       Maximum size of each chunk
                     </p>
                   </div>
-
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                       Overlap Size
@@ -434,39 +518,42 @@ const GraphRAGConfig = () => {
                     <Input
                       type="number"
                       className="dark:border-[#3D3D3D] dark:bg-background"
-                      placeholder="0"
+                      placeholder="1/8 of chunk size"
                       value={overlapSize}
                       onChange={(e) => setOverlapSize(e.target.value)}
                     />
                     <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
-                      Overlap between consecutive chunks
+                      Overlap between consecutive chunks. Defaults to 1/8 of chunk size if empty.
                     </p>
                   </div>
                 </div>
-              )}
+              </div>
 
-              {/* Settings for semantic chunker */}
-              {defaultChunker === "semantic" && (
+              {/* Semantic chunker settings */}
+              <div className="border border-gray-200 dark:border-[#3D3D3D] rounded-lg p-4">
+                <h3 className="text-sm font-medium mb-3 text-black dark:text-white">
+                  Semantic Chunker
+                </h3>
                 <div className="grid grid-cols-2 gap-4">
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                       Semantic Method
                     </label>
-                    <Select value={semanticMethod} onValueChange={setSemanticMethod}>
+                    <Select value={semanticMethod || "percentile"} onValueChange={(v) => setSemanticMethod(v)}>
                       <SelectTrigger className="dark:border-[#3D3D3D] dark:bg-background">
-                        <SelectValue placeholder="Select method" />
+                        <SelectValue placeholder="Percentile (default)" />
                       </SelectTrigger>
                       <SelectContent>
                         <SelectItem value="percentile">Percentile</SelectItem>
                         <SelectItem value="standard_deviation">Standard Deviation</SelectItem>
                         <SelectItem value="interquartile">Interquartile</SelectItem>
+                        <SelectItem value="gradient">Gradient</SelectItem>
                       </SelectContent>
                     </Select>
                     <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
                       Breakpoint detection method
                     </p>
                   </div>
-
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                       Semantic Threshold
@@ -484,10 +571,13 @@ const GraphRAGConfig = () => {
                     </p>
                   </div>
                 </div>
-              )}
+              </div>
 
-              {/* Settings for regex chunker */}
-              {defaultChunker === "regex" && (
+              {/* Regex chunker settings */}
+              <div className="border border-gray-200 dark:border-[#3D3D3D] rounded-lg p-4">
+                <h3 className="text-sm font-medium mb-3 text-black dark:text-white">
+                  Regex Chunker
+                </h3>
                 <div>
                   <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                     Regex Pattern
@@ -503,16 +593,7 @@ const GraphRAGConfig = () => {
                     Regular expression pattern to split on
                   </p>
                 </div>
-              )}
-
-              {/* Info for HTML chunker */}
-              {defaultChunker === "html" && (
-                <div className="p-4 rounded-lg bg-blue-50 dark:bg-blue-900/20 text-blue-800 dark:text-blue-200">
-                  <p className="text-sm">
-                    HTML chunker uses the document structure to split content. No additional configuration needed.
-                  </p>
-                </div>
-              )}
+              </div>
             </div>
           </div>
 
diff --git a/graphrag-ui/src/pages/setup/LLMConfig.tsx b/graphrag-ui/src/pages/setup/LLMConfig.tsx
index aa5596f..836e17c 100644
--- a/graphrag-ui/src/pages/setup/LLMConfig.tsx
+++ b/graphrag-ui/src/pages/setup/LLMConfig.tsx
@@ -64,7 +64,7 @@ const PROVIDER_FIELDS: Record<string, ProviderConfig> = {
       { key: "AWS_SECRET_ACCESS_KEY", label: "AWS Secret Access Key", type: "password", required: true }
     ],
     configFields: [
-      { key: "region_name", label: "AWS Region", type: "text", required: true, placeholder: "us-east-1" }
+      { key: "region_name", label: "AWS Region", type: "text", required: false, placeholder: "us-east-1" }
     ]
   },
   groq: {
@@ -107,6 +107,20 @@ const PROVIDER_FIELDS: Record<string, ProviderConfig> = {
   }
 };
 
+// Single provider list shared across all service Select dropdowns
+const LLM_PROVIDERS = [
+  { value: "openai", label: "OpenAI" },
+  { value: "azure", label: "Azure OpenAI" },
+  { value: "genai", label: "Google GenAI (Gemini)" },
+  { value: "vertexai", label: "Google Vertex AI" },
+  { value: "bedrock", label: "AWS Bedrock" },
+  { value: "groq", label: "Groq" },
+  { value: "ollama", label: "Ollama" },
+  { value: "sagemaker", label: "AWS SageMaker" },
+  { value: "huggingface", label: "HuggingFace" },
+  { value: "watsonx", label: "IBM WatsonX" },
+] as const;
+
 const LLMConfig = () => {
   const [selectedGraph, setSelectedGraph] = useState(sessionStorage.getItem("selectedGraph") || "");
   const [availableGraphs, setAvailableGraphs] = useState<string[]>([]);
@@ -119,15 +133,10 @@ const LLMConfig = () => {
   const [messageType, setMessageType] = useState<"success" | "error" | "">("");
   const [testResults, setTestResults] = useState<any>(null);
   const [connectionTested, setConnectionTested] = useState(false);
-  
-  // Single provider state
-  const [singleProvider, setSingleProvider] = useState("openai");
-  const [singleConfig, setSingleConfig] = useState<Record<string, string>>({});
-  const [singleDefaultModel, setSingleDefaultModel] = useState("");
-  const [singleEmbeddingModel, setSingleEmbeddingModel] = useState("");
-  const [multimodalModel, setMultimodalModel] = useState("");
-
-  // Multi-provider state
+
+  const [useCustomMultimodal, setUseCustomMultimodal] = useState(false);
+
+  // Canonical per-service state — both single and multi-provider UIs read/write these
   const [completionProvider, setCompletionProvider] = useState("openai");
   const [completionConfig, setCompletionConfig] = useState<Record<string, string>>({});
   const [completionDefaultModel, setCompletionDefaultModel] = useState("");
@@ -183,124 +192,118 @@ const LLMConfig = () => {
       const llmConfig = data.llm_config;
       setLlmConfigAccess(data.llm_config_access === "chatbot_only" ? "chatbot_only" : "full");
 
+      // Store graph overrides when in per-graph scope
+      if (data.graph_overrides) {
+        setGraphOverrides(data.graph_overrides);
+      } else {
+        setGraphOverrides({});
+      }
+
+      // Detect providers (needed by chat/multimodal fallback below)
+      const completionProv = llmConfig.completion_service?.llm_service?.toLowerCase();
+      const embeddingProv = llmConfig.embedding_service?.embedding_model_service?.toLowerCase();
+      const multimodalProv = llmConfig.multimodal_service?.llm_service?.toLowerCase();
+      const chatProv = llmConfig.chat_service?.llm_service?.toLowerCase();
+      const defaultProv = completionProv || "openai";
+
+      // All config field keys that any provider might use
+      const allConfigKeys = ["base_url", "azure_deployment", "region_name", "project", "location", "endpoint_name", "endpoint_url"];
+
+      // Build the base config: top-level auth + completion_service fields.
+      // Every service inherits missing keys from this base.
+      const baseConfig: Record<string, string> = {};
+      // Layer 1: top-level auth
+      if (llmConfig.authentication_configuration) {
+        for (const [key, value] of Object.entries(llmConfig.authentication_configuration)) {
+          if (typeof value === "string") baseConfig[key] = value;
+        }
+      }
+      // Layer 2: completion_service config fields + auth
+      if (llmConfig.completion_service) {
+        for (const key of allConfigKeys) {
+          if (llmConfig.completion_service[key]) baseConfig[key] = llmConfig.completion_service[key];
+        }
+        if (llmConfig.completion_service.authentication_configuration) {
+          for (const [key, value] of Object.entries(llmConfig.completion_service.authentication_configuration)) {
+            if (typeof value === "string") baseConfig[key] = value;
+          }
+        }
+      }
+
+      // Helper: load a service config, inheriting all missing keys from baseConfig
+      const loadServiceConfigResolved = (svc: any) => {
+        // Start with base config as defaults
+        const cfg: Record<string, string> = { ...baseConfig };
+        // Override with service-specific config fields
+        if (svc) {
+          for (const key of allConfigKeys) {
+            if (svc[key]) cfg[key] = svc[key];
+          }
+          // Override with service-specific auth
+          if (svc.authentication_configuration) {
+            for (const [key, value] of Object.entries(svc.authentication_configuration)) {
+              if (typeof value === "string") cfg[key] = value;
+            }
+          }
+        }
+        return cfg;
+      };
+
       // Parse per-graph chatbot config (chatbot_only mode)
       if (data.global_chat_info) {
         setGlobalChatInfo(data.global_chat_info);
       }
       if (data.chatbot_config) {
         setUseCustomChatbot(true);
-        setChatbotProvider(data.chatbot_config.llm_service?.toLowerCase() || "openai");
+        setChatbotProvider(data.chatbot_config.llm_service?.toLowerCase() || defaultProv);
         setChatbotModelName(data.chatbot_config.llm_model || "");
         setChatbotTemperature(String(data.chatbot_config.model_kwargs?.temperature ?? "0"));
-        // Load provider-specific config fields + masked auth
-        const cfg: Record<string, string> = {};
-        for (const key of ["base_url", "azure_deployment", "region_name", "project", "location", "endpoint_name", "endpoint_url"]) {
-          if (data.chatbot_config[key]) cfg[key] = data.chatbot_config[key];
-        }
-        if (data.chatbot_config.authentication_configuration) {
-          for (const [key, value] of Object.entries(data.chatbot_config.authentication_configuration)) {
-            if (typeof value === "string") cfg[key] = value;
-          }
-        }
-        setChatbotProviderConfig(cfg);
+        // Resolve chatbot config: base config + chatbot overrides
+        setChatbotProviderConfig(loadServiceConfigResolved(data.chatbot_config));
       } else {
         setUseCustomChatbot(false);
       }
 
-      // Store graph overrides when in per-graph scope
-      if (data.graph_overrides) {
-        setGraphOverrides(data.graph_overrides);
-      } else {
-        setGraphOverrides({});
-      }
-
       const currentDefaultModel = llmConfig.completion_service?.llm_model || "";
-      setSingleDefaultModel(currentDefaultModel);
+      setCompletionDefaultModel(currentDefaultModel);
+
+      const allSameProvider =
+        completionProv === embeddingProv &&
+        (!multimodalProv || completionProv === multimodalProv) &&
+        (!chatProv || completionProv === chatProv);
+
+      setUseMultipleProviders(!allSameProvider);
 
       // Load chat_service config for full mode (superadmin)
+      // Chat inherits from base (completion) when not explicitly set
       if (llmConfig.chat_service) {
         setUseCustomChatbot(true);
-        setChatbotProvider(llmConfig.chat_service.llm_service?.toLowerCase() || "openai");
+        setChatbotProvider(chatProv || defaultProv);
         setChatbotModelName(llmConfig.chat_service.llm_model || "");
         setChatbotTemperature(String(llmConfig.chat_service.model_kwargs?.temperature ?? "0"));
-        const chatCfg: Record<string, string> = {};
-        for (const key of ["base_url", "azure_deployment", "region_name", "project", "location", "endpoint_name", "endpoint_url"]) {
-          if (llmConfig.chat_service[key]) chatCfg[key] = llmConfig.chat_service[key];
-        }
-        if (llmConfig.chat_service.authentication_configuration) {
-          for (const [key, value] of Object.entries(llmConfig.chat_service.authentication_configuration)) {
-            if (typeof value === "string") chatCfg[key] = value;
-          }
-        }
-        setChatbotProviderConfig(chatCfg);
+        setChatbotProviderConfig(loadServiceConfigResolved(llmConfig.chat_service));
       } else {
         setUseCustomChatbot(false);
-        setChatbotProvider("openai");
+        setChatbotProvider(defaultProv);
         setChatbotModelName("");
         setChatbotTemperature("0");
-        setChatbotProviderConfig({});
+        setChatbotProviderConfig({ ...baseConfig });
       }
 
-      // Detect if using multiple providers
-      const completionProv = llmConfig.completion_service?.llm_service?.toLowerCase();
-      const embeddingProv = llmConfig.embedding_service?.embedding_model_service?.toLowerCase();
-      const multimodalProv = llmConfig.multimodal_service?.llm_service?.toLowerCase();
-      const chatProv = llmConfig.chat_service?.llm_service?.toLowerCase();
+      // Canonical per-service state — both single and multi-provider UIs read these
+      setCompletionProvider(completionProv || "openai");
+      setCompletionDefaultModel(llmConfig.completion_service?.llm_model || "");
+      setCompletionConfig(loadServiceConfigResolved(llmConfig.completion_service));
 
-      const allSameProvider =
-        completionProv === embeddingProv &&
-        (!multimodalProv || completionProv === multimodalProv) &&
-        (!chatProv || completionProv === chatProv);
-      
-      setUseMultipleProviders(!allSameProvider);
+      setEmbeddingProvider(embeddingProv || completionProv || "openai");
+      setEmbeddingModel(llmConfig.embedding_service?.model_name || "");
+      setEmbeddingConfig(loadServiceConfigResolved(llmConfig.embedding_service));
 
-      // Helper: load config fields + masked auth fields from a service config
-      const loadServiceConfig = (svc: any, configKeys: string[]) => {
-        const cfg: Record<string, string> = {};
-        for (const key of configKeys) {
-          if (svc?.[key]) cfg[key] = svc[key];
-        }
-        // Load masked auth fields from authentication_configuration
-        if (svc?.authentication_configuration) {
-          for (const [key, value] of Object.entries(svc.authentication_configuration)) {
-            if (typeof value === "string") cfg[key] = value;
-          }
-        }
-        return cfg;
-      };
-
-      const completionConfigKeys = ["base_url", "azure_deployment", "region_name", "project", "location", "endpoint_name", "endpoint_url"];
-      const embeddingConfigKeys = ["base_url", "azure_deployment", "region_name"];
-
-      if (!allSameProvider) {
-        // Multi-provider mode - Load from backend
-        setCompletionProvider(completionProv || "openai");
-        setCompletionDefaultModel(llmConfig.completion_service?.llm_model || "");
-        setCompletionConfig(loadServiceConfig(llmConfig.completion_service, completionConfigKeys));
-
-        setEmbeddingProvider(embeddingProv || "openai");
-        setEmbeddingModel(llmConfig.embedding_service?.model_name || "");
-        setEmbeddingConfig(loadServiceConfig(llmConfig.embedding_service, embeddingConfigKeys));
-
-        setMultimodalProvider(multimodalProv || "openai");
-        setMultimodalModelName(llmConfig.multimodal_service?.llm_model || "");
-        setMultimodalConfig(loadServiceConfig(llmConfig.multimodal_service, ["azure_deployment"]));
-      } else {
-        // Single provider mode - Load from backend
-        setSingleProvider(completionProv || "openai");
-        setSingleDefaultModel(llmConfig.completion_service?.llm_model || "");
-        setSingleEmbeddingModel(llmConfig.embedding_service?.model_name || "");
-        setMultimodalModel(llmConfig.multimodal_service?.llm_model || "");
-        // Load config + auth from completion_service (single provider shares auth)
-        const singleCfg = loadServiceConfig(llmConfig.completion_service, completionConfigKeys);
-        // Also load top-level authentication_configuration (used in single-provider mode)
-        if (llmConfig.authentication_configuration) {
-          for (const [key, value] of Object.entries(llmConfig.authentication_configuration)) {
-            if (typeof value === "string" && !singleCfg[key]) singleCfg[key] = value;
-          }
-        }
-        setSingleConfig(singleCfg);
-      }
+      setMultimodalProvider(multimodalProv || completionProv || "openai");
+      const mmModel = llmConfig.multimodal_service?.llm_model || "";
+      setMultimodalModelName(mmModel);
+      setMultimodalConfig(loadServiceConfigResolved(llmConfig.multimodal_service));
+      setUseCustomMultimodal(!!mmModel || !!multimodalProv);
     } catch (error: any) {
       console.error("Error fetching config:", error);
       setMessage(`Failed to load configuration: ${error.message}`);
@@ -318,19 +321,20 @@ const LLMConfig = () => {
   };
 
   // Update config when provider changes - CLEAR ALL FIELDS
-  const handleProviderChange = (newProvider: string, target: 'single' | 'completion' | 'embedding' | 'multimodal') => {
-    if (target === 'single') {
-      setSingleProvider(newProvider);
-      setSingleConfig({});
-      // Clear model names when switching provider
-      setSingleDefaultModel("");
-      setSingleEmbeddingModel("");
-      setMultimodalModel("");
-    } else if (target === 'completion') {
+  const handleProviderChange = (newProvider: string, target: 'completion' | 'embedding' | 'multimodal') => {
+    if (target === 'completion') {
       setCompletionProvider(newProvider);
       setCompletionConfig({});
-      // Clear model names when switching provider
       setCompletionDefaultModel("");
+      // In single-provider mode, all services share the same provider
+      if (!useMultipleProviders) {
+        setEmbeddingProvider(newProvider);
+        setEmbeddingConfig({});
+        setEmbeddingModel("");
+        setMultimodalProvider(newProvider);
+        setMultimodalConfig({});
+        setMultimodalModelName("");
+      }
     } else if (target === 'embedding') {
       setEmbeddingProvider(newProvider);
       setEmbeddingConfig({});
@@ -373,6 +377,103 @@ const LLMConfig = () => {
     return serviceConfig;
   };
 
+  /**
+   * Build the candidate LLM config payload.
+   * Used by both test-connection and save — same structure, single source of truth.
+   * Inherited services (multimodal, chatbot) are set to null when not customized.
+   */
+  const buildLLMConfigPayload = (): any => {
+    let llmConfigData: any;
+
+    if (useMultipleProviders) {
+      const completionServiceConfig: any = {
+        llm_service: completionProvider,
+        llm_model: completionDefaultModel,
+        authentication_configuration: buildAuthConfig(completionProvider, completionConfig),
+        model_kwargs: { temperature: 0 },
+        prompt_path: `./common/prompts/${getPromptPath(completionProvider)}/`,
+        ...buildServiceConfig(completionProvider, completionConfig)
+      };
+
+      llmConfigData = {
+        graphname: selectedGraph || undefined,
+        completion_service: completionServiceConfig,
+        embedding_service: {
+          embedding_model_service: embeddingProvider,
+          model_name: embeddingModel,
+          authentication_configuration: buildAuthConfig(embeddingProvider, embeddingConfig),
+          ...buildServiceConfig(embeddingProvider, embeddingConfig)
+        },
+      };
+
+      if (useCustomMultimodal && multimodalModelName) {
+        llmConfigData.multimodal_service = {
+          llm_service: multimodalProvider,
+          llm_model: multimodalModelName,
+          authentication_configuration: buildAuthConfig(multimodalProvider, multimodalConfig),
+          model_kwargs: { temperature: 0 },
+          ...buildServiceConfig(multimodalProvider, multimodalConfig)
+        };
+      } else {
+        llmConfigData.multimodal_service = null;
+      }
+
+      if (useCustomChatbot) {
+        llmConfigData.chat_service = {
+          llm_service: chatbotProvider,
+          llm_model: chatbotModelName,
+          authentication_configuration: buildAuthConfig(chatbotProvider, chatbotProviderConfig),
+          model_kwargs: { temperature: parseFloat(chatbotTemperature) || 0 },
+          ...buildServiceConfig(chatbotProvider, chatbotProviderConfig),
+        };
+      } else {
+        llmConfigData.chat_service = null;
+      }
+    } else {
+      const completionServiceConfig: any = {
+        llm_service: completionProvider,
+        llm_model: completionDefaultModel,
+        model_kwargs: { temperature: 0 },
+        prompt_path: `./common/prompts/${getPromptPath(completionProvider)}/`,
+        ...buildServiceConfig(completionProvider, completionConfig)
+      };
+
+      llmConfigData = {
+        graphname: selectedGraph || undefined,
+        authentication_configuration: buildAuthConfig(completionProvider, completionConfig),
+        completion_service: completionServiceConfig,
+        embedding_service: {
+          embedding_model_service: completionProvider,
+          model_name: embeddingModel,
+        },
+      };
+
+      if (useCustomMultimodal && multimodalModelName.trim()) {
+        llmConfigData.multimodal_service = {
+          llm_model: multimodalModelName,
+        };
+      } else {
+        llmConfigData.multimodal_service = null;
+      }
+
+      if (useCustomChatbot) {
+        const chatTemp = parseFloat(chatbotTemperature) || 0;
+        llmConfigData.chat_service = {
+          ...(chatbotModelName.trim() ? { llm_model: chatbotModelName } : {}),
+          model_kwargs: { temperature: chatTemp },
+        };
+      } else {
+        llmConfigData.chat_service = null;
+      }
+    }
+
+    if (configScope === "graph") {
+      llmConfigData.scope = "graph";
+    }
+
+    return llmConfigData;
+  };
+
   const handleSave = async () => {
     setIsSaving(true);
     setMessage("");
@@ -420,85 +521,7 @@ const LLMConfig = () => {
         return;
       }
 
-      if (useMultipleProviders) {
-        const completionServiceConfig: any = {
-          llm_service: completionProvider,
-          llm_model: completionDefaultModel,
-          authentication_configuration: buildAuthConfig(completionProvider, completionConfig),
-          model_kwargs: { temperature: 0 },
-          prompt_path: `./common/prompts/${getPromptPath(completionProvider)}/`,
-          ...buildServiceConfig(completionProvider, completionConfig)
-        };
-        
-        llmConfigData = {
-          graphname: selectedGraph || undefined,
-          completion_service: completionServiceConfig,
-          embedding_service: {
-            embedding_model_service: embeddingProvider,
-            model_name: embeddingModel,
-            authentication_configuration: buildAuthConfig(embeddingProvider, embeddingConfig),
-            ...buildServiceConfig(embeddingProvider, embeddingConfig)
-          },
-          multimodal_service: {
-            llm_service: multimodalProvider,
-            llm_model: multimodalModelName,
-            authentication_configuration: buildAuthConfig(multimodalProvider, multimodalConfig),
-            model_kwargs: { temperature: 0 },
-            ...buildServiceConfig(multimodalProvider, multimodalConfig)
-          },
-        };
-
-        // Save chat_service if not inheriting from completion service
-        if (useCustomChatbot) {
-          llmConfigData.chat_service = {
-            llm_service: chatbotProvider,
-            llm_model: chatbotModelName,
-            authentication_configuration: buildAuthConfig(chatbotProvider, chatbotProviderConfig),
-            model_kwargs: { temperature: parseFloat(chatbotTemperature) || 0 },
-            ...buildServiceConfig(chatbotProvider, chatbotProviderConfig),
-          };
-        } else {
-          llmConfigData.chat_service = null;
-        }
-      } else {
-        const completionServiceConfig: any = {
-          llm_service: singleProvider,
-          llm_model: singleDefaultModel,
-          model_kwargs: { temperature: 0 },
-          prompt_path: `./common/prompts/${getPromptPath(singleProvider)}/`,
-          ...buildServiceConfig(singleProvider, singleConfig)
-        };
-        
-        llmConfigData = {
-          graphname: selectedGraph || undefined,
-          authentication_configuration: buildAuthConfig(singleProvider, singleConfig),
-          completion_service: completionServiceConfig,
-          embedding_service: {
-            embedding_model_service: singleProvider,
-            model_name: singleEmbeddingModel,
-          },
-          multimodal_service: {
-            llm_service: singleProvider,
-            llm_model: multimodalModel,
-            model_kwargs: { temperature: 0 },
-            ...buildServiceConfig(singleProvider, singleConfig)
-          },
-        };
-
-        // Save chat_service with just the model name (same provider as completion)
-        if (chatbotModelName.trim()) {
-          llmConfigData.chat_service = {
-            llm_model: chatbotModelName,
-          };
-        } else {
-          llmConfigData.chat_service = null;
-        }
-      }
-
-      // Add scope for superadmin per-graph saves
-      if (configScope === "graph") {
-        llmConfigData.scope = "graph";
-      }
+      llmConfigData = buildLLMConfigPayload();
 
       const response = await fetch("/ui/config/llm", {
         method: "POST",
@@ -519,6 +542,9 @@ const LLMConfig = () => {
       setMessageType("success");
       setTestResults(null);
       setConnectionTested(false);
+
+      // Refetch to sync all state with the saved config
+      fetchConfig(configScope === "graph" ? "graph" : "global", selectedGraph || undefined);
     } catch (error: any) {
       console.error("Error saving config:", error);
       setMessage(`❌ Error: ${error.message}`);
@@ -553,112 +579,43 @@ const LLMConfig = () => {
         return null;
       };
 
+      const failValidation = (msg: string) => {
+        setMessage(`❌ ${msg}`);
+        setMessageType("error");
+        setIsTesting(false);
+      };
+
       if (useMultipleProviders) {
         const completionError = validateProvider(completionProvider, completionConfig, "Completion Service");
-        if (completionError) {
-          setMessage(`❌ ${completionError}`);
-          setMessageType("error");
-          setIsTesting(false);
-          return;
-        }
-        
+        if (completionError) { failValidation(completionError); return; }
+        if (!completionDefaultModel.trim()) { failValidation("Model Name is required for Completion Service"); return; }
+
         const embeddingError = validateProvider(embeddingProvider, embeddingConfig, "Embedding Service");
-        if (embeddingError) {
-          setMessage(`❌ ${embeddingError}`);
-          setMessageType("error");
-          setIsTesting(false);
-          return;
+        if (embeddingError) { failValidation(embeddingError); return; }
+        if (!embeddingModel.trim()) { failValidation("Model Name is required for Embedding Service"); return; }
+
+        if (useCustomMultimodal) {
+          const multimodalError = validateProvider(multimodalProvider, multimodalConfig, "Multimodal Service");
+          if (multimodalError) { failValidation(multimodalError); return; }
+          if (!multimodalModelName.trim()) { failValidation("Model Name is required for Multimodal Service"); return; }
         }
 
-        const multimodalError = validateProvider(multimodalProvider, multimodalConfig, "Multimodal Service");
-        if (multimodalError) {
-          setMessage(`❌ ${multimodalError}`);
-          setMessageType("error");
-          setIsTesting(false);
-          return;
+        if (useCustomChatbot) {
+          const chatbotError = validateProvider(chatbotProvider, chatbotProviderConfig, "Chatbot Service");
+          if (chatbotError) { failValidation(chatbotError); return; }
+          if (!chatbotModelName.trim()) { failValidation("Model Name is required for Chatbot Service"); return; }
         }
       } else {
-        const singleError = validateProvider(singleProvider, singleConfig, singleProvider);
-        if (singleError) {
-          setMessage(`❌ ${singleError}`);
-          setMessageType("error");
-          setIsTesting(false);
-          return;
-        }
+        const singleError = validateProvider(completionProvider, completionConfig, completionProvider);
+        if (singleError) { failValidation(singleError); return; }
+        if (!completionDefaultModel.trim()) { failValidation("Completion Model is required"); return; }
+        if (!embeddingModel.trim()) { failValidation("Embedding Model is required"); return; }
+        if (useCustomMultimodal && !multimodalModelName.trim()) { failValidation("Multimodal Model is required when not inheriting from completion"); return; }
+        if (useCustomChatbot && !chatbotModelName.trim()) { failValidation("Chatbot Model is required when not inheriting from completion"); return; }
       }
       
       const creds = sessionStorage.getItem("creds");
-      let llmConfigData: any;
-
-      if (useMultipleProviders) {
-        llmConfigData = {
-          graphname: selectedGraph || undefined,
-          completion_service: {
-            llm_service: completionProvider,
-            llm_model: completionDefaultModel,
-            authentication_configuration: buildAuthConfig(completionProvider, completionConfig),
-            ...buildServiceConfig(completionProvider, completionConfig)
-          },
-          embedding_service: {
-            embedding_model_service: embeddingProvider,
-            model_name: embeddingModel,
-            authentication_configuration: buildAuthConfig(embeddingProvider, embeddingConfig),
-            ...buildServiceConfig(embeddingProvider, embeddingConfig)
-          },
-        };
-        
-        llmConfigData.multimodal_service = {
-          llm_service: multimodalProvider,
-          llm_model: multimodalModelName,
-          authentication_configuration: buildAuthConfig(multimodalProvider, multimodalConfig),
-          ...buildServiceConfig(multimodalProvider, multimodalConfig)
-        };
-      } else {
-        llmConfigData = {
-          graphname: selectedGraph || undefined,
-          authentication_configuration: buildAuthConfig(singleProvider, singleConfig),
-          completion_service: {
-            llm_service: singleProvider,
-            llm_model: singleDefaultModel,
-            ...buildServiceConfig(singleProvider, singleConfig)
-          },
-          embedding_service: {
-            embedding_model_service: singleProvider,
-            model_name: singleEmbeddingModel,
-          },
-          multimodal_service: {
-            llm_service: singleProvider,
-            llm_model: multimodalModel,
-            ...buildServiceConfig(singleProvider, singleConfig)
-          },
-        };
-        
-      }
-
-      // Add chat_service to test config if custom chatbot is configured
-      // Add chat_service to test config if not inheriting
-      if (useCustomChatbot) {
-        if (useMultipleProviders) {
-          const chatbotError = validateProvider(chatbotProvider, chatbotProviderConfig, "Chatbot Service");
-          if (chatbotError) {
-            setMessage(`❌ ${chatbotError}`);
-            setMessageType("error");
-            setIsTesting(false);
-            return;
-          }
-          llmConfigData.chat_service = {
-            llm_service: chatbotProvider,
-            llm_model: chatbotModelName,
-            authentication_configuration: buildAuthConfig(chatbotProvider, chatbotProviderConfig),
-            model_kwargs: { temperature: parseFloat(chatbotTemperature) || 0 },
-            ...buildServiceConfig(chatbotProvider, chatbotProviderConfig),
-          };
-        } else if (chatbotModelName.trim()) {
-          llmConfigData.chat_service = {
-            llm_model: chatbotModelName,
-          };
-        }
-      }
+      const llmConfigData = buildLLMConfigPayload();
 
       const response = await fetch("/ui/config/llm/test", {
         method: "POST",
@@ -919,16 +876,9 @@ const LLMConfig = () => {
                         <SelectValue />
                       </SelectTrigger>
                       <SelectContent>
-                        <SelectItem value="openai">OpenAI</SelectItem>
-                        <SelectItem value="azure">Azure OpenAI</SelectItem>
-                        <SelectItem value="genai">Google GenAI (Gemini)</SelectItem>
-                        <SelectItem value="vertexai">Google Vertex AI</SelectItem>
-                        <SelectItem value="bedrock">AWS Bedrock</SelectItem>
-                        <SelectItem value="groq">Groq</SelectItem>
-                        <SelectItem value="ollama">Ollama</SelectItem>
-                        <SelectItem value="sagemaker">AWS SageMaker</SelectItem>
-                        <SelectItem value="huggingface">HuggingFace</SelectItem>
-                        <SelectItem value="watsonx">IBM WatsonX</SelectItem>
+                        {LLM_PROVIDERS.map((p) => (
+                          <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                        ))}
                       </SelectContent>
                     </Select>
                   </div>
@@ -1102,6 +1052,13 @@ const LLMConfig = () => {
                 checked={useMultipleProviders}
                 onChange={(e) => {
                   setUseMultipleProviders(e.target.checked);
+                  if (!e.target.checked) {
+                    // Switching to single-provider: unify providers/configs to completion
+                    setEmbeddingProvider(completionProvider);
+                    setEmbeddingConfig({ ...completionConfig });
+                    setMultimodalProvider(completionProvider);
+                    setMultimodalConfig({ ...completionConfig });
+                  }
                   clearTestResults();
                 }}
                 className="h-4 w-4 rounded border-gray-300 dark:border-[#3D3D3D]"
@@ -1132,37 +1089,34 @@ const LLMConfig = () => {
                       <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                         Provider
                       </label>
-                      <Select value={singleProvider} onValueChange={(value) => handleProviderChange(value, 'single')}>
+                      <Select value={completionProvider} onValueChange={(value) => handleProviderChange(value, 'completion')}>
                         <SelectTrigger className="dark:border-[#3D3D3D] dark:bg-background">
                           <SelectValue placeholder="Select provider" />
                         </SelectTrigger>
                         <SelectContent>
-                          <SelectItem value="openai">OpenAI</SelectItem>
-                          <SelectItem value="azure">Azure OpenAI</SelectItem>
-                          <SelectItem value="genai">Google GenAI (Gemini)</SelectItem>
-                          <SelectItem value="vertexai">Google Vertex AI</SelectItem>
-                          <SelectItem value="bedrock">AWS Bedrock</SelectItem>
-                          <SelectItem value="ollama">Ollama</SelectItem>
+                          {LLM_PROVIDERS.map((p) => (
+                            <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                          ))}
                         </SelectContent>
                       </Select>
                       <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
-                        Only providers supporting both completion and embedding services are shown
+                        This provider will be used for all services (completion, embedding, multimodal)
                       </p>
                     </div>
 
-                    {renderProviderFields(singleProvider, singleConfig, setSingleConfig)}
+                    {renderProviderFields(completionProvider, completionConfig, setCompletionConfig)}
 
                     <div>
                       <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                        Completion Model
+                        Completion Model <span className="text-red-500">*</span>
                       </label>
                       <Input
                         type="text"
                         className="dark:border-[#3D3D3D] dark:bg-background"
-                        placeholder={getModelPlaceholder(singleProvider, 'llm')}
-                        value={singleDefaultModel}
+                        placeholder={getModelPlaceholder(completionProvider, 'llm')}
+                        value={completionDefaultModel}
                         onChange={(e) => {
-                          setSingleDefaultModel(e.target.value);
+                          setCompletionDefaultModel(e.target.value);
                           clearTestResults();
                         }}
                       />
@@ -1171,6 +1125,8 @@ const LLMConfig = () => {
                       </p>
                     </div>
 
+                    <hr className="border-gray-200 dark:border-[#3D3D3D]" />
+
                     <div>
                       <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                         Chatbot Model
@@ -1183,9 +1139,6 @@ const LLMConfig = () => {
                           checked={!useCustomChatbot}
                           onChange={(e) => {
                             setUseCustomChatbot(!e.target.checked);
-                            if (e.target.checked) {
-                              setChatbotModelName("");
-                            }
                             clearTestResults();
                           }}
                         />
@@ -1197,7 +1150,7 @@ const LLMConfig = () => {
                         <Input
                           type="text"
                           className="dark:border-[#3D3D3D] dark:bg-background"
-                          placeholder={getModelPlaceholder(singleProvider, 'llm')}
+                          placeholder={getModelPlaceholder(completionProvider, 'llm')}
                           value={chatbotModelName}
                           onChange={(e) => {
                             setChatbotModelName(e.target.value);
@@ -1212,39 +1165,84 @@ const LLMConfig = () => {
 
                     <div>
                       <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                        Embedding Model
+                        Chatbot Temperature
                       </label>
                       <Input
-                        type="text"
+                        type="number"
                         className="dark:border-[#3D3D3D] dark:bg-background"
-                        placeholder={getModelPlaceholder(singleProvider, 'embedding')}
-                        value={singleEmbeddingModel}
-                        onChange={(e) => {
-                          setSingleEmbeddingModel(e.target.value);
-                          clearTestResults();
-                        }}
+                        placeholder="0"
+                        min="0"
+                        max="2"
+                        step="0.1"
+                        value={chatbotTemperature}
+                        onChange={(e) => { setChatbotTemperature(e.target.value); clearTestResults(); }}
                       />
                       <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
-                        Used for generating vector embeddings of document chunks
+                        Controls randomness of chatbot responses (0 = deterministic, higher = more creative)
                       </p>
                     </div>
 
+                    <hr className="border-gray-200 dark:border-[#3D3D3D]" />
+
                     <div>
                       <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                         Multimodal Model
                       </label>
+                      <div className="flex items-center space-x-2 mb-2">
+                        <input
+                          type="checkbox"
+                          id="inheritMultimodalModel"
+                          className="rounded border-gray-300 dark:border-[#3D3D3D]"
+                          checked={!useCustomMultimodal}
+                          onChange={(e) => {
+                            setUseCustomMultimodal(!e.target.checked);
+                            clearTestResults();
+                          }}
+                        />
+                        <label htmlFor="inheritMultimodalModel" className="text-sm text-black dark:text-white">
+                          Use same model as completion service
+                        </label>
+                      </div>
+                      {!useCustomMultimodal && (
+                        <p className="text-xs text-amber-600 dark:text-amber-400 mb-2">
+                          Ensure your completion model supports vision input. Use "Test Connection" to verify.
+                        </p>
+                      )}
+                      {useCustomMultimodal && (
+                        <Input
+                          type="text"
+                          className="dark:border-[#3D3D3D] dark:bg-background"
+                          placeholder={getModelPlaceholder(completionProvider, 'multimodal')}
+                          value={multimodalModelName}
+                          onChange={(e) => {
+                            setMultimodalModelName(e.target.value);
+                            clearTestResults();
+                          }}
+                        />
+                      )}
+                      <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
+                        Used for processing images and multimodal content
+                      </p>
+                    </div>
+
+                    <hr className="border-gray-200 dark:border-[#3D3D3D]" />
+
+                    <div>
+                      <label className="block text-sm font-medium mb-2 text-black dark:text-white">
+                        Embedding Model <span className="text-red-500">*</span>
+                      </label>
                       <Input
                         type="text"
                         className="dark:border-[#3D3D3D] dark:bg-background"
-                        placeholder={getModelPlaceholder(singleProvider, 'multimodal')}
-                        value={multimodalModel}
+                        placeholder={getModelPlaceholder(completionProvider, 'embedding')}
+                        value={embeddingModel}
                         onChange={(e) => {
-                          setMultimodalModel(e.target.value);
+                          setEmbeddingModel(e.target.value);
                           clearTestResults();
                         }}
                       />
                       <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
-                        Used for processing images and multimodal content
+                        Used for generating vector embeddings of document chunks
                       </p>
                     </div>
                   </div>
@@ -1275,16 +1273,9 @@ const LLMConfig = () => {
                         <SelectValue placeholder="Select provider" />
                       </SelectTrigger>
                       <SelectContent>
-                        <SelectItem value="openai">OpenAI</SelectItem>
-                        <SelectItem value="azure">Azure OpenAI</SelectItem>
-                        <SelectItem value="genai">Google GenAI (Gemini)</SelectItem>
-                        <SelectItem value="vertexai">Google Vertex AI</SelectItem>
-                        <SelectItem value="bedrock">AWS Bedrock</SelectItem>
-                        <SelectItem value="sagemaker">AWS SageMaker</SelectItem>
-                        <SelectItem value="groq">Groq</SelectItem>
-                        <SelectItem value="ollama">Ollama</SelectItem>
-                        <SelectItem value="huggingface">HuggingFace</SelectItem>
-                        <SelectItem value="watsonx">IBM WatsonX</SelectItem>
+                        {LLM_PROVIDERS.map((p) => (
+                          <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                        ))}
                       </SelectContent>
                     </Select>
                   </div>
@@ -1293,7 +1284,7 @@ const LLMConfig = () => {
 
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                      Completion Model
+                      Model Name <span className="text-red-500">*</span>
                     </label>
                     <Input
                       type="text"
@@ -1330,10 +1321,6 @@ const LLMConfig = () => {
                       checked={!useCustomChatbot}
                       onChange={(e) => {
                         setUseCustomChatbot(!e.target.checked);
-                        if (e.target.checked) {
-                          setChatbotModelName("");
-                          setChatbotProviderConfig({});
-                        }
                         clearTestResults();
                       }}
                     />
@@ -1361,16 +1348,9 @@ const LLMConfig = () => {
                             <SelectValue />
                           </SelectTrigger>
                           <SelectContent>
-                            <SelectItem value="openai">OpenAI</SelectItem>
-                            <SelectItem value="azure">Azure OpenAI</SelectItem>
-                            <SelectItem value="genai">Google GenAI (Gemini)</SelectItem>
-                            <SelectItem value="vertexai">Google Vertex AI</SelectItem>
-                            <SelectItem value="bedrock">AWS Bedrock</SelectItem>
-                            <SelectItem value="groq">Groq</SelectItem>
-                            <SelectItem value="ollama">Ollama</SelectItem>
-                            <SelectItem value="sagemaker">AWS SageMaker</SelectItem>
-                            <SelectItem value="huggingface">HuggingFace</SelectItem>
-                            <SelectItem value="watsonx">IBM WatsonX</SelectItem>
+                            {LLM_PROVIDERS.map((p) => (
+                              <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                            ))}
                           </SelectContent>
                         </Select>
                       </div>
@@ -1410,62 +1390,84 @@ const LLMConfig = () => {
                 </div>
               </div>
 
-              {/* Embedding Service Provider */}
+              {/* Multimodal Service Provider */}
               <div className="bg-white dark:bg-shadeA border border-gray-300 dark:border-[#3D3D3D] rounded-lg p-6">
                 <h2 className="text-lg font-semibold mb-4 text-black dark:text-white">
-                  Embedding Service
+                  Multimodal Service
                 </h2>
-                <p className="text-sm text-gray-600 dark:text-[#D9D9D9] mb-6">
-                  Configure the provider for generating embeddings.
+                <p className="text-sm text-gray-600 dark:text-[#D9D9D9] mb-4">
+                  Configure the provider for processing images and multimodal content (vision tasks).
                 </p>
 
                 <div className="space-y-4">
-                  <div>
-                    <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                      Provider
-                    </label>
-                    <Select value={embeddingProvider} onValueChange={(value) => handleProviderChange(value, 'embedding')}>
-                      <SelectTrigger className="dark:border-[#3D3D3D] dark:bg-background">
-                        <SelectValue placeholder="Select provider" />
-                      </SelectTrigger>
-                      <SelectContent>
-                        <SelectItem value="openai">OpenAI</SelectItem>
-                        <SelectItem value="azure">Azure OpenAI</SelectItem>
-                        <SelectItem value="genai">Google GenAI</SelectItem>
-                        <SelectItem value="vertexai">Google Vertex AI</SelectItem>
-                        <SelectItem value="bedrock">AWS Bedrock</SelectItem>
-                        <SelectItem value="ollama">Ollama</SelectItem>
-                      </SelectContent>
-                    </Select>
-                  </div>
-
-                  {renderProviderFields(embeddingProvider, embeddingConfig, setEmbeddingConfig)}
-
-                  <div>
-                    <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                      Embedding Model
-                    </label>
-                    <Input
-                      type="text"
-                      className="dark:border-[#3D3D3D] dark:bg-background"
-                      placeholder={getModelPlaceholder(embeddingProvider, 'embedding')}
-                      value={embeddingModel}
+                  <div className="flex items-center space-x-2">
+                    <input
+                      type="checkbox"
+                      id="inheritMultimodalService"
+                      className="rounded border-gray-300 dark:border-[#3D3D3D]"
+                      checked={!useCustomMultimodal}
                       onChange={(e) => {
-                        setEmbeddingModel(e.target.value);
+                        setUseCustomMultimodal(!e.target.checked);
                         clearTestResults();
                       }}
                     />
+                    <label htmlFor="inheritMultimodalService" className="text-sm font-medium text-black dark:text-white">
+                      Inherit from completion service
+                    </label>
                   </div>
+                  {!useCustomMultimodal && (
+                    <p className="text-xs text-amber-600 dark:text-amber-400">
+                      Ensure your completion model supports vision input. Use "Test Connection" to verify.
+                    </p>
+                  )}
+
+                  {useCustomMultimodal && (
+                    <>
+                      <div>
+                        <label className="block text-sm font-medium mb-2 text-black dark:text-white">
+                          Provider
+                        </label>
+                        <Select value={multimodalProvider} onValueChange={(value) => handleProviderChange(value, 'multimodal')}>
+                          <SelectTrigger className="dark:border-[#3D3D3D] dark:bg-background">
+                            <SelectValue placeholder="Select provider" />
+                          </SelectTrigger>
+                          <SelectContent>
+                            {LLM_PROVIDERS.map((p) => (
+                              <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                            ))}
+                          </SelectContent>
+                        </Select>
+                      </div>
+
+                      {renderProviderFields(multimodalProvider, multimodalConfig, setMultimodalConfig)}
+
+                      <div>
+                        <label className="block text-sm font-medium mb-2 text-black dark:text-white">
+                          Model Name <span className="text-red-500">*</span>
+                        </label>
+                        <Input
+                          type="text"
+                          className="dark:border-[#3D3D3D] dark:bg-background"
+                          placeholder={getModelPlaceholder(multimodalProvider, 'multimodal')}
+                          value={multimodalModelName}
+                          onChange={(e) => {
+                            setMultimodalModelName(e.target.value);
+                            clearTestResults();
+                          }}
+                        />
+                      </div>
+                    </>
+                  )}
                 </div>
               </div>
 
-              {/* Multimodal Service Provider */}
+              {/* Embedding Service Provider */}
               <div className="bg-white dark:bg-shadeA border border-gray-300 dark:border-[#3D3D3D] rounded-lg p-6">
                 <h2 className="text-lg font-semibold mb-4 text-black dark:text-white">
-                  Multimodal Service
+                  Embedding Service
                 </h2>
                 <p className="text-sm text-gray-600 dark:text-[#D9D9D9] mb-6">
-                  Configure the provider for processing images and multimodal content (vision tasks).
+                  Configure the provider for generating embeddings.
                 </p>
 
                 <div className="space-y-4">
@@ -1473,35 +1475,31 @@ const LLMConfig = () => {
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
                       Provider
                     </label>
-                    <Select value={multimodalProvider} onValueChange={(value) => handleProviderChange(value, 'multimodal')}>
+                    <Select value={embeddingProvider} onValueChange={(value) => handleProviderChange(value, 'embedding')}>
                       <SelectTrigger className="dark:border-[#3D3D3D] dark:bg-background">
                         <SelectValue placeholder="Select provider" />
                       </SelectTrigger>
                       <SelectContent>
-                        <SelectItem value="openai">OpenAI</SelectItem>
-                        <SelectItem value="azure">Azure OpenAI</SelectItem>
-                        <SelectItem value="genai">Google GenAI (Gemini)</SelectItem>
-                        <SelectItem value="vertexai">Google Vertex AI</SelectItem>
+                        {LLM_PROVIDERS.map((p) => (
+                          <SelectItem key={p.value} value={p.value}>{p.label}</SelectItem>
+                        ))}
                       </SelectContent>
                     </Select>
-                    <p className="text-xs text-gray-500 dark:text-gray-400 mt-1">
-                      Only OpenAI, Azure, GenAI, VertexAI support vision
-                    </p>
                   </div>
 
-                  {renderProviderFields(multimodalProvider, multimodalConfig, setMultimodalConfig)}
+                  {renderProviderFields(embeddingProvider, embeddingConfig, setEmbeddingConfig)}
 
                   <div>
                     <label className="block text-sm font-medium mb-2 text-black dark:text-white">
-                      Model Name
+                      Model Name <span className="text-red-500">*</span>
                     </label>
                     <Input
                       type="text"
                       className="dark:border-[#3D3D3D] dark:bg-background"
-                      placeholder={getModelPlaceholder(multimodalProvider, 'multimodal')}
-                      value={multimodalModelName}
+                      placeholder={getModelPlaceholder(embeddingProvider, 'embedding')}
+                      value={embeddingModel}
                       onChange={(e) => {
-                        setMultimodalModelName(e.target.value);
+                        setEmbeddingModel(e.target.value);
                         clearTestResults();
                       }}
                     />
@@ -1535,7 +1533,7 @@ const LLMConfig = () => {
                     ? "bg-green-50 dark:bg-green-900/20 text-green-700 dark:text-green-300"
                     : "bg-red-50 dark:bg-red-900/20 text-red-700 dark:text-red-300"
                 }`}>
-                  <strong>Default LLM Model:</strong> {testResults.completion.message}
+                  <strong>Completion Model:</strong> {testResults.completion.message}
                 </div>
               )}
               
@@ -1545,27 +1543,27 @@ const LLMConfig = () => {
                     ? "bg-green-50 dark:bg-green-900/20 text-green-700 dark:text-green-300"
                     : "bg-red-50 dark:bg-red-900/20 text-red-700 dark:text-red-300"
                 }`}>
-                  <strong>Chatbot LLM Model:</strong> {testResults.chatbot.message}
+                  <strong>Chatbot Model:</strong> {testResults.chatbot.message}
                 </div>
               )}
-              
-              {testResults.embedding && testResults.embedding.status !== "not_tested" && (
+
+              {testResults.multimodal && testResults.multimodal.status !== "not_tested" && (
                 <div className={`p-3 rounded-lg text-sm ${
-                  testResults.embedding.status === "success"
+                  testResults.multimodal.status === "success"
                     ? "bg-green-50 dark:bg-green-900/20 text-green-700 dark:text-green-300"
                     : "bg-red-50 dark:bg-red-900/20 text-red-700 dark:text-red-300"
                 }`}>
-                  <strong>Embedding Model:</strong> {testResults.embedding.message}
+                  <strong>Multimodal Model:</strong> {testResults.multimodal.message}
                 </div>
               )}
-              
-              {testResults.multimodal && testResults.multimodal.status !== "not_tested" && (
+
+              {testResults.embedding && testResults.embedding.status !== "not_tested" && (
                 <div className={`p-3 rounded-lg text-sm ${
-                  testResults.multimodal.status === "success"
+                  testResults.embedding.status === "success"
                     ? "bg-green-50 dark:bg-green-900/20 text-green-700 dark:text-green-300"
                     : "bg-red-50 dark:bg-red-900/20 text-red-700 dark:text-red-300"
                 }`}>
-                  <strong>Multimodal Model:</strong> {testResults.multimodal.message}
+                  <strong>Embedding Model:</strong> {testResults.embedding.message}
                 </div>
               )}
             </div>
diff --git a/graphrag/app/agent/agent_generation.py b/graphrag/app/agent/agent_generation.py
index d6b3461..22d10d4 100644
--- a/graphrag/app/agent/agent_generation.py
+++ b/graphrag/app/agent/agent_generation.py
@@ -26,10 +26,10 @@
 logger = logging.getLogger(__name__)
 
 class TigerGraphAgentGenerator:
-    def __init__(self, llm_model):
-        self.llm = llm_model
-        llm_config = getattr(llm_model, "config", {})
-        self.token_calculator = get_token_calculator(token_limit=llm_config.get("token_limit"), model_name=llm_config.get("llm_model"))
+    def __init__(self, llm_service):
+        self.llm = llm_service
+        svc_config = getattr(llm_service, "config", {})
+        self.token_calculator = get_token_calculator(token_limit=svc_config.get("token_limit"), model_name=svc_config.get("llm_model"))
 
     def generate_answer(self, question: str, context: str | dict, query: str = "") -> dict:
         """Generate an answer based on the question and context.
diff --git a/graphrag/app/routers/root.py b/graphrag/app/routers/root.py
index f96bb40..e986194 100644
--- a/graphrag/app/routers/root.py
+++ b/graphrag/app/routers/root.py
@@ -5,7 +5,7 @@
 from fastapi.responses import FileResponse, HTMLResponse
 from fastapi.security import HTTPBasic, HTTPBasicCredentials
 
-from common.config import llm_config, service_status
+from common.config import get_completion_config, service_status
 
 logger = logging.getLogger(__name__)
 router = APIRouter()
@@ -13,7 +13,7 @@
 
 @router.get("/")
 def read_root():
-    return {"config": llm_config["model_name"]}
+    return {"config": get_completion_config().get("llm_model", "unknown")}
 
 
 @router.get("/health")
diff --git a/graphrag/app/routers/ui.py b/graphrag/app/routers/ui.py
index 9bd22b8..30971cd 100644
--- a/graphrag/app/routers/ui.py
+++ b/graphrag/app/routers/ui.py
@@ -51,7 +51,7 @@
 from pyTigerGraph import TigerGraphConnection
 from tools.validation_utils import MapQuestionToSchemaException
 
-from common.config import db_config, graphrag_config, embedding_service, llm_config, service_status, SERVER_CONFIG, get_chat_config, validate_graphname
+from common.config import db_config, graphrag_config, embedding_service, llm_config, service_status, get_chat_config, get_completion_config, get_embedding_config, get_multimodal_config, validate_graphname, get_llm_service, resolve_llm_services
 from common.db.connections import get_db_connection_pwd_manual
 from common.logs.log import req_id_cv
 from common.logs.logwriter import LogWriter
@@ -181,14 +181,6 @@ def _require_roles(credentials: HTTPBasicCredentials, allowed_roles: set[str]) -
     return roles
 
 
-def _create_llm_service(provider: str, config: dict):
-    """Instantiate an LLM provider, returning None for unsupported providers."""
-    try:
-        return get_llm_service(config)
-    except Exception:
-        return None
-
-
 def _create_embedding_service(provider: str, config: dict):
     from common.embeddings.embedding_services import (
         OpenAI_Embedding, AzureOpenAI_Ada002, GenAI_Embedding,
@@ -1118,7 +1110,7 @@ async def chat(
             status_code=503,
             detail=service_status["embedding_store"]["error"]
         )
-    
+
     await websocket.accept()
 
     # AUTH with proper error handling and timeout
@@ -1817,7 +1809,7 @@ async def save_llm_config(
     Save LLM configuration and reload services.
     """
     try:
-        graphname = llm_config_data.pop("graphname", None)
+        graphname = llm_config_data.get("graphname")
         llm_access_mode = _resolve_llm_config_access(credentials, graphname)
         graphs = auth(credentials.username, credentials.password)[0]
         auth_header = "Basic " + base64.b64encode(
@@ -1837,10 +1829,7 @@ async def save_llm_config(
             # Save and reload in graphrag service
             from common.config import reload_llm_config
 
-            scope = llm_config_data.pop("scope", None)
-
-            # Substitute masked sentinel values with real stored values
-            _unmask_auth(llm_config_data, llm_config)
+            candidate, graphname, scope = _prepare_llm_config(llm_config_data)
 
             if llm_access_mode == "chatbot_only" or (llm_access_mode == "full" and scope == "graph"):
                 # Per-graph save: write only overrides to graph config file.
@@ -1864,20 +1853,34 @@ async def save_llm_config(
 
                     graph_llm = graph_server_config.setdefault("llm_config", {})
 
-                    # Also unmask against the graph's own stored config
-                    _unmask_auth(llm_config_data, graph_llm)
-
                     if llm_access_mode == "chatbot_only":
                         # Graph admin: only chat_service
                         svc_keys = ["chat_service"]
                     else:
                         # Superadmin per-graph: all services
-                        svc_keys = ["completion_service", "chat_service", "multimodal_service"]
+                        svc_keys = ["completion_service", "embedding_service", "chat_service", "multimodal_service"]
+
+                    # Resolve both candidate and global to get fully expanded configs,
+                    # then store only the delta as the graph override.
+                    resolved_candidate = resolve_llm_services(candidate)
+                    resolved_global = resolve_llm_services(llm_config)
 
                     for svc_key in svc_keys:
-                        incoming = llm_config_data.get(svc_key)
+                        incoming = candidate.get(svc_key)
                         if incoming:
-                            graph_llm[svc_key] = incoming
+                            rc = resolved_candidate.get(svc_key, {})
+                            rg = resolved_global.get(svc_key, {})
+                            # Compute delta: keys whose resolved values differ
+                            delta = {}
+                            for k, v in rc.items():
+                                if k == "authentication_configuration":
+                                    continue
+                                if rg.get(k) != v:
+                                    delta[k] = v
+                            if delta:
+                                graph_llm[svc_key] = delta
+                            else:
+                                graph_llm.pop(svc_key, None)
                         else:
                             # Revert to inherit: remove override
                             graph_llm.pop(svc_key, None)
@@ -1890,7 +1893,7 @@ async def save_llm_config(
                 result = {"status": "success"}
             else:
                 # Superadmin global save
-                result = reload_llm_config(llm_config_data)
+                result = reload_llm_config(candidate)
 
             if result["status"] != "success":
                 raise HTTPException(status_code=500, detail=result["message"])
@@ -1917,247 +1920,167 @@ async def test_llm_config(
     Test LLM configuration by making actual API calls to the provider.
     Tests completion, embedding, and multimodal services.
     """
+    test_results = {
+        "completion": {"status": "not_tested", "message": ""},
+        "chatbot": {"status": "not_tested", "message": ""},
+        "embedding": {"status": "not_tested", "message": ""},
+        "multimodal": {"status": "not_tested", "message": ""}
+    }
     try:
-        graphname = llm_test_config.pop("graphname", None)
+        graphname = llm_test_config.get("graphname")
         llm_access_mode = _resolve_llm_config_access(credentials, graphname)
-        # Substitute masked sentinel values with real stored values
-        _unmask_auth(llm_test_config, llm_config)
-        from common import config as cfg
-
-        test_results = {
-            "completion": {"status": "not_tested", "message": ""},
-            "chatbot": {"status": "not_tested", "message": ""},
-            "embedding": {"status": "not_tested", "message": ""},
-            "multimodal": {"status": "not_tested", "message": ""}
-        }
+
+        # Build candidate config — same preparation as save
+        candidate, graphname, scope = _prepare_llm_config(llm_test_config)
+        # Resolve partial service configs into full configs for testing
+        # (same resolution logic used when parsing config from disk)
+        test_configs = resolve_llm_services(candidate)
 
         # Graph admins (chatbot_only) can only test chat_service
         if llm_access_mode == "chatbot_only":
-            if "chat_service" in llm_test_config:
+            if "chat_service" in candidate:
                 try:
-                    test_chat_config = llm_test_config["chat_service"].copy()
-                    provider = test_chat_config.get("llm_service", "openai").lower()
-                    model = test_chat_config.get("llm_model", "gpt-4o-mini")
-
-                    if "authentication_configuration" not in test_chat_config:
-                        test_chat_config["authentication_configuration"] = {}
-
-                    if hasattr(cfg, 'completion_config') and cfg.completion_config:
-                        for key in ["model_kwargs", "prompt_path", "base_url", "token_limit"]:
-                            if key not in test_chat_config and key in cfg.completion_config:
-                                test_chat_config[key] = cfg.completion_config[key]
-
-                    if "model_kwargs" not in test_chat_config:
-                        test_chat_config["model_kwargs"] = {"temperature": 0}
-                    if "prompt_path" not in test_chat_config:
-                        test_chat_config["prompt_path"] = "common/prompts/openai_gpt4/"
-
-                    llm_service = _create_llm_service(provider, test_chat_config)
-                    if llm_service:
-                        response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
-                        if not response or not str(response).strip():
-                            raise ValueError("LLM returned an empty response")
-                        test_results["chatbot"]["status"] = "success"
-                        test_results["chatbot"]["message"] = f"Chatbot LLM ({model}) connected successfully"
-                    else:
-                        test_results["chatbot"]["status"] = "error"
-                        test_results["chatbot"]["message"] = f"Provider '{provider}' not supported"
+                    test_config = test_configs["chat_service"]
+                    model = test_config.get("llm_model", "")
+                    llm_service = get_llm_service(test_config)
+                    response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
+                    if not response or not str(response).strip():
+                        raise ValueError("LLM returned an empty response")
+                    test_results["chatbot"]["status"] = "success"
+                    test_results["chatbot"]["message"] = f"Chatbot LLM ({model}) connected successfully"
                 except Exception as e:
                     test_results["chatbot"]["status"] = "error"
                     test_results["chatbot"]["message"] = f"Chatbot test failed: {str(e)}"
                     logger.error(f"Chatbot test failed for graph {graphname}: {str(e)}")
 
-            overall_status = "success" if test_results["chatbot"]["status"] == "success" else "error"
+            chatbot_status = test_results["chatbot"]["status"]
+            overall_status = "success" if chatbot_status == "success" else ("error" if chatbot_status == "error" else "not_tested")
             return {
                 "status": overall_status,
                 "message": "Connection test completed",
                 "results": {"chatbot": test_results["chatbot"]}
             }
 
-        # Full access: test all services
-        # Test Completion Service (Default LLM Model)
-        if "completion_service" in llm_test_config or "llm_service" in llm_test_config:
+        # Full access: test all services from the resolved test configs
+
+        # Test Completion Service
+        if "completion_service" in test_configs:
             try:
-                if "completion_service" in llm_test_config:
-                    test_completion_config = llm_test_config["completion_service"].copy()
-                    provider = test_completion_config.get("llm_service", "openai").lower()
-                    model = test_completion_config.get("llm_model", "gpt-4o-mini")
-                else:
-                    test_completion_config = {
-                        "llm_service": llm_test_config.get("llm_service", "openai"),
-                        "llm_model": llm_test_config.get("llm_model", "gpt-4o-mini"),
-                        "authentication_configuration": llm_test_config.get("authentication_configuration", {})
-                    }
-                    provider = test_completion_config["llm_service"].lower()
-                    model = test_completion_config["llm_model"]
-                
-                # Ensure authentication_configuration exists (may be at top level in single-provider mode)
-                if "authentication_configuration" not in test_completion_config:
-                    test_completion_config["authentication_configuration"] = llm_test_config.get("authentication_configuration", {})
-                
-                # Merge with existing config to get model_kwargs and prompt_path
-                if hasattr(cfg, 'completion_config') and cfg.completion_config:
-                    for key in ["model_kwargs", "prompt_path", "base_url", "token_limit"]:
-                        if key not in test_completion_config and key in cfg.completion_config:
-                            test_completion_config[key] = cfg.completion_config[key]
-                
-                # Ensure required fields exist
-                if "model_kwargs" not in test_completion_config:
-                    test_completion_config["model_kwargs"] = {"temperature": 0}
-                if "prompt_path" not in test_completion_config:
-                    test_completion_config["prompt_path"] = "common/prompts/openai_gpt4/"
-                
-                llm_service = _create_llm_service(provider, test_completion_config)
-                
-                if llm_service:
-                    response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
-                    if not response or not str(response).strip():
-                        raise ValueError("LLM returned an empty response")
-                    test_results["completion"]["status"] = "success"
-                    test_results["completion"]["message"] = f"✅ Default LLM model ({model}) connected successfully"
-                else:
-                    test_results["completion"]["status"] = "error"
-                    test_results["completion"]["message"] = f"Provider '{provider}' not supported for completion"
-                    
+                test_config = test_configs["completion_service"]
+                model = test_config.get("llm_model", "")
+                llm_service = get_llm_service(test_config)
+                response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
+                if not response or not str(response).strip():
+                    raise ValueError("LLM returned an empty response")
+                test_results["completion"]["status"] = "success"
+                test_results["completion"]["message"] = f"Completion model ({model}) connected successfully"
             except Exception as e:
                 test_results["completion"]["status"] = "error"
-                test_results["completion"]["message"] = f"❌ Completion test failed: {str(e)}"
+                test_results["completion"]["message"] = f"Completion test failed: {str(e)}"
                 logger.error(f"Completion test failed: {str(e)}")
-        
-        # Test Chatbot Service (if different model is provided)
-        if "chatbot_service" in llm_test_config:
+
+        # Test Chatbot Service (only if custom config provided in candidate;
+        # when inheriting from completion, the completion test already covers it)
+        if "chat_service" in candidate:
             try:
-                test_chatbot_config = llm_test_config["chatbot_service"].copy()
-                provider = test_chatbot_config.get("llm_service", "openai").lower()
-                model = test_chatbot_config.get("llm_model", "gpt-4o-mini")
-                
-                # Ensure authentication_configuration exists
-                if "authentication_configuration" not in test_chatbot_config:
-                    test_chatbot_config["authentication_configuration"] = llm_test_config.get("authentication_configuration", {})
-                
-                # Merge with existing config to get model_kwargs and prompt_path
-                if hasattr(cfg, 'completion_config') and cfg.completion_config:
-                    for key in ["model_kwargs", "prompt_path", "base_url", "token_limit"]:
-                        if key not in test_chatbot_config and key in cfg.completion_config:
-                            test_chatbot_config[key] = cfg.completion_config[key]
-                
-                # Ensure required fields exist
-                if "model_kwargs" not in test_chatbot_config:
-                    test_chatbot_config["model_kwargs"] = {"temperature": 0}
-                if "prompt_path" not in test_chatbot_config:
-                    test_chatbot_config["prompt_path"] = "common/prompts/openai_gpt4/"
-                
-                llm_service = _create_llm_service(provider, test_chatbot_config)
-                
-                if llm_service:
-                    response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
-                    if not response or not str(response).strip():
-                        raise ValueError("LLM returned an empty response")
-                    test_results["chatbot"]["status"] = "success"
-                    test_results["chatbot"]["message"] = f"✅ Chatbot LLM model ({model}) connected successfully"
-                else:
-                    test_results["chatbot"]["status"] = "error"
-                    test_results["chatbot"]["message"] = f"Provider '{provider}' not supported for chatbot"
-                    
+                test_config = test_configs["chat_service"]
+                model = test_config.get("llm_model", "")
+                llm_service = get_llm_service(test_config)
+                response = llm_service.llm.invoke("Say 'Connection successful' in 2 words")
+                if not response or not str(response).strip():
+                    raise ValueError("LLM returned an empty response")
+                test_results["chatbot"]["status"] = "success"
+                test_results["chatbot"]["message"] = f"Chatbot LLM model ({model}) connected successfully"
             except Exception as e:
                 test_results["chatbot"]["status"] = "error"
-                test_results["chatbot"]["message"] = f"❌ Chatbot test failed: {str(e)}"
+                test_results["chatbot"]["message"] = f"Chatbot test failed: {str(e)}"
                 logger.error(f"Chatbot test failed: {str(e)}")
-        
+
         # Test Embedding Service
-        if "embedding_service" in llm_test_config:
+        if "embedding_service" in test_configs:
             try:
-                test_embedding_config = llm_test_config["embedding_service"].copy()
-                provider = test_embedding_config.get("embedding_model_service", "openai").lower()
-                model = test_embedding_config.get("model_name", "text-embedding-3-small")
-                
-                # Ensure authentication_configuration exists
-                if "authentication_configuration" not in test_embedding_config:
-                    test_embedding_config["authentication_configuration"] = llm_test_config.get("authentication_configuration", {})
-                
-                # Merge with existing config
-                if hasattr(cfg, 'embedding_config') and cfg.embedding_config:
-                    for key in ["dimensions", "token_limit"]:
-                        if key not in test_embedding_config and key in cfg.embedding_config:
-                            test_embedding_config[key] = cfg.embedding_config[key]
-                
-                embedding_service_test = _create_embedding_service(provider, test_embedding_config)
-                
-                if embedding_service_test:
-                    # Test with a simple text
-                    embeddings = embedding_service_test.embed_query("test connection")
-                    if embeddings and len(embeddings) > 0:
-                        test_results["embedding"]["status"] = "success"
-                        test_results["embedding"]["message"] = f"✅ Embedding model ({model}) connected successfully"
-                    else:
-                        test_results["embedding"]["status"] = "error"
-                        test_results["embedding"]["message"] = "❌ Embedding returned empty result"
-                else:
-                    test_results["embedding"]["status"] = "error"
-                    test_results["embedding"]["message"] = f"Provider '{provider}' not supported for embeddings"
-                    
+                test_config = test_configs["embedding_service"]
+                provider = test_config.get("embedding_model_service", "openai").lower()
+                model = test_config.get("model_name", "")
+                embedding_service_test = _create_embedding_service(provider, test_config)
+                if not embedding_service_test:
+                    raise ValueError(f"Provider '{provider}' not supported for embeddings")
+                embeddings = embedding_service_test.embed_query("test connection")
+                if not embeddings or len(embeddings) == 0:
+                    raise ValueError("Embedding returned empty result")
+                test_results["embedding"]["status"] = "success"
+                test_results["embedding"]["message"] = f"Embedding model ({model}) connected successfully"
             except Exception as e:
                 test_results["embedding"]["status"] = "error"
-                test_results["embedding"]["message"] = f"❌ Embedding test failed: {str(e)}"
+                test_results["embedding"]["message"] = f"Embedding test failed: {str(e)}"
                 logger.error(f"Embedding test failed: {str(e)}")
-        
-        # Test Multimodal Service
-        if "multimodal_service" in llm_test_config:
+
+        # Test Multimodal Service — verifies the model supports vision
+        # When multimodal_service is absent (inheriting), use completion_service
+        # config — that's what will be used at runtime after save.
+        multimodal_config = test_configs.get("multimodal_service") or test_configs.get("completion_service")
+        if multimodal_config:
+            model = ""
             try:
-                test_multimodal_config = llm_test_config["multimodal_service"].copy()
-                provider = test_multimodal_config.get("llm_service", "openai").lower()
-                model = test_multimodal_config.get("llm_model", "gpt-4o")
-                
-                # Ensure authentication_configuration exists
-                if "authentication_configuration" not in test_multimodal_config:
-                    test_multimodal_config["authentication_configuration"] = llm_test_config.get("authentication_configuration", {})
-                
-                # Merge with existing config to get model_kwargs and prompt_path
-                if hasattr(cfg, 'multimodal_config') and cfg.multimodal_config:
-                    for key in ["model_kwargs", "prompt_path", "base_url", "token_limit"]:
-                        if key not in test_multimodal_config and key in cfg.multimodal_config:
-                            test_multimodal_config[key] = cfg.multimodal_config[key]
-                elif hasattr(cfg, 'completion_config') and cfg.completion_config:
-                    # Fallback to completion config
-                    for key in ["model_kwargs", "prompt_path", "base_url", "token_limit"]:
-                        if key not in test_multimodal_config and key in cfg.completion_config:
-                            test_multimodal_config[key] = cfg.completion_config[key]
-                
-                # Ensure required fields exist
-                if "model_kwargs" not in test_multimodal_config:
-                    test_multimodal_config["model_kwargs"] = {"temperature": 0}
-                if "prompt_path" not in test_multimodal_config:
-                    test_multimodal_config["prompt_path"] = "common/prompts/openai_gpt4/"
-                
-                multimodal_service = _create_llm_service(provider, test_multimodal_config)
-                
-                if multimodal_service:
-                    response = multimodal_service.llm.invoke("Say 'Connection successful' in 2 words")
-                    if not response or not str(response).strip():
-                        raise ValueError("Multimodal LLM returned an empty response")
-                    test_results["multimodal"]["status"] = "success"
-                    test_results["multimodal"]["message"] = f"✅ Multimodal model ({model}) connected successfully"
+                from langchain_core.messages import HumanMessage
+                test_config = multimodal_config
+                model = test_config.get("llm_model", "")
+                llm_service = get_llm_service(test_config)
+                # Send a small 20x20 red PNG to verify the model accepts image input.
+                # Some providers (e.g. Bedrock) reject 1x1 images.
+                TEST_IMAGE_B64 = (
+                    "iVBORw0KGgoAAAANSUhEUgAAABQAAAAUCAIAAAAC64paAAAAKUlEQVR4"
+                    "nGP8z0A+YKJAL8OoZhIBE6kakMGoZhIBE6kakMGoZhIBRZoBIpwBJy3"
+                    "phGMAAAAASUVORK5CYII="
+                )
+                provider = test_config.get("llm_service", "").lower()
+                # Google GenAI/VertexAI only accept image_url format;
+                # Bedrock/Anthropic-native providers prefer type:"image" with source.
+                if provider in ("genai", "vertexai"):
+                    image_block = {
+                        "type": "image_url",
+                        "image_url": {"url": f"data:image/png;base64,{TEST_IMAGE_B64}"},
+                    }
                 else:
-                    test_results["multimodal"]["status"] = "error"
-                    test_results["multimodal"]["message"] = f"Provider '{provider}' not supported for multimodal"
-                    
+                    image_block = {
+                        "type": "image",
+                        "source": {
+                            "type": "base64",
+                            "media_type": "image/png",
+                            "data": TEST_IMAGE_B64,
+                        },
+                    }
+                vision_message = HumanMessage(
+                    content=[
+                        {"type": "text", "text": "Describe this image in one word."},
+                        image_block,
+                    ]
+                )
+                response = llm_service.llm.invoke([vision_message])
+                if not response or not str(response).strip():
+                    raise ValueError("Multimodal LLM returned an empty response")
+                test_results["multimodal"]["status"] = "success"
+                test_results["multimodal"]["message"] = f"Multimodal model ({model}) connected and supports vision"
             except Exception as e:
                 test_results["multimodal"]["status"] = "error"
-                test_results["multimodal"]["message"] = f"❌ Multimodal test failed: {str(e)}"
+                test_results["multimodal"]["message"] = (
+                    f"Multimodal test failed for model ({model}): {str(e)}. "
+                    f"Please ensure the model supports vision input (e.g., GPT-4o, Claude 3.5+, Gemini)."
+                )
                 logger.error(f"Multimodal test failed: {str(e)}")
-        
+
         # Determine overall status
         all_success = all(result["status"] == "success" for result in test_results.values() if result["status"] != "not_tested")
         any_error = any(result["status"] == "error" for result in test_results.values())
-        
+
         overall_status = "success" if all_success and not any_error else "error" if any_error else "partial"
-        
+
         return {
             "status": overall_status,
             "message": "Connection test completed",
             "results": test_results
         }
-        
+
     except HTTPException:
         raise
     except Exception as e:
@@ -2172,37 +2095,131 @@ async def test_llm_config(
 MASKED_SECRET = "********"
 
 
+def _prepare_llm_config(llm_config_data: dict):
+    """
+    Shared preparation for both save and test endpoints.
+
+    1. Pop metadata keys (graphname, scope)
+    2. Unmask MASKED_SECRET values using current config from disk
+    3. Strip null service values (null = inherit, key should be absent)
+
+    Returns (candidate_config, graphname, scope).
+    The candidate_config is save-ready. Top-level parameters (authentication_configuration,
+    region_name) are promoted from completion_service if missing and redundant per-service
+    copies are stripped. reload_llm_config() and resolve_llm_services() handle injecting
+    them back into service configs at runtime.
+    """
+    graphname = llm_config_data.pop("graphname", None)
+    scope = llm_config_data.pop("scope", None)
+
+    # Resolve masked secrets from disk before modifying the payload
+    _unmask_auth(llm_config_data, graphname)
+
+    # Strip null values — null means "inherit from base", key should be absent
+    for key in list(llm_config_data.keys()):
+        if llm_config_data[key] is None:
+            del llm_config_data[key]
+
+    # Normalize auth: ensure top-level authentication_configuration exists.
+    # If missing, promote from completion_service so future config files
+    # always have auth at the top level.
+    if "authentication_configuration" not in llm_config_data:
+        completion_svc = llm_config_data.get("completion_service")
+        if isinstance(completion_svc, dict) and "authentication_configuration" in completion_svc:
+            llm_config_data["authentication_configuration"] = completion_svc["authentication_configuration"]
+
+    # Strip per-service auth if identical to top-level (redundant on disk;
+    # reload_llm_config injects top-level auth into services on load)
+    top_auth = llm_config_data.get("authentication_configuration")
+    if top_auth:
+        for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+            svc = llm_config_data.get(svc_key)
+            if isinstance(svc, dict) and svc.get("authentication_configuration") == top_auth:
+                del svc["authentication_configuration"]
+
+    # Normalize region_name: promote from completion_service to top level,
+    # strip per-service copies if identical (same pattern as auth).
+    if "region_name" not in llm_config_data:
+        completion_svc = llm_config_data.get("completion_service")
+        if isinstance(completion_svc, dict) and "region_name" in completion_svc:
+            llm_config_data["region_name"] = completion_svc["region_name"]
+
+    top_region = llm_config_data.get("region_name")
+    if top_region:
+        for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
+            svc = llm_config_data.get(svc_key)
+            if isinstance(svc, dict) and svc.get("region_name") == top_region:
+                del svc["region_name"]
+
+    return llm_config_data, graphname, scope
+
+
+
 def _mask_secret_values(auth_config: dict) -> dict:
     """Replace all values in an authentication_configuration dict with the masked sentinel."""
     return {k: MASKED_SECRET for k in auth_config}
 
 
-def _unmask_auth(incoming: dict, stored_config: dict):
+def _unmask_auth(incoming: dict, graphname: str = None):
     """
     In-place: replace MASKED_SECRET values in incoming authentication_configuration
-    with the real values from stored_config.
+    with real values resolved through the full config chain via getters.
 
-    Works on both top-level and per-service authentication_configuration.
+    Uses get_xxx_config(graphname) which resolves:
+      Layer 1 (base) → Layer 2 (global service) → Layer 3 (graph base) → Layer 4 (graph service)
     """
-    def _unmask_dict(incoming_auth, stored_auth):
-        if not isinstance(incoming_auth, dict) or not isinstance(stored_auth, dict):
-            return
-        for k, v in incoming_auth.items():
-            if v == MASKED_SECRET:
-                incoming_auth[k] = stored_auth.get(k, "")
+    # Use completion_service as the primary source for top-level auth resolution
+    # (backward compat: base bootstraps from completion_service)
+    resolved_completion = get_completion_config(graphname)
+
+    # Resolved configs for each service (lazy — only built if needed)
+    _resolved_cache = {}
+    def _get_resolved(svc_key):
+        if svc_key not in _resolved_cache:
+            getter = {
+                "completion_service": get_completion_config,
+                "embedding_service": get_embedding_config,
+                "chat_service": get_chat_config,
+                "multimodal_service": get_multimodal_config,
+            }.get(svc_key)
+            if getter:
+                result = getter(graphname)
+                _resolved_cache[svc_key] = result if result else {}
+            else:
+                _resolved_cache[svc_key] = {}
+        return _resolved_cache[svc_key]
+
+    def _resolve_real_value(key, svc_key=None):
+        """Find real value for an auth key using the resolved config chain."""
+        # Check the specific service first
+        if svc_key:
+            resolved = _get_resolved(svc_key)
+            val = resolved.get("authentication_configuration", {}).get(key, "")
+            if val and val != MASKED_SECRET:
+                return val
+        # Fallback to completion (which has full base resolution)
+        val = resolved_completion.get("authentication_configuration", {}).get(key, "")
+        if val and val != MASKED_SECRET:
+            return val
+        return ""
 
     # Top-level authentication_configuration
     if "authentication_configuration" in incoming:
-        stored_top = stored_config.get("authentication_configuration", {})
-        _unmask_dict(incoming["authentication_configuration"], stored_top)
+        auth = incoming["authentication_configuration"]
+        if isinstance(auth, dict):
+            for k, v in auth.items():
+                if v == MASKED_SECRET:
+                    auth[k] = _resolve_real_value(k)
 
     # Per-service authentication_configuration
     for svc_key in ["completion_service", "embedding_service", "multimodal_service", "chat_service"]:
         svc = incoming.get(svc_key)
-        if svc and "authentication_configuration" in svc:
-            stored_svc = stored_config.get(svc_key, {})
-            stored_svc_auth = stored_svc.get("authentication_configuration", {})
-            _unmask_dict(svc["authentication_configuration"], stored_svc_auth)
+        if isinstance(svc, dict) and "authentication_configuration" in svc:
+            auth = svc["authentication_configuration"]
+            if isinstance(auth, dict):
+                for k, v in auth.items():
+                    if v == MASKED_SECRET:
+                        auth[k] = _resolve_real_value(k, svc_key)
 
 
 def _strip_auth(config: dict) -> dict:
@@ -2248,7 +2265,7 @@ async def get_config(
                         graph_chat_service["authentication_configuration"] = _mask_secret_values(graph_chat_service["authentication_configuration"])
 
             # Global chat info for "Inherited from" display
-            global_chat = llm_config.get("chat_service", llm_config.get("completion_service", {}))
+            global_chat = get_chat_config()
             global_chat_info = {
                 "llm_service": global_chat.get("llm_service", ""),
                 "llm_model": global_chat.get("llm_model", ""),
diff --git a/graphrag/app/supportai/supportai_ingest.py b/graphrag/app/supportai/supportai_ingest.py
index 4ba69f1..e312f25 100644
--- a/graphrag/app/supportai/supportai_ingest.py
+++ b/graphrag/app/supportai/supportai_ingest.py
@@ -39,7 +39,7 @@ def chunk_document(self, document, chunker, chunker_params):
             from common.chunkers.character_chunker import CharacterChunker
 
             chunker = CharacterChunker(
-                chunker_params["chunk_size"], chunker_params.get("overlap", 0)
+                chunker_params.get("chunk_size", 0), chunker_params.get("overlap_size", -1)
             )
         elif chunker.lower() == "semantic":
             from common.chunkers.semantic_chunker import SemanticChunker
@@ -54,7 +54,7 @@ def chunk_document(self, document, chunker, chunker_params):
 
             chunker = HTMLChunker(
                 chunk_size=chunker_params.get("chunk_size", 0),
-                chunk_overlap=chunker_params.get("overlap_size", 0),
+                overlap_size=chunker_params.get("overlap_size", -1),
                 headers=chunker_params.get("headers", None),
             )
         elif chunker.lower() == "markdown":
@@ -62,7 +62,7 @@ def chunk_document(self, document, chunker, chunker_params):
 
             chunker = MarkdownChunker(
                 chunk_size=chunker_params.get("chunk_size", 0),
-                chunk_overlap=chunker_params.get("overlap_size", 0)
+                overlap_size=chunker_params.get("overlap_size", -1)
             )
         else:
             raise ValueError(f"Chunker {chunker} not supported")
diff --git a/graphrag/tests/test_character_chunker.py b/graphrag/tests/test_character_chunker.py
index f132ce7..8b60b06 100644
--- a/graphrag/tests/test_character_chunker.py
+++ b/graphrag/tests/test_character_chunker.py
@@ -5,7 +5,7 @@
 class TestCharacterChunker(unittest.TestCase):
     def test_chunk_without_overlap(self):
         """Test chunking without overlap."""
-        chunker = CharacterChunker(chunk_size=4)
+        chunker = CharacterChunker(chunk_size=4, overlap_size=0)
         input_string = "abcdefghijkl"
         expected_chunks = ["abcd", "efgh", "ijkl"]
         self.assertEqual(chunker.chunk(input_string), expected_chunks)
@@ -33,7 +33,7 @@ def test_empty_input_string(self):
 
     def test_input_shorter_than_chunk_size(self):
         """Test input string shorter than chunk size."""
-        chunker = CharacterChunker(chunk_size=10)
+        chunker = CharacterChunker(chunk_size=10, overlap_size=0)
         input_string = "abc"
         expected_chunks = ["abc"]
         self.assertEqual(chunker.chunk(input_string), expected_chunks)
@@ -46,24 +46,27 @@ def test_last_chunk_shorter_than_chunk_size(self):
         self.assertEqual(chunker.chunk(input_string), expected_chunks)
 
     def test_chunk_size_equals_overlap_size(self):
-        """Test when chunk size equals overlap size."""
+        """Test when chunk size equals overlap size raises on chunk()."""
+        chunker = CharacterChunker(chunk_size=4, overlap_size=4)
         with self.assertRaises(ValueError):
-            CharacterChunker(chunk_size=4, overlap_size=4)
+            chunker.chunk("abcdefgh")
 
     def test_overlap_larger_than_chunk_should_raise_error(self):
-        """Test initialization with overlap size larger than chunk size should raise an error."""
+        """Test overlap size larger than chunk size raises on chunk()."""
+        chunker = CharacterChunker(chunk_size=3, overlap_size=4)
         with self.assertRaises(ValueError):
-            CharacterChunker(chunk_size=3, overlap_size=4)
+            chunker.chunk("abcdefgh")
 
-    def test_chunk_size_zero_should_raise_error(self):
-        """Test initialization with a chunk size of zero should raise an error."""
-        with self.assertRaises(ValueError):
-            CharacterChunker(chunk_size=0, overlap_size=0)
+    def test_chunk_size_zero_uses_default(self):
+        """Test that chunk_size=0 falls back to default values."""
+        chunker = CharacterChunker(chunk_size=0)
+        self.assertEqual(chunker.chunk_size, 2048)
+        self.assertEqual(chunker.overlap_size, 256)
 
-    def test_chunk_size_negative_should_raise_error(self):
-        """Test initialization with a negative chunk size."""
-        with self.assertRaises(ValueError):
-            CharacterChunker(chunk_size=-1)
+    def test_chunk_size_negative_uses_default(self):
+        """Test that negative chunk_size falls back to default values."""
+        chunker = CharacterChunker(chunk_size=-1)
+        self.assertEqual(chunker.chunk_size, 2048)
 
 
 if __name__ == "__main__":
diff --git a/report-service/app/routers/root.py b/report-service/app/routers/root.py
index 2e618a2..01a567a 100644
--- a/report-service/app/routers/root.py
+++ b/report-service/app/routers/root.py
@@ -5,7 +5,7 @@
 from fastapi import APIRouter, Request, Depends, Response
 from typing import Annotated
 
-from common.config import get_completion_config, get_llm_service
+from common.config import get_completion_config, get_embedding_config, get_llm_service
 from common.py_schemas import ReportCreationRequest
 
 from report_agent.agent import TigerGraphReportAgent
@@ -19,17 +19,15 @@
 
 @router.get("/")
 def read_root():
-    return {"config": llm_config["model_name"]}
+    return {"config": get_completion_config().get("llm_model", "unknown")}
 
 
 @router.get("/health")
 async def health():
     return {
         "status": "healthy",
-        "llm_completion_model": llm_config["completion_service"]["llm_model"],
-        "embedding_service": llm_config["embedding_service"][
-            "embedding_model_service"
-        ],
+        "llm_completion_model": get_completion_config().get("llm_model", "unknown"),
+        "embedding_service": get_embedding_config().get("embedding_model_service", "unknown"),
     }
 
 def retrieve_template(template_name: str):