Skip to content

Commit 4972093

Browse files
chengbiao-jinclaude
andcommitted
fix: config reload race conditions, chatbot model selection, and add UI for top_k/num_hops
- Fix chatbot agent using wrong model (llm_model instead of chat_model) - Ensure get_completion_config always returns chat_model with llm_model fallback - Restore startup validation for llm_service and llm_model - Add _config_file_lock to prevent concurrent config file overwrites - Replace clear()+update() with atomic dict updates in reload functions - Load community summarization prompt at call time instead of import time - Add top_k and num_hops fields to GraphRAG config UI - Fix ECC URL defaults to match docker-compose service names - Document all supported config parameters in README - Bump TigerGraph version to 4.2.2 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 9c9f9eb commit 4972093

12 files changed

Lines changed: 299 additions & 115 deletions

File tree

README.md

Lines changed: 98 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@
3333
- [GraphRAG configuration](#graphrag-configuration)
3434
- [Chat configuration](#chat-configuration)
3535
- [LLM provider configuration](#llm-provider-configuration)
36+
- [Supported parameters](#supported-parameters)
37+
- [Provider examples](#provider-examples)
3638
- [OpenAI](#openai)
3739
- [Google GenAI](#google-genai)
3840
- [GCP VertexAI](#gcp-vertexai)
@@ -198,10 +200,10 @@ Run command `docker compose down` and wait for all the service containers to sto
198200
199201
If you prefer to start a TigerGraph Community Edition instance without a license key, please make sure the container can be accessed from the GraphRAG containers by add `--network graphrag_default`:
200202
```
201-
docker run -d -p 14240:14240 --name tigergraph --ulimit nofile=1000000:1000000 --init --network graphrag_default -t tigergraph/community:4.2.1
203+
docker run -d -p 14240:14240 --name tigergraph --ulimit nofile=1000000:1000000 --init --network graphrag_default -t tigergraph/community:4.2.2
202204
```
203205

204-
> Use **tigergraph/tigergraph:4.2.1** if Enterprise Edition is preferred.
206+
> Use **tigergraph/tigergraph:4.2.2** if Enterprise Edition is preferred.
205207
> Setting up **DNS** or `/etc/hosts` properly is an alternative solution to ensure contains can connect to each other.
206208
> Or modify`hostname` in `db_config` section of `configs/server_config.json` and replace `http://tigergraph` to your tigergraph container IP address, e.g., `http://172.19.0.2`.
207209
@@ -419,6 +421,8 @@ Copy the below into `configs/server_config.json` and edit the `hostname` and `ge
419421
"hostname": "http://tigergraph",
420422
"restppPort": "9000",
421423
"gsPort": "14240",
424+
"username": "tigergraph",
425+
"password": "tigergraph",
422426
"getToken": false,
423427
"default_timeout": 300,
424428
"default_mem_threshold": 5000,
@@ -427,22 +431,64 @@ Copy the below into `configs/server_config.json` and edit the `hostname` and `ge
427431
}
428432
```
429433

434+
| Parameter | Type | Default | Description |
435+
| --- | --- | --- | --- |
436+
| `hostname` | string | `"http://tigergraph"` | TigerGraph server URL. |
437+
| `restppPort` | string | `"9000"` | RESTPP port for TigerGraph API requests. |
438+
| `gsPort` | string | `"14240"` | GSQL port for TigerGraph admin operations. |
439+
| `username` | string | `"tigergraph"` | TigerGraph database username. |
440+
| `password` | string | `"tigergraph"` | TigerGraph database password. |
441+
| `getToken` | bool | `false` | Set to `true` if token authentication is enabled on TigerGraph. |
442+
| `graphname` | string | `""` | Default graph name. Usually left empty (selected at runtime). |
443+
| `apiToken` | string | `""` | Pre-generated API token. If set, token-based auth is used instead of username/password. |
444+
| `default_timeout` | int | `300` | Default query timeout in seconds. |
445+
| `default_mem_threshold` | int | `5000` | Memory threshold (MB) for query execution. |
446+
| `default_thread_limit` | int | `8` | Max threads for query execution. |
447+
430448
### GraphRAG configuration
431449
Copy the below code into `configs/server_config.json`. You shouldn’t need to change anything unless you change the port of the chat history service in the Docker Compose file.
432450

433-
`reuse_embedding` to `true` will skip re-generating the embedding if it already exists.
434-
`ecc` and `chat_history_api` are the addresses of internal components of GraphRAG.If you use the Docker Compose file as is, you don’t need to change them.
435-
436451
```json
437452
{
438453
"graphrag_config": {
439454
"reuse_embedding": false,
440-
"ecc": "http://eventual-consistency-service:8001",
441-
"chat_history_api": "http://chat-history:8002"
455+
"ecc": "http://graphrag-ecc:8001",
456+
"chat_history_api": "http://chat-history:8002",
457+
"chunker": "semantic",
458+
"extractor": "llm",
459+
"top_k": 5,
460+
"num_hops": 2
442461
}
443462
}
444463
```
445464

465+
| Parameter | Type | Default | Description |
466+
| --- | --- | --- | --- |
467+
| `reuse_embedding` | bool | `true` | Skip re-generating the embedding if it already exists on a vertex. |
468+
| `ecc` | string | `"http://graphrag-ecc:8001"` | URL of the Entity-Context-Community (ECC) service. No change needed when using the provided Docker Compose file. |
469+
| `chat_history_api` | string | `"http://chat-history:8002"` | URL of the chat history service. No change needed when using the provided Docker Compose file. |
470+
| `chunker` | string | `"semantic"` | Default document chunker. Options: `semantic`, `character`, `regex`, `markdown`, `html`, `recursive`. |
471+
| `extractor` | string | `"llm"` | Entity extraction method. Options: `llm`, `graphrag`. |
472+
| `chunker_config` | object | `{}` | Chunker-specific settings. For `character`/`markdown`/`recursive`: `chunk_size`, `overlap_size`. For `semantic`: `method`, `threshold`. For `regex`: `pattern`. |
473+
| `top_k` | int | `5` | Number of top similar results to retrieve during search. |
474+
| `num_hops` | int | `2` | Number of graph hops to traverse when expanding retrieved results. |
475+
| `num_seen_min` | int | `2` | Minimum number of times a node must appear across retrievals to be included. |
476+
| `community_level` | int | `2` | Community hierarchy level used for community search. |
477+
| `chunk_only` | bool | `true` | If true, hybrid search only retrieves document chunks (not entities). |
478+
| `doc_only` | bool | `false` | If true, hybrid search only retrieves from document chunks, skipping entity traversal. |
479+
| `with_chunk` | bool | `true` | If true, community search also includes document chunks in results. |
480+
| `doc_process_switch` | bool | `true` | Enable/disable document processing during knowledge graph build. |
481+
| `entity_extraction_switch` | bool | same as `doc_process_switch` | Enable/disable entity extraction during knowledge graph build. |
482+
| `community_detection_switch` | bool | same as `entity_extraction_switch` | Enable/disable community detection during knowledge graph build. |
483+
| `load_batch_size` | int | `500` | Batch size for upserting vertices during document loading. |
484+
| `upsert_delay` | int | `0` | Delay in seconds between upsert batches. |
485+
| `tg_concurrency` | int | `10` | Max concurrent requests to TigerGraph during processing. |
486+
| `process_interval_seconds` | int | `300` | Interval for background consistency processing (when enabled). |
487+
| `cleanup_interval_seconds` | int | `300` | Interval for background cleanup (when enabled). |
488+
| `checker_batch_size` | int | `100` | Number of vertices to scan per batch during background consistency checking. (Also accepts legacy key `batch_size`.) |
489+
| `enable_consistency_checker` | bool | `false` | Enable the background consistency checker. |
490+
| `graph_names` | list | `[]` | Graphs to monitor when consistency checker is enabled. |
491+
446492
### Chat configuration
447493
Copy the below code into `configs/server_config.json`. You shouldn’t need to change anything unless you change the port of the chat history service in the Docker Compose file.
448494

@@ -464,6 +510,51 @@ Copy the below code into `configs/server_config.json`. You shouldn’t need to c
464510
### LLM provider configuration
465511
In the `llm_config` section of `configs/server_config.json` file, copy JSON config template from below for your LLM provider, and fill out the appropriate fields. Only one provider is needed.
466512

513+
#### Supported parameters
514+
515+
**Top-level `llm_config` parameters:**
516+
517+
| Parameter | Type | Default | Description |
518+
| --- | --- | --- | --- |
519+
| `authentication_configuration` | object || Shared authentication credentials. Merged into all service configs (service-specific values take precedence). |
520+
| `token_limit` | int || Shared token limit propagated to `completion_service` and `embedding_service` if they don't define their own. Use `0` or negative for unlimited. |
521+
522+
**`completion_service` parameters:**
523+
524+
| Parameter | Type | Required | Default | Description |
525+
| --- | --- | --- | --- | --- |
526+
| `llm_service` | string | **Yes** || LLM provider. Options: `openai`, `azure`, `vertexai`, `genai`, `bedrock`, `sagemaker`, `groq`, `ollama`, `huggingface`, `watsonx`. |
527+
| `llm_model` | string | **Yes** || Model name for ECC/GraphRAG tasks (e.g., `gpt-4.1-mini`). |
528+
| `chat_model` | string | No | same as `llm_model` | Model name for the chatbot. If not set, falls back to `llm_model`. Allows using a different (e.g., cheaper/faster) model for chat vs. ingestion. |
529+
| `authentication_configuration` | object | No | inherited from top-level | Service-specific auth credentials (overrides top-level). |
530+
| `model_kwargs` | object | No | `{}` | Additional keyword arguments passed to the LLM (e.g., `{"temperature": 0}`). |
531+
| `prompt_path` | string | No | `"./common/prompts/openai_gpt4/"` | Path to prompt template files. |
532+
| `base_url` | string | No || Custom API base URL (for self-hosted or proxy endpoints). |
533+
| `token_limit` | int | No | inherited from top-level | Max token limit for this service. |
534+
535+
**`embedding_service` parameters:**
536+
537+
| Parameter | Type | Required | Default | Description |
538+
| --- | --- | --- | --- | --- |
539+
| `embedding_model_service` | string | **Yes** || Embedding provider. Options: `openai`, `azure`, `vertexai`, `genai`, `bedrock`, `ollama`. |
540+
| `model_name` | string | **Yes** || Embedding model name (e.g., `text-embedding-3-small`). |
541+
| `dimensions` | int | No | `1536` | Embedding vector dimensions. |
542+
| `authentication_configuration` | object | No | inherited from top-level | Service-specific auth credentials (overrides top-level). |
543+
544+
**`multimodal_service` parameters (optional):**
545+
546+
Used for vision/image description tasks during document ingestion. If not configured, a default vision model is auto-derived from the completion service provider.
547+
548+
| Parameter | Type | Required | Default | Description |
549+
| --- | --- | --- | --- | --- |
550+
| `llm_service` | string | No | same as completion | Multimodal LLM provider. |
551+
| `llm_model` | string | No | auto-derived per provider | Vision model name (e.g., `gpt-4o`). |
552+
| `authentication_configuration` | object | No | inherited from top-level | Service-specific auth credentials. |
553+
| `model_kwargs` | object | No | `{}` | Additional keyword arguments. |
554+
| `prompt_path` | string | No | `"./common/prompts/openai_gpt4/"` | Path to prompt template files. |
555+
556+
#### Provider examples
557+
467558
#### OpenAI
468559
In addition to the `OPENAI_API_KEY`, `llm_model` and `model_name` can be edited to match your specific configuration details.
469560

0 commit comments

Comments
 (0)