redstring/aiinstructions.txt at main · theredstring/redstring · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
# AI Instructions for Redstring UI React

## Project Overview
Redstring is a semantic knowledge graph application that allows users to create, connect, and explore concepts through visual nodes and relationships. The system integrates with semantic web technologies including Wikidata, DBpedia, and Wikipedia.

## Recent Enhancements (Latest)

### Bridge + Wizard Revival (2025-11)
- **Goal**: Reactivate The Wizard via the HTTP bridge instead of relying on spatial LLM outputs.
- **Core Files**: `src/ai/BridgeClient.jsx` (state sync), `src/components/panel/views/LeftAIView.jsx` (AI panel), `bridge-daemon.js` (Orchestrator HTTP server), `src/services/bridgeConfig.js` (URL helpers).
- **Run Loop**:
  1. `npm install` (first time).
  2. `npm run dev` (Vite UI on :4000) in one shell.
  3. `npm run bridge` (starts `bridge-daemon.js` on :3001, auto-kills stale listeners). Use `npm run bridge -- --kill-only` to stop it.
  4. Verify the daemon with `curl http://localhost:3001/health` and `curl http://localhost:3001/api/bridge/health`.
  5. In the UI, click the 🔑 icon in the AI panel to store your Anthropic/OpenRouter key locally (never committed), or configure a local LLM server (Ollama, LM Studio, etc.) for privacy and offline use.
- **Using The Wizard**:
  - Bridge client now mounts automatically via `<BridgeClient />` in `src/App.jsx` and streams graph state + pending actions to the daemon.
  - The left AI panel replays bridge telemetry, reconnects via the refresh icon, and calls `/api/ai/chat` or `/api/ai/agent` with your stored key.
  - Status messages surface in-panel whenever the bridge reconnects or health checks fail.
- **Troubleshooting**:
  - If the refresh icon reports an unreachable bridge, rerun `npm run bridge` and ensure nothing else is bound to :3001.
  - The bridge expects your API key in the `Authorization` header; the UI handles this once the key is stored locally.
  - For local LLM servers, ensure the server is running and accessible at the configured endpoint (e.g., `http://localhost:11434` for Ollama).
- **The Wizard's Magic: LLM + Auto-Layout Pipeline**:
  - **Core Philosophy**: LLM handles semantics (what nodes/edges), auto-layout handles spatial positioning (where to place them).
  - **Flow**: User message → `/api/ai/agent` → LLM produces graphSpec → enqueued with layoutAlgorithm → Executor calls `graphLayoutService.applyLayout()` → positioned ops → Committer → UI.
  - **Key Files**: `bridge-daemon.js` (lines 911-1011: graphSpec routing), `src/services/orchestrator/roleRunners.js` (lines 41-132: Executor's `create_subgraph` + auto-layout), `src/services/graphLayoutService.js` (5 deterministic layouts).
  - **LLM Guidance**: AGENT_PLANNER_PROMPT explicitly forbids x/y coordinates, instructs LLM to choose layoutAlgorithm (force/hierarchical/radial/grid/circular) based on structure.
  - **Result**: Wizard generates complex multi-node graphs with deterministic spatial layout, no LLM spatial hallucination.
- **The Wizard Can Now "See" (2025-11)**:
  - **New Capability**: `read_graph_structure` tool allows Wizard to read semantic graph data (nodes, edges, relationships) without spatial coordinates.
  - **Implementation**: `src/services/bridgeStoreAccessor.js` provides read-only access to mirrored UI state, `src/services/orchestrator/roleRunners.js` handles `read_graph_structure` execution, `src/services/Committer.js` sends results to chat.
  - **Usage**: When user asks "what's in this graph?" or "show me what you made", LLM uses intent:"analyze" → triggers `read_graph_structure` → receives semantic data → responds with node/edge summary.
  - **Schema**: `src/services/toolValidator.js` defines validation for `read_graph_structure` (graph_id, include_edges, include_descriptions).
  - **Allowlists**: Added to Planner, Executor, and Auditor allowlists in `src/services/roles.js`.
  - **Result**: Wizard can verify its creations, answer questions about graph contents, and make informed decisions about future additions.
- **Connection Definitions (2025-11)**:
  - **New Capability**: Wizard proactively creates connection definition nodes for all specific relationship types.
  - **How It Works**: Edges can have `definitionNodeIds` that point to node prototypes (NOT instances). When rendering, the edge uses that prototype's color and name. Definition nodes are NOT placed in the graph - they exist as reusable type definitions.
  - **GraphSpec Format**: LLM includes `definitionNode` in edge specs: `{"source":"A","target":"B","type":"orbits","definitionNode":{"name":"Orbital Relationship","color":"#FDB813","description":"what this means"}}`
  - **Deduplication**: `src/services/orchestrator/roleRunners.js` (lines 131-161, 373-403) searches existing prototypes by name before creating new ones, preventing duplicates across graphs.
  - **LLM Guidance**: AGENT_PLANNER_PROMPT instructs to ALWAYS define specific relationships ("orbits", "eats", "manages") but SKIP generic ones ("connects", "relates to").
  - **Result**: All meaningful relationships are first-class semantic concepts with custom colors, automatically deduplicated and reusable across the entire universe.
- **Define Connections Tool (2025-11)**:
  - **Goal**: Label edges that lack definition nodes in the active graph without touching the layout.
  - **Usage**: When the user asks to "define", "label", or "name" connections, the planner queues the `define_connections` tool instead of adding nodes.
  - **Implementation**: `src/services/orchestrator/roleRunners.js` inspects `bridgeStoreData.graphEdges`, creates/reuses definition prototypes, and emits `updateEdgeDefinition` ops for the selected edges.
  - **Options**: The tool accepts `limit` (max edges per run) and `includeGeneralTypes` if the user wants even vague links labeled.
  - **UI Feedback**: When no edges need definitions, the tool emits a friendly read response instead of mutating the graph.
- **Active Graph Awareness**:
  - Treat words like "here", "this graph", or "current graph" as references to the active graph.
  - Always mention the active graph’s name when acting on it so the user knows what’s being modified.
  - If the user asks what tools are available, mention the list (`create_graph`, `create_subgraph`, `define_connections`, `read_graph_structure`, `verify_state`) before continuing.

### Auto Layout & Graph Generation System (2025-01)
- **New Feature**: Comprehensive auto-layout and graph generation system in Debug menu
- **Access**: Redstring Menu → Debug → Generate Test Graph
- **Key Components**:
  - `src/services/graphLayoutService.js` - 5 layout algorithms (force-directed, hierarchical, radial, grid, circular)
  - `src/services/autoGraphGenerator.js` - JSON-LD & Simple JSON parsers, intelligent prototype reuse
  - `src/components/AutoGraphModal.jsx` - Configuration UI modal
- **Features**:
  - Parses Simple JSON and JSON-LD/RDF formats with auto-detection
  - Intelligently reuses existing prototypes (searches by name before creating)
  - Creates positioned instances using selected layout algorithm
  - Four sample data templates for quick testing
  - Respects three-layer architecture (prototypes→instances→graphs)
  - Layout options: force-directed, hierarchical, radial, grid, circular
  - Target modes: new graph, add to current, or replace existing
- **Usage Flow**: Select data source (sample/custom) → Choose layout → Select target → Generate
- **Documentation**: See `AUTO_LAYOUT_GUIDE.md` (complete guide) and `AUTOGRAPH_IMPLEMENTATION_SUMMARY.md` (implementation details)
- **Prototype Intelligence**: Searches for existing prototypes by name to maintain semantic consistency across the universe
- **Layout Quality**: All algorithms include collision avoidance, bounds management, and configurable parameters

### Node Drag Performance Optimization (2025-01)
- **Major Performance Fix**: Optimized node dragging to eliminate THREE critical bottlenecks
- **Dimension Caching**:
  - Two-layer cache (NodeCanvas + utils.js) based on node content, not position
  - Cache key: `${prototypeId}-${name}-${thumbnailSrc}` (ignores x, y, scale)
  - Eliminates 99% of dimension calculations during drag (from 100+ to 0 per frame)
  - LRU cache with automatic eviction to prevent memory leaks
- **SaveCoordinator Optimization**:
  - Skip expensive state hash calculation during 'move' phase
  - Only compute hash when drag ends (phase: 'end')
  - Reduces hash calculations from 60+ per second to 1 per drag
  - Throttle console logs to once per second during drag
- **Hover State Optimization**:
  - Clear hover states once at drag start (handleMouseDown)
  - Skip per-frame hover state clearing during drag (180+ setState calls eliminated)
  - Skip selection box calculations during node drag
  - Hover vision aid automatically disabled during drag, re-enabled after
- **Drag Signal Pattern** (critical for future changes):
  - START: Scale set to 1.1 (implicit)
  - MOVE: `{ isDragging: true, phase: 'move' }` (60+ times/sec)
  - END: `{ phase: 'end', isDragging: false, finalize: true }` (once)
- **Performance Gains**: 30-70x faster dragging (1-3ms vs 25-40ms per frame)
- **Key Files**:
  - `src/NodeCanvas.jsx` (lines 1469-1505, 5225, 5425-5426, 5429, 5739-5742): Dimension cache + hover optimization
  - `src/utils.js` (lines 73-122, 293-313): LRU cache with hash calculation
  - `src/services/SaveCoordinator.js` (lines 81-126): Skip hash during 'move' phase
  - `DRAG_PERFORMANCE_COMPLETE.md`: Complete technical documentation
- **Critical Principles**:
  - Never compute expensive operations during transient 'move' phase - defer to 'end' phase
  - Clear state once at start, not repeatedly during operation
  - Cache based on content, not identity

### Local-First Storage Architecture (2025-01)
- **Critical Architectural Change**: Rewrote `forceSave` to support multi-storage sync instead of Git-centric priority system
- **Multi-Storage Sync**: When saving, system saves to ALL enabled storage locations to keep them in sync:
  - Local file (if linked and has handle)
  - Git repository (if linked and authenticated)
  - Browser storage (always, as cache)
- **Source of Truth**: Only matters when LOADING data (to resolve conflicts), not when SAVING
- **Git Opt-In**: Git only activates when user explicitly links a repository - no automatic creation
- **Independent Local Storage**: Local `.redstring` files work completely independently without Git interference
- **User Privacy**: GitHub cannot access data unless user explicitly enables Git federation
- **Resilience**: If one storage method fails, others still succeed
- **Key Changes**: `src/services/universeBackend.js` (lines 932-1073) - Complete rewrite of `forceSave` method
- **Philosophy**: Storage options are additive, not exclusive. Git federation is a feature you opt into.
- **Documentation**: `GIT_FEDERATION.md` - Local-First Storage Architecture section

### Format Versioning System (2025-01)
- **Comprehensive Versioning**: Implemented versioning system for `.redstring` file format to protect user data during updates
- **Automatic Migration**: Files from older versions (v1.0.0, v2.0.0-semantic) are automatically migrated to current version (v3.0.0)
- **Version Validation**: Files are validated before import with clear error messages for incompatible versions
- **User Feedback**: Migration progress is shown to users with clear status messages
- **Developer Documentation**: Complete guide in `REDSTRING_FORMAT_VERSIONING.md` for maintaining and extending the system
- **Key Files**:
  - `src/formats/redstringFormat.js`: Version constants, validation, and migration logic
  - `src/GitNativeFederation.jsx`: UI integration with validation messages
  - `GIT_FEDERATION.md`: User-facing documentation
  - `REDSTRING_FORMAT_VERSIONING.md`: Developer guide and API reference

### Enhanced Semantic Search System
- **Comprehensive DBpedia Search**: Implemented `comprehensiveDBpediaSearch()` function that explores all DBpedia relationships for an entity, including:
  - Main entity properties and metadata
  - Related entities through `wikiPageWikiLink` properties
  - Entities in the same categories
  - Property categorization (relationships, categories, attributes, external links)

- **Enhanced Semantic Search**: Created `enhancedSemanticSearch()` function that integrates multiple sources:
  - DBpedia (primary source, 70% of results)
  - Wikidata (secondary source, 20% of results)
  - Wikipedia (tertiary source, 10% of results)
  - Results in format compatible with semantic discovery interface

- **SameAs Consolidation**: Implemented `consolidateSameAsResults()` to merge duplicate entities across different sources (Wikidata/DBpedia)

- **Property-Based Relationships**: Enhanced `findRelatedThroughDBpediaProperties()` to find semantically related entities through shared properties

### Semantic Discovery Integration
- Updated semantic discovery interface to use enhanced search functions
- Replaced `knowledgeFederation.importKnowledgeCluster()` with `enhancedSemanticSearch()`
- Added comprehensive logging for debugging and monitoring
- Increased timeout and result limits for better coverage

### Error Handling & Performance
- Fixed stale node reference warnings in semantic discovery
- Added session-based warning tracking to prevent repeated console spam
- Implemented timeout management for external API calls
- Added fallback mechanisms for failed searches

## Core Architecture

### Semantic Web Integration
- **Wikidata**: Direct SPARQL queries with fuzzy search support
- **DBpedia**: Property-based relationship discovery and category matching
- **Wikipedia**: API integration for article summaries and search
- **Local Knowledge Graph**: Semantic similarity algorithms and relationship mapping

### Search Strategies
1. **Direct Entity Search**: Exact and fuzzy matching across sources
2. **Property-Based Search**: Finding entities through shared properties
3. **Category-Based Search**: Discovering concepts in related categories
4. **Relationship Traversal**: Following semantic connections between entities

### Data Flow
1. User searches for concept (e.g., "LittleBigPlanet")
2. Enhanced search queries multiple sources in parallel
3. Results are consolidated and deduplicated
4. SameAs relationships are merged
5. Results are formatted for semantic discovery interface
6. User sees comprehensive list of related concepts with connection information

## Key Functions

### `enhancedSemanticSearch(entityName, options)`
- Main entry point for semantic discovery
- Returns structured results with entities and relationships
- Handles timeouts and fallbacks gracefully

### `comprehensiveDBpediaSearch(entityName, options)`
- Deep exploration of DBpedia entity relationships
- Categorizes properties by type (relationships, categories, attributes)
- Finds related entities through multiple strategies

### `findRelatedThroughDBpediaProperties(entityName, options)`
- Discovers entities linked through `wikiPageWikiLink` properties
- Creates semantic relationship network
- Provides connection context for each related entity

## Usage Examples

### Basic Semantic Search
```javascript
import { enhancedSemanticSearch } from './services/semanticWebQuery.js';

const results = await enhancedSemanticSearch('LittleBigPlanet', {
  timeout: 25000,
  limit: 50,
  includeWikipedia: true
});

console.log(`Found ${results.metadata.totalEntities} entities`);
console.log(`Found ${results.metadata.totalRelationships} relationships`);
```

### Comprehensive DBpedia Exploration
```javascript
import { comprehensiveDBpediaSearch } from './services/semanticWebQuery.js';

const results = await comprehensiveDBpediaSearch('LittleBigPlanet');
console.log(`Properties: ${results.properties.length}`);
console.log(`Categories: ${results.categories.length}`);
console.log(`Related Entities: ${results.relatedEntities.length}`);
```

## Performance Characteristics
- **DBpedia Search**: ~2-5 seconds, 30-50 related entities
- **Wikidata Search**: ~1-3 seconds, 10-20 entities
- **Wikipedia Search**: ~1-2 seconds, 1-5 articles
- **Total Enhanced Search**: ~5-10 seconds, 40-70 total entities

## Troubleshooting

### Common Issues
1. **Timeout Errors**: Increase timeout in options or check network connectivity
2. **Limited Results**: Verify entity name spelling and check if entity exists in DBpedia
3. **Stale Node Warnings**: These are now handled gracefully with session tracking

### Debug Mode
Enable detailed logging by setting console log level:
```javascript
// In browser console
localStorage.setItem('debug', 'semanticWebQuery:*');
```

## Future Enhancements
- **Caching Layer**: Implement result caching for frequently searched entities
- **Incremental Search**: Progressive disclosure of results as they load
- **Semantic Clustering**: Group related entities by semantic similarity
- **Cross-Language Support**: Extend search to multiple languages
- **Real-time Updates**: Live updates from semantic web sources

### Local LLM Integration (2025-01)
- **New Capability**: Support for local LLM servers (Ollama, LM Studio, LocalAI, vLLM, etc.) via OpenAI-compatible API format.
- **Implementation**: `src/services/agent/llmCaller.js` supports `provider: 'local'` or `provider: 'openai'` with custom endpoints, `src/services/apiKeyManager.js` includes local provider presets, `src/ai/components/APIKeySetup.jsx` provides UI for local LLM configuration.
- **Provider Presets**: Ollama (port 11434), LM Studio (port 1234), LocalAI (port 8080), vLLM (port 8000), and custom OpenAI-compatible servers.
- **Configuration**: Users select "💻 Local LLM Server" from provider dropdown, choose a preset or configure manually, enter endpoint URL and model name, test connection, and save.
- **Benefits**: Complete data privacy (no data leaves user's machine), offline capability, zero API costs, lower latency, no rate limits.
- **API Compatibility**: All local providers use OpenAI `/v1/chat/completions` format, making them compatible with existing Wizard agent code.
- **Documentation**: See `LOCAL_LLM_SETUP.md` for detailed setup instructions for each provider.
- **Key Files**: `src/services/agent/llmCaller.js` (OpenAI-compatible endpoint support), `src/services/apiKeyManager.js` (local provider presets), `src/ai/components/APIKeySetup.jsx` (local LLM UI), `LOCAL_LLM_SETUP.md` (setup guide).

## Integration Points
- **Semantic Discovery Panel**: Primary interface for concept exploration
- **Node Canvas**: Visual representation of discovered concepts
- **Knowledge Federation**: Legacy integration maintained for compatibility
- **External APIs**: Wikidata, DBpedia, Wikipedia endpoints
- **Local LLM Servers**: Ollama, LM Studio, LocalAI, vLLM, and custom OpenAI-compatible endpoints

This enhanced semantic search system provides significantly better results for queries like "LittleBigPlanet" by leveraging the rich property-based relationships in DBpedia and consolidating results across multiple semantic web sources.