Agents provide enhanced chat experiences by automatically selecting between Simple Chat and RAG Chat based on document availability, with all routing handled intelligently by LangGraph.
Processing Flow
1. Agent Selection and Routing
- User selects agent from interface
- Frontend emits single socket event with query and agent ID
- Node.js backend receives request
- LangGraph analyzes agent configuration and requirements
- Backend retrieves agent data from MongoDB
- Agent ID and configuration fully loaded
- Agent metadata (tools, prompt, document presence) verified
- Document availability automatically detected
2. Document-Based Path Detection
With Documents (RAG Chat):- Agent’s custom system prompt
- Document retrieval from Pinecone
- Web Analysis Tool (if supported)
- Image Generation Tool (OpenAI models only)
- Web Search Tool (SearxNG - not supported for GPT-4o latest, DeepSeek, Qwen)
- Agent’s system prompt
- Standard chat processing
- Image Generation Tool (OpenAI models only)
- Web Analysis Tool
- Web Search Tool (SearxNG)
- LangGraph automatically detects document presence
- Backend determines appropriate processing path
- No frontend decision logic required
3. Unified Context Assembly
Combined Context Elements:- Agent Prompt: Custom agent instructions
- Chat History: Previous conversation messages
- User Query: Current user input
- Documents (if applicable): Retrieved vector chunks
- Overflow handled using rolling window strategy
- Context trimmed automatically when limits approached
- Priority given to recent messages and relevant documents
4. Single LLM Call Processing
Execution:- LangGraph assembles complete context
- Single LLM call with all required information
- Response generated and streamed to frontend
- MongoDB storage via Cost Callback tracker
- LLM response content
- Agent ID and configuration
- Token cost and usage metrics
- Processing time and metadata
- Tool activations (if any)
5. RAG Implementation (Document-Based Agents)
Document Processing:- Text extraction from uploaded files
- Content split into optimized chunks
- Embedding generation using configured model
- Parallel storage in Pinecone and S3
- Query embedding generated
- Pinecone retrieves similar chunks
- Chunks combined with agent prompt
- Enhanced context sent to LLM
- Total LLM Calls: 1 call with all context
- Consistent embedding model for upload and retrieval
- Top-k vector search with agent-level metadata filtering
- Automatic context size management
Architecture

Agent Processing Architecture
Tool Activation Matrix
Tool | Simple Chat | RAG Chat | Model Requirement |
---|---|---|---|
Document Retrieval | ❌ | ✅ | Any |
Web Search (SearxNG) | ✅ | ✅ | All except GPT-4o latest, DeepSeek, Qwen |
Web Analysis Tool | ✅ | ✅ | Any |
Image Generation | ✅ | ✅ | OpenAI models only |
Key Components
Component | Purpose |
---|---|
LangGraph Router | Automatic path selection and routing |
Agent Repository | Agent configuration and prompt retrieval |
Document Detector | Identifies RAG vs Simple Chat requirements |
Pinecone Client | Vector retrieval for document context |
Cost Tracker | Token usage and pricing tracking |
MongoDB Handler | Response and metrics persistence |
Backend Intelligence
Automatic Detection
LangGraph backend identifies:- Agent Configuration: Custom prompts and settings
- Document Availability: RAG vs Simple Chat routing
- Tool Requirements: Web search, image generation needs
- Model Capabilities: Validates feature support
Decision Flow
Cost Optimization
Single Call Efficiency
Token Reduction:- Previous: Multiple calls with repeated context
- Current: Single call with optimized context
- Savings: 40-60% reduction in token usage
- Input tokens measured
- Output tokens tracked
- Cost calculated per interaction
- Stored in MongoDB for reporting
Web Search Independence
SearxNG Integration
Benefits:- No dependency on OpenAI search features
- Works across multiple model providers
- Self-hosted for privacy and control
- Consistent search experience
- Supported: Most models including GPT-4, Claude, Gemini
- Not Supported: GPT-4o latest, DeepSeek, Qwen
Troubleshooting
Agent RAG Not Triggering
Potential Issues:- No documents linked to agent
- Embeddings not properly generated
- Pinecone index misconfigured
- Document metadata filters incorrect
- Verify agent has linked documents in database
- Check embedding generation logs
- Validate Pinecone connection and index
- Review metadata filtering logic
- Test document retrieval independently