Sophia AI - System Overview & Architecture
Introduction
Sophia is an advanced AI service developed by SoftInstigate. It leverages Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to provide intelligent support services for organizations and educational institutions.
System Architecture
The system consists of two main components:
Backend (sophia-restheart)
-
Built on RESTHeart framework
-
Java 21 with Maven build system
-
Integration with AWS Bedrock using Claude model by Anthropic
-
MongoDB database for data persistence
-
Vector store for embedding storage and semantic search
-
LangChain4j integration for AI/ML operations
Frontend (sophia-web)
-
Angular web application
-
Responsive design with customizable themes
-
Real-time chat interface
-
Multi-language support (Italian/English)
-
Embeddable via iframe
Key Features
AI-Powered Conversations
Sophia utilizes the Claude model from AWS Bedrock to deliver intelligent, context-aware conversations. The system implements a Retrieval-Augmented Generation (RAG) approach that combines real-time document retrieval with advanced language processing to provide accurate and relevant responses. Administrators can customize prompt templates to tailor the AI’s behavior and tone to specific use cases, while the system maintains conversation context through intelligent chat history management.
Knowledge Base Management
The platform supports comprehensive document management with compatibility for multiple file formats including TXT, MD, PDF, and HTML documents. Documents are automatically processed through vector embeddings to enable sophisticated semantic search capabilities. The system performs intelligent document segmentation and indexing, while implementing tag-based content partitioning that allows administrators to control access to specific information sets based on user roles and permissions.
Security & Authentication
Security is implemented through robust JWT-based authentication mechanisms that ensure secure API token validation across all system interactions. The platform provides domain-specific knowledge base access controls, allowing organizations to segment information based on user groups or departments. Comprehensive user session management ensures secure and reliable access while maintaining audit trails for administrative oversight.
Real-time Communication
The system delivers seamless real-time communication through WebSocket-based messaging infrastructure. MongoDB Change Streams provide live updates that ensure users receive immediate responses as they’re generated. The platform supports streaming response delivery that allows users to see AI responses as they’re being composed, while comprehensive connection status monitoring ensures reliable communication channels.
Technical Stack
Backend Technologies
The backend infrastructure is built on RESTHeart framework running on Java 21, providing a robust and scalable foundation for enterprise applications. The system utilizes MongoDB as its primary database with full Atlas compatibility for cloud deployments. AI and machine learning capabilities are powered by LangChain4j 0.36.2, which integrates seamlessly with AWS Bedrock to access the Claude language model. The application is built and managed using Maven, while security is handled through comprehensive JWT authentication mechanisms.
Frontend Technologies
The user interface is developed using Angular framework with TypeScript for type-safe development and enhanced maintainability. Styling is implemented through SCSS for advanced CSS preprocessing capabilities. The system incorporates ngx-markdown with PrismJS integration for sophisticated text rendering and code highlighting. FontAwesome provides comprehensive iconography throughout the interface, while the entire frontend build process is managed through Angular CLI for streamlined development and deployment workflows.
Deployment Architecture
Production Deployment
Sophia is designed for flexible deployment scenarios, supporting both cloud-native and on-premises installations. The system can be deployed using Docker containers for consistent environments across development, staging, and production. Load balancing capabilities ensure high availability and optimal performance under varying user loads.
Scalability Considerations
The architecture supports horizontal scaling through MongoDB clustering and RESTHeart’s stateless design. Vector search capabilities scale with document volume, while the frontend can be served via CDN for global performance optimization.
Integration Points
The system provides multiple integration options including REST API endpoints for backend integration, WebSocket connections for real-time features, and iframe embedding for frontend integration. Authentication can be integrated with existing organizational systems through JWT token validation.