Edit Page

Sophia AI - System Overview & Architecture

Cloud

Introduction

Sophia is an advanced AI service developed by SoftInstigate. It leverages Large Language Models (LLMs) with Retrieval-Augmented Generation (RAG) to provide intelligent support services for organizations and educational institutions. Sophia also supports Agentic Mode, where the AI autonomously uses tools to gather information before answering complex questions.

System Architecture

The system consists of two main components:

Backend (sophia-restheart)

  • Built on RESTHeart framework

  • Java 25 with Maven build system

  • Integration with AWS Bedrock using Claude model by Anthropic

  • MongoDB database for data persistence

  • Vector store for embedding storage and semantic search

  • LangChain4j integration for AI/ML operations

  • Agentic loop with tool-use support (search, file retrieval)

Frontend (sophia-web)

  • Angular web application

  • Responsive design with customizable themes

  • Real-time chat interface with agentic event streaming

  • Admin panel for managing contexts, knowledge, segments, and API tokens

  • Multi-language support (Italian/English)

  • Embeddable via iframe

Key Features

AI-Powered Conversations

Sophia utilizes the Claude model from AWS Bedrock to deliver intelligent, context-aware conversations. The system implements a Retrieval-Augmented Generation (RAG) approach that combines real-time document retrieval with advanced language processing. Administrators can customize prompt templates to tailor the AI’s behavior and tone, while the system maintains conversation context through intelligent chat history management.

Agentic Mode

When enabled on a context, Sophia operates as an autonomous agent that iteratively uses tools — searching the knowledge base, reading files, and gathering information — before composing a final response. The agent loop runs up to a configurable number of iterations, and tool execution events can be streamed to the client in real time so users see the reasoning process as it happens.

Knowledge Base Management

The platform supports comprehensive document management with compatibility for TXT, MD, PDF, and HTML files. Documents are automatically processed through vector embeddings using Amazon Titan Embed Text v2 for semantic search. The admin panel provides a web interface for uploading, browsing (flat and tree views), filtering, and deleting documents.

Context-Based Knowledge Segregation

Contexts are the core abstraction for knowledge partitioning. Each context defines a set of tags that act as mandatory filters on every vector search — ensuring that users in one context cannot access documents belonging to another. This provides enterprise-grade data isolation without requiring separate deployments.

Security & Authentication

Authentication uses cookie-based sessions for the web interface and JWT bearer tokens for API and MCP access. Role-based access control distinguishes between admin users (who manage contexts, knowledge, and tokens) and regular users (who interact with the chat). API tokens can be bound to specific contexts and tags, ensuring that programmatic access respects the same knowledge segregation boundaries.

Admin Panel

A web-based administration interface provides full management capabilities:

  • Contexts: Create, edit, and delete contexts with prompt templates, tag filters, RAG options, and agentic mode settings

  • Knowledge: Upload, browse, tag, and delete documents with flat and tree views

  • Segments: Inspect text segments, test semantic search, and debug retrieval quality

  • API Tokens: Issue, revoke, and delete JWT tokens with auto-generated MCP configuration snippets

Real-time Communication

WebSocket-based messaging with MongoDB Change Streams provides live updates. Streaming response delivery allows users to see AI responses as they’re composed. In agentic mode, tool execution events are streamed in real time, showing tool names, arguments, results, and durations.

Technical Stack

Backend Technologies

The backend is built on RESTHeart framework running on Java 25. MongoDB serves as the primary database with Atlas compatibility for cloud deployments. LangChain4j integrates with AWS Bedrock for Claude LLM access and Amazon Titan for embeddings. Security is handled through JWT authentication with cookie-based session support.

Frontend Technologies

The frontend is built with Angular 21 using standalone components and signals for state management. Tailwind CSS v4 provides styling, with spartan-ng UI components. The admin panel uses responsive layouts with mobile support.

Deployment Architecture

Production Deployment

Sophia deploys as a managed RESTHeart Cloud service. The system uses Docker containers for consistent environments. As a cloud service, all infrastructure concerns — scaling, monitoring, updates — are handled automatically.

Integration Points

  • REST API endpoints for backend integration

  • WebSocket connections for real-time chat

  • MCP server for AI client integration (Claude Desktop, Cursor, VS Code)

  • iframe embedding for frontend integration

  • JWT tokens for programmatic access