System Architecture

Understand how Cogira is built — a modular platform designed to stay flexible, scale easily, and give you full control over your AI setup.

Architecture Overview

How It All Fits Together

Frontend Layer

A modern, fast web application that provides a smooth interface for managing documents, research, and AI conversations. It offers a clear overview of datasets, files, users, groups, AI models, and system settings.

API & Business Logic

The backend provides the main API and handles authentication, workspaces, datasets, and communication with AI providers and databases. It acts as the central coordination layer of the system.

Worker Infrastructure

Background workers handle long-running tasks such as processing documents, splitting them into chunks, generating embeddings, and making them searchable. Since document processing can be resource-intensive, this isolated worker layer ensures that already processed data remains fast and responsive to work with.

Data Layer

A relational database stores structured data, a vector database provides fast semantic search, and the storage layer offers S3-compatible file storage. Both the vector database and storage layer can use Cogira’s managed services or be replaced with your own infrastructure if needed.

Built for Flexibility

Cogira is designed to avoid lock-in. Most parts of the system can be replaced or configured to match your needs.

AI Model Freedom

Use your own API keys for providers like OpenAI, Google Vertex AI, or Voyage AI, or run local models. Models can be switched across different tasks without any code changes.

Vector Database Choice

The system is designed to work with managed Qdrant by default, while also supporting Weaviate, and allowing other vector databases to be integrated in the future.

S3-Compatible Storage

Any S3-compatible service can be used for file storage. Files and documents are stored separately from the application itself.

Platform Capabilities

This architecture enables more than simple document search — it supports advanced AI workflows.

Isolated Workspaces

Each workspace is fully separated with its own data, permissions, and AI settings. Users can belong to multiple workspaces with role-based access control.

Streaming Responses

Search results and chat responses are streamed in real time, allowing users to see output as it is generated. Per-operation cost tracking is also supported.

Agentic Deep Research

Complex queries are automatically broken down, multiple data sources are searched, and results are combined into structured answers with APA 7 citations.

Multimodal Document Processing

Intelligent document processing that understands structure, extracts tables and images, and can generate descriptions for visual elements — making everything searchable.