apersona/docs/ARCHITECTURE.md

9.6 KiB

aPersona System Architecture

Overview

aPersona is a fully local, AI-powered personal assistant designed to work entirely offline while providing intelligent, context-aware assistance based on your personal files and behavior patterns.

Core Principles

  • 100% Local: No data leaves your device
  • Privacy-First: All processing happens on your machine
  • Adaptive Learning: Continuously improves based on your interactions
  • Context-Aware: Understands your personal documents and preferences

System Architecture

Backend (Python FastAPI)

backend/
├── app/
│   ├── api/           # REST API endpoints
│   ├── core/          # Core configuration and security
│   ├── db/            # Database models and connections
│   └── services/      # Business logic services
├── ai_core/           # AI/ML components
│   ├── embeddings/    # Text embedding service
│   ├── llm/           # Local LLM integration (Ollama)
│   ├── rag/           # Retrieval-Augmented Generation
│   └── auto_learning/ # Adaptive learning engine
└── requirements.txt

Key Components

  1. FastAPI Application: RESTful API server
  2. SQLAlchemy ORM: Database management with SQLite
  3. Authentication: JWT-based user authentication
  4. File Processing: Multi-format document processing
  5. Vector Database: ChromaDB for semantic search
  6. Local LLM: Ollama integration for AI responses

Frontend (React + TypeScript)

frontend/
├── src/
│   ├── components/    # Reusable UI components
│   ├── pages/         # Page-level components
│   ├── services/      # API service layer
│   ├── store/         # State management (Zustand)
│   └── utils/         # Utility functions
├── index.html
└── package.json

Key Technologies

  1. React 18: Modern UI framework
  2. TypeScript: Type-safe development
  3. TailwindCSS: Utility-first styling
  4. Vite: Fast build tool and dev server
  5. React Query: Server state management
  6. Zustand: Client state management

AI Core Components

1. Embedding Service (ai_core/embeddings/)

  • Purpose: Convert text to numerical vectors for semantic search
  • Model: SentenceTransformers (all-MiniLM-L6-v2)
  • Features:
    • Caching for performance
    • Batch processing
    • Similarity computation

2. Vector Store (ai_core/rag/)

  • Purpose: Store and search document embeddings
  • Technology: ChromaDB with persistent storage
  • Capabilities:
    • Semantic similarity search
    • Metadata filtering
    • User-specific collections

3. LLM Integration (ai_core/llm/)

  • Purpose: Local language model integration
  • Technology: Ollama (supports Mistral, LLaMA, etc.)
  • Features:
    • Streaming responses
    • Context management
    • Error handling

4. File Processing (ai_core/file_processing/)

  • Supported Formats: PDF, DOCX, TXT, Images (OCR), Markdown
  • Features:
    • Content extraction
    • Auto-categorization
    • Metadata extraction
    • Text chunking for embeddings

Auto-Learning System

The auto-learning module is the heart of aPersona's intelligence, continuously adapting to user behavior and preferences.

Learning Components

1. Interaction Analysis

class LearningEngine:
    async def analyze_user_interactions(self, user_id: int):
        # Analyzes patterns in user queries and responses
        - Frequency patterns
        - Topic preferences  
        - Response quality metrics
        - Search patterns
        - Time-based usage patterns

2. Preference Learning

The system learns user preferences across multiple dimensions:

  • Response Style: Concise vs. detailed responses
  • Topic Interests: Frequently discussed subjects
  • Time Patterns: When user is most active
  • File Usage: Most accessed documents

3. Adaptive Prompting

async def generate_personalized_prompt(self, user_id: int, base_prompt: str):
    # Creates personalized system prompts based on learned preferences
    - User's communication style
    - Preferred response length
    - Topic expertise areas
    - Context preferences

4. Proactive Suggestions

The system generates intelligent suggestions:

  • Reminder Optimization: Suggests optimal reminder times
  • File Organization: Proposes file organization improvements
  • Content Discovery: Recommends related documents
  • Workflow Improvements: Suggests process optimizations

Learning Data Flow

graph TD
    A[User Interaction] --> B[Store Interaction Data]
    B --> C[Analyze Patterns]
    C --> D[Update Preferences]
    D --> E[Generate Personalized Prompts]
    E --> F[Improve Responses]
    F --> G[Collect Feedback]
    G --> A

Learning Metrics

  1. Confidence Scores: How certain the system is about preferences
  2. Success Rates: Effectiveness of learned patterns
  3. Usage Counts: Frequency of pattern application
  4. Feedback Integration: User satisfaction incorporation

Data Storage

Database Schema

Core Tables

  1. Users: User accounts and authentication
  2. UserFiles: Uploaded files and metadata
  3. UserInteractions: All user-AI interactions
  4. UserPreferences: Learned user preferences
  5. LearningPatterns: Detected behavioral patterns
  6. Reminders: User reminders and notifications

Vector Storage

  • ChromaDB Collections: Document embeddings with metadata
  • User-Specific Collections: Isolated data per user
  • Embedding Cache: Local cache for faster processing

Security & Privacy

Data Protection

  1. Local Storage: All data remains on user's device
  2. Encrypted Authentication: JWT tokens with secure hashing
  3. No External APIs: No cloud dependencies
  4. User Data Isolation: Multi-user support with data separation

File Security

  1. Access Controls: User-based file access
  2. Secure Upload: File validation and sanitization
  3. Safe Processing: Sandboxed file processing
  4. Cleanup: Temporary file management

RAG (Retrieval-Augmented Generation) System

How It Works

  1. Document Ingestion:

    • Files are processed and chunked
    • Text is converted to embeddings
    • Metadata is extracted and stored
  2. Query Processing:

    • User query is embedded
    • Semantic search finds relevant chunks
    • Context is assembled for LLM
  3. Response Generation:

    • LLM receives query + relevant context
    • Personalized prompts are applied
    • Response is generated and returned
  4. Learning Loop:

    • User feedback is collected
    • Patterns are analyzed
    • System adapts for future queries

Context Assembly

def assemble_context(query_embedding, user_preferences):
    # Find relevant documents
    relevant_docs = vector_store.search_similar(query_embedding)
    
    # Apply user preferences
    context = personalize_context(relevant_docs, user_preferences)
    
    # Generate personalized prompt
    system_prompt = generate_personalized_prompt(user_id, base_prompt)
    
    return context, system_prompt

Performance Optimizations

Embedding Cache

  • Local caching of text embeddings
  • Significant performance improvement for repeated content
  • Automatic cache management

Batch Processing

  • Process multiple files simultaneously
  • Batch embedding generation
  • Efficient database operations

Background Tasks

  • Asynchronous file processing
  • Background learning analysis
  • Scheduled maintenance tasks

Deployment Architecture

Local Development

# Backend
cd backend && python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload

# Frontend  
cd frontend && npm install
npm run dev

# AI Services
ollama serve
ollama pull mistral
ollama pull nomic-embed-text

Production Deployment

  • Containerization: Docker support for easy deployment
  • Service Management: Systemd service files
  • Automatic Updates: Self-updating mechanisms
  • Backup System: Automated data backups

Extending the System

Adding New File Types

  1. Implement processor in ai_core/file_processing/
  2. Add MIME type mapping
  3. Update file upload validation
  4. Test with sample files

Adding New Learning Patterns

  1. Extend LearningEngine class
  2. Add new pattern types
  3. Implement analysis logic
  4. Update preference storage

Custom LLM Integration

  1. Implement LLM client interface
  2. Add configuration options
  3. Update prompt generation
  4. Test with target model

Monitoring & Analytics

System Health

  • AI service availability
  • Database performance
  • File processing status
  • Memory and disk usage

User Analytics

  • Interaction frequency
  • Learning effectiveness
  • Feature usage patterns
  • System performance metrics

Future Enhancements

Planned Features

  1. Multi-modal Support: Image understanding and generation
  2. Voice Interface: Speech-to-text and text-to-speech
  3. Advanced Scheduling: Calendar integration and smart scheduling
  4. Team Features: Shared knowledge bases (while maintaining privacy)
  5. Mobile App: Native mobile applications
  6. Plugin System: Extensible plugin architecture

Research Areas

  1. Federated Learning: Improve models without data sharing
  2. Advanced RAG: More sophisticated retrieval strategies
  3. Multi-agent Systems: Specialized AI agents for different tasks
  4. Continuous Learning: Real-time model adaptation

This architecture ensures aPersona remains a powerful, private, and continuously improving personal AI assistant that truly understands and adapts to each user's unique needs and preferences.