apersona/docs/ARCHITECTURE.md

351 lines
9.6 KiB
Markdown

# aPersona System Architecture
## Overview
aPersona is a fully local, AI-powered personal assistant designed to work entirely offline while providing intelligent, context-aware assistance based on your personal files and behavior patterns.
## Core Principles
- **100% Local**: No data leaves your device
- **Privacy-First**: All processing happens on your machine
- **Adaptive Learning**: Continuously improves based on your interactions
- **Context-Aware**: Understands your personal documents and preferences
## System Architecture
### Backend (Python FastAPI)
```
backend/
├── app/
│ ├── api/ # REST API endpoints
│ ├── core/ # Core configuration and security
│ ├── db/ # Database models and connections
│ └── services/ # Business logic services
├── ai_core/ # AI/ML components
│ ├── embeddings/ # Text embedding service
│ ├── llm/ # Local LLM integration (Ollama)
│ ├── rag/ # Retrieval-Augmented Generation
│ └── auto_learning/ # Adaptive learning engine
└── requirements.txt
```
#### Key Components
1. **FastAPI Application**: RESTful API server
2. **SQLAlchemy ORM**: Database management with SQLite
3. **Authentication**: JWT-based user authentication
4. **File Processing**: Multi-format document processing
5. **Vector Database**: ChromaDB for semantic search
6. **Local LLM**: Ollama integration for AI responses
### Frontend (React + TypeScript)
```
frontend/
├── src/
│ ├── components/ # Reusable UI components
│ ├── pages/ # Page-level components
│ ├── services/ # API service layer
│ ├── store/ # State management (Zustand)
│ └── utils/ # Utility functions
├── index.html
└── package.json
```
#### Key Technologies
1. **React 18**: Modern UI framework
2. **TypeScript**: Type-safe development
3. **TailwindCSS**: Utility-first styling
4. **Vite**: Fast build tool and dev server
5. **React Query**: Server state management
6. **Zustand**: Client state management
### AI Core Components
#### 1. Embedding Service (`ai_core/embeddings/`)
- **Purpose**: Convert text to numerical vectors for semantic search
- **Model**: SentenceTransformers (all-MiniLM-L6-v2)
- **Features**:
- Caching for performance
- Batch processing
- Similarity computation
#### 2. Vector Store (`ai_core/rag/`)
- **Purpose**: Store and search document embeddings
- **Technology**: ChromaDB with persistent storage
- **Capabilities**:
- Semantic similarity search
- Metadata filtering
- User-specific collections
#### 3. LLM Integration (`ai_core/llm/`)
- **Purpose**: Local language model integration
- **Technology**: Ollama (supports Mistral, LLaMA, etc.)
- **Features**:
- Streaming responses
- Context management
- Error handling
#### 4. File Processing (`ai_core/file_processing/`)
- **Supported Formats**: PDF, DOCX, TXT, Images (OCR), Markdown
- **Features**:
- Content extraction
- Auto-categorization
- Metadata extraction
- Text chunking for embeddings
## Auto-Learning System
The auto-learning module is the heart of aPersona's intelligence, continuously adapting to user behavior and preferences.
### Learning Components
#### 1. Interaction Analysis
```python
class LearningEngine:
async def analyze_user_interactions(self, user_id: int):
# Analyzes patterns in user queries and responses
- Frequency patterns
- Topic preferences
- Response quality metrics
- Search patterns
- Time-based usage patterns
```
#### 2. Preference Learning
The system learns user preferences across multiple dimensions:
- **Response Style**: Concise vs. detailed responses
- **Topic Interests**: Frequently discussed subjects
- **Time Patterns**: When user is most active
- **File Usage**: Most accessed documents
#### 3. Adaptive Prompting
```python
async def generate_personalized_prompt(self, user_id: int, base_prompt: str):
# Creates personalized system prompts based on learned preferences
- User's communication style
- Preferred response length
- Topic expertise areas
- Context preferences
```
#### 4. Proactive Suggestions
The system generates intelligent suggestions:
- **Reminder Optimization**: Suggests optimal reminder times
- **File Organization**: Proposes file organization improvements
- **Content Discovery**: Recommends related documents
- **Workflow Improvements**: Suggests process optimizations
### Learning Data Flow
```mermaid
graph TD
A[User Interaction] --> B[Store Interaction Data]
B --> C[Analyze Patterns]
C --> D[Update Preferences]
D --> E[Generate Personalized Prompts]
E --> F[Improve Responses]
F --> G[Collect Feedback]
G --> A
```
### Learning Metrics
1. **Confidence Scores**: How certain the system is about preferences
2. **Success Rates**: Effectiveness of learned patterns
3. **Usage Counts**: Frequency of pattern application
4. **Feedback Integration**: User satisfaction incorporation
## Data Storage
### Database Schema
#### Core Tables
1. **Users**: User accounts and authentication
2. **UserFiles**: Uploaded files and metadata
3. **UserInteractions**: All user-AI interactions
4. **UserPreferences**: Learned user preferences
5. **LearningPatterns**: Detected behavioral patterns
6. **Reminders**: User reminders and notifications
#### Vector Storage
- **ChromaDB Collections**: Document embeddings with metadata
- **User-Specific Collections**: Isolated data per user
- **Embedding Cache**: Local cache for faster processing
## Security & Privacy
### Data Protection
1. **Local Storage**: All data remains on user's device
2. **Encrypted Authentication**: JWT tokens with secure hashing
3. **No External APIs**: No cloud dependencies
4. **User Data Isolation**: Multi-user support with data separation
### File Security
1. **Access Controls**: User-based file access
2. **Secure Upload**: File validation and sanitization
3. **Safe Processing**: Sandboxed file processing
4. **Cleanup**: Temporary file management
## RAG (Retrieval-Augmented Generation) System
### How It Works
1. **Document Ingestion**:
- Files are processed and chunked
- Text is converted to embeddings
- Metadata is extracted and stored
2. **Query Processing**:
- User query is embedded
- Semantic search finds relevant chunks
- Context is assembled for LLM
3. **Response Generation**:
- LLM receives query + relevant context
- Personalized prompts are applied
- Response is generated and returned
4. **Learning Loop**:
- User feedback is collected
- Patterns are analyzed
- System adapts for future queries
### Context Assembly
```python
def assemble_context(query_embedding, user_preferences):
# Find relevant documents
relevant_docs = vector_store.search_similar(query_embedding)
# Apply user preferences
context = personalize_context(relevant_docs, user_preferences)
# Generate personalized prompt
system_prompt = generate_personalized_prompt(user_id, base_prompt)
return context, system_prompt
```
## Performance Optimizations
### Embedding Cache
- Local caching of text embeddings
- Significant performance improvement for repeated content
- Automatic cache management
### Batch Processing
- Process multiple files simultaneously
- Batch embedding generation
- Efficient database operations
### Background Tasks
- Asynchronous file processing
- Background learning analysis
- Scheduled maintenance tasks
## Deployment Architecture
### Local Development
```bash
# Backend
cd backend && python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload
# Frontend
cd frontend && npm install
npm run dev
# AI Services
ollama serve
ollama pull mistral
ollama pull nomic-embed-text
```
### Production Deployment
- **Containerization**: Docker support for easy deployment
- **Service Management**: Systemd service files
- **Automatic Updates**: Self-updating mechanisms
- **Backup System**: Automated data backups
## Extending the System
### Adding New File Types
1. Implement processor in `ai_core/file_processing/`
2. Add MIME type mapping
3. Update file upload validation
4. Test with sample files
### Adding New Learning Patterns
1. Extend `LearningEngine` class
2. Add new pattern types
3. Implement analysis logic
4. Update preference storage
### Custom LLM Integration
1. Implement LLM client interface
2. Add configuration options
3. Update prompt generation
4. Test with target model
## Monitoring & Analytics
### System Health
- AI service availability
- Database performance
- File processing status
- Memory and disk usage
### User Analytics
- Interaction frequency
- Learning effectiveness
- Feature usage patterns
- System performance metrics
## Future Enhancements
### Planned Features
1. **Multi-modal Support**: Image understanding and generation
2. **Voice Interface**: Speech-to-text and text-to-speech
3. **Advanced Scheduling**: Calendar integration and smart scheduling
4. **Team Features**: Shared knowledge bases (while maintaining privacy)
5. **Mobile App**: Native mobile applications
6. **Plugin System**: Extensible plugin architecture
### Research Areas
1. **Federated Learning**: Improve models without data sharing
2. **Advanced RAG**: More sophisticated retrieval strategies
3. **Multi-agent Systems**: Specialized AI agents for different tasks
4. **Continuous Learning**: Real-time model adaptation
This architecture ensures aPersona remains a powerful, private, and continuously improving personal AI assistant that truly understands and adapts to each user's unique needs and preferences.