9.6 KiB
9.6 KiB
aPersona System Architecture
Overview
aPersona is a fully local, AI-powered personal assistant designed to work entirely offline while providing intelligent, context-aware assistance based on your personal files and behavior patterns.
Core Principles
- 100% Local: No data leaves your device
- Privacy-First: All processing happens on your machine
- Adaptive Learning: Continuously improves based on your interactions
- Context-Aware: Understands your personal documents and preferences
System Architecture
Backend (Python FastAPI)
backend/
├── app/
│ ├── api/ # REST API endpoints
│ ├── core/ # Core configuration and security
│ ├── db/ # Database models and connections
│ └── services/ # Business logic services
├── ai_core/ # AI/ML components
│ ├── embeddings/ # Text embedding service
│ ├── llm/ # Local LLM integration (Ollama)
│ ├── rag/ # Retrieval-Augmented Generation
│ └── auto_learning/ # Adaptive learning engine
└── requirements.txt
Key Components
- FastAPI Application: RESTful API server
- SQLAlchemy ORM: Database management with SQLite
- Authentication: JWT-based user authentication
- File Processing: Multi-format document processing
- Vector Database: ChromaDB for semantic search
- Local LLM: Ollama integration for AI responses
Frontend (React + TypeScript)
frontend/
├── src/
│ ├── components/ # Reusable UI components
│ ├── pages/ # Page-level components
│ ├── services/ # API service layer
│ ├── store/ # State management (Zustand)
│ └── utils/ # Utility functions
├── index.html
└── package.json
Key Technologies
- React 18: Modern UI framework
- TypeScript: Type-safe development
- TailwindCSS: Utility-first styling
- Vite: Fast build tool and dev server
- React Query: Server state management
- Zustand: Client state management
AI Core Components
1. Embedding Service (ai_core/embeddings/
)
- Purpose: Convert text to numerical vectors for semantic search
- Model: SentenceTransformers (all-MiniLM-L6-v2)
- Features:
- Caching for performance
- Batch processing
- Similarity computation
2. Vector Store (ai_core/rag/
)
- Purpose: Store and search document embeddings
- Technology: ChromaDB with persistent storage
- Capabilities:
- Semantic similarity search
- Metadata filtering
- User-specific collections
3. LLM Integration (ai_core/llm/
)
- Purpose: Local language model integration
- Technology: Ollama (supports Mistral, LLaMA, etc.)
- Features:
- Streaming responses
- Context management
- Error handling
4. File Processing (ai_core/file_processing/
)
- Supported Formats: PDF, DOCX, TXT, Images (OCR), Markdown
- Features:
- Content extraction
- Auto-categorization
- Metadata extraction
- Text chunking for embeddings
Auto-Learning System
The auto-learning module is the heart of aPersona's intelligence, continuously adapting to user behavior and preferences.
Learning Components
1. Interaction Analysis
class LearningEngine:
async def analyze_user_interactions(self, user_id: int):
# Analyzes patterns in user queries and responses
- Frequency patterns
- Topic preferences
- Response quality metrics
- Search patterns
- Time-based usage patterns
2. Preference Learning
The system learns user preferences across multiple dimensions:
- Response Style: Concise vs. detailed responses
- Topic Interests: Frequently discussed subjects
- Time Patterns: When user is most active
- File Usage: Most accessed documents
3. Adaptive Prompting
async def generate_personalized_prompt(self, user_id: int, base_prompt: str):
# Creates personalized system prompts based on learned preferences
- User's communication style
- Preferred response length
- Topic expertise areas
- Context preferences
4. Proactive Suggestions
The system generates intelligent suggestions:
- Reminder Optimization: Suggests optimal reminder times
- File Organization: Proposes file organization improvements
- Content Discovery: Recommends related documents
- Workflow Improvements: Suggests process optimizations
Learning Data Flow
graph TD
A[User Interaction] --> B[Store Interaction Data]
B --> C[Analyze Patterns]
C --> D[Update Preferences]
D --> E[Generate Personalized Prompts]
E --> F[Improve Responses]
F --> G[Collect Feedback]
G --> A
Learning Metrics
- Confidence Scores: How certain the system is about preferences
- Success Rates: Effectiveness of learned patterns
- Usage Counts: Frequency of pattern application
- Feedback Integration: User satisfaction incorporation
Data Storage
Database Schema
Core Tables
- Users: User accounts and authentication
- UserFiles: Uploaded files and metadata
- UserInteractions: All user-AI interactions
- UserPreferences: Learned user preferences
- LearningPatterns: Detected behavioral patterns
- Reminders: User reminders and notifications
Vector Storage
- ChromaDB Collections: Document embeddings with metadata
- User-Specific Collections: Isolated data per user
- Embedding Cache: Local cache for faster processing
Security & Privacy
Data Protection
- Local Storage: All data remains on user's device
- Encrypted Authentication: JWT tokens with secure hashing
- No External APIs: No cloud dependencies
- User Data Isolation: Multi-user support with data separation
File Security
- Access Controls: User-based file access
- Secure Upload: File validation and sanitization
- Safe Processing: Sandboxed file processing
- Cleanup: Temporary file management
RAG (Retrieval-Augmented Generation) System
How It Works
-
Document Ingestion:
- Files are processed and chunked
- Text is converted to embeddings
- Metadata is extracted and stored
-
Query Processing:
- User query is embedded
- Semantic search finds relevant chunks
- Context is assembled for LLM
-
Response Generation:
- LLM receives query + relevant context
- Personalized prompts are applied
- Response is generated and returned
-
Learning Loop:
- User feedback is collected
- Patterns are analyzed
- System adapts for future queries
Context Assembly
def assemble_context(query_embedding, user_preferences):
# Find relevant documents
relevant_docs = vector_store.search_similar(query_embedding)
# Apply user preferences
context = personalize_context(relevant_docs, user_preferences)
# Generate personalized prompt
system_prompt = generate_personalized_prompt(user_id, base_prompt)
return context, system_prompt
Performance Optimizations
Embedding Cache
- Local caching of text embeddings
- Significant performance improvement for repeated content
- Automatic cache management
Batch Processing
- Process multiple files simultaneously
- Batch embedding generation
- Efficient database operations
Background Tasks
- Asynchronous file processing
- Background learning analysis
- Scheduled maintenance tasks
Deployment Architecture
Local Development
# Backend
cd backend && python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload
# Frontend
cd frontend && npm install
npm run dev
# AI Services
ollama serve
ollama pull mistral
ollama pull nomic-embed-text
Production Deployment
- Containerization: Docker support for easy deployment
- Service Management: Systemd service files
- Automatic Updates: Self-updating mechanisms
- Backup System: Automated data backups
Extending the System
Adding New File Types
- Implement processor in
ai_core/file_processing/
- Add MIME type mapping
- Update file upload validation
- Test with sample files
Adding New Learning Patterns
- Extend
LearningEngine
class - Add new pattern types
- Implement analysis logic
- Update preference storage
Custom LLM Integration
- Implement LLM client interface
- Add configuration options
- Update prompt generation
- Test with target model
Monitoring & Analytics
System Health
- AI service availability
- Database performance
- File processing status
- Memory and disk usage
User Analytics
- Interaction frequency
- Learning effectiveness
- Feature usage patterns
- System performance metrics
Future Enhancements
Planned Features
- Multi-modal Support: Image understanding and generation
- Voice Interface: Speech-to-text and text-to-speech
- Advanced Scheduling: Calendar integration and smart scheduling
- Team Features: Shared knowledge bases (while maintaining privacy)
- Mobile App: Native mobile applications
- Plugin System: Extensible plugin architecture
Research Areas
- Federated Learning: Improve models without data sharing
- Advanced RAG: More sophisticated retrieval strategies
- Multi-agent Systems: Specialized AI agents for different tasks
- Continuous Learning: Real-time model adaptation
This architecture ensures aPersona remains a powerful, private, and continuously improving personal AI assistant that truly understands and adapts to each user's unique needs and preferences.