Initial commit: aPersona - AI-powered personal assistant with local LLM, RAG system, auto-learning engine, and privacy-first design

This commit is contained in:
Mehmet Oezdag 2025-06-08 17:50:50 +02:00
commit 9110e97fe2
40 changed files with 4463 additions and 0 deletions

118
.gitignore vendored Normal file
View File

@ -0,0 +1,118 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# Virtual environments
venv/
env/
ENV/
.venv/
.env/
# PyCharm
.idea/
# VSCode
.vscode/
# Jupyter Notebook
.ipynb_checkpoints
# Environment variables
.env
.env.local
.env.development.local
.env.test.local
.env.production.local
# Node.js
node_modules/
npm-debug.log*
yarn-debug.log*
yarn-error.log*
lerna-debug.log*
# Runtime data
pids/
*.pid
*.seed
*.pid.lock
# Coverage directory used by tools like istanbul
coverage/
*.lcov
# Build outputs
dist/
build/
# Database
*.db
*.sqlite
*.sqlite3
# Logs
logs/
*.log
# Data directories (user uploaded files)
data/uploads/*
data/processed/*
data/vectors/*
data/embeddings_cache/*
# Keep the directories but ignore contents
!data/uploads/.gitkeep
!data/processed/.gitkeep
!data/vectors/.gitkeep
!data/embeddings_cache/.gitkeep
# AI model files (if downloaded locally)
*.gguf
*.bin
*.safetensors
models/
# Temporary files
*.tmp
*.temp
.DS_Store
Thumbs.db
# Certificates and keys
*.pem
*.key
*.crt
# Editor files
*~
.swp
.swo
# OS generated files
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
ehthumbs.db
Thumbs.db

225
QUICK_START.md Normal file
View File

@ -0,0 +1,225 @@
# aPersona Quick Start Guide
## Prerequisites
Before you begin, ensure you have the following installed:
- **Python 3.11+**: [Download here](https://python.org/downloads/)
- **Node.js 18+**: [Download here](https://nodejs.org/)
- **Ollama**: [Install guide](https://ollama.ai/download)
## 🚀 Automated Setup (Recommended)
The easiest way to get started is using our setup script:
```bash
# Make the setup script executable
chmod +x setup.sh
# Run the setup script
./setup.sh
```
This script will:
- Check your system requirements
- Install dependencies for both backend and frontend
- Set up the AI models
- Create necessary directories and configuration files
## 🔧 Manual Setup
If you prefer to set up manually:
### 1. Clone and Setup Backend
```bash
# Navigate to backend directory
cd backend
# Create virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Create environment file
cp .env.example .env # Edit with your preferences
```
### 2. Setup Frontend
```bash
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Install and configure development tools
npm run dev # This will start the development server
```
### 3. Setup AI Services
```bash
# Start Ollama service
ollama serve
# In another terminal, pull required models
ollama pull mistral # Main LLM model
ollama pull nomic-embed-text # Embedding model
```
## 🏃‍♂️ Running the Application
### Start the Backend
```bash
cd backend
source venv/bin/activate # If not already activated
uvicorn app.main:app --reload
```
The backend will be available at: `http://localhost:8000`
### Start the Frontend
```bash
cd frontend
npm run dev
```
The frontend will be available at: `http://localhost:3000`
### Start Ollama (if not running)
```bash
ollama serve
```
## 🎯 First Steps
1. **Open your browser** and go to `http://localhost:3000`
2. **Create an account** using the registration form
3. **Upload some documents** to get started:
- PDFs, Word documents, text files, or images
- The system will automatically process and categorize them
4. **Start chatting** with your AI assistant:
- Ask questions about your uploaded files
- The AI will provide context-aware responses
- Give feedback to help the system learn your preferences
## 🔍 Verify Everything is Working
### Check System Health
Visit: `http://localhost:8000/health`
You should see:
```json
{
"status": "healthy",
"services": {
"database": "healthy",
"ollama": "healthy",
"embeddings": "healthy",
"vector_store": "healthy"
}
}
```
### Check API Documentation
Visit: `http://localhost:8000/docs`
This will show the interactive API documentation.
## 🐛 Troubleshooting
### Common Issues
#### 1. Ollama Not Running
```bash
# Error: Connection refused to Ollama
# Solution: Start Ollama service
ollama serve
```
#### 2. Models Not Downloaded
```bash
# Error: Model not found
# Solution: Download required models
ollama pull mistral
ollama pull nomic-embed-text
```
#### 3. Port Already in Use
```bash
# Backend port 8000 in use
uvicorn app.main:app --reload --port 8001
# Frontend port 3000 in use
npm run dev -- --port 3001
```
#### 4. Python Dependencies Issues
```bash
# Create fresh virtual environment
rm -rf venv
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
```
#### 5. Node Dependencies Issues
```bash
# Clear cache and reinstall
rm -rf node_modules package-lock.json
npm install
```
### Performance Tips
1. **First Run**: The first time you upload files and ask questions, it may take longer as models are loading and caches are being built.
2. **Memory Usage**: The system uses local AI models which require significant RAM. Ensure you have at least 8GB RAM available.
3. **Storage**: Vector embeddings and model files require disk space. Ensure you have at least 5GB free disk space.
## 📊 System Requirements
### Minimum Requirements
- **RAM**: 8GB
- **Storage**: 5GB free space
- **CPU**: Multi-core processor (4+ cores recommended)
- **OS**: Windows 10+, macOS 10.14+, Linux (Ubuntu 18.04+)
### Recommended Requirements
- **RAM**: 16GB+
- **Storage**: 10GB+ free space
- **CPU**: 8+ cores
- **GPU**: NVIDIA GPU with CUDA support (optional, for faster processing)
## 🎉 You're Ready!
Once everything is running:
1. **Upload your documents** (PDFs, Word docs, images, etc.)
2. **Ask questions** about your content
3. **Set reminders** and let the AI help organize your life
4. **Watch it learn** and adapt to your preferences over time
## 🆘 Need Help?
- Check the [Architecture Documentation](docs/ARCHITECTURE.md) for technical details
- Review the API documentation at `http://localhost:8000/docs`
- Ensure all services are running with the health check endpoint
## 🔒 Privacy Note
Remember: **All your data stays local**. aPersona runs entirely on your machine without any cloud dependencies. Your files, conversations, and personal information never leave your device.

126
README.md Normal file
View File

@ -0,0 +1,126 @@
# aPersona - AI-Powered Personal Assistant
A fully local, offline AI-powered personal assistant that learns from your personal files, preferences, and behavior to act as your intelligent secretary.
## 🔹 Key Features
- **100% Local & Offline**: No cloud dependencies, complete data privacy
- **User Authentication**: Secure local user management
- **File Analysis**: Automatic categorization of documents, images, PDFs
- **Semantic Search**: Vector-based search through your personal data
- **Local LLM Integration**: Powered by Ollama with RAG capabilities
- **Auto-Learning**: Adaptive behavior based on user interactions
- **Smart Reminders**: Context-aware suggestions and notifications
- **Personal Context**: Deep understanding of your preferences and habits
## 🛠 Technology Stack
### Backend
- **FastAPI**: Modern Python web framework
- **SQLAlchemy**: Database ORM
- **ChromaDB**: Vector database for embeddings
- **SentenceTransformers**: Text embeddings
- **Ollama**: Local LLM runtime
### Frontend
- **React**: Modern UI framework
- **TailwindCSS**: Utility-first CSS framework
- **Vite**: Fast build tool
### AI/ML
- **Hugging Face Transformers**: Pre-trained models
- **PyTorch**: ML framework
- **Pillow**: Image processing
- **PyPDF2**: PDF text extraction
## 📁 Project Structure
```
apersona/
├── backend/ # FastAPI backend
│ ├── app/
│ │ ├── api/ # API routes
│ │ ├── core/ # Core configuration
│ │ ├── db/ # Database models
│ │ ├── services/ # Business logic
│ │ └── main.py # FastAPI app
│ ├── ai_core/ # AI/ML components
│ │ ├── embeddings/ # Text embeddings
│ │ ├── llm/ # LLM integration
│ │ ├── rag/ # RAG system
│ │ └── auto_learning/ # Adaptive learning
│ └── requirements.txt
├── frontend/ # React frontend
│ ├── src/
│ │ ├── components/ # React components
│ │ ├── pages/ # Page components
│ │ ├── services/ # API services
│ │ └── utils/ # Utility functions
│ └── package.json
├── data/ # Local data storage
│ ├── uploads/ # User uploaded files
│ ├── processed/ # Processed files
│ └── vectors/ # Vector embeddings
└── docs/ # Documentation
```
## 🚀 Quick Start
### Prerequisites
- Python 3.11+
- Node.js 18+
- Ollama installed locally
### Backend Setup
```bash
cd backend
pip install -r requirements.txt
uvicorn app.main:app --reload
```
### Frontend Setup
```bash
cd frontend
npm install
npm run dev
```
### AI Setup
```bash
# Install Ollama models
ollama pull mistral
ollama pull nomic-embed-text
```
## 🧠 Auto-Learning System
The auto-learning module continuously adapts to user behavior through:
- **Interaction Patterns**: Learning from user queries and responses
- **Preference Tracking**: Monitoring file usage and search patterns
- **Context Building**: Understanding user's work and personal contexts
- **Response Optimization**: Improving answer relevance over time
- **Proactive Suggestions**: Anticipating user needs based on patterns
## 🔒 Privacy & Security
- All data stored locally
- No external API calls
- Encrypted user authentication
- Secure file handling
- Optional data anonymization
## 📚 Documentation
- [API Documentation](./docs/api.md)
- [AI Integration Guide](./docs/ai-integration.md)
- [Auto-Learning Architecture](./docs/auto-learning.md)
- [Deployment Guide](./docs/deployment.md)
## 🤝 Contributing
This is a personal project focused on privacy and local execution. Feel free to fork and adapt for your needs.
## 📄 License
MIT License - See LICENSE file for details

View File

@ -0,0 +1,357 @@
from typing import Dict, List, Any, Optional, Tuple
from collections import defaultdict, Counter
import json
import numpy as np
from datetime import datetime, timedelta
import asyncio
from sqlalchemy.orm import Session
from app.db.models import UserInteraction, UserPreference, LearningPattern, User
from app.core.config import settings
import logging
logger = logging.getLogger(__name__)
class LearningEngine:
def __init__(self):
self.user_patterns = defaultdict(dict)
self.feedback_weights = {
-1: -0.2, # Negative feedback
0: 0.0, # Neutral feedback
1: 0.1 # Positive feedback
}
async def analyze_user_interactions(self, db: Session, user_id: int) -> Dict[str, Any]:
"""Analyze user interaction patterns to extract learning insights"""
try:
# Get recent interactions (last 30 days)
cutoff_date = datetime.utcnow() - timedelta(days=30)
interactions = db.query(UserInteraction).filter(
UserInteraction.user_id == user_id,
UserInteraction.created_at >= cutoff_date
).all()
if not interactions:
return {}
analysis = {
'interaction_frequency': self._analyze_frequency_patterns(interactions),
'topic_preferences': self._analyze_topic_preferences(interactions),
'response_quality': self._analyze_response_quality(interactions),
'search_patterns': self._analyze_search_patterns(interactions),
'time_patterns': self._analyze_time_patterns(interactions)
}
return analysis
except Exception as e:
logger.error(f"Failed to analyze user interactions: {e}")
return {}
def _analyze_frequency_patterns(self, interactions: List[UserInteraction]) -> Dict[str, Any]:
"""Analyze how frequently user interacts with the system"""
if not interactions:
return {}
# Group interactions by day
daily_counts = defaultdict(int)
for interaction in interactions:
day = interaction.created_at.date()
daily_counts[day] += 1
# Calculate patterns
counts = list(daily_counts.values())
return {
'avg_daily_interactions': np.mean(counts) if counts else 0,
'max_daily_interactions': max(counts) if counts else 0,
'active_days': len(daily_counts),
'total_interactions': len(interactions)
}
def _analyze_topic_preferences(self, interactions: List[UserInteraction]) -> Dict[str, Any]:
"""Analyze what topics user asks about most frequently"""
topic_keywords = []
successful_topics = []
for interaction in interactions:
if interaction.query:
# Extract keywords from query (simple approach)
words = interaction.query.lower().split()
# Filter out common stop words
stop_words = {'the', 'a', 'an', 'and', 'or', 'but', 'in', 'on', 'at', 'to', 'for', 'of', 'with', 'by', 'is', 'are', 'was', 'were', 'what', 'how', 'when', 'where', 'why', 'who'}
keywords = [word for word in words if len(word) > 3 and word not in stop_words]
topic_keywords.extend(keywords)
# Track successful topics (positive feedback)
if interaction.user_feedback and interaction.user_feedback > 0:
successful_topics.extend(keywords)
# Count frequencies
topic_counts = Counter(topic_keywords)
successful_counts = Counter(successful_topics)
return {
'most_common_topics': dict(topic_counts.most_common(10)),
'successful_topics': dict(successful_counts.most_common(5)),
'topic_diversity': len(set(topic_keywords))
}
def _analyze_response_quality(self, interactions: List[UserInteraction]) -> Dict[str, Any]:
"""Analyze response quality based on user feedback"""
feedback_scores = []
response_times = []
helpful_count = 0
total_feedback = 0
for interaction in interactions:
if interaction.user_feedback is not None:
feedback_scores.append(interaction.user_feedback)
total_feedback += 1
if interaction.was_helpful is not None:
if interaction.was_helpful:
helpful_count += 1
if interaction.response_time:
response_times.append(interaction.response_time)
return {
'avg_feedback_score': np.mean(feedback_scores) if feedback_scores else 0,
'feedback_distribution': dict(Counter(feedback_scores)),
'helpfulness_rate': helpful_count / total_feedback if total_feedback > 0 else 0,
'avg_response_time': np.mean(response_times) if response_times else 0,
'total_feedback_count': total_feedback
}
def _analyze_search_patterns(self, interactions: List[UserInteraction]) -> Dict[str, Any]:
"""Analyze search and file usage patterns"""
search_terms = []
used_files = []
for interaction in interactions:
if interaction.search_terms:
search_terms.extend(interaction.search_terms)
if interaction.used_files:
used_files.extend(interaction.used_files)
return {
'common_search_terms': dict(Counter(search_terms).most_common(10)),
'frequently_used_files': dict(Counter(used_files).most_common(10)),
'search_diversity': len(set(search_terms))
}
def _analyze_time_patterns(self, interactions: List[UserInteraction]) -> Dict[str, Any]:
"""Analyze when user is most active"""
hours = []
days_of_week = []
for interaction in interactions:
hours.append(interaction.created_at.hour)
days_of_week.append(interaction.created_at.weekday())
return {
'peak_hours': dict(Counter(hours).most_common(5)),
'active_days_of_week': dict(Counter(days_of_week).most_common()),
'activity_distribution': {
'morning': sum(1 for h in hours if 6 <= h < 12),
'afternoon': sum(1 for h in hours if 12 <= h < 18),
'evening': sum(1 for h in hours if 18 <= h < 24),
'night': sum(1 for h in hours if 0 <= h < 6)
}
}
async def update_user_preferences(self, db: Session, user_id: int, analysis: Dict[str, Any]):
"""Update user preferences based on learning analysis"""
try:
# Update or create preferences based on analysis
preferences_to_update = [
('response_style', 'preferred_length', self._infer_response_length_preference(analysis)),
('topics', 'interests', analysis.get('topic_preferences', {}).get('most_common_topics', {})),
('interaction', 'peak_hours', analysis.get('time_patterns', {}).get('peak_hours', {})),
('quality', 'feedback_history', analysis.get('response_quality', {}))
]
for pref_type, pref_key, pref_value in preferences_to_update:
if pref_value:
# Check if preference exists
existing_pref = db.query(UserPreference).filter(
UserPreference.user_id == user_id,
UserPreference.preference_type == pref_type,
UserPreference.preference_key == pref_key
).first()
confidence_score = self._calculate_confidence_score(analysis, pref_type)
if existing_pref:
# Update existing preference
existing_pref.preference_value = pref_value
existing_pref.confidence_score = confidence_score
existing_pref.updated_at = datetime.utcnow()
else:
# Create new preference
new_pref = UserPreference(
user_id=user_id,
preference_type=pref_type,
preference_key=pref_key,
preference_value=pref_value,
confidence_score=confidence_score
)
db.add(new_pref)
db.commit()
logger.info(f"Updated preferences for user {user_id}")
except Exception as e:
logger.error(f"Failed to update user preferences: {e}")
db.rollback()
def _infer_response_length_preference(self, analysis: Dict[str, Any]) -> str:
"""Infer user's preferred response length based on interaction patterns"""
response_quality = analysis.get('response_quality', {})
avg_feedback = response_quality.get('avg_feedback_score', 0)
# Simple heuristic: if user gives positive feedback, maintain current style
if avg_feedback > 0.5:
return 'detailed'
elif avg_feedback < -0.2:
return 'concise'
else:
return 'balanced'
def _calculate_confidence_score(self, analysis: Dict[str, Any], preference_type: str) -> float:
"""Calculate confidence score for a preference based on data volume and consistency"""
base_confidence = 0.5
# Factor in number of interactions
total_interactions = analysis.get('interaction_frequency', {}).get('total_interactions', 0)
interaction_factor = min(total_interactions / 100, 1.0) * 0.3
# Factor in feedback consistency
response_quality = analysis.get('response_quality', {})
feedback_count = response_quality.get('total_feedback_count', 0)
feedback_factor = min(feedback_count / 20, 1.0) * 0.2
return min(base_confidence + interaction_factor + feedback_factor, 1.0)
async def generate_personalized_prompt(self, db: Session, user_id: int, base_prompt: str) -> str:
"""Generate personalized system prompt based on user preferences"""
try:
# Get user preferences
preferences = db.query(UserPreference).filter(
UserPreference.user_id == user_id
).all()
# Build personalization context
personalization = []
for pref in preferences:
if pref.confidence_score > 0.6: # Only use high-confidence preferences
if pref.preference_type == 'response_style':
if pref.preference_key == 'preferred_length':
if pref.preference_value == 'concise':
personalization.append("Provide concise, direct answers.")
elif pref.preference_value == 'detailed':
personalization.append("Provide detailed, comprehensive explanations.")
elif pref.preference_type == 'topics':
topics = list(pref.preference_value.keys())[:3] # Top 3 topics
if topics:
personalization.append(f"User frequently asks about: {', '.join(topics)}")
# Combine base prompt with personalization
if personalization:
personal_context = "\n".join(personalization)
return f"{base_prompt}\n\nPersonalization context:\n{personal_context}"
return base_prompt
except Exception as e:
logger.error(f"Failed to generate personalized prompt: {e}")
return base_prompt
async def suggest_proactive_actions(self, db: Session, user_id: int) -> List[Dict[str, Any]]:
"""Suggest proactive actions based on user patterns"""
try:
analysis = await self.analyze_user_interactions(db, user_id)
suggestions = []
# Suggest reminders based on time patterns
time_patterns = analysis.get('time_patterns', {})
peak_hours = time_patterns.get('peak_hours', {})
if peak_hours:
most_active_hour = max(peak_hours.keys())
suggestions.append({
'type': 'reminder_optimization',
'message': f"You're most active at {most_active_hour}:00. Would you like to schedule important reminders around this time?",
'confidence': 0.7
})
# Suggest file organization based on usage patterns
search_patterns = analysis.get('search_patterns', {})
frequent_files = search_patterns.get('frequently_used_files', {})
if frequent_files:
suggestions.append({
'type': 'file_organization',
'message': "I noticed you frequently access certain files. Would you like me to create quick access shortcuts?",
'confidence': 0.6
})
# Suggest topic exploration based on interests
topic_prefs = analysis.get('topic_preferences', {})
common_topics = topic_prefs.get('most_common_topics', {})
if common_topics:
top_topic = max(common_topics.keys(), key=common_topics.get)
suggestions.append({
'type': 'content_discovery',
'message': f"You seem interested in {top_topic}. Would you like me to search for related documents in your files?",
'confidence': 0.5
})
return suggestions
except Exception as e:
logger.error(f"Failed to generate proactive suggestions: {e}")
return []
async def record_learning_pattern(self, db: Session, user_id: int, pattern_type: str, pattern_data: Dict[str, Any]):
"""Record a new learning pattern for future reference"""
try:
pattern = LearningPattern(
user_id=user_id,
pattern_type=pattern_type,
pattern_data=pattern_data,
confidence_score=0.5,
usage_count=1,
success_rate=0.0
)
db.add(pattern)
db.commit()
logger.info(f"Recorded learning pattern for user {user_id}: {pattern_type}")
except Exception as e:
logger.error(f"Failed to record learning pattern: {e}")
db.rollback()
async def update_pattern_success(self, db: Session, pattern_id: int, was_successful: bool):
"""Update the success rate of a learning pattern"""
try:
pattern = db.query(LearningPattern).filter(LearningPattern.id == pattern_id).first()
if pattern:
pattern.usage_count += 1
current_success_rate = pattern.success_rate * (pattern.usage_count - 1)
if was_successful:
current_success_rate += 1
pattern.success_rate = current_success_rate / pattern.usage_count
pattern.updated_at = datetime.utcnow()
db.commit()
logger.info(f"Updated pattern {pattern_id} success rate: {pattern.success_rate}")
except Exception as e:
logger.error(f"Failed to update pattern success: {e}")
db.rollback()
# Global instance
learning_engine = LearningEngine()

View File

@ -0,0 +1,197 @@
from sentence_transformers import SentenceTransformer
import numpy as np
from typing import List, Union, Dict, Any
import hashlib
import os
import pickle
from pathlib import Path
from app.core.config import settings
import logging
logger = logging.getLogger(__name__)
class EmbeddingService:
def __init__(self, model_name: str = None):
self.model_name = model_name or settings.EMBEDDING_MODEL
self.model = None
self.cache_dir = Path("../data/embeddings_cache")
self.cache_dir.mkdir(parents=True, exist_ok=True)
self._load_model()
def _load_model(self):
"""Load the SentenceTransformer model"""
try:
logger.info(f"Loading embedding model: {self.model_name}")
self.model = SentenceTransformer(self.model_name)
logger.info("Embedding model loaded successfully")
except Exception as e:
logger.error(f"Failed to load embedding model: {e}")
raise Exception(f"Could not initialize embedding model: {e}")
def _get_cache_key(self, text: str) -> str:
"""Generate cache key for text"""
return hashlib.md5(f"{self.model_name}:{text}".encode()).hexdigest()
def _get_cached_embedding(self, cache_key: str) -> np.ndarray:
"""Get embedding from cache if available"""
cache_file = self.cache_dir / f"{cache_key}.pkl"
if cache_file.exists():
try:
with open(cache_file, 'rb') as f:
return pickle.load(f)
except Exception as e:
logger.warning(f"Failed to load cached embedding: {e}")
return None
def _cache_embedding(self, cache_key: str, embedding: np.ndarray):
"""Cache embedding for future use"""
cache_file = self.cache_dir / f"{cache_key}.pkl"
try:
with open(cache_file, 'wb') as f:
pickle.dump(embedding, f)
except Exception as e:
logger.warning(f"Failed to cache embedding: {e}")
def encode_text(self, text: str, use_cache: bool = True) -> np.ndarray:
"""Generate embedding for a single text"""
if not text or not text.strip():
return np.zeros(384) # Default embedding size for all-MiniLM-L6-v2
cache_key = self._get_cache_key(text)
# Check cache first
if use_cache:
cached_embedding = self._get_cached_embedding(cache_key)
if cached_embedding is not None:
return cached_embedding
try:
# Generate embedding
embedding = self.model.encode(text, convert_to_numpy=True)
# Cache for future use
if use_cache:
self._cache_embedding(cache_key, embedding)
return embedding
except Exception as e:
logger.error(f"Failed to generate embedding: {e}")
return np.zeros(384)
def encode_texts(self, texts: List[str], use_cache: bool = True, batch_size: int = 32) -> List[np.ndarray]:
"""Generate embeddings for multiple texts"""
if not texts:
return []
embeddings = []
texts_to_encode = []
cache_keys = []
indices_to_encode = []
# Check cache for each text
for i, text in enumerate(texts):
if not text or not text.strip():
embeddings.append(np.zeros(384))
continue
cache_key = self._get_cache_key(text)
cache_keys.append(cache_key)
if use_cache:
cached_embedding = self._get_cached_embedding(cache_key)
if cached_embedding is not None:
embeddings.append(cached_embedding)
continue
# Need to encode this text
texts_to_encode.append(text)
indices_to_encode.append(i)
embeddings.append(None) # Placeholder
# Encode texts that weren't cached
if texts_to_encode:
try:
new_embeddings = self.model.encode(
texts_to_encode,
convert_to_numpy=True,
batch_size=batch_size
)
# Cache and place new embeddings
for idx, embedding in zip(indices_to_encode, new_embeddings):
embeddings[idx] = embedding
if use_cache:
self._cache_embedding(cache_keys[idx], embedding)
except Exception as e:
logger.error(f"Failed to generate batch embeddings: {e}")
# Fill with zeros for failed embeddings
for idx in indices_to_encode:
embeddings[idx] = np.zeros(384)
return embeddings
def compute_similarity(self, embedding1: np.ndarray, embedding2: np.ndarray) -> float:
"""Compute cosine similarity between two embeddings"""
try:
# Normalize embeddings
norm1 = np.linalg.norm(embedding1)
norm2 = np.linalg.norm(embedding2)
if norm1 == 0 or norm2 == 0:
return 0.0
# Cosine similarity
similarity = np.dot(embedding1, embedding2) / (norm1 * norm2)
return float(similarity)
except Exception as e:
logger.error(f"Failed to compute similarity: {e}")
return 0.0
def find_most_similar(
self,
query_embedding: np.ndarray,
candidate_embeddings: List[np.ndarray],
top_k: int = 5
) -> List[Dict[str, Any]]:
"""Find most similar embeddings to query"""
similarities = []
for i, candidate in enumerate(candidate_embeddings):
similarity = self.compute_similarity(query_embedding, candidate)
similarities.append({
'index': i,
'similarity': similarity
})
# Sort by similarity (descending)
similarities.sort(key=lambda x: x['similarity'], reverse=True)
return similarities[:top_k]
def get_model_info(self) -> Dict[str, Any]:
"""Get information about the loaded model"""
if not self.model:
return {}
return {
'model_name': self.model_name,
'max_sequence_length': getattr(self.model, 'max_seq_length', 'unknown'),
'embedding_dimension': self.model.get_sentence_embedding_dimension(),
}
def clear_cache(self):
"""Clear the embedding cache"""
try:
for cache_file in self.cache_dir.glob("*.pkl"):
cache_file.unlink()
logger.info("Embedding cache cleared")
except Exception as e:
logger.error(f"Failed to clear cache: {e}")
# Global instance
embedding_service = EmbeddingService()

View File

@ -0,0 +1,316 @@
import os
import magic
from pathlib import Path
from typing import Dict, List, Any, Optional, Tuple
import logging
import hashlib
from datetime import datetime
# File processing imports
import PyPDF2
from docx import Document
from PIL import Image
import pytesseract
logger = logging.getLogger(__name__)
class FileProcessor:
def __init__(self):
self.supported_types = {
'application/pdf': self._process_pdf,
'application/vnd.openxmlformats-officedocument.wordprocessingml.document': self._process_docx,
'text/plain': self._process_text,
'text/markdown': self._process_text,
'image/jpeg': self._process_image,
'image/png': self._process_image,
'image/gif': self._process_image,
'image/bmp': self._process_image,
'image/tiff': self._process_image,
}
# Categories for auto-classification
self.categories = {
'work': ['project', 'meeting', 'report', 'presentation', 'proposal', 'contract', 'invoice'],
'personal': ['diary', 'journal', 'note', 'reminder', 'todo', 'list'],
'financial': ['budget', 'expense', 'income', 'tax', 'receipt', 'bank', 'investment'],
'education': ['course', 'study', 'lecture', 'assignment', 'exam', 'research'],
'health': ['medical', 'doctor', 'prescription', 'health', 'fitness', 'exercise'],
'travel': ['itinerary', 'booking', 'ticket', 'hotel', 'flight', 'vacation'],
'legal': ['contract', 'agreement', 'legal', 'law', 'court', 'document'],
'technical': ['code', 'programming', 'software', 'api', 'documentation', 'manual']
}
async def process_file(self, file_path: str, original_name: str) -> Dict[str, Any]:
"""Process a file and extract relevant information"""
try:
# Detect file MIME type
mime_type = magic.from_file(file_path, mime=True)
file_size = os.path.getsize(file_path)
# Extract content based on file type
content_info = {
'original_name': original_name,
'file_path': file_path,
'mime_type': mime_type,
'file_size': file_size,
'processed_at': datetime.utcnow().isoformat(),
'extracted_text': '',
'content_summary': '',
'categories': [],
'metadata': {}
}
if mime_type in self.supported_types:
processor = self.supported_types[mime_type]
extracted_data = await processor(file_path)
content_info.update(extracted_data)
else:
logger.warning(f"Unsupported file type: {mime_type}")
content_info['error'] = f"Unsupported file type: {mime_type}"
# Auto-categorize content
if content_info['extracted_text']:
content_info['categories'] = self._categorize_content(content_info['extracted_text'])
content_info['content_summary'] = self._generate_summary(content_info['extracted_text'])
return content_info
except Exception as e:
logger.error(f"Failed to process file {file_path}: {e}")
return {
'original_name': original_name,
'file_path': file_path,
'error': str(e),
'processed_at': datetime.utcnow().isoformat()
}
async def _process_pdf(self, file_path: str) -> Dict[str, Any]:
"""Extract text from PDF files"""
try:
extracted_text = ""
metadata = {}
with open(file_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
# Extract metadata
if pdf_reader.metadata:
metadata = {
'title': pdf_reader.metadata.get('/Title', ''),
'author': pdf_reader.metadata.get('/Author', ''),
'subject': pdf_reader.metadata.get('/Subject', ''),
'creator': pdf_reader.metadata.get('/Creator', ''),
'creation_date': str(pdf_reader.metadata.get('/CreationDate', '')),
}
metadata['page_count'] = len(pdf_reader.pages)
# Extract text from all pages
for page_num, page in enumerate(pdf_reader.pages):
try:
text = page.extract_text()
if text:
extracted_text += f"\n--- Page {page_num + 1} ---\n{text}\n"
except Exception as e:
logger.warning(f"Failed to extract text from page {page_num + 1}: {e}")
continue
return {
'extracted_text': extracted_text.strip(),
'metadata': metadata,
'file_type': 'pdf'
}
except Exception as e:
logger.error(f"Failed to process PDF {file_path}: {e}")
return {'extracted_text': '', 'error': str(e), 'file_type': 'pdf'}
async def _process_docx(self, file_path: str) -> Dict[str, Any]:
"""Extract text from DOCX files"""
try:
extracted_text = ""
metadata = {}
doc = Document(file_path)
# Extract metadata
core_props = doc.core_properties
metadata = {
'title': core_props.title or '',
'author': core_props.author or '',
'subject': core_props.subject or '',
'created': str(core_props.created) if core_props.created else '',
'modified': str(core_props.modified) if core_props.modified else '',
'keywords': core_props.keywords or '',
}
# Extract text from paragraphs
paragraphs = []
for paragraph in doc.paragraphs:
if paragraph.text.strip():
paragraphs.append(paragraph.text.strip())
extracted_text = '\n'.join(paragraphs)
metadata['paragraph_count'] = len(paragraphs)
return {
'extracted_text': extracted_text,
'metadata': metadata,
'file_type': 'docx'
}
except Exception as e:
logger.error(f"Failed to process DOCX {file_path}: {e}")
return {'extracted_text': '', 'error': str(e), 'file_type': 'docx'}
async def _process_text(self, file_path: str) -> Dict[str, Any]:
"""Process plain text files"""
try:
with open(file_path, 'r', encoding='utf-8', errors='ignore') as file:
content = file.read()
# Basic text file metadata
lines = content.split('\n')
words = content.split()
metadata = {
'line_count': len(lines),
'word_count': len(words),
'character_count': len(content)
}
return {
'extracted_text': content,
'metadata': metadata,
'file_type': 'text'
}
except Exception as e:
logger.error(f"Failed to process text file {file_path}: {e}")
return {'extracted_text': '', 'error': str(e), 'file_type': 'text'}
async def _process_image(self, file_path: str) -> Dict[str, Any]:
"""Extract text from images using OCR"""
try:
extracted_text = ""
metadata = {}
# Open image and extract metadata
with Image.open(file_path) as img:
metadata = {
'format': img.format,
'mode': img.mode,
'size': img.size,
'width': img.width,
'height': img.height
}
# Extract EXIF data if available
if hasattr(img, '_getexif') and img._getexif():
exif_data = img._getexif()
if exif_data:
metadata['exif'] = {str(k): str(v) for k, v in exif_data.items()}
# Perform OCR to extract text
try:
extracted_text = pytesseract.image_to_string(img)
if extracted_text.strip():
metadata['has_text'] = True
else:
metadata['has_text'] = False
except Exception as ocr_error:
logger.warning(f"OCR failed for {file_path}: {ocr_error}")
metadata['ocr_error'] = str(ocr_error)
return {
'extracted_text': extracted_text.strip(),
'metadata': metadata,
'file_type': 'image'
}
except Exception as e:
logger.error(f"Failed to process image {file_path}: {e}")
return {'extracted_text': '', 'error': str(e), 'file_type': 'image'}
def _categorize_content(self, text: str) -> List[str]:
"""Automatically categorize content based on keywords"""
text_lower = text.lower()
detected_categories = []
for category, keywords in self.categories.items():
# Count how many keywords from this category appear in the text
keyword_count = sum(1 for keyword in keywords if keyword in text_lower)
# If more than 20% of keywords are found, or at least 2 keywords, add category
threshold = max(1, len(keywords) * 0.2)
if keyword_count >= threshold:
detected_categories.append(category)
# If no categories detected, try to infer from common patterns
if not detected_categories:
if any(word in text_lower for word in ['meeting', 'project', 'deadline', 'task']):
detected_categories.append('work')
elif any(word in text_lower for word in ['personal', 'diary', 'thought', 'feeling']):
detected_categories.append('personal')
else:
detected_categories.append('general')
return detected_categories
def _generate_summary(self, text: str, max_length: int = 200) -> str:
"""Generate a simple summary of the content"""
if not text or len(text) <= max_length:
return text
# Simple extractive summary: take first few sentences
sentences = text.split('.')
summary = ""
for sentence in sentences:
if len(summary + sentence) <= max_length:
summary += sentence + "."
else:
break
return summary.strip() if summary else text[:max_length] + "..."
def chunk_text(self, text: str, chunk_size: int = 1000, overlap: int = 100) -> List[str]:
"""Split text into overlapping chunks for embedding"""
if not text:
return []
words = text.split()
chunks = []
for i in range(0, len(words), chunk_size - overlap):
chunk_words = words[i:i + chunk_size]
chunk_text = ' '.join(chunk_words)
if chunk_text.strip():
chunks.append(chunk_text.strip())
return chunks
def get_file_hash(self, file_path: str) -> str:
"""Generate hash of file content for deduplication"""
try:
hash_sha256 = hashlib.sha256()
with open(file_path, 'rb') as f:
for chunk in iter(lambda: f.read(4096), b""):
hash_sha256.update(chunk)
return hash_sha256.hexdigest()
except Exception as e:
logger.error(f"Failed to generate hash for {file_path}: {e}")
return ""
def is_supported_file(self, file_path: str) -> bool:
"""Check if file type is supported"""
try:
mime_type = magic.from_file(file_path, mime=True)
return mime_type in self.supported_types
except Exception:
return False
# Global instance
file_processor = FileProcessor()

View File

@ -0,0 +1,146 @@
import httpx
import json
from typing import Dict, List, Any, Optional, AsyncGenerator
from app.core.config import settings
import logging
logger = logging.getLogger(__name__)
class OllamaClient:
def __init__(self, base_url: str = None, model: str = None):
self.base_url = base_url or settings.OLLAMA_BASE_URL
self.model = model or settings.DEFAULT_LLM_MODEL
self.client = httpx.AsyncClient(timeout=60.0)
async def chat(
self,
messages: List[Dict[str, str]],
system_prompt: Optional[str] = None,
temperature: float = 0.7,
max_tokens: int = 2000
) -> str:
"""Send chat messages to Ollama and get response"""
try:
# Format messages for Ollama
if system_prompt:
messages.insert(0, {"role": "system", "content": system_prompt})
payload = {
"model": self.model,
"messages": messages,
"options": {
"temperature": temperature,
"num_predict": max_tokens
},
"stream": False
}
response = await self.client.post(
f"{self.base_url}/api/chat",
json=payload
)
response.raise_for_status()
result = response.json()
return result.get("message", {}).get("content", "")
except httpx.RequestError as e:
logger.error(f"Request error communicating with Ollama: {e}")
raise Exception(f"Failed to communicate with local LLM: {e}")
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error from Ollama: {e}")
raise Exception(f"LLM service error: {e}")
async def chat_stream(
self,
messages: List[Dict[str, str]],
system_prompt: Optional[str] = None,
temperature: float = 0.7
) -> AsyncGenerator[str, None]:
"""Stream chat response from Ollama"""
try:
if system_prompt:
messages.insert(0, {"role": "system", "content": system_prompt})
payload = {
"model": self.model,
"messages": messages,
"options": {
"temperature": temperature
},
"stream": True
}
async with self.client.stream(
"POST",
f"{self.base_url}/api/chat",
json=payload
) as response:
response.raise_for_status()
async for line in response.aiter_lines():
if line:
try:
data = json.loads(line)
if "message" in data and "content" in data["message"]:
yield data["message"]["content"]
except json.JSONDecodeError:
continue
except httpx.RequestError as e:
logger.error(f"Request error streaming from Ollama: {e}")
raise Exception(f"Failed to stream from local LLM: {e}")
async def generate_embedding(self, text: str) -> List[float]:
"""Generate embeddings using Ollama (if supported by model)"""
try:
payload = {
"model": "nomic-embed-text", # Use embedding-specific model
"prompt": text
}
response = await self.client.post(
f"{self.base_url}/api/embeddings",
json=payload
)
response.raise_for_status()
result = response.json()
return result.get("embedding", [])
except httpx.RequestError as e:
logger.error(f"Request error getting embeddings from Ollama: {e}")
return []
except httpx.HTTPStatusError as e:
logger.error(f"HTTP error getting embeddings from Ollama: {e}")
return []
async def check_health(self) -> bool:
"""Check if Ollama service is available"""
try:
response = await self.client.get(f"{self.base_url}/api/tags")
return response.status_code == 200
except:
return False
async def list_models(self) -> List[str]:
"""List available models in Ollama"""
try:
response = await self.client.get(f"{self.base_url}/api/tags")
response.raise_for_status()
result = response.json()
models = result.get("models", [])
return [model["name"] for model in models]
except httpx.RequestError as e:
logger.error(f"Request error listing models from Ollama: {e}")
return []
async def close(self):
"""Close the HTTP client"""
await self.client.aclose()
# Global instance
ollama_client = OllamaClient()

View File

@ -0,0 +1,241 @@
import chromadb
from chromadb.config import Settings
from typing import List, Dict, Any, Optional, Tuple
import uuid
from pathlib import Path
from app.core.config import settings
import logging
logger = logging.getLogger(__name__)
class VectorStore:
def __init__(self, persist_directory: str = None, collection_name: str = None):
self.persist_directory = persist_directory or str(settings.VECTOR_DB_DIR)
self.collection_name = collection_name or settings.VECTOR_COLLECTION_NAME
self.client = None
self.collection = None
self._initialize_client()
def _initialize_client(self):
"""Initialize ChromaDB client and collection"""
try:
# Create persistent client
self.client = chromadb.PersistentClient(
path=self.persist_directory,
settings=Settings(anonymized_telemetry=False)
)
# Get or create collection
self.collection = self.client.get_or_create_collection(
name=self.collection_name,
metadata={"hnsw:space": "cosine"} # Use cosine similarity
)
logger.info(f"Vector store initialized with collection: {self.collection_name}")
except Exception as e:
logger.error(f"Failed to initialize vector store: {e}")
raise Exception(f"Could not initialize vector database: {e}")
def add_documents(
self,
documents: List[str],
embeddings: List[List[float]],
metadatas: List[Dict[str, Any]],
ids: Optional[List[str]] = None
) -> List[str]:
"""Add documents with embeddings to the vector store"""
try:
if not documents or not embeddings:
return []
# Generate IDs if not provided
if ids is None:
ids = [str(uuid.uuid4()) for _ in documents]
# Ensure all lists have the same length
if not (len(documents) == len(embeddings) == len(metadatas) == len(ids)):
raise ValueError("Documents, embeddings, metadatas, and ids must have the same length")
# Add to collection
self.collection.add(
documents=documents,
embeddings=embeddings,
metadatas=metadatas,
ids=ids
)
logger.info(f"Added {len(documents)} documents to vector store")
return ids
except Exception as e:
logger.error(f"Failed to add documents to vector store: {e}")
raise Exception(f"Could not add documents to vector database: {e}")
def search_similar(
self,
query_embedding: List[float],
n_results: int = 5,
where: Optional[Dict[str, Any]] = None,
include: List[str] = None
) -> Dict[str, List[Any]]:
"""Search for similar documents using embedding"""
try:
if include is None:
include = ["documents", "metadatas", "distances"]
results = self.collection.query(
query_embeddings=[query_embedding],
n_results=n_results,
where=where,
include=include
)
# Flatten results since we only query with one embedding
flattened_results = {}
for key, values in results.items():
if values and len(values) > 0:
flattened_results[key] = values[0]
else:
flattened_results[key] = []
return flattened_results
except Exception as e:
logger.error(f"Failed to search vector store: {e}")
return {"documents": [], "metadatas": [], "distances": []}
def search_by_text(
self,
query_text: str,
n_results: int = 5,
where: Optional[Dict[str, Any]] = None
) -> Dict[str, List[Any]]:
"""Search for similar documents using text query"""
try:
results = self.collection.query(
query_texts=[query_text],
n_results=n_results,
where=where,
include=["documents", "metadatas", "distances"]
)
# Flatten results
flattened_results = {}
for key, values in results.items():
if values and len(values) > 0:
flattened_results[key] = values[0]
else:
flattened_results[key] = []
return flattened_results
except Exception as e:
logger.error(f"Failed to search vector store by text: {e}")
return {"documents": [], "metadatas": [], "distances": []}
def get_documents_by_ids(self, ids: List[str]) -> Dict[str, List[Any]]:
"""Retrieve documents by their IDs"""
try:
results = self.collection.get(
ids=ids,
include=["documents", "metadatas"]
)
return results
except Exception as e:
logger.error(f"Failed to get documents by IDs: {e}")
return {"documents": [], "metadatas": []}
def update_document(
self,
document_id: str,
document: Optional[str] = None,
embedding: Optional[List[float]] = None,
metadata: Optional[Dict[str, Any]] = None
) -> bool:
"""Update an existing document"""
try:
update_data = {"ids": [document_id]}
if document is not None:
update_data["documents"] = [document]
if embedding is not None:
update_data["embeddings"] = [embedding]
if metadata is not None:
update_data["metadatas"] = [metadata]
self.collection.update(**update_data)
logger.info(f"Updated document: {document_id}")
return True
except Exception as e:
logger.error(f"Failed to update document {document_id}: {e}")
return False
def delete_documents(self, ids: List[str]) -> bool:
"""Delete documents by their IDs"""
try:
self.collection.delete(ids=ids)
logger.info(f"Deleted {len(ids)} documents from vector store")
return True
except Exception as e:
logger.error(f"Failed to delete documents: {e}")
return False
def delete_by_metadata(self, where: Dict[str, Any]) -> bool:
"""Delete documents by metadata criteria"""
try:
self.collection.delete(where=where)
logger.info(f"Deleted documents matching criteria: {where}")
return True
except Exception as e:
logger.error(f"Failed to delete documents by metadata: {e}")
return False
def get_collection_stats(self) -> Dict[str, Any]:
"""Get statistics about the collection"""
try:
count = self.collection.count()
return {
"collection_name": self.collection_name,
"total_documents": count,
"persist_directory": self.persist_directory
}
except Exception as e:
logger.error(f"Failed to get collection stats: {e}")
return {}
def clear_collection(self) -> bool:
"""Clear all documents from the collection"""
try:
# Delete the collection and recreate it
self.client.delete_collection(self.collection_name)
self.collection = self.client.get_or_create_collection(
name=self.collection_name,
metadata={"hnsw:space": "cosine"}
)
logger.info(f"Cleared collection: {self.collection_name}")
return True
except Exception as e:
logger.error(f"Failed to clear collection: {e}")
return False
def create_user_collection(self, user_id: int, collection_name: str = None) -> 'VectorStore':
"""Create a user-specific collection"""
if collection_name is None:
collection_name = f"user_{user_id}_documents"
return VectorStore(
persist_directory=self.persist_directory,
collection_name=collection_name
)
# Global instance
vector_store = VectorStore()

198
backend/app/api/auth.py Normal file
View File

@ -0,0 +1,198 @@
from fastapi import APIRouter, Depends, HTTPException, status
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from sqlalchemy.orm import Session
from datetime import timedelta
from pydantic import BaseModel, EmailStr
from app.db.database import get_db
from app.db.models import User
from app.core.security import verify_password, get_password_hash, create_access_token, decode_token
from app.core.config import settings
import logging
logger = logging.getLogger(__name__)
router = APIRouter()
security = HTTPBearer()
# Pydantic models
class UserRegistration(BaseModel):
username: str
email: EmailStr
password: str
fullName: str = None
class UserLogin(BaseModel):
username: str
password: str
class UserResponse(BaseModel):
id: int
username: str
email: str
fullName: str = None
createdAt: str
class Config:
from_attributes = True
class TokenResponse(BaseModel):
access_token: str
token_type: str = "bearer"
user: UserResponse
# Dependency to get current user
async def get_current_user(
credentials: HTTPAuthorizationCredentials = Depends(security),
db: Session = Depends(get_db)
) -> User:
"""Get current authenticated user"""
token = credentials.credentials
username = decode_token(token)
if username is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid authentication credentials",
headers={"WWW-Authenticate": "Bearer"},
)
user = db.query(User).filter(User.username == username).first()
if user is None:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="User not found",
headers={"WWW-Authenticate": "Bearer"},
)
return user
@router.post("/register", response_model=TokenResponse)
async def register_user(user_data: UserRegistration, db: Session = Depends(get_db)):
"""Register a new user"""
try:
# Check if username already exists
existing_user = db.query(User).filter(User.username == user_data.username).first()
if existing_user:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Username already registered"
)
# Check if email already exists
existing_email = db.query(User).filter(User.email == user_data.email).first()
if existing_email:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Email already registered"
)
# Create new user
hashed_password = get_password_hash(user_data.password)
new_user = User(
username=user_data.username,
email=user_data.email,
hashed_password=hashed_password,
full_name=user_data.fullName
)
db.add(new_user)
db.commit()
db.refresh(new_user)
# Create access token
access_token = create_access_token(
subject=new_user.username,
expires_delta=timedelta(minutes=settings.ACCESS_TOKEN_EXPIRE_MINUTES)
)
logger.info(f"New user registered: {new_user.username}")
return TokenResponse(
access_token=access_token,
user=UserResponse(
id=new_user.id,
username=new_user.username,
email=new_user.email,
fullName=new_user.full_name,
createdAt=new_user.created_at.isoformat()
)
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Registration error: {e}")
db.rollback()
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Registration failed"
)
@router.post("/login", response_model=TokenResponse)
async def login_user(login_data: UserLogin, db: Session = Depends(get_db)):
"""Authenticate user and return token"""
try:
# Find user
user = db.query(User).filter(User.username == login_data.username).first()
if not user:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid username or password"
)
# Verify password
if not verify_password(login_data.password, user.hashed_password):
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Invalid username or password"
)
# Update last login
from datetime import datetime
user.last_login = datetime.utcnow()
db.commit()
# Create access token
access_token = create_access_token(
subject=user.username,
expires_delta=timedelta(minutes=settings.ACCESS_TOKEN_EXPIRE_MINUTES)
)
logger.info(f"User logged in: {user.username}")
return TokenResponse(
access_token=access_token,
user=UserResponse(
id=user.id,
username=user.username,
email=user.email,
fullName=user.full_name,
createdAt=user.created_at.isoformat()
)
)
except HTTPException:
raise
except Exception as e:
logger.error(f"Login error: {e}")
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail="Login failed"
)
@router.post("/logout")
async def logout_user(current_user: User = Depends(get_current_user)):
"""Logout user (client-side token removal)"""
logger.info(f"User logged out: {current_user.username}")
return {"message": "Successfully logged out"}
@router.get("/me", response_model=UserResponse)
async def get_current_user_info(current_user: User = Depends(get_current_user)):
"""Get current user information"""
return UserResponse(
id=current_user.id,
username=current_user.username,
email=current_user.email,
fullName=current_user.full_name,
createdAt=current_user.created_at.isoformat()
)

View File

@ -0,0 +1,57 @@
from pydantic_settings import BaseSettings
from typing import List
import os
from pathlib import Path
class Settings(BaseSettings):
# App Configuration
APP_NAME: str = "aPersona"
APP_VERSION: str = "1.0.0"
DEBUG: bool = True
API_V1_STR: str = "/api/v1"
# Security
SECRET_KEY: str = "your-secret-key-change-in-production"
ACCESS_TOKEN_EXPIRE_MINUTES: int = 60 * 24 * 8 # 8 days
ALGORITHM: str = "HS256"
# Database
DATABASE_URL: str = "sqlite:///./apersona.db"
# File Storage
UPLOAD_DIR: Path = Path("../data/uploads")
PROCESSED_DIR: Path = Path("../data/processed")
VECTOR_DB_DIR: Path = Path("../data/vectors")
MAX_FILE_SIZE: int = 100 * 1024 * 1024 # 100MB
# AI Configuration
OLLAMA_BASE_URL: str = "http://localhost:11434"
DEFAULT_LLM_MODEL: str = "mistral"
EMBEDDING_MODEL: str = "all-MiniLM-L6-v2"
VECTOR_COLLECTION_NAME: str = "apersona_documents"
# CORS
BACKEND_CORS_ORIGINS: List[str] = [
"http://localhost:3000",
"http://localhost:5173",
"http://127.0.0.1:3000",
"http://127.0.0.1:5173",
]
# Auto-Learning Configuration
LEARNING_UPDATE_INTERVAL: int = 3600 # 1 hour in seconds
MIN_INTERACTIONS_FOR_LEARNING: int = 10
FEEDBACK_WEIGHT: float = 0.1
def __init__(self, **kwargs):
super().__init__(**kwargs)
# Create directories if they don't exist
for directory in [self.UPLOAD_DIR, self.PROCESSED_DIR, self.VECTOR_DB_DIR]:
directory.mkdir(parents=True, exist_ok=True)
class Config:
env_file = ".env"
settings = Settings()

View File

@ -0,0 +1,45 @@
from datetime import datetime, timedelta
from typing import Optional, Union, Any
from jose import jwt
from passlib.context import CryptContext
from app.core.config import settings
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
def create_access_token(
subject: Union[str, Any], expires_delta: Optional[timedelta] = None
) -> str:
"""Create JWT access token"""
if expires_delta:
expire = datetime.utcnow() + expires_delta
else:
expire = datetime.utcnow() + timedelta(
minutes=settings.ACCESS_TOKEN_EXPIRE_MINUTES
)
to_encode = {"exp": expire, "sub": str(subject)}
encoded_jwt = jwt.encode(to_encode, settings.SECRET_KEY, algorithm=settings.ALGORITHM)
return encoded_jwt
def verify_password(plain_password: str, hashed_password: str) -> bool:
"""Verify a password against its hash"""
return pwd_context.verify(plain_password, hashed_password)
def get_password_hash(password: str) -> str:
"""Generate password hash"""
return pwd_context.hash(password)
def decode_token(token: str) -> Optional[str]:
"""Decode JWT token and return subject"""
try:
payload = jwt.decode(
token, settings.SECRET_KEY, algorithms=[settings.ALGORITHM]
)
token_data = payload.get("sub")
return token_data
except jwt.JWTError:
return None

View File

@ -0,0 +1,22 @@
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from app.core.config import settings
engine = create_engine(
settings.DATABASE_URL,
connect_args={"check_same_thread": False} # Only for SQLite
)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
Base = declarative_base()
def get_db():
"""Dependency to get database session"""
db = SessionLocal()
try:
yield db
finally:
db.close()

149
backend/app/db/models.py Normal file
View File

@ -0,0 +1,149 @@
from sqlalchemy import Boolean, Column, Integer, String, DateTime, Text, Float, ForeignKey, JSON
from sqlalchemy.orm import relationship
from sqlalchemy.sql import func
from app.db.database import Base
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True, index=True)
username = Column(String, unique=True, index=True, nullable=False)
email = Column(String, unique=True, index=True, nullable=False)
hashed_password = Column(String, nullable=False)
full_name = Column(String, nullable=True)
is_active = Column(Boolean, default=True)
created_at = Column(DateTime(timezone=True), server_default=func.now())
last_login = Column(DateTime(timezone=True), nullable=True)
# Relationships
files = relationship("UserFile", back_populates="owner")
interactions = relationship("UserInteraction", back_populates="user")
preferences = relationship("UserPreference", back_populates="user")
reminders = relationship("Reminder", back_populates="user")
class UserFile(Base):
__tablename__ = "user_files"
id = Column(Integer, primary_key=True, index=True)
filename = Column(String, nullable=False)
original_name = Column(String, nullable=False)
file_path = Column(String, nullable=False)
file_type = Column(String, nullable=False) # pdf, txt, docx, image, etc.
file_size = Column(Integer, nullable=False)
mime_type = Column(String, nullable=True)
# Content analysis
content_summary = Column(Text, nullable=True)
extracted_text = Column(Text, nullable=True)
categories = Column(JSON, nullable=True) # List of auto-detected categories
tags = Column(JSON, nullable=True) # User-defined tags
# Metadata
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now())
last_accessed = Column(DateTime(timezone=True), nullable=True)
access_count = Column(Integer, default=0)
# Relationships
owner_id = Column(Integer, ForeignKey("users.id"))
owner = relationship("User", back_populates="files")
class UserInteraction(Base):
__tablename__ = "user_interactions"
id = Column(Integer, primary_key=True, index=True)
interaction_type = Column(String, nullable=False) # query, file_upload, search, etc.
query = Column(Text, nullable=True)
response = Column(Text, nullable=True)
context = Column(JSON, nullable=True) # Additional context data
# Quality metrics
response_time = Column(Float, nullable=True)
user_feedback = Column(Integer, nullable=True) # -1, 0, 1 (negative, neutral, positive)
was_helpful = Column(Boolean, nullable=True)
# Learning data
used_files = Column(JSON, nullable=True) # List of file IDs used in response
search_terms = Column(JSON, nullable=True)
created_at = Column(DateTime(timezone=True), server_default=func.now())
# Relationships
user_id = Column(Integer, ForeignKey("users.id"))
user = relationship("User", back_populates="interactions")
class UserPreference(Base):
__tablename__ = "user_preferences"
id = Column(Integer, primary_key=True, index=True)
preference_type = Column(String, nullable=False) # response_style, categories, etc.
preference_key = Column(String, nullable=False)
preference_value = Column(JSON, nullable=False)
confidence_score = Column(Float, default=0.5) # How confident we are about this preference
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now())
# Relationships
user_id = Column(Integer, ForeignKey("users.id"))
user = relationship("User", back_populates="preferences")
class Reminder(Base):
__tablename__ = "reminders"
id = Column(Integer, primary_key=True, index=True)
title = Column(String, nullable=False)
description = Column(Text, nullable=True)
reminder_time = Column(DateTime(timezone=True), nullable=False)
is_completed = Column(Boolean, default=False)
is_recurring = Column(Boolean, default=False)
recurrence_pattern = Column(String, nullable=True) # daily, weekly, monthly
# Context for AI suggestions
context_files = Column(JSON, nullable=True) # Related file IDs
auto_generated = Column(Boolean, default=False) # Was this generated by AI?
priority = Column(Integer, default=1) # 1-5 priority scale
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now())
# Relationships
user_id = Column(Integer, ForeignKey("users.id"))
user = relationship("User", back_populates="reminders")
class LearningPattern(Base):
__tablename__ = "learning_patterns"
id = Column(Integer, primary_key=True, index=True)
pattern_type = Column(String, nullable=False) # time_based, topic_based, etc.
pattern_data = Column(JSON, nullable=False)
confidence_score = Column(Float, default=0.0)
usage_count = Column(Integer, default=0)
success_rate = Column(Float, default=0.0)
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now())
# Relationships
user_id = Column(Integer, ForeignKey("users.id"))
class DocumentEmbedding(Base):
__tablename__ = "document_embeddings"
id = Column(Integer, primary_key=True, index=True)
file_id = Column(Integer, ForeignKey("user_files.id"))
chunk_index = Column(Integer, nullable=False) # For large documents split into chunks
chunk_text = Column(Text, nullable=False)
embedding_id = Column(String, nullable=False) # ID in vector database
created_at = Column(DateTime(timezone=True), server_default=func.now())
# Relationships
file = relationship("UserFile")

221
backend/app/main.py Normal file
View File

@ -0,0 +1,221 @@
from fastapi import FastAPI, Request, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.trustedhost import TrustedHostMiddleware
from fastapi.responses import JSONResponse
import time
import logging
from contextlib import asynccontextmanager
from app.core.config import settings
from app.db.database import engine
from app.db.models import Base
# Import routers
from app.api.auth import router as auth_router
# from app.api.files import router as files_router
# from app.api.chat import router as chat_router
# from app.api.reminders import router as reminders_router
# from app.api.search import router as search_router
logger = logging.getLogger(__name__)
@asynccontextmanager
async def lifespan(app: FastAPI):
"""Lifespan context manager for startup and shutdown events"""
# Startup
logger.info("Starting aPersona backend...")
# Create database tables
Base.metadata.create_all(bind=engine)
logger.info("Database tables created")
# Initialize AI components
try:
from ai_core.embeddings.embedding_service import embedding_service
from ai_core.rag.vector_store import vector_store
from ai_core.llm.ollama_client import ollama_client
# Test Ollama connection
is_healthy = await ollama_client.check_health()
if is_healthy:
logger.info("Ollama connection established")
else:
logger.warning("Ollama service not available - some features may be limited")
# Initialize vector store
stats = vector_store.get_collection_stats()
logger.info(f"Vector store initialized: {stats}")
# Test embedding service
embedding_info = embedding_service.get_model_info()
logger.info(f"Embedding service ready: {embedding_info}")
except Exception as e:
logger.error(f"Failed to initialize AI components: {e}")
yield
# Shutdown
logger.info("Shutting down aPersona backend...")
try:
await ollama_client.close()
except:
pass
# Create FastAPI app
app = FastAPI(
title=settings.APP_NAME,
version=settings.APP_VERSION,
description="AI-powered personal assistant that works completely offline",
lifespan=lifespan
)
# Add CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=settings.BACKEND_CORS_ORIGINS,
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Add trusted host middleware for security
app.add_middleware(
TrustedHostMiddleware,
allowed_hosts=["localhost", "127.0.0.1", "*.localhost"]
)
# Request timing middleware
@app.middleware("http")
async def add_process_time_header(request: Request, call_next):
"""Add processing time to response headers"""
start_time = time.time()
response = await call_next(request)
process_time = time.time() - start_time
response.headers["X-Process-Time"] = str(process_time)
return response
# Global exception handler
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
"""Global exception handler for unhandled errors"""
logger.error(f"Unhandled error for {request.url}: {exc}")
return JSONResponse(
status_code=500,
content={
"detail": "Internal server error",
"error": str(exc) if settings.DEBUG else "An unexpected error occurred"
}
)
# Health check endpoint
@app.get("/health")
async def health_check():
"""Health check endpoint"""
try:
from ai_core.llm.ollama_client import ollama_client
ollama_healthy = await ollama_client.check_health()
return {
"status": "healthy",
"app_name": settings.APP_NAME,
"version": settings.APP_VERSION,
"services": {
"database": "healthy",
"ollama": "healthy" if ollama_healthy else "unhealthy",
"embeddings": "healthy",
"vector_store": "healthy"
}
}
except Exception as e:
logger.error(f"Health check failed: {e}")
return JSONResponse(
status_code=503,
content={
"status": "unhealthy",
"error": str(e)
}
)
# Root endpoint
@app.get("/")
async def root():
"""Root endpoint"""
return {
"message": f"Welcome to {settings.APP_NAME}",
"version": settings.APP_VERSION,
"description": "AI-powered personal assistant - fully local and private",
"endpoints": {
"health": "/health",
"docs": "/docs",
"api": settings.API_V1_STR
}
}
# System info endpoint
@app.get(f"{settings.API_V1_STR}/system/info")
async def get_system_info():
"""Get system information and capabilities"""
try:
from ai_core.embeddings.embedding_service import embedding_service
from ai_core.rag.vector_store import vector_store
from ai_core.llm.ollama_client import ollama_client
# Get AI service information
embedding_info = embedding_service.get_model_info()
vector_stats = vector_store.get_collection_stats()
available_models = await ollama_client.list_models()
return {
"app_info": {
"name": settings.APP_NAME,
"version": settings.APP_VERSION,
"debug": settings.DEBUG
},
"ai_services": {
"embedding_model": embedding_info,
"vector_store": vector_stats,
"available_llm_models": available_models,
"current_llm_model": settings.DEFAULT_LLM_MODEL
},
"capabilities": {
"file_processing": [
"PDF", "DOCX", "TXT", "Images (OCR)",
"Markdown", "PNG", "JPEG", "GIF"
],
"ai_features": [
"Semantic search", "Auto-categorization",
"Smart reminders", "Personalized responses",
"Learning from interactions"
]
}
}
except Exception as e:
logger.error(f"Failed to get system info: {e}")
raise HTTPException(status_code=500, detail="Failed to retrieve system information")
# Include API routers
app.include_router(auth_router, prefix=f"{settings.API_V1_STR}/auth", tags=["authentication"])
# app.include_router(files_router, prefix=f"{settings.API_V1_STR}/files", tags=["files"])
# app.include_router(chat_router, prefix=f"{settings.API_V1_STR}/chat", tags=["chat"])
# app.include_router(reminders_router, prefix=f"{settings.API_V1_STR}/reminders", tags=["reminders"])
# app.include_router(search_router, prefix=f"{settings.API_V1_STR}/search", tags=["search"])
if __name__ == "__main__":
import uvicorn
uvicorn.run(
"app.main:app",
host="0.0.0.0",
port=8000,
reload=settings.DEBUG,
log_level="info" if not settings.DEBUG else "debug"
)

41
backend/requirements.txt Normal file
View File

@ -0,0 +1,41 @@
# FastAPI and Web Server
fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
# Database and ORM
sqlalchemy==2.0.23
alembic==1.12.1
sqlite3
# Authentication and Security
python-jose[cryptography]==3.3.0
passlib[bcrypt]==1.7.4
python-multipart==0.0.6
# AI and ML
torch==2.1.1
transformers==4.35.2
sentence-transformers==2.2.2
chromadb==0.4.15
ollama==0.1.8
huggingface-hub==0.19.4
# File Processing
PyPDF2==3.0.1
python-docx==1.1.0
Pillow==10.1.0
python-magic==0.4.27
# Utilities
pydantic==2.5.0
python-dotenv==1.0.0
httpx==0.25.2
aiofiles==23.2.1
schedule==1.2.0
# Development
pytest==7.4.3
pytest-asyncio==0.21.1
black==23.11.0
isort==5.12.0

View File

0
data/processed/.gitkeep Normal file
View File

0
data/uploads/.gitkeep Normal file
View File

0
data/vectors/.gitkeep Normal file
View File

351
docs/ARCHITECTURE.md Normal file
View File

@ -0,0 +1,351 @@
# aPersona System Architecture
## Overview
aPersona is a fully local, AI-powered personal assistant designed to work entirely offline while providing intelligent, context-aware assistance based on your personal files and behavior patterns.
## Core Principles
- **100% Local**: No data leaves your device
- **Privacy-First**: All processing happens on your machine
- **Adaptive Learning**: Continuously improves based on your interactions
- **Context-Aware**: Understands your personal documents and preferences
## System Architecture
### Backend (Python FastAPI)
```
backend/
├── app/
│ ├── api/ # REST API endpoints
│ ├── core/ # Core configuration and security
│ ├── db/ # Database models and connections
│ └── services/ # Business logic services
├── ai_core/ # AI/ML components
│ ├── embeddings/ # Text embedding service
│ ├── llm/ # Local LLM integration (Ollama)
│ ├── rag/ # Retrieval-Augmented Generation
│ └── auto_learning/ # Adaptive learning engine
└── requirements.txt
```
#### Key Components
1. **FastAPI Application**: RESTful API server
2. **SQLAlchemy ORM**: Database management with SQLite
3. **Authentication**: JWT-based user authentication
4. **File Processing**: Multi-format document processing
5. **Vector Database**: ChromaDB for semantic search
6. **Local LLM**: Ollama integration for AI responses
### Frontend (React + TypeScript)
```
frontend/
├── src/
│ ├── components/ # Reusable UI components
│ ├── pages/ # Page-level components
│ ├── services/ # API service layer
│ ├── store/ # State management (Zustand)
│ └── utils/ # Utility functions
├── index.html
└── package.json
```
#### Key Technologies
1. **React 18**: Modern UI framework
2. **TypeScript**: Type-safe development
3. **TailwindCSS**: Utility-first styling
4. **Vite**: Fast build tool and dev server
5. **React Query**: Server state management
6. **Zustand**: Client state management
### AI Core Components
#### 1. Embedding Service (`ai_core/embeddings/`)
- **Purpose**: Convert text to numerical vectors for semantic search
- **Model**: SentenceTransformers (all-MiniLM-L6-v2)
- **Features**:
- Caching for performance
- Batch processing
- Similarity computation
#### 2. Vector Store (`ai_core/rag/`)
- **Purpose**: Store and search document embeddings
- **Technology**: ChromaDB with persistent storage
- **Capabilities**:
- Semantic similarity search
- Metadata filtering
- User-specific collections
#### 3. LLM Integration (`ai_core/llm/`)
- **Purpose**: Local language model integration
- **Technology**: Ollama (supports Mistral, LLaMA, etc.)
- **Features**:
- Streaming responses
- Context management
- Error handling
#### 4. File Processing (`ai_core/file_processing/`)
- **Supported Formats**: PDF, DOCX, TXT, Images (OCR), Markdown
- **Features**:
- Content extraction
- Auto-categorization
- Metadata extraction
- Text chunking for embeddings
## Auto-Learning System
The auto-learning module is the heart of aPersona's intelligence, continuously adapting to user behavior and preferences.
### Learning Components
#### 1. Interaction Analysis
```python
class LearningEngine:
async def analyze_user_interactions(self, user_id: int):
# Analyzes patterns in user queries and responses
- Frequency patterns
- Topic preferences
- Response quality metrics
- Search patterns
- Time-based usage patterns
```
#### 2. Preference Learning
The system learns user preferences across multiple dimensions:
- **Response Style**: Concise vs. detailed responses
- **Topic Interests**: Frequently discussed subjects
- **Time Patterns**: When user is most active
- **File Usage**: Most accessed documents
#### 3. Adaptive Prompting
```python
async def generate_personalized_prompt(self, user_id: int, base_prompt: str):
# Creates personalized system prompts based on learned preferences
- User's communication style
- Preferred response length
- Topic expertise areas
- Context preferences
```
#### 4. Proactive Suggestions
The system generates intelligent suggestions:
- **Reminder Optimization**: Suggests optimal reminder times
- **File Organization**: Proposes file organization improvements
- **Content Discovery**: Recommends related documents
- **Workflow Improvements**: Suggests process optimizations
### Learning Data Flow
```mermaid
graph TD
A[User Interaction] --> B[Store Interaction Data]
B --> C[Analyze Patterns]
C --> D[Update Preferences]
D --> E[Generate Personalized Prompts]
E --> F[Improve Responses]
F --> G[Collect Feedback]
G --> A
```
### Learning Metrics
1. **Confidence Scores**: How certain the system is about preferences
2. **Success Rates**: Effectiveness of learned patterns
3. **Usage Counts**: Frequency of pattern application
4. **Feedback Integration**: User satisfaction incorporation
## Data Storage
### Database Schema
#### Core Tables
1. **Users**: User accounts and authentication
2. **UserFiles**: Uploaded files and metadata
3. **UserInteractions**: All user-AI interactions
4. **UserPreferences**: Learned user preferences
5. **LearningPatterns**: Detected behavioral patterns
6. **Reminders**: User reminders and notifications
#### Vector Storage
- **ChromaDB Collections**: Document embeddings with metadata
- **User-Specific Collections**: Isolated data per user
- **Embedding Cache**: Local cache for faster processing
## Security & Privacy
### Data Protection
1. **Local Storage**: All data remains on user's device
2. **Encrypted Authentication**: JWT tokens with secure hashing
3. **No External APIs**: No cloud dependencies
4. **User Data Isolation**: Multi-user support with data separation
### File Security
1. **Access Controls**: User-based file access
2. **Secure Upload**: File validation and sanitization
3. **Safe Processing**: Sandboxed file processing
4. **Cleanup**: Temporary file management
## RAG (Retrieval-Augmented Generation) System
### How It Works
1. **Document Ingestion**:
- Files are processed and chunked
- Text is converted to embeddings
- Metadata is extracted and stored
2. **Query Processing**:
- User query is embedded
- Semantic search finds relevant chunks
- Context is assembled for LLM
3. **Response Generation**:
- LLM receives query + relevant context
- Personalized prompts are applied
- Response is generated and returned
4. **Learning Loop**:
- User feedback is collected
- Patterns are analyzed
- System adapts for future queries
### Context Assembly
```python
def assemble_context(query_embedding, user_preferences):
# Find relevant documents
relevant_docs = vector_store.search_similar(query_embedding)
# Apply user preferences
context = personalize_context(relevant_docs, user_preferences)
# Generate personalized prompt
system_prompt = generate_personalized_prompt(user_id, base_prompt)
return context, system_prompt
```
## Performance Optimizations
### Embedding Cache
- Local caching of text embeddings
- Significant performance improvement for repeated content
- Automatic cache management
### Batch Processing
- Process multiple files simultaneously
- Batch embedding generation
- Efficient database operations
### Background Tasks
- Asynchronous file processing
- Background learning analysis
- Scheduled maintenance tasks
## Deployment Architecture
### Local Development
```bash
# Backend
cd backend && python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
uvicorn app.main:app --reload
# Frontend
cd frontend && npm install
npm run dev
# AI Services
ollama serve
ollama pull mistral
ollama pull nomic-embed-text
```
### Production Deployment
- **Containerization**: Docker support for easy deployment
- **Service Management**: Systemd service files
- **Automatic Updates**: Self-updating mechanisms
- **Backup System**: Automated data backups
## Extending the System
### Adding New File Types
1. Implement processor in `ai_core/file_processing/`
2. Add MIME type mapping
3. Update file upload validation
4. Test with sample files
### Adding New Learning Patterns
1. Extend `LearningEngine` class
2. Add new pattern types
3. Implement analysis logic
4. Update preference storage
### Custom LLM Integration
1. Implement LLM client interface
2. Add configuration options
3. Update prompt generation
4. Test with target model
## Monitoring & Analytics
### System Health
- AI service availability
- Database performance
- File processing status
- Memory and disk usage
### User Analytics
- Interaction frequency
- Learning effectiveness
- Feature usage patterns
- System performance metrics
## Future Enhancements
### Planned Features
1. **Multi-modal Support**: Image understanding and generation
2. **Voice Interface**: Speech-to-text and text-to-speech
3. **Advanced Scheduling**: Calendar integration and smart scheduling
4. **Team Features**: Shared knowledge bases (while maintaining privacy)
5. **Mobile App**: Native mobile applications
6. **Plugin System**: Extensible plugin architecture
### Research Areas
1. **Federated Learning**: Improve models without data sharing
2. **Advanced RAG**: More sophisticated retrieval strategies
3. **Multi-agent Systems**: Specialized AI agents for different tasks
4. **Continuous Learning**: Real-time model adaptation
This architecture ensures aPersona remains a powerful, private, and continuously improving personal AI assistant that truly understands and adapts to each user's unique needs and preferences.

110
frontend/index.html Normal file
View File

@ -0,0 +1,110 @@
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="description" content="aPersona - Your AI-powered personal assistant that works completely offline" />
<meta name="theme-color" content="#000000" />
<!-- Preload fonts -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet">
<title>aPersona - AI Personal Assistant</title>
<style>
/* CSS custom properties for theme */
:root {
--background: 0 0% 100%;
--foreground: 222.2 84% 4.9%;
--card: 0 0% 100%;
--card-foreground: 222.2 84% 4.9%;
--popover: 0 0% 100%;
--popover-foreground: 222.2 84% 4.9%;
--primary: 221.2 83.2% 53.3%;
--primary-foreground: 210 40% 98%;
--secondary: 210 40% 96%;
--secondary-foreground: 222.2 84% 4.9%;
--muted: 210 40% 96%;
--muted-foreground: 215.4 16.3% 46.9%;
--accent: 210 40% 96%;
--accent-foreground: 222.2 84% 4.9%;
--destructive: 0 84.2% 60.2%;
--destructive-foreground: 210 40% 98%;
--border: 214.3 31.8% 91.4%;
--input: 214.3 31.8% 91.4%;
--ring: 221.2 83.2% 53.3%;
--radius: 0.5rem;
}
.dark {
--background: 222.2 84% 4.9%;
--foreground: 210 40% 98%;
--card: 222.2 84% 4.9%;
--card-foreground: 210 40% 98%;
--popover: 222.2 84% 4.9%;
--popover-foreground: 210 40% 98%;
--primary: 217.2 91.2% 59.8%;
--primary-foreground: 222.2 84% 4.9%;
--secondary: 217.2 32.6% 17.5%;
--secondary-foreground: 210 40% 98%;
--muted: 217.2 32.6% 17.5%;
--muted-foreground: 215 20.2% 65.1%;
--accent: 217.2 32.6% 17.5%;
--accent-foreground: 210 40% 98%;
--destructive: 0 62.8% 30.6%;
--destructive-foreground: 210 40% 98%;
--border: 217.2 32.6% 17.5%;
--input: 217.2 32.6% 17.5%;
--ring: 224.3 76.3% 94.1%;
}
/* Loading styles */
#loading {
position: fixed;
top: 0;
left: 0;
width: 100%;
height: 100%;
background: hsl(var(--background));
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
z-index: 9999;
}
.loading-spinner {
width: 40px;
height: 40px;
border: 3px solid hsl(var(--muted));
border-top: 3px solid hsl(var(--primary));
border-radius: 50%;
animation: spin 1s linear infinite;
}
@keyframes spin {
0% { transform: rotate(0deg); }
100% { transform: rotate(360deg); }
}
.loading-text {
margin-top: 1rem;
color: hsl(var(--muted-foreground));
font-family: 'Inter', sans-serif;
}
</style>
</head>
<body>
<div id="root">
<!-- Loading screen -->
<div id="loading">
<div class="loading-spinner"></div>
<div class="loading-text">Loading aPersona...</div>
</div>
</div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>

57
frontend/package.json Normal file
View File

@ -0,0 +1,57 @@
{
"name": "apersona-frontend",
"private": true,
"version": "1.0.0",
"type": "module",
"description": "aPersona AI Assistant Frontend",
"scripts": {
"dev": "vite",
"build": "tsc && vite build",
"lint": "eslint . --ext ts,tsx --report-unused-disable-directives --max-warnings 0",
"preview": "vite preview"
},
"dependencies": {
"react": "^18.2.0",
"react-dom": "^18.2.0",
"react-router-dom": "^6.19.0",
"axios": "^1.6.0",
"zustand": "^4.4.6",
"@tanstack/react-query": "^5.8.4",
"react-hook-form": "^7.47.0",
"@hookform/resolvers": "^3.3.2",
"zod": "^3.22.4",
"date-fns": "^2.30.0",
"lucide-react": "^0.294.0",
"react-dropzone": "^14.2.3",
"react-markdown": "^9.0.1",
"react-syntax-highlighter": "^15.5.0",
"recharts": "^2.8.0",
"sonner": "^1.2.4",
"clsx": "^2.0.0",
"tailwind-merge": "^2.0.0",
"@radix-ui/react-dialog": "^1.0.5",
"@radix-ui/react-dropdown-menu": "^2.0.6",
"@radix-ui/react-tabs": "^1.0.4",
"@radix-ui/react-toast": "^1.1.5",
"@radix-ui/react-tooltip": "^1.0.7",
"@radix-ui/react-avatar": "^1.0.4",
"@radix-ui/react-badge": "^1.0.4",
"@radix-ui/react-progress": "^1.0.3"
},
"devDependencies": {
"@types/react": "^18.2.37",
"@types/react-dom": "^18.2.15",
"@types/react-syntax-highlighter": "^15.5.10",
"@typescript-eslint/eslint-plugin": "^6.10.0",
"@typescript-eslint/parser": "^6.10.0",
"@vitejs/plugin-react": "^4.1.1",
"autoprefixer": "^10.4.16",
"eslint": "^8.53.0",
"eslint-plugin-react-hooks": "^4.6.0",
"eslint-plugin-react-refresh": "^0.4.4",
"postcss": "^8.4.31",
"tailwindcss": "^3.3.5",
"typescript": "^5.2.2",
"vite": "^4.5.0"
}
}

View File

@ -0,0 +1 @@

41
frontend/src/App.tsx Normal file
View File

@ -0,0 +1,41 @@
import { Routes, Route } from 'react-router-dom'
import { useQuery } from '@tanstack/react-query'
import Layout from './components/Layout'
import Dashboard from './pages/Dashboard'
import Chat from './pages/Chat'
import Files from './pages/Files'
import Reminders from './pages/Reminders'
import Settings from './pages/Settings'
import Login from './pages/Login'
import { useAuthStore } from './store/authStore'
import { api } from './services/api'
function App() {
const { user, setUser } = useAuthStore()
// Check system health on app load
const { data: systemInfo } = useQuery({
queryKey: ['system-info'],
queryFn: api.getSystemInfo,
refetchInterval: 30000, // Check every 30 seconds
})
// If user is not authenticated, show login
if (!user) {
return <Login />
}
return (
<Layout systemInfo={systemInfo}>
<Routes>
<Route path="/" element={<Dashboard />} />
<Route path="/chat" element={<Chat />} />
<Route path="/files" element={<Files />} />
<Route path="/reminders" element={<Reminders />} />
<Route path="/settings" element={<Settings />} />
</Routes>
</Layout>
)
}
export default App

View File

@ -0,0 +1,173 @@
import React, { useState } from 'react'
import { Link, useLocation } from 'react-router-dom'
import {
Home,
MessageSquare,
Files,
Bell,
Settings,
Menu,
X,
Brain,
User,
LogOut
} from 'lucide-react'
import { useAuthStore } from '../store/authStore'
import { SystemInfo } from '../services/api'
interface LayoutProps {
children: React.ReactNode
systemInfo?: SystemInfo
}
const navigation = [
{ name: 'Dashboard', href: '/', icon: Home },
{ name: 'Chat', href: '/chat', icon: MessageSquare },
{ name: 'Files', href: '/files', icon: Files },
{ name: 'Reminders', href: '/reminders', icon: Bell },
{ name: 'Settings', href: '/settings', icon: Settings },
]
export default function Layout({ children, systemInfo }: LayoutProps) {
const [sidebarOpen, setSidebarOpen] = useState(false)
const location = useLocation()
const { user, logout } = useAuthStore()
const handleLogout = () => {
logout()
window.location.reload()
}
return (
<div className="h-screen flex overflow-hidden bg-background">
{/* Sidebar */}
<div className={`${
sidebarOpen ? 'translate-x-0' : '-translate-x-full'
} fixed inset-y-0 left-0 z-50 w-64 bg-card transition-transform lg:translate-x-0 lg:static lg:inset-0`}>
<div className="flex h-full flex-col">
{/* Logo and close button */}
<div className="flex h-16 shrink-0 items-center justify-between px-4 border-b">
<div className="flex items-center gap-2">
<Brain className="w-8 h-8 text-primary" />
<span className="text-xl font-bold text-foreground">aPersona</span>
</div>
<button
type="button"
className="lg:hidden"
onClick={() => setSidebarOpen(false)}
>
<X className="h-6 w-6" />
</button>
</div>
{/* Navigation */}
<nav className="flex-1 space-y-1 px-4 py-4">
{navigation.map((item) => {
const Icon = item.icon
const isActive = location.pathname === item.href
return (
<Link
key={item.name}
to={item.href}
className={`sidebar-item ${isActive ? 'active' : ''}`}
onClick={() => setSidebarOpen(false)}
>
<Icon className="h-5 w-5" />
{item.name}
</Link>
)
})}
</nav>
{/* System status */}
{systemInfo && (
<div className="px-4 py-3 border-t">
<div className="text-xs text-muted-foreground mb-2">System Status</div>
<div className="space-y-1 text-xs">
<div className="flex justify-between">
<span>LLM:</span>
<span className="text-green-500">
{systemInfo.ai_services.current_llm_model}
</span>
</div>
<div className="flex justify-between">
<span>Docs:</span>
<span className="text-green-500">
{systemInfo.ai_services.vector_store?.total_documents || 0}
</span>
</div>
</div>
</div>
)}
{/* User info */}
<div className="px-4 py-3 border-t">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<div className="w-8 h-8 bg-primary rounded-full flex items-center justify-center">
<User className="w-4 h-4 text-primary-foreground" />
</div>
<div className="text-sm">
<div className="font-medium">{user?.username}</div>
<div className="text-muted-foreground text-xs">{user?.email}</div>
</div>
</div>
<button
onClick={handleLogout}
className="p-1 rounded-md hover:bg-accent"
title="Logout"
>
<LogOut className="w-4 h-4" />
</button>
</div>
</div>
</div>
</div>
{/* Main content */}
<div className="flex flex-1 flex-col overflow-hidden">
{/* Header */}
<header className="h-16 flex items-center justify-between px-4 lg:px-6 border-b bg-card">
<button
type="button"
className="lg:hidden"
onClick={() => setSidebarOpen(true)}
>
<Menu className="h-6 w-6" />
</button>
<div className="flex items-center gap-4">
{/* Status indicators */}
<div className="hidden md:flex items-center gap-2 text-xs text-muted-foreground">
<div className="flex items-center gap-1">
<div className="w-2 h-2 bg-green-500 rounded-full"></div>
<span>AI Online</span>
</div>
<div className="flex items-center gap-1">
<div className="w-2 h-2 bg-blue-500 rounded-full"></div>
<span>Local</span>
</div>
<div className="flex items-center gap-1">
<div className="w-2 h-2 bg-purple-500 rounded-full"></div>
<span>Private</span>
</div>
</div>
</div>
</header>
{/* Page content */}
<main className="flex-1 overflow-auto">
{children}
</main>
</div>
{/* Mobile sidebar overlay */}
{sidebarOpen && (
<div
className="fixed inset-0 bg-black/20 z-40 lg:hidden"
onClick={() => setSidebarOpen(false)}
/>
)}
</div>
)
}

129
frontend/src/index.css Normal file
View File

@ -0,0 +1,129 @@
@import 'tailwindcss/base';
@import 'tailwindcss/components';
@import 'tailwindcss/utilities';
@layer base {
* {
@apply border-border;
}
body {
@apply bg-background text-foreground;
font-feature-settings: 'rlig' 1, 'calt' 1;
}
}
@layer components {
/* Custom component styles */
.chat-message {
@apply p-4 rounded-lg mb-4 max-w-3xl;
}
.chat-message.user {
@apply bg-primary text-primary-foreground ml-auto;
}
.chat-message.assistant {
@apply bg-muted text-muted-foreground mr-auto;
}
.file-upload-area {
@apply border-2 border-dashed border-border rounded-lg p-8 text-center cursor-pointer hover:border-primary transition-colors;
}
.file-upload-area.dragover {
@apply border-primary bg-primary/10;
}
.sidebar-item {
@apply flex items-center gap-3 px-3 py-2 rounded-md text-sm font-medium transition-colors hover:bg-accent hover:text-accent-foreground;
}
.sidebar-item.active {
@apply bg-accent text-accent-foreground;
}
.card {
@apply rounded-lg border bg-card text-card-foreground shadow-sm;
}
.input {
@apply flex h-10 w-full rounded-md border border-input bg-background px-3 py-2 text-sm ring-offset-background file:border-0 file:bg-transparent file:text-sm file:font-medium placeholder:text-muted-foreground focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50;
}
.button {
@apply inline-flex items-center justify-center whitespace-nowrap rounded-md text-sm font-medium ring-offset-background transition-colors focus-visible:outline-none focus-visible:ring-2 focus-visible:ring-ring focus-visible:ring-offset-2 disabled:pointer-events-none disabled:opacity-50;
}
.button-primary {
@apply button bg-primary text-primary-foreground hover:bg-primary/90 h-10 px-4 py-2;
}
.button-secondary {
@apply button bg-secondary text-secondary-foreground hover:bg-secondary/80 h-10 px-4 py-2;
}
.button-outline {
@apply button border border-input bg-background hover:bg-accent hover:text-accent-foreground h-10 px-4 py-2;
}
.button-ghost {
@apply button hover:bg-accent hover:text-accent-foreground h-10 px-4 py-2;
}
}
/* Custom scrollbar */
.custom-scrollbar::-webkit-scrollbar {
width: 6px;
}
.custom-scrollbar::-webkit-scrollbar-track {
background: hsl(var(--muted));
}
.custom-scrollbar::-webkit-scrollbar-thumb {
background: hsl(var(--muted-foreground));
border-radius: 3px;
}
.custom-scrollbar::-webkit-scrollbar-thumb:hover {
background: hsl(var(--foreground));
}
/* Animation for typing indicator */
.typing-indicator {
display: inline-flex;
align-items: center;
gap: 2px;
}
.typing-dot {
width: 6px;
height: 6px;
border-radius: 50%;
background-color: hsl(var(--muted-foreground));
animation: typing 1.4s ease-in-out infinite;
}
.typing-dot:nth-child(1) {
animation-delay: 0ms;
}
.typing-dot:nth-child(2) {
animation-delay: 200ms;
}
.typing-dot:nth-child(3) {
animation-delay: 400ms;
}
@keyframes typing {
0%, 60%, 100% {
transform: translateY(0);
opacity: 0.4;
}
30% {
transform: translateY(-10px);
opacity: 1;
}
}

43
frontend/src/main.tsx Normal file
View File

@ -0,0 +1,43 @@
import React from 'react'
import ReactDOM from 'react-dom/client'
import { BrowserRouter } from 'react-router-dom'
import { QueryClient, QueryClientProvider } from '@tanstack/react-query'
import { Toaster } from 'sonner'
import App from './App.tsx'
import './index.css'
// Create a query client for React Query
const queryClient = new QueryClient({
defaultOptions: {
queries: {
staleTime: 1000 * 60 * 5, // 5 minutes
retry: 1,
},
},
})
// Remove loading screen after React app loads
const removeLoadingScreen = () => {
const loadingElement = document.getElementById('loading')
if (loadingElement) {
loadingElement.remove()
}
}
ReactDOM.createRoot(document.getElementById('root')!).render(
<React.StrictMode>
<QueryClientProvider client={queryClient}>
<BrowserRouter>
<App />
<Toaster
position="top-right"
richColors
closeButton
/>
</BrowserRouter>
</QueryClientProvider>
</React.StrictMode>,
)
// Remove loading screen once React has mounted
setTimeout(removeLoadingScreen, 100)

View File

@ -0,0 +1,10 @@
import React from 'react'
export default function Chat() {
return (
<div className="p-6">
<h1 className="text-2xl font-bold text-foreground">Chat</h1>
<p className="text-muted-foreground">Chat with your AI assistant (coming soon)</p>
</div>
)
}

View File

@ -0,0 +1,219 @@
import React from 'react'
import { useQuery } from '@tanstack/react-query'
import {
MessageSquare,
Files,
Bell,
Activity,
Upload,
Search,
Plus
} from 'lucide-react'
import { Link } from 'react-router-dom'
import { api } from '../services/api'
export default function Dashboard() {
const { data: usageStats } = useQuery({
queryKey: ['usage-stats'],
queryFn: api.getUsageStats,
})
const { data: filesData } = useQuery({
queryKey: ['files', 1, 5],
queryFn: () => api.getFiles(1, 5),
})
const { data: reminders } = useQuery({
queryKey: ['reminders'],
queryFn: api.getReminders,
})
const { data: suggestions } = useQuery({
queryKey: ['suggestions'],
queryFn: api.getProactiveSuggestions,
})
const stats = [
{
name: 'Total Conversations',
value: usageStats?.total_conversations || 0,
icon: MessageSquare,
color: 'text-blue-500',
},
{
name: 'Files Uploaded',
value: filesData?.total || 0,
icon: Files,
color: 'text-green-500',
},
{
name: 'Active Reminders',
value: reminders?.filter((r: any) => !r.isCompleted).length || 0,
icon: Bell,
color: 'text-yellow-500',
},
{
name: 'Learning Score',
value: Math.round((usageStats?.learning_score || 0) * 100),
icon: Activity,
color: 'text-purple-500',
},
]
const quickActions = [
{
name: 'Start Chat',
description: 'Ask me anything about your files',
href: '/chat',
icon: MessageSquare,
color: 'bg-blue-500',
},
{
name: 'Upload Files',
description: 'Add documents for analysis',
href: '/files',
icon: Upload,
color: 'bg-green-500',
},
{
name: 'Search',
description: 'Find content in your files',
href: '/files',
icon: Search,
color: 'bg-purple-500',
},
{
name: 'Add Reminder',
description: 'Create a new reminder',
href: '/reminders',
icon: Plus,
color: 'bg-orange-500',
},
]
return (
<div className="p-6 space-y-6">
{/* Header */}
<div>
<h1 className="text-2xl font-bold text-foreground">Dashboard</h1>
<p className="text-muted-foreground">
Welcome back! Here's what's happening with your AI assistant.
</p>
</div>
{/* Stats Grid */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-6">
{stats.map((stat) => {
const Icon = stat.icon
return (
<div key={stat.name} className="card p-6">
<div className="flex items-center justify-between">
<div>
<p className="text-sm font-medium text-muted-foreground">
{stat.name}
</p>
<p className="text-3xl font-bold text-foreground">
{stat.value}
</p>
</div>
<Icon className={`h-8 w-8 ${stat.color}`} />
</div>
</div>
)
})}
</div>
{/* Quick Actions */}
<div className="card">
<div className="p-6 border-b">
<h2 className="text-lg font-semibold text-foreground">Quick Actions</h2>
<p className="text-sm text-muted-foreground">
Get started with these common tasks
</p>
</div>
<div className="p-6">
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4">
{quickActions.map((action) => {
const Icon = action.icon
return (
<Link
key={action.name}
to={action.href}
className="flex items-center space-x-3 p-4 rounded-lg border hover:bg-accent transition-colors"
>
<div className={`p-2 rounded-md ${action.color} text-white`}>
<Icon className="h-5 w-5" />
</div>
<div>
<p className="font-medium text-foreground">{action.name}</p>
<p className="text-xs text-muted-foreground">
{action.description}
</p>
</div>
</Link>
)
})}
</div>
</div>
</div>
{/* Recent Activity */}
<div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
{/* Recent Files */}
<div className="card">
<div className="p-6 border-b">
<h2 className="text-lg font-semibold text-foreground">Recent Files</h2>
</div>
<div className="p-6">
{filesData?.files?.length > 0 ? (
<div className="space-y-3">
{filesData.files.slice(0, 5).map((file: any) => (
<div key={file.id} className="flex items-center space-x-3">
<Files className="h-4 w-4 text-muted-foreground" />
<div className="flex-1 min-w-0">
<p className="text-sm font-medium text-foreground truncate">
{file.originalName}
</p>
<p className="text-xs text-muted-foreground">
{new Date(file.createdAt).toLocaleDateString()}
</p>
</div>
</div>
))}
</div>
) : (
<p className="text-sm text-muted-foreground">
No files uploaded yet. Start by uploading some documents.
</p>
)}
</div>
</div>
{/* AI Suggestions */}
<div className="card">
<div className="p-6 border-b">
<h2 className="text-lg font-semibold text-foreground">AI Suggestions</h2>
</div>
<div className="p-6">
{suggestions?.length > 0 ? (
<div className="space-y-3">
{suggestions.slice(0, 3).map((suggestion: any, index: number) => (
<div key={index} className="p-3 bg-muted rounded-lg">
<p className="text-sm text-foreground">{suggestion.message}</p>
<p className="text-xs text-muted-foreground mt-1">
Confidence: {Math.round(suggestion.confidence * 100)}%
</p>
</div>
))}
</div>
) : (
<p className="text-sm text-muted-foreground">
I'll provide personalized suggestions as you use the system more.
</p>
)}
</div>
</div>
</div>
</div>
)
}

View File

@ -0,0 +1,10 @@
import React from 'react'
export default function Files() {
return (
<div className="p-6">
<h1 className="text-2xl font-bold text-foreground">Files</h1>
<p className="text-muted-foreground">Manage your uploaded files (coming soon)</p>
</div>
)
}

View File

@ -0,0 +1,191 @@
import React, { useState } from 'react'
import { Brain, Eye, EyeOff } from 'lucide-react'
import { useAuthStore } from '../store/authStore'
import { api } from '../services/api'
import { toast } from 'sonner'
export default function Login() {
const [isLogin, setIsLogin] = useState(true)
const [showPassword, setShowPassword] = useState(false)
const [loading, setLoading] = useState(false)
const { setUser, setToken } = useAuthStore()
const [formData, setFormData] = useState({
username: '',
email: '',
password: '',
fullName: '',
})
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault()
setLoading(true)
try {
if (isLogin) {
const response = await api.login(formData.username, formData.password)
setToken(response.access_token)
setUser(response.user)
toast.success('Welcome back!')
} else {
const response = await api.register({
username: formData.username,
email: formData.email,
password: formData.password,
fullName: formData.fullName,
})
setToken(response.access_token)
setUser(response.user)
toast.success('Account created successfully!')
}
} catch (error: any) {
toast.error(error.response?.data?.detail || 'Authentication failed')
} finally {
setLoading(false)
}
}
const handleInputChange = (e: React.ChangeEvent<HTMLInputElement>) => {
setFormData(prev => ({
...prev,
[e.target.name]: e.target.value
}))
}
return (
<div className="min-h-screen flex items-center justify-center bg-background px-4">
<div className="max-w-md w-full space-y-8">
{/* Logo and title */}
<div className="text-center">
<div className="flex justify-center">
<Brain className="w-16 h-16 text-primary" />
</div>
<h2 className="mt-6 text-3xl font-bold text-foreground">
{isLogin ? 'Welcome back' : 'Create account'}
</h2>
<p className="mt-2 text-sm text-muted-foreground">
Your AI assistant that keeps everything local and private
</p>
</div>
{/* Form */}
<form className="mt-8 space-y-6" onSubmit={handleSubmit}>
<div className="space-y-4">
{!isLogin && (
<div>
<label htmlFor="fullName" className="block text-sm font-medium text-foreground">
Full Name
</label>
<input
id="fullName"
name="fullName"
type="text"
className="input mt-1"
placeholder="Your full name"
value={formData.fullName}
onChange={handleInputChange}
/>
</div>
)}
<div>
<label htmlFor="username" className="block text-sm font-medium text-foreground">
Username
</label>
<input
id="username"
name="username"
type="text"
required
className="input mt-1"
placeholder="Your username"
value={formData.username}
onChange={handleInputChange}
/>
</div>
{!isLogin && (
<div>
<label htmlFor="email" className="block text-sm font-medium text-foreground">
Email
</label>
<input
id="email"
name="email"
type="email"
required
className="input mt-1"
placeholder="your@email.com"
value={formData.email}
onChange={handleInputChange}
/>
</div>
)}
<div>
<label htmlFor="password" className="block text-sm font-medium text-foreground">
Password
</label>
<div className="relative mt-1">
<input
id="password"
name="password"
type={showPassword ? 'text' : 'password'}
required
className="input pr-10"
placeholder="Your password"
value={formData.password}
onChange={handleInputChange}
/>
<button
type="button"
className="absolute inset-y-0 right-0 pr-3 flex items-center"
onClick={() => setShowPassword(!showPassword)}
>
{showPassword ? (
<EyeOff className="h-4 w-4 text-muted-foreground" />
) : (
<Eye className="h-4 w-4 text-muted-foreground" />
)}
</button>
</div>
</div>
</div>
<div>
<button
type="submit"
disabled={loading}
className="button-primary w-full"
>
{loading ? (
<div className="flex items-center justify-center">
<div className="w-4 h-4 border-2 border-primary-foreground border-t-transparent rounded-full animate-spin mr-2"></div>
{isLogin ? 'Signing in...' : 'Creating account...'}
</div>
) : (
isLogin ? 'Sign in' : 'Create account'
)}
</button>
</div>
<div className="text-center">
<button
type="button"
className="text-sm text-primary hover:text-primary/80"
onClick={() => setIsLogin(!isLogin)}
>
{isLogin ? "Don't have an account? Sign up" : 'Already have an account? Sign in'}
</button>
</div>
</form>
{/* Privacy notice */}
<div className="text-center text-xs text-muted-foreground">
<p>🔒 All data is stored locally on your device</p>
<p>No cloud services Complete privacy Full control</p>
</div>
</div>
</div>
)
}

View File

@ -0,0 +1,10 @@
import React from 'react'
export default function Reminders() {
return (
<div className="p-6">
<h1 className="text-2xl font-bold text-foreground">Reminders</h1>
<p className="text-muted-foreground">Manage your reminders and notifications (coming soon)</p>
</div>
)
}

View File

@ -0,0 +1,10 @@
import React from 'react'
export default function Settings() {
return (
<div className="p-6">
<h1 className="text-2xl font-bold text-foreground">Settings</h1>
<p className="text-muted-foreground">Configure your AI assistant settings (coming soon)</p>
</div>
)
}

View File

@ -0,0 +1,235 @@
import axios from 'axios'
const BASE_URL = import.meta.env.VITE_API_URL || 'http://localhost:8000'
// Create axios instance
const apiClient = axios.create({
baseURL: BASE_URL,
timeout: 30000,
})
// Add auth token to requests
apiClient.interceptors.request.use((config) => {
const token = localStorage.getItem('auth-storage')
if (token) {
try {
const parsed = JSON.parse(token)
if (parsed.state?.token) {
config.headers.Authorization = `Bearer ${parsed.state.token}`
}
} catch (error) {
console.error('Failed to parse auth token:', error)
}
}
return config
})
// Handle auth errors
apiClient.interceptors.response.use(
(response) => response,
(error) => {
if (error.response?.status === 401) {
// Clear auth on 401
localStorage.removeItem('auth-storage')
window.location.reload()
}
return Promise.reject(error)
}
)
// Types
export interface SystemInfo {
app_info: {
name: string
version: string
debug: boolean
}
ai_services: {
embedding_model: any
vector_store: any
available_llm_models: string[]
current_llm_model: string
}
capabilities: {
file_processing: string[]
ai_features: string[]
}
}
export interface ChatMessage {
id: string
role: 'user' | 'assistant'
content: string
timestamp: string
metadata?: any
}
export interface FileInfo {
id: number
filename: string
originalName: string
fileType: string
fileSize: number
contentSummary?: string
categories: string[]
tags: string[]
createdAt: string
lastAccessed?: string
}
export interface Reminder {
id: number
title: string
description?: string
reminderTime: string
isCompleted: boolean
isRecurring: boolean
priority: number
autoGenerated: boolean
}
export interface SearchResult {
id: string
content: string
metadata: any
similarity: number
source: string
}
// API methods
export const api = {
// System endpoints
async getHealth() {
const response = await apiClient.get('/health')
return response.data
},
async getSystemInfo(): Promise<SystemInfo> {
const response = await apiClient.get('/api/v1/system/info')
return response.data
},
// Auth endpoints
async login(username: string, password: string) {
const response = await apiClient.post('/api/v1/auth/login', {
username,
password,
})
return response.data
},
async register(userData: {
username: string
email: string
password: string
fullName?: string
}) {
const response = await apiClient.post('/api/v1/auth/register', userData)
return response.data
},
async logout() {
await apiClient.post('/api/v1/auth/logout')
},
// Chat endpoints
async sendMessage(message: string): Promise<{ response: string; used_files?: string[] }> {
const response = await apiClient.post('/api/v1/chat/message', {
message,
})
return response.data
},
async getChatHistory(): Promise<ChatMessage[]> {
const response = await apiClient.get('/api/v1/chat/history')
return response.data
},
async provideFeedback(messageId: string, feedback: number) {
await apiClient.post(`/api/v1/chat/feedback/${messageId}`, {
feedback,
})
},
// File endpoints
async uploadFile(file: File, tags?: string[]) {
const formData = new FormData()
formData.append('file', file)
if (tags) {
formData.append('tags', JSON.stringify(tags))
}
const response = await apiClient.post('/api/v1/files/upload', formData, {
headers: {
'Content-Type': 'multipart/form-data',
},
})
return response.data
},
async getFiles(page = 1, limit = 20, category?: string): Promise<{ files: FileInfo[]; total: number }> {
const params = new URLSearchParams({
page: page.toString(),
limit: limit.toString()
})
if (category) params.append('category', category)
const response = await apiClient.get(`/api/v1/files?${params}`)
return response.data
},
async deleteFile(fileId: number) {
await apiClient.delete(`/api/v1/files/${fileId}`)
},
async updateFileTags(fileId: number, tags: string[]) {
const response = await apiClient.patch(`/api/v1/files/${fileId}/tags`, {
tags,
})
return response.data
},
// Search endpoints
async searchFiles(query: string, limit = 10): Promise<SearchResult[]> {
const response = await apiClient.get('/api/v1/search', {
params: { query, limit },
})
return response.data
},
// Reminders endpoints
async getReminders(): Promise<Reminder[]> {
const response = await apiClient.get('/api/v1/reminders')
return response.data
},
async createReminder(reminder: Omit<Reminder, 'id' | 'autoGenerated'>) {
const response = await apiClient.post('/api/v1/reminders', reminder)
return response.data
},
async updateReminder(reminderId: number, updates: Partial<Reminder>) {
const response = await apiClient.patch(`/api/v1/reminders/${reminderId}`, updates)
return response.data
},
async deleteReminder(reminderId: number) {
await apiClient.delete(`/api/v1/reminders/${reminderId}`)
},
async getProactiveSuggestions() {
const response = await apiClient.get('/api/v1/suggestions')
return response.data
},
// Analytics endpoints
async getUsageStats() {
const response = await apiClient.get('/api/v1/analytics/usage')
return response.data
},
async getInteractionHistory() {
const response = await apiClient.get('/api/v1/analytics/interactions')
return response.data
},
}

View File

@ -0,0 +1,42 @@
import { create } from 'zustand'
import { persist } from 'zustand/middleware'
interface User {
id: number
username: string
email: string
fullName?: string
createdAt: string
}
interface AuthState {
user: User | null
token: string | null
setUser: (user: User) => void
setToken: (token: string) => void
logout: () => void
isAuthenticated: () => boolean
}
export const useAuthStore = create<AuthState>()(
persist(
(set, get) => ({
user: null,
token: null,
setUser: (user: User) => set({ user }),
setToken: (token: string) => set({ token }),
logout: () => set({ user: null, token: null }),
isAuthenticated: () => {
const { user, token } = get()
return !!(user && token)
},
}),
{
name: 'auth-storage',
partialize: (state) => ({
user: state.user,
token: state.token
}),
}
)
)

View File

@ -0,0 +1,75 @@
/** @type {import('tailwindcss').Config} */
export default {
content: [
"./index.html",
"./src/**/*.{js,ts,jsx,tsx}",
],
theme: {
extend: {
colors: {
border: "hsl(var(--border))",
input: "hsl(var(--input))",
ring: "hsl(var(--ring))",
background: "hsl(var(--background))",
foreground: "hsl(var(--foreground))",
primary: {
DEFAULT: "hsl(var(--primary))",
foreground: "hsl(var(--primary-foreground))",
},
secondary: {
DEFAULT: "hsl(var(--secondary))",
foreground: "hsl(var(--secondary-foreground))",
},
destructive: {
DEFAULT: "hsl(var(--destructive))",
foreground: "hsl(var(--destructive-foreground))",
},
muted: {
DEFAULT: "hsl(var(--muted))",
foreground: "hsl(var(--muted-foreground))",
},
accent: {
DEFAULT: "hsl(var(--accent))",
foreground: "hsl(var(--accent-foreground))",
},
popover: {
DEFAULT: "hsl(var(--popover))",
foreground: "hsl(var(--popover-foreground))",
},
card: {
DEFAULT: "hsl(var(--card))",
foreground: "hsl(var(--card-foreground))",
},
},
borderRadius: {
lg: "var(--radius)",
md: "calc(var(--radius) - 2px)",
sm: "calc(var(--radius) - 4px)",
},
fontFamily: {
sans: ["Inter", "system-ui", "sans-serif"],
mono: ["JetBrains Mono", "monospace"],
},
animation: {
"fade-in": "fadeIn 0.5s ease-in-out",
"slide-up": "slideUp 0.3s ease-out",
"pulse-subtle": "pulseSubtle 2s infinite",
},
keyframes: {
fadeIn: {
"0%": { opacity: "0" },
"100%": { opacity: "1" },
},
slideUp: {
"0%": { transform: "translateY(10px)", opacity: "0" },
"100%": { transform: "translateY(0)", opacity: "1" },
},
pulseSubtle: {
"0%, 100%": { opacity: "1" },
"50%": { opacity: "0.8" },
},
},
},
},
plugins: [],
}

36
frontend/tsconfig.json Normal file
View File

@ -0,0 +1,36 @@
{
"compilerOptions": {
"target": "ES2020",
"useDefineForClassFields": true,
"lib": ["ES2020", "DOM", "DOM.Iterable"],
"module": "ESNext",
"skipLibCheck": true,
/* Bundler mode */
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"resolveJsonModule": true,
"isolatedModules": true,
"noEmit": true,
"jsx": "react-jsx",
/* Linting */
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true,
/* Path mapping */
"baseUrl": ".",
"paths": {
"@/*": ["./src/*"]
},
/* Additional type checking */
"allowSyntheticDefaultImports": true,
"esModuleInterop": true,
"forceConsistentCasingInFileNames": true
},
"include": ["src"],
"references": [{ "path": "./tsconfig.node.json" }]
}

View File

@ -0,0 +1,10 @@
{
"compilerOptions": {
"composite": true,
"skipLibCheck": true,
"module": "ESNext",
"moduleResolution": "bundler",
"allowSyntheticDefaultImports": true
},
"include": ["vite.config.ts"]
}

26
frontend/vite.config.ts Normal file
View File

@ -0,0 +1,26 @@
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import path from 'path'
// https://vitejs.dev/config/
export default defineConfig({
plugins: [react()],
resolve: {
alias: {
'@': path.resolve(__dirname, './src'),
},
},
server: {
port: 3000,
proxy: {
'/api': {
target: 'http://localhost:8000',
changeOrigin: true,
},
},
},
build: {
outDir: 'dist',
sourcemap: true,
},
})

225
setup.sh Executable file
View File

@ -0,0 +1,225 @@
#!/bin/bash
# aPersona Setup Script
# This script helps you set up the aPersona AI assistant locally
set -e
echo "🤖 Welcome to aPersona Setup!"
echo "=========================================="
# Colors for output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Function to print colored output
print_status() {
echo -e "${BLUE}[INFO]${NC} $1"
}
print_success() {
echo -e "${GREEN}[SUCCESS]${NC} $1"
}
print_warning() {
echo -e "${YELLOW}[WARNING]${NC} $1"
}
print_error() {
echo -e "${RED}[ERROR]${NC} $1"
}
# Check if Python 3.11+ is installed
check_python() {
print_status "Checking Python installation..."
if command -v python3 &> /dev/null; then
python_version=$(python3 --version | cut -d' ' -f2)
major_version=$(echo $python_version | cut -d'.' -f1)
minor_version=$(echo $python_version | cut -d'.' -f2)
if [ "$major_version" -eq 3 ] && [ "$minor_version" -ge 11 ]; then
print_success "Python $python_version found"
else
print_error "Python 3.11+ required. Found Python $python_version"
exit 1
fi
else
print_error "Python 3 not found. Please install Python 3.11+"
exit 1
fi
}
# Check if Node.js 18+ is installed
check_node() {
print_status "Checking Node.js installation..."
if command -v node &> /dev/null; then
node_version=$(node --version | cut -d'v' -f2)
major_version=$(echo $node_version | cut -d'.' -f1)
if [ "$major_version" -ge 18 ]; then
print_success "Node.js $node_version found"
else
print_error "Node.js 18+ required. Found Node.js $node_version"
exit 1
fi
else
print_error "Node.js not found. Please install Node.js 18+"
exit 1
fi
}
# Check if Ollama is installed
check_ollama() {
print_status "Checking Ollama installation..."
if command -v ollama &> /dev/null; then
print_success "Ollama found"
# Check if Ollama service is running
if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then
print_success "Ollama service is running"
else
print_warning "Ollama service is not running. Please start it with: ollama serve"
fi
else
print_warning "Ollama not found. Installing Ollama..."
curl -fsSL https://ollama.ai/install.sh | sh
print_success "Ollama installed. Please start it with: ollama serve"
fi
}
# Setup Python backend
setup_backend() {
print_status "Setting up Python backend..."
cd backend
# Create virtual environment if it doesn't exist
if [ ! -d "venv" ]; then
print_status "Creating Python virtual environment..."
python3 -m venv venv
print_success "Virtual environment created"
fi
# Activate virtual environment
source venv/bin/activate
# Install requirements
print_status "Installing Python dependencies..."
pip install --upgrade pip
pip install -r requirements.txt
print_success "Backend dependencies installed"
cd ..
}
# Setup React frontend
setup_frontend() {
print_status "Setting up React frontend..."
cd frontend
# Install npm dependencies
print_status "Installing Node.js dependencies..."
npm install
print_success "Frontend dependencies installed"
cd ..
}
# Create necessary directories
create_directories() {
print_status "Creating data directories..."
mkdir -p data/uploads
mkdir -p data/processed
mkdir -p data/vectors
mkdir -p data/embeddings_cache
print_success "Data directories created"
}
# Install Ollama models
install_models() {
print_status "Installing AI models..."
if command -v ollama &> /dev/null; then
print_status "Downloading Mistral model (this may take a while)..."
ollama pull mistral
print_status "Downloading embedding model..."
ollama pull nomic-embed-text
print_success "AI models installed"
else
print_warning "Ollama not available. Please install models manually after setting up Ollama"
fi
}
# Create environment file
create_env() {
print_status "Creating environment configuration..."
if [ ! -f "backend/.env" ]; then
cat > backend/.env << EOF
# aPersona Environment Configuration
# Security
SECRET_KEY=your-secret-key-change-in-production-$(openssl rand -hex 32)
# Database
DATABASE_URL=sqlite:///./apersona.db
# AI Services
OLLAMA_BASE_URL=http://localhost:11434
DEFAULT_LLM_MODEL=mistral
EMBEDDING_MODEL=all-MiniLM-L6-v2
# Development
DEBUG=true
EOF
print_success "Environment file created"
else
print_warning "Environment file already exists"
fi
}
# Main setup function
main() {
echo "Starting aPersona setup process..."
echo ""
# System checks
check_python
check_node
check_ollama
echo ""
# Setup components
create_directories
create_env
setup_backend
setup_frontend
install_models
echo ""
echo "=========================================="
print_success "aPersona setup completed successfully!"
echo ""
echo "📋 Next steps:"
echo " 1. Start Ollama service: ollama serve"
echo " 2. Start the backend: cd backend && source venv/bin/activate && uvicorn app.main:app --reload"
echo " 3. Start the frontend: cd frontend && npm run dev"
echo " 4. Open http://localhost:3000 in your browser"
echo ""
echo "💡 For more information, check the README.md file"
echo "🔒 Your data stays completely local and private!"
}
# Run main function
main