Bike4Mind Architecture Overview
Introduction
Bike4Mind is built on a modern, serverless-first architecture using AWS services and designed for scalability, reliability, and AI-powered functionality. This document provides a comprehensive overview of the current system architecture and its components.
High-Level Architecture
Core Technology Stack
- Infrastructure: AWS Serverless (Lambda, API Gateway, S3, SQS, EventBridge)
- Framework: SST (Serverless Stack) for Infrastructure as Code
- Frontend: Next.js with API Routes pattern
- Runtime: Node.js 20.x
- Database: MongoDB (managed)
- Real-time: WebSocket API with EventBridge
- AI/ML: Multi-LLM support (OpenAI, Anthropic, Bedrock, Gemini)
Architecture Principles
- Serverless-First: Leverages AWS Lambda for compute with automatic scaling
- Event-Driven: Uses queues and events for asynchronous processing
- Microservices: Modular design with clear separation of concerns
- Repository Pattern: Business logic abstracted through repositories
- Queue-Based AI: Long-running AI tasks processed asynchronously
- Real-time Updates: WebSocket connections for live user feedback
Infrastructure Components
AWS Serverless Foundation
graph TB
User[User] --> CF[CloudFront]
CF --> APIG[API Gateway]
CF --> S3[S3 Static Assets]
APIG --> Lambda[Lambda Functions]
Lambda --> Mongo[(MongoDB)]
Lambda --> S3Buckets[S3 Buckets]
Lambda --> SQS[SQS Queues]
Lambda --> WS[WebSocket API]
SQS --> Workers[Worker Lambdas]
Workers --> Bedrock[AWS Bedrock]
Workers --> OpenAI[OpenAI API]
Domain & DNS Management
- Custom Domains: Configurable domain management with Route53
- SSL/TLS: Automatic certificate management via ACM
- CDN: CloudFront distribution with custom cache policies
- HSTS: Security headers policy for enhanced security
VPC Configuration
- Flexible VPC: Supports both existing and new VPC configurations
- Developer VPC: Shared development environment
- Production Isolation: Separate VPC for production workloads
- NAT Gateway: Single NAT gateway for cost optimization
Data Architecture
Primary Database (MongoDB)
The system uses MongoDB as the primary database with a rich data model:
Core Models
- UserModel: User accounts, preferences, and authentication
- SessionModel: Conversation sessions with AI agents
- QuestModel: AI-powered task execution tracking
- AgentModel: AI agent configurations and capabilities
- OrganizationModel: Multi-tenant organization support
Content Models
- FabFileModel: User-uploaded files with processing metadata
- MementoModel: Knowledge base entries and memories
- ProjectModel: User projects and collaboration
- ReportModel: Generated reports and analytics
System Models
- ApiKeyModel: API key management for external services
- CreditTransactionModel: Usage tracking and billing
- McpServerModel: Model Context Protocol server configurations
File Storage (S3 Buckets)
Production Buckets
fabFilesBucket
: Core user files with automated processing pipelinegeneratedImagesBucket
: AI-generated images with CORS for web accessappFilesBucket
: Application files and user uploads
Processing Buckets
historyImportBucket
: Temporary storage for data imports (auto-cleanup)
Bucket Features
- Versioning: Configurable versioning for data protection
- Lifecycle Policies: Automatic cleanup and archival
- Event Triggers: S3 events trigger processing pipelines
- CORS Configuration: Web-friendly access patterns
AI & Processing Pipeline
Multi-LLM Support
The system supports multiple AI providers through a unified interface:
- OpenAI GPT-4: Primary reasoning and conversation
- Anthropic Claude: Large context analysis and safety
- Amazon Bedrock: AWS-native AI services
- Google Gemini: Alternative reasoning capabilities
Queue-Based Processing
graph LR
Upload[File Upload] --> Chunk[Chunk Queue]
Chunk --> Vector[Vector Queue]
Vector --> Search[Searchable Content]
Quest[Quest Request] --> QuestQ[Quest Queue]
QuestQ --> AI[AI Processing]
AI --> Result[Quest Result]
Image[Image Request] --> ImageQ[Image Queue]
ImageQ --> Bedrock[AWS Bedrock]
Bedrock --> Generated[Generated Image]
Current AI Queues
Queue | Timeout | Purpose | DLQ |
---|---|---|---|
questStartQueue | 10min | AI quest execution | ✓ |
imageGenerationQueue | 10min | Image generation | ✓ |
imageEditQueue | 10min | Image editing | ✓ |
notebookSummarizationQueue | 2min | Content summarization | - |
notebookTaggingQueue | 2min | Content tagging | - |
fabFileChunkQueue | 13min | File chunking | ✓ |
fabFileVectQueue | 5min | Vectorization | ✓ |
Agent Architecture Foundation
Model Context Protocol (MCP)
- MCP Integration: Standardized tool calling via MCP protocol
- Lambda-based Tools: Tools implemented as Lambda functions
- Dynamic Tool Discovery: Runtime tool registration and execution
- Environment Management: Secure environment variable handling
Quest System
- Structured Execution: AI tasks organized as quests with master plans
- Progress Tracking: Real-time quest status and progress updates
- Error Handling: Comprehensive error recovery and retry logic
- User Feedback: Integration points for human-in-the-loop workflows
API Architecture
Next.js API Routes
The API follows a file-based routing pattern in packages/client/pages/api/
:
/api/
├── ai/ # AI services
│ ├── transcribe.ts # Audio transcription
│ ├── generate-image.ts # Image generation
│ └── infer.ts # LLM inference
├── users/ # User management
├── organizations/ # Organization CRUD
├── api-keys/ # API key management
├── subscriptions/ # Billing & subscriptions
└── [type]/[id]/ # Generic resource endpoints
Middleware Stack
baseApi()
: Express-like middleware with authentication and CORSasyncHandler
: Comprehensive error handling and logging- Authentication: JWT-based with session management
- Authorization: CASL-based permissions with user scoping
External API Integrations
- Authentication: Google OAuth, GitHub OAuth, Okta
- Payments: Stripe with webhook handling
- Communication: AWS SES, Slack webhooks, Twilio SMS
- AI Services: OpenAI, Anthropic, Google AI APIs
Real-Time & Communication
WebSocket API
graph TB
Client[Client] --> WS[WebSocket API]
WS --> Connect[Connect Handler]
WS --> Disconnect[Disconnect Handler]
WS --> Heartbeat[Heartbeat Handler]
WS --> Subscribe[Subscribe Handler]
WS --> Unsubscribe[Unsubscribe Handler]
Subscribe --> Mongo[(MongoDB)]
Subscribe --> Fanout[Subscriber Fanout Service]
Fanout --> Client
WebSocket Handlers
- Connection Management:
$connect
,$disconnect
,heartbeat
- Data Subscriptions: Real-time query subscriptions
- Quest Updates: Live progress updates for AI tasks
- Error Notifications: Real-time error reporting
Subscriber Fanout Service
- ECS Fargate Service: Dedicated service for database change streaming
- MongoDB Change Streams: Real-time database change detection
- WebSocket Broadcasting: Efficient message distribution to connected clients
Event-Driven Architecture
EventBus Integration
- Central Event Routing: AWS EventBridge for application events
- Event Handlers: Modular event processing
- Retry Logic: Built-in retry and dead letter queue handling
Event Types
- Stripe Events: Payment processing and subscription updates
- Email Events: Automated email sending and notifications
- Dice Roll Events: Gaming and engagement features
- System Events: Monitoring and alerting
Business Logic Architecture
Repository Pattern
Business logic is organized using the repository pattern in b4m-core/packages/core/
:
b4m-core/packages/core/
├── database/ # Data models and repositories
├── services/ # Business logic services
├── common/ # Shared types and interfaces
└── mcp/ # Model Context Protocol implementation
Service Layer
Core Services
- ChatCompletionService: AI interaction orchestration
- OrganizationService: Multi-tenant organization management
- UserService: User lifecycle and preferences
- SessionService: Conversation session management
Specialized Services
- FabFileService: File processing and management
- AdminSettingsService: System configuration
- ProjectService: Project collaboration features
- AuthService: Authentication and authorization
Permission System
- CASL Integration: Attribute-based access control
- User Scoping: Automatic data filtering based on permissions
- Organization Isolation: Multi-tenant data separation
- Role-Based Access: Flexible role and permission management
Monitoring & Observability
Logging Infrastructure
graph LR
Lambda[Lambda Functions] --> CW[CloudWatch Logs]
CW --> Filter[Log Filters]
Filter --> Slack[Slack Notifications]
Filter --> Analysis[Log Analysis]
CW --> Metrics[CloudWatch Metrics]
Metrics --> Alarms[CloudWatch Alarms]
Alarms --> SNS[SNS Notifications]
Log Management
- CloudWatch Integration: Centralized logging with retention policies
- Error Filtering: Automatic error detection and alerting
- Slack Integration: Real-time error notifications
- Log Analysis: Structured logging for debugging and analytics
Analytics & Reporting
- Event Tracking: User activity and system usage analytics
- Cron Jobs: Automated daily and weekly reporting
- Performance Metrics: System performance and health monitoring
Security & Compliance
Authentication & Authorization
- Multi-Provider OAuth: Google, GitHub, Okta integration
- JWT Tokens: Secure session management
- API Key Management: Secure external API access
- Permission Scoping: Fine-grained access control
Data Protection
- Encryption: Data encryption at rest and in transit
- Secret Management: AWS Secrets Manager integration
- Audit Logging: Comprehensive audit trail
- Backup Strategy: Automated backup and recovery procedures
Deployment & DevOps
Infrastructure as Code
- SST Framework: Declarative infrastructure management
- Environment Management: Separate staging and production environments
- Secret Rotation: Automated secret rotation and monitoring
- Database Migrations: Automated schema updates
CI/CD Pipeline
- Seed Integration: Automated deployment pipeline
- Environment Promotion: Staged deployment process
- Rollback Capability: Quick rollback for failed deployments
- Health Checks: Automated deployment verification
Future Architecture Considerations
Agent-Ready Components
The current architecture provides a strong foundation for advanced AI agent capabilities:
- Tool Execution: MCP protocol support with Lambda-based tools
- Memory Systems: Rich MongoDB models with file processing
- Async Processing: Comprehensive queue system for AI tasks
- Real-time Communication: WebSocket infrastructure for agent updates
- Multi-LLM Support: Abstracted chat completion service
- Permission System: Scoped agent actions with CASL authorization
- Event System: Loose coupling for agent event handling
Scalability Considerations
- Serverless Scaling: Automatic scaling based on demand
- Queue Management: Backpressure handling and rate limiting
- Database Optimization: Indexing and query optimization
- CDN Strategy: Global content distribution
- Cost Optimization: Usage-based pricing and resource optimization
This architecture provides a robust, scalable foundation for Bike4Mind's current functionality while being well-positioned for future AI agent capabilities and enhanced automation features.