Skip to main content

File Management Architecture

This document explains how files (knowledge base documents) are managed within Bike4Mind sessions (notebooks) and how they are used as context during AI interactions.

Overview

Bike4Mind allows users to attach files to their notebooks, which then serve as contextual knowledge for AI interactions. This system involves several key components working together to manage file state, display, and integration with the chat completion pipeline.

Key Components

1. Session Document Structure

Sessions store file references in the knowledgeIds array:

interface ISession {
id: string;
name: string;
userId: string;
knowledgeIds?: Array<IFabFileDocument['id']>; // Array of file IDs
// ... other fields
}

2. WorkBench Component

The WorkBench component (packages/client/app/components/Session/WorkBench.tsx) provides the UI for managing attached files:

  • Display: Shows attached files as interactive chips
  • Remove: Allows users to remove files from the session
  • Auto-hide: Automatically collapses after 10 seconds of inactivity
  • File Support: Indicates whether files are supported by the selected model

Key features:

  • Visual indicators for auto-detected file types
  • Model compatibility warnings
  • Progressive loading animation
  • Collapsed state for minimal UI footprint

3. Sessions Context

The SessionsContext (packages/client/app/contexts/SessionsContext.tsx) manages the state of files:

// Zustand store for managing workbench state
const useWorkBenchStore = create<WorkBenchStore>((set, get) => ({
sessionStates: {},
setWorkBenchFiles: (sessionId, files) => { /* ... */ },
// ... other methods
}));

Key responsibilities:

  • Fetches files from server when session changes
  • Maintains local state for quick UI updates
  • Synchronizes with server on changes
  • Handles file deduplication

4. Chat Completion Integration

When a user sends a message, the attached files are processed:

  1. File IDs Collection: Session's knowledgeIds are collected
  2. Message File IDs: Files attached to the specific message
  3. System Files: Global and user-specific system files
  4. Processing: All files are fetched, converted, and included as context

File Management Flow

Architecture Overview

flowchart TB
subgraph "User Interface"
UI[User Selects Files]
WB[WorkBench Component]
CS[CollapsedWorkBench]
end

subgraph "State Management"
SC[SessionsContext]
WBS[WorkBenchStore]
LS[Local State]
end

subgraph "Session Storage"
SD[Session Document]
KI[knowledgeIds Array]
end

subgraph "Chat Completion"
CC[ChatCompletionService]
FP[File Processing]
CM[Context Messages]
end

subgraph "Database"
DB[(MongoDB)]
FF[FabFiles Collection]
end

%% User actions
UI -->|"Add File"| WB
UI -->|"Remove File"| WB
WB -->|"Display Files"| CS

%% State flow
WB -->|"Update"| SC
SC -->|"setWorkBenchFiles"| WBS
SC -->|"Update Session"| SD
SD -->|"Store"| KI

%% File fetching
SC -->|"fetchFiles"| FF
FF -->|"Return Files"| LS
LS -->|"Initialize"| WBS

%% Chat completion flow
KI -->|"Pass fabFileIds"| CC
CC -->|"Fetch & Convert"| FP
FP -->|"Generate Messages"| CM
CM -->|"Include in LLM Context"| CC

%% Database operations
SD <-->|"Save/Load"| DB
FF <-->|"Query"| DB

style UI fill:#e1f5e1
style WB fill:#e1f5e1
style CS fill:#e1f5e1
style CC fill:#ffe1e1
style FP fill:#ffe1e1
style CM fill:#ffe1e1
style DB fill:#e1e1ff

Sequence Diagram

The following sequence diagram shows the complete interaction flow:

sequenceDiagram
participant U as User
participant WB as WorkBench UI
participant SC as SessionsContext
participant DB as Database
participant CC as ChatCompletion
participant LLM as LLM Model

Note over U,LLM: File Addition Flow
U->>WB: Add file to notebook
WB->>SC: Update workBenchFiles
SC->>DB: Update session.knowledgeIds
DB-->>SC: Confirm update
SC-->>WB: Update UI state
WB-->>U: Show file chip

Note over U,LLM: File Removal Flow
U->>WB: Remove file
WB->>SC: Filter out file
SC->>DB: Update session.knowledgeIds
DB-->>SC: Confirm removal
SC-->>WB: Update UI state
WB-->>U: Hide file chip

Note over U,LLM: Chat Completion Flow
U->>CC: Send message
CC->>DB: Fetch session
DB-->>CC: Return session with knowledgeIds

par File Processing
CC->>DB: Fetch file contents
DB-->>CC: Return file data
and System Files
CC->>DB: Fetch system files
DB-->>CC: Return system data
and Message Files
CC->>DB: Fetch message files
DB-->>CC: Return message data
end

CC->>CC: Convert files to messages
CC->>CC: Build context with files
CC->>LLM: Send context + prompt
LLM-->>CC: Generate response
CC-->>U: Stream response

Detailed Flow Steps

1. Adding Files to Context

  1. User Action: User uploads or selects files to attach
  2. UI Update: WorkBench component displays new files immediately
  3. State Update: SessionsContext updates the WorkBenchStore
  4. Server Sync: Session document is updated with new knowledgeIds
  5. Persistence: Changes are saved to MongoDB

2. Removing Files from Context

  1. User Action: User clicks the delete button on a file chip
  2. Optimistic Update: File is removed from UI immediately
  3. State Update: WorkBenchFiles filtered to exclude removed file
  4. Server Update: Session's knowledgeIds array is updated
  5. Confirmation: Success toast shown to user

3. Using Files in Chat Completion

When a user sends a message:

// In ChatCompletionService.process()
const fabFileProcessingStartTime = Date.now();
const { promptMessages: fabMessages, convertedFabFiles } =
await this.fabFilesToMessages(
[
...sessionFabFileIds, // Session-level files
...messageFileIds, // Message-specific files
...enabledSystemFileIds, // User's system files
...globalSystemFileIds // Global system files
],
quest,
embeddingFactory,
message,
maxTokens,
modelInfo
);

The process:

  1. Collects all relevant file IDs from various sources
  2. Fetches file content from storage
  3. Converts files to appropriate message format
  4. Includes processed content in LLM context
  5. Manages token limits to prevent context overflow

File Types and Processing

Supported File Types

Files are processed based on their MIME type:

  • Text Files: text/plain, text/markdown, text/csv
  • Documents: application/pdf
  • Code: application/json
  • Web: text/html
  • Images: Supported only by vision-enabled models

Auto-Detection

Files without extensions are automatically detected as plain text and marked with a visual indicator.

Model Compatibility

The WorkBench component checks model capabilities:

  • Text models without vision support cannot process images
  • Image generation models have specific file type restrictions
  • Unsupported files are visually marked in the UI

Performance Optimizations

1. Progressive Loading

  • Files are fetched in parallel with other chat preparation steps
  • Non-blocking UI updates for better perceived performance

2. Caching

  • Local file metadata cached in WorkBenchStore
  • Reduces redundant server fetches

3. Smart Context Management

  • Query complexity classification optimizes file processing
  • Simple queries may skip extensive file processing
  • Token limits enforced to prevent context overflow

Best Practices

For Developers

  1. State Management

    • Always use WorkBenchStore for file state
    • Implement optimistic updates for better UX
    • Handle errors gracefully with rollback
  2. Performance

    • Batch file operations when possible
    • Use parallel processing for file fetching
    • Monitor token usage to prevent overflow
  3. Security

    • Validate file access permissions
    • Use proper scope filters when fetching files
    • Never expose internal file paths

For Users

  1. File Selection

    • Choose relevant files for your task
    • Remove unnecessary files to improve performance
    • Be aware of model limitations for file types
  2. Context Management

    • Monitor the file count indicator
    • Use the collapsed view to save screen space
    • Leverage auto-detection for text files

Troubleshooting

Common Issues

  1. Files Not Showing

    • Check session's knowledgeIds array
    • Verify file exists in database
    • Ensure proper permissions
  2. Model Compatibility Warnings

    • Verify model supports the file type
    • Consider switching to a compatible model
    • Remove unsupported files
  3. Performance Issues

    • Reduce number of attached files
    • Check file sizes
    • Monitor token usage

API Reference

Key Functions

// Add files to session
setWorkBenchFiles(sessionId: string, files: IFabFileDocument[])

// Remove file from session
handleRemove(fileId: string)

// Fetch files from server
fetchFiles(knowledgeIds: string[]): Promise<IFabFileDocument[]>

// Convert files to messages
fabFilesToMessages(
fabFileIds: string[],
quest: IChatHistoryItemDocument,
embeddingFactory: EmbeddingFactory,
message: string,
max_tokens: number,
modelInfo: ModelInfo
): Promise<{ promptMessages: IMessage[], convertedFabFiles: IFabFileDocument[] }>

Future Enhancements

  • Drag & Drop: Direct file upload via drag & drop
  • File Preview: Quick preview of file contents
  • Smart Suggestions: AI-powered file recommendations
  • Batch Operations: Select multiple files for bulk actions
  • File Search: Search within attached files
  • Version Control: Track file versions and changes