Extending Image Generation System
This guide covers how to add new AI image providers to Bike4Mind's image generation system.
Architecture Overview
The image generation system uses a provider pattern that allows easy integration of new AI services. Here's how to extend it:
graph TB
subgraph "Extension Points"
NewProvider[New Provider Service]
NewModels[New Model Definitions]
NewBackend[New Backend Info]
NewSchemas[New Parameter Schemas]
end
subgraph "Core System"
Factory[Provider Factory]
BaseService[AIImageService]
Models[Model Registry]
UI[Frontend Components]
end
NewProvider --> Factory
NewProvider -.implements.-> BaseService
NewModels --> Models
NewBackend --> Models
NewSchemas --> UI
Adding a New Provider
Step 1: Define Models
First, add your new models to the ImageModels
enum:
// b4m-core/packages/core/common/models.ts
export enum ImageModels {
// Existing models...
GPT_IMAGE_1 = 'gpt-image-1',
FLUX_PRO = 'flux-pro',
// Your new models
MIDJOURNEY_V6 = 'midjourney-v6',
MIDJOURNEY_NIJI = 'midjourney-niji',
STABLE_DIFFUSION_XL = 'stable-diffusion-xl',
}
// Add model constraints
export const IMAGE_SIZE_CONSTRAINTS = {
// Existing constraints...
MIDJOURNEY: {
sizes: ['1024x1024', '1152x896', '896x1152', '1216x832', '832x1216'] as const,
defaultSize: '1024x1024',
aspectRatios: ['1:1', '4:3', '3:4', '16:9', '9:16'],
},
STABLE_DIFFUSION: {
minWidth: 512,
maxWidth: 1536,
minHeight: 512,
maxHeight: 1536,
stepSize: 64,
defaultSize: '1024x1024',
},
} as const;
Step 2: Create Parameter Schemas
Define validation schemas for your provider's parameters:
// b4m-core/packages/core/common/schemas/midjourney.ts
import { z } from 'zod';
export const MIDJOURNEY_STYLES = {
RAW: 'raw',
STYLIZE_LOW: 'stylize-low',
STYLIZE_MED: 'stylize-med',
STYLIZE_HIGH: 'stylize-high',
STYLIZE_VERY_HIGH: 'stylize-very-high',
} as const;
export const MIDJOURNEY_MODELS = [
ImageModels.MIDJOURNEY_V6,
ImageModels.MIDJOURNEY_NIJI,
] as const;
export const MidjourneyImageGenerationSchema = z.object({
prompt: z.string().min(1).max(4000),
model: z.enum(MIDJOURNEY_MODELS),
aspect_ratio: z.string().optional(),
stylize: z.number().min(0).max(1000).optional().default(100),
chaos: z.number().min(0).max(100).optional().default(0),
quality: z.number().min(0.25).max(2).optional().default(1),
seed: z.number().optional(),
stop: z.number().min(10).max(100).optional(),
style: z.nativeEnum(MIDJOURNEY_STYLES).optional(),
tile: z.boolean().optional().default(false),
weird: z.number().min(0).max(3000).optional(),
});
export type MidjourneyImageGenerationOptions = z.infer<typeof MidjourneyImageGenerationSchema>;
Step 3: Implement Provider Service
Create a service class that extends AIImageService
:
// b4m-core/packages/core/utils/imageGeneration/MidjourneyImageService.ts
import { AIImageService } from './AIImageService';
import { Logger } from '../logger';
import { ImageModels } from '@b4m-core/common/models';
import { MidjourneyImageGenerationOptions } from '@b4m-core/common/schemas/midjourney';
import axios from 'axios';
export class MidjourneyImageService extends AIImageService {
private baseUrl = 'https://api.midjourney.com/v1';
constructor(apiKey: string, logger: Logger) {
super(apiKey, logger);
}
async generate(
prompt: string,
options: MidjourneyImageGenerationOptions
): Promise<string[]> {
try {
const {
model,
aspect_ratio,
stylize,
chaos,
quality,
seed,
stop,
style,
tile,
weird,
n = 1,
} = options;
// Build Midjourney prompt with parameters
let fullPrompt = prompt;
if (aspect_ratio) fullPrompt += ` --ar ${aspect_ratio}`;
if (stylize !== 100) fullPrompt += ` --stylize ${stylize}`;
if (chaos > 0) fullPrompt += ` --chaos ${chaos}`;
if (quality !== 1) fullPrompt += ` --quality ${quality}`;
if (seed) fullPrompt += ` --seed ${seed}`;
if (stop) fullPrompt += ` --stop ${stop}`;
if (style) fullPrompt += ` --style ${style}`;
if (tile) fullPrompt += ` --tile`;
if (weird) fullPrompt += ` --weird ${weird}`;
// Generate images in parallel
const images = await Promise.all(
Array.from({ length: n }).map(async () => {
// Submit generation request
const submitResponse = await axios.post(`${this.baseUrl}/imagine`, {
prompt: fullPrompt,
webhook_url: null, // Handle async via polling
}, {
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json',
},
});
const taskId = submitResponse.data.task_id;
// Poll for completion
return await this.pollForResult(taskId);
})
);
return images;
} catch (error) {
this.logger.error('Midjourney generation failed:', error);
throw error instanceof Error ? error : new Error('Midjourney generation error');
}
}
async edit(
image: string,
prompt: string,
options: any
): Promise<string> {
// Midjourney doesn't support direct editing - use vary or remix
throw new Error('Direct editing not supported - use vary/remix operations');
}
async variations(image: Buffer, options: any): Promise<string[]> {
// Upload image and create variations
const uploadResponse = await this.uploadImage(image);
const imageUrl = uploadResponse.data.url;
const response = await axios.post(`${this.baseUrl}/vary`, {
image_url: imageUrl,
variation_type: options.type || 'strong',
}, {
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'application/json',
},
});
return await this.pollForResult(response.data.task_id);
}
private async pollForResult(taskId: string): Promise<string> {
const maxAttempts = 120; // 10 minutes with 5s intervals
const interval = 5000;
for (let attempt = 0; attempt < maxAttempts; attempt++) {
const response = await axios.get(`${this.baseUrl}/task/${taskId}`, {
headers: {
'Authorization': `Bearer ${this.apiKey}`,
},
});
const { status, result } = response.data;
if (status === 'completed' && result?.image_url) {
return result.image_url;
}
if (status === 'failed') {
throw new Error(`Generation failed: ${result?.error || 'Unknown error'}`);
}
if (status === 'moderated') {
throw new Error('Content was moderated - please adjust your prompt');
}
// Wait before next poll
await new Promise(resolve => setTimeout(resolve, interval));
}
throw new Error(`Generation timed out after ${maxAttempts} attempts`);
}
private async uploadImage(imageBuffer: Buffer): Promise<any> {
const formData = new FormData();
formData.append('image', new Blob([imageBuffer]), 'image.png');
return await axios.post(`${this.baseUrl}/upload`, formData, {
headers: {
'Authorization': `Bearer ${this.apiKey}`,
'Content-Type': 'multipart/form-data',
},
});
}
}
Step 4: Update Provider Factory
Add your new provider to the factory:
// b4m-core/packages/core/utils/imageGeneration/index.ts
import { MidjourneyImageService } from './MidjourneyImageService';
type ImageServiceTypes = {
openai: OpenAIImageService;
bfl: BFLImageService;
midjourney: MidjourneyImageService; // Add your provider
test: TestImageService;
};
export function aiImageService<V extends ImageGenerationVendor>(
vendor: V,
apiKey: string,
logger: Logger
): ImageServiceTypes[V] {
switch (vendor) {
case 'openai':
return new OpenAIImageService(apiKey, logger) as ImageServiceTypes[V];
case 'bfl':
return new BFLImageService(apiKey, logger) as ImageServiceTypes[V];
case 'midjourney':
return new MidjourneyImageService(apiKey, logger) as ImageServiceTypes[V];
case 'test':
return new TestImageService(apiKey, logger) as ImageServiceTypes[V];
default:
throw new Error(`Unknown AI image generator vendor: ${vendor}`);
}
}
Step 5: Add Backend Model Info
Create a backend class to provide model information:
// b4m-core/packages/core/utils/llm/midjourneyBackend.ts
import { ICompletionBackend, ModelInfo, ModelBackend } from '@b4m-core/common';
import { ImageModels } from '@b4m-core/common/models';
export class MidjourneyBackend implements ICompletionBackend {
async getModelInfo(): Promise<ModelInfo[]> {
return [
{
id: ImageModels.MIDJOURNEY_V6,
type: 'image',
name: 'Midjourney v6',
backend: ModelBackend.Midjourney,
contextWindow: 4000,
max_tokens: 4000,
supportsImageVariation: true,
pricing: {
1000: { input: 20, output: 20 }, // $0.02 per image
},
description: 'Midjourney v6 - Premium AI art generation with advanced artistic styles',
supportsSafetyTolerance: false,
rank: 2,
logoFile: 'midjourney_logo.svg',
},
{
id: ImageModels.MIDJOURNEY_NIJI,
type: 'image',
name: 'Midjourney Niji',
backend: ModelBackend.Midjourney,
contextWindow: 4000,
max_tokens: 4000,
supportsImageVariation: true,
pricing: {
1000: { input: 20, output: 20 },
},
description: 'Midjourney Niji - Specialized for anime and manga-style artwork',
supportsSafetyTolerance: false,
rank: 3,
logoFile: 'midjourney_logo.svg',
},
];
}
// Implement other required methods...
async complete(): Promise<any> {
throw new Error('Text completion not supported by Midjourney');
}
async streamComplete(): Promise<any> {
throw new Error('Streaming not supported by Midjourney');
}
}
Step 6: Update Frontend Components
Add provider-specific controls to the UI:
// packages/client/app/components/Session/MidjourneyControls.tsx
import { FC } from 'react';
import { Box, Select, Option, Slider, Switch, Typography } from '@mui/joy';
import { useLLM } from '@client/app/contexts/LLMContext';
import { MIDJOURNEY_STYLES } from '@b4m-core/common/schemas/midjourney';
interface MidjourneyControlsProps {
model: string;
}
const MidjourneyControls: FC<MidjourneyControlsProps> = ({ model }) => {
const {
stylize = 100,
chaos = 0,
quality = 1,
weird = 0,
tile = false,
style,
setLLM
} = useLLM();
if (!model.startsWith('midjourney')) return null;
return (
<Box>
<Typography level="h4">Midjourney Settings</Typography>
{/* Stylization */}
<Box>
<Typography>Stylization: {stylize}</Typography>
<Slider
value={stylize}
min={0}
max={1000}
step={25}
onChange={(_, value) => setLLM({ stylize: value as number })}
/>
</Box>
{/* Chaos */}
<Box>
<Typography>Chaos: {chaos}</Typography>
<Slider
value={chaos}
min={0}
max={100}
step={5}
onChange={(_, value) => setLLM({ chaos: value as number })}
/>
</Box>
{/* Quality */}
<Box>
<Typography>Quality: {quality}</Typography>
<Slider
value={quality}
min={0.25}
max={2}
step={0.25}
onChange={(_, value) => setLLM({ quality: value as number })}
/>
</Box>
{/* Style */}
<Box>
<Typography>Style</Typography>
<Select
value={style || ''}
onChange={(_, value) => setLLM({ style: value || undefined })}
>
<Option value="">Default</Option>
{Object.entries(MIDJOURNEY_STYLES).map(([key, value]) => (
<Option key={key} value={value}>
{key.replace('_', ' ')}
</Option>
))}
</Select>
</Box>
{/* Weird */}
<Box>
<Typography>Weird: {weird}</Typography>
<Slider
value={weird}
min={0}
max={3000}
step={100}
onChange={(_, value) => setLLM({ weird: value as number })}
/>
</Box>
{/* Tile */}
<Box>
<Typography>Seamless Tile</Typography>
<Switch
checked={tile}
onChange={(e) => setLLM({ tile: e.target.checked })}
/>
</Box>
</Box>
);
};
export default MidjourneyControls;
Step 7: Update Image Generation Service
Modify the service to handle your new provider:
// b4m-core/packages/core/services/llm/ImageGeneration.ts
export class ImageGenerationService {
async process({ body, logger }: ProcessImageParams) {
// ... existing code ...
// Determine provider
const isBFLModel = BFL_IMAGE_MODELS.includes(model as any);
const isMidjourneyModel = MIDJOURNEY_MODELS.includes(model as any);
let service;
if (isBFLModel) {
service = aiImageService('bfl', apiKeyTable.bfl!, logger);
} else if (isMidjourneyModel) {
service = aiImageService('midjourney', apiKeyTable.midjourney!, logger);
} else {
service = aiImageService('openai', apiKeyTable.openai!, logger);
}
// Provider-specific generation
if (isMidjourneyModel) {
images = await service.generate(prompt, {
model: model as any,
aspect_ratio,
stylize: body.stylize,
chaos: body.chaos,
quality: body.quality,
weird: body.weird,
tile: body.tile,
style: body.style,
n,
});
} else {
// Handle other providers...
}
}
}
Adding API Key Management
Step 1: Update Common Types
// b4m-core/packages/core/common/types/apiKeys.ts
export enum ApiKeyType {
openai = 'openai',
bfl = 'bfl',
midjourney = 'midjourney', // Add your provider
anthropic = 'anthropic',
gemini = 'gemini',
}
Step 2: Add API Key Validation
// b4m-core/packages/core/services/apiKeyService/index.ts
export const getEffectiveLLMApiKeys = async (
userId: string,
{ db }: GetEffectiveApiKeyAdapters
): Promise<{
openai?: string;
bfl?: string;
midjourney?: string; // Add your provider
anthropic?: string;
gemini?: string;
}> => {
const keys = await Promise.all([
getEffectiveApiKey(userId, { type: ApiKeyType.openai }, { db }),
getEffectiveApiKey(userId, { type: ApiKeyType.bfl }, { db }),
getEffectiveApiKey(userId, { type: ApiKeyType.midjourney }, { db }),
// ... other providers
]);
return {
openai: keys[0],
bfl: keys[1],
midjourney: keys[2],
// ... other providers
};
};
Testing Your Provider
Unit Tests
// b4m-core/packages/core/utils/imageGeneration/__tests__/MidjourneyImageService.test.ts
import { MidjourneyImageService } from '../MidjourneyImageService';
import { Logger } from '../../logger';
describe('MidjourneyImageService', () => {
let service: MidjourneyImageService;
let mockLogger: Logger;
beforeEach(() => {
mockLogger = new Logger({ level: 'info' });
service = new MidjourneyImageService('test-api-key', mockLogger);
});
test('should generate image with basic prompt', async () => {
// Mock API responses
jest.spyOn(global, 'fetch').mockImplementation(() =>
Promise.resolve({
ok: true,
json: () => Promise.resolve({
task_id: 'test-task-id',
status: 'submitted'
})
})
);
const images = await service.generate('A beautiful sunset', {
model: ImageModels.MIDJOURNEY_V6,
stylize: 100,
n: 1,
});
expect(images).toHaveLength(1);
expect(typeof images[0]).toBe('string');
});
test('should handle parameter validation', async () => {
await expect(service.generate('', {
model: ImageModels.MIDJOURNEY_V6,
})).rejects.toThrow('Prompt cannot be empty');
});
});
Integration Tests
// Test provider integration with the full system
describe('Midjourney Integration', () => {
test('should generate image through full pipeline', async () => {
const result = await handleImageGenerationCommand({
params: 'A futuristic cityscape',
currentSession: testSession,
workBenchFiles: [],
queryClient: testQueryClient,
model: ImageModels.MIDJOURNEY_V6,
stylize: 500,
chaos: 20,
});
expect(result).toBeDefined();
});
});
Best Practices
Error Handling
- Implement comprehensive error handling for API failures
- Provide user-friendly error messages
- Handle rate limiting and retry logic
- Log errors with sufficient context for debugging
Performance
- Implement connection pooling for HTTP requests
- Use appropriate timeouts for long-running operations
- Cache frequently accessed data
- Monitor and optimize API call patterns
Security
- Validate all input parameters
- Sanitize prompts for safety
- Implement proper API key management
- Use secure communication channels
User Experience
- Provide real-time progress updates
- Show informative status messages
- Handle edge cases gracefully
- Implement proper loading states
Deployment Checklist
- Add new models to enum and constraints
- Implement provider service class
- Update provider factory
- Add backend model information
- Create frontend parameter controls
- Update image generation service
- Add API key management
- Write comprehensive tests
- Update documentation
- Test in staging environment
- Deploy to production
- Monitor for issues
This guide provides a complete framework for adding new image generation providers to the system. Follow these patterns to ensure consistency and maintainability.