Cost Management

AI Controller provides cost management capabilities to help organizations control, monitor, and optimize their LLM expenditures.

Cost Management Overview

LLM services typically charge based on token usage, with different rates for different models and providers. AI Controller helps you manage these costs through:

Usage Tracking: Detailed monitoring of requests
Reporting: Basic usage metrics including number of requests, providers, models, and request lengths

Understanding LLM Costs

Token-Based Pricing

Most LLM providers use token-based pricing models. For a deeper understanding of different models and their pricing structures, see Models and Providers.

Provider	Model	Input Price (per 1K tokens)	Output Price (per 1K tokens)
OpenAI	GPT-4	$0.03	$0.06
OpenAI	GPT-3.5-Turbo	$0.0015	$0.002
Anthropic	Claude-3-Opus	$0.015	$0.075
Anthropic	Claude-3-Sonnet	$0.003	$0.015
Google	Gemini Pro	$0.00025	$0.0005

Note: Prices are subject to change; always verify current pricing with providers.

Cost Factors

Several factors affect your overall LLM costs:

Model selection: More capable models cost more
Request volume: Higher usage means higher costs
Prompt length: Longer prompts consume more input tokens
Response length: Longer responses consume more output tokens
Caching efficiency: Higher cache hit rates reduce costs
Provider selection: Different providers have different pricing

Usage Logs and Cache Entries

AI Controller tracks usage through two related systems:

Log Entries:
- Include metadata such as timestamp, provider used, and user
- Each log entry has a unique CorrelationId
Cache Entries:
- Include the Request (input) and Response (output)
- Model information is indirectly available within the Request
- Can be linked to Log entries via the CorrelationId

Cost Optimization Strategies

AI Controller offers several features to reduce unnecessary spending. These strategies form part of AI Controller's overall cost governance framework.

Response Caching: Implement caching to avoid redundant API calls. At a 50% cache hit rate, most organizations can reduce costs by half. For detailed configuration, see Response Caching.
Model Access Control: Control which models can be used by different user groups:
1. Configure default models based on cost/quality requirements
2. Use the Rules Engine to restrict access to expensive models
3. Monitor model usage patterns
4. Consider which use cases require more expensive models

Caching System
Rules Engine
Logging and Monitoring
API Key Management
Performance Optimization
Governance - Learn about AI Controller's cost governance framework
Models and Providers - Understand cost differences between models

Updated: 2025-05-15