Model Rate Limit Error

Error Code: MODEL_RATE_LIMIT

This error occurs when you exceed the rate limits imposed by the model provider.

Common Causes

Too Many Requests: Exceeding requests per minute/hour limits
Token Limits: Exceeding token usage limits
Concurrent Requests: Too many simultaneous requests
Account Limits: Exceeding your account's usage limits

Solutions

1. Implement Exponential Backoff

use ferriclink_core::FerricLinkError;
use tokio::time::{sleep, Duration};

async fn retry_with_backoff<F, T>(mut operation: F) -> Result<T, FerricLinkError>
where
    F: FnMut() -> Result<T, FerricLinkError>,
{
    let mut delay = Duration::from_secs(1);
    let max_retries = 5;
    
    for attempt in 0..max_retries {
        match operation() {
            Ok(result) => return Ok(result),
            Err(e) => {
                if e.error_code() == Some(ErrorCode::ModelRateLimit) {
                    if attempt < max_retries - 1 {
                        println!("Rate limited, waiting {:?} before retry {}", delay, attempt + 1);
                        sleep(delay).await;
                        delay *= 2; // Exponential backoff
                        continue;
                    }
                }
                return Err(e);
            }
        }
    }
    
    Err(FerricLinkError::model_rate_limit("Max retries exceeded"))
}

2. Rate Limiting with Tokens

use std::sync::Arc;
use tokio::sync::Semaphore;

struct RateLimiter {
    semaphore: Arc&lt;Semaphore&gt;,
    max_requests_per_minute: usize,
}

impl RateLimiter {
    fn new(max_requests_per_minute: usize) -> Self {
        Self {
            semaphore: Arc::new(Semaphore::new(max_requests_per_minute)),
            max_requests_per_minute,
        }
    }
    
    async fn acquire(&self) -> Result<(), FerricLinkError> {
        self.semaphore.acquire().await
            .map_err(|_| FerricLinkError::model_rate_limit("Failed to acquire rate limit permit"))?
            .forget();
        Ok(())
    }
}

3. Monitor Usage

use std::collections::HashMap;
use std::time::{Duration, Instant};

struct UsageMonitor {
    requests: HashMap&lt;String, Vec&lt;Instant&gt;&gt;,
    window: Duration,
}

impl UsageMonitor {
    fn new(window: Duration) -> Self {
        Self {
            requests: HashMap::new(),
            window,
        }
    }
    
    fn can_make_request(&mut self, key: &str, limit: usize) -> bool {
        let now = Instant::now();
        let requests = self.requests.entry(key.to_string()).or_insert_with(Vec::new);
        
        // Remove old requests outside the window
        requests.retain(|&time| now.duration_since(time) < self.window);
        
        if requests.len() < limit {
            requests.push(now);
            true
        } else {
            false
        }
    }
}

Prevention

Implement proper rate limiting
Use exponential backoff for retries
Monitor your usage patterns
Consider upgrading your plan
Batch requests when possible
Cache responses when appropriate

Common Causes​

Solutions​

1. Implement Exponential Backoff​

2. Rate Limiting with Tokens​

3. Monitor Usage​

Prevention​

Related Documentation​

Common Causes

Solutions

1. Implement Exponential Backoff

2. Rate Limiting with Tokens

3. Monitor Usage

Prevention

Related Documentation