Add optional response caching

## Summary

Add an optional caching layer to avoid repeated identical API calls, reducing latency and API costs for cacheable requests.

## Use Cases

- **Identical prompts** - Same question asked multiple times
- **System prompt reuse** - Caching responses with identical system prompts
- **Development/testing** - Avoid hitting API during debugging
- **Rate limiting** - Reduce API calls when approaching limits

## Proposed API

```csharp
// Enable caching via options
services.AddCompactifAI(options =>
{
    options.ApiKey = "...";
    options.EnableCaching = true;
    options.CacheDuration = TimeSpan.FromMinutes(5);
});

// Or per-request control
var response = await client.ChatAsync(
    "What is 2+2?",
    cacheOptions: new CacheOptions 
    { 
        Enabled = true,
        Duration = TimeSpan.FromHours(1)
    });
```

## Implementation Options

### Option 1: IMemoryCache Integration

```csharp
public class CachingCompactifAIClient : ICompactifAIClient
{
    private readonly IMemoryCache _cache;
    private readonly ICompactifAIClient _inner;

    public async Task<ChatCompletionResponse> CreateChatCompletionAsync(...)
    {
        var cacheKey = GenerateCacheKey(request);
        if (_cache.TryGetValue(cacheKey, out ChatCompletionResponse cached))
            return cached;

        var response = await _inner.CreateChatCompletionAsync(request);
        _cache.Set(cacheKey, response, _cacheDuration);
        return response;
    }
}
```

### Option 2: Decorator Pattern

Allow users to wrap the client with their own caching strategy.

## Cache Key Generation

```csharp
private static string GenerateCacheKey(ChatCompletionRequest request)
{
    // Hash based on: model + messages + temperature + other deterministic params
    // Exclude: stream, user, etc.
}
```

## Expected Benefits

- **Reduced API costs** - Avoid paying for duplicate requests
- **Lower latency** - Instant responses for cached results
- **Offline development** - Work with cached responses during development

## Priority

🟢 P2 - Medium Impact

## Considerations

- Cache invalidation strategy
- Memory limits for cache size
- Consider distributed cache support (IDistributedCache)
- Non-deterministic responses (temperature > 0) may not be suitable for caching
- Should be opt-in to avoid unexpected behavior

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add optional response caching #6

Summary

Use Cases

Proposed API

Implementation Options

Option 1: IMemoryCache Integration

Option 2: Decorator Pattern

Cache Key Generation

Expected Benefits

Priority

Considerations

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add optional response caching #6

Description

Summary

Use Cases

Proposed API

Implementation Options

Option 1: IMemoryCache Integration

Option 2: Decorator Pattern

Cache Key Generation

Expected Benefits

Priority

Considerations

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions