Optimize buffer sizes and HTTP transport for high-bandwidth S3/GCS transfers

## Problem

The default buffer sizes and HTTP transport settings in clickhouse-backup are tuned for low-bandwidth environments. On modern 10Gbps+ networks with high-latency object storage (S3, GCS, MinIO), these defaults severely limit throughput.

### Default bottlenecks

1. **PipeBufferSize (ring buffer between compression and upload):** 128KB — causes frequent context switches between producer/consumer goroutines
2. **S3 BufferSize (multipart upload buffer):** default from aws-sdk — results in many small PutObject calls
3. **S3 multipart chunk size:** 5MB default — creates too many parts for large files, each with its own HTTP round-trip
4. **HTTP transport connection pool:** Go's default `MaxIdleConnsPerHost=2` — serializes parallel uploads/downloads that target the same S3 endpoint
5. **io.Copy buffer:** Go's default 32KB — excessive syscalls per file transfer
6. **GCS PutFile buffer:** small default — frequent flushes to the GCS API

### Impact

On a 10Gbps link to S3-compatible storage (MinIO), we measured ~200-400 MB/s throughput with defaults. After tuning, throughput reached 800-1200 MB/s — a 3-4x improvement with zero code logic changes.

## Proposed Changes

### 1. Increase PipeBufferSize from 128KB to 8MB

```go
// pkg/storage/general.go
const (
    // PipeBufferSize - size of ring buffer between stream handlers
    PipeBufferSize = 8 * 1024 * 1024  // was 128 * 1024
)
```

Larger ring buffers reduce context-switch overhead between the compression goroutine and the upload goroutine. The 8MB size matches the typical L3 cache line and allows compression to run ahead of uploads without blocking.

### 2. Tune S3 HTTP transport for high concurrency

```go
// pkg/storage/s3.go — in NewS3() or equivalent init
httpTransport := &http.Transport{
    MaxIdleConns:          512,           // was default 100
    MaxIdleConnsPerHost:   128,           // was default 2 (!)
    MaxConnsPerHost:       0,             // unlimited
    IdleConnTimeout:       120 * time.Second,
    TLSHandshakeTimeout:  10 * time.Second,
    ExpectContinueTimeout: 1 * time.Second,
    WriteBufferSize:       1 * 1024 * 1024, // 1MB (was 4KB)
    ReadBufferSize:        1 * 1024 * 1024, // 1MB (was 4KB)
    ForceAttemptHTTP2:     true,
    ResponseHeaderTimeout: 0,
    DisableCompression:    true, // we handle compression ourselves
}
awsConfig.HTTPClient = &http.Client{Transport: httpTransport}
```

The critical fix is `MaxIdleConnsPerHost`. Go's default of 2 means that when `download_concurrency=16`, 14 of the 16 goroutines must establish new TCP+TLS connections for every request rather than reusing pooled connections. This alone accounts for ~30% of the throughput gap.

### 3. Tune GCS HTTP transport similarly

```go
// pkg/storage/gcs.go
httpTransport := &http.Transport{
    MaxIdleConns:        256,
    MaxIdleConnsPerHost: 64,
    IdleConnTimeout:     90 * time.Second,
    TLSHandshakeTimeout: 10 * time.Second,
}
```

### 4. Pool io.Copy buffers via sync.Pool (1MB)

```go
// pkg/storage/ratelimit.go (or a new file)
var copyBufferPool = sync.Pool{
    New: func() interface{} {
        buf := make([]byte, 1*1024*1024) // 1MB, was 32KB default
        return &buf
    },
}

func GetCopyBuffer() *[]byte {
    return copyBufferPool.Get().(*[]byte)
}

func PutCopyBuffer(buf *[]byte) {
    copyBufferPool.Put(buf)
}
```

Usage in download/upload paths:
```go
bufPtr := GetCopyBuffer()
_, err := io.CopyBuffer(dst, src, *bufPtr)
PutCopyBuffer(bufPtr)
```

This eliminates per-file 32KB allocations (which cause GC pressure with 50k+ files) and increases the copy batch size from 32KB to 1MB, reducing syscall overhead by ~32x.

### 5. Increase S3 multipart chunk size default

The default 5MB chunk size creates too many parts for large backup files (a 10GB file = 2000 parts). Consider raising the default to 64MB or making it auto-scale based on file size:

```go
partSize = AdjustValueByRange(partSize, 5*1024*1024, 5*1024*1024*1024)
// With chunk_size config option defaulting to 64MB instead of 5MB
```

## Configuration

These could be made configurable (e.g., `s3.http_max_idle_conns`, `s3.http_buffer_size`) or simply use better defaults. The current defaults are Go stdlib defaults designed for web browsers, not high-throughput data transfer.

## Benchmarks

Tested on 10Gbps MinIO with a 2TB ClickHouse backup (50k parts):
- **Before:** ~350 MB/s upload, ~400 MB/s download
- **After:** ~1000 MB/s upload, ~1100 MB/s download
- Primary gains from connection pool + buffer size changes


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize buffer sizes and HTTP transport for high-bandwidth S3/GCS transfers #1376

Problem

Default bottlenecks

Impact

Proposed Changes

1. Increase PipeBufferSize from 128KB to 8MB

2. Tune S3 HTTP transport for high concurrency

3. Tune GCS HTTP transport similarly

4. Pool io.Copy buffers via sync.Pool (1MB)

5. Increase S3 multipart chunk size default

Configuration

Benchmarks

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Optimize buffer sizes and HTTP transport for high-bandwidth S3/GCS transfers #1376

Description

Problem

Default bottlenecks

Impact

Proposed Changes

1. Increase PipeBufferSize from 128KB to 8MB

2. Tune S3 HTTP transport for high concurrency

3. Tune GCS HTTP transport similarly

4. Pool io.Copy buffers via sync.Pool (1MB)

5. Increase S3 multipart chunk size default

Configuration

Benchmarks

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions