Problem
When using --resume on a backup directory that exists but has no download.state2 file, clickhouse-backup crashes with:
Error: backup is already exists
Root cause
The download logic checks if the backup already exists locally:
for i := range localBackups {
if backupName == localBackups[i].BackupName {
if !b.resume {
return ErrBackupIsAlreadyExists
} else {
// resume path requires download.state2 to exist
}
}
}
When --resume is set but download.state2 doesn't exist (e.g., the previous download crashed before creating the state file, or the state file was manually deleted), the code falls through to an error path.
When this happens
- Previous download was killed before writing any state
- User manually deleted the state file but kept the partial data
- Previous download used a different version that didn't create state files
- Backup directory was created by a failed download that crashed on metadata
Proposed Fix
Make the state file optional for resume. If it doesn't exist, resume from scratch — the conservative validation (see #1376) will detect which parts are complete and which need re-downloading:
if !b.resume {
return ErrBackupIsAlreadyExists
} else {
isResumeExists = true
_, checkDownloadErr := os.Stat(path.Join(b.DefaultDataPath, "backup", backupName, "download.state2"))
if errors.Is(checkDownloadErr, os.ErrNotExist) {
// No state file from a previous download — this is OK.
// Resume will re-validate and re-download any incomplete parts.
// The state file is an optimization, not a requirement.
log.Warn().Msgf("%s already exists but no download.state2 found, will resume download from scratch", backupName)
} else {
log.Warn().Msgf("%s already exists will try to resume download", backupName)
}
}
Behavior after fix
| State file exists? |
Behavior |
| Yes |
Normal resume — bbolt state tracks completed parts |
| No |
Resume from scratch — validates each part locally, re-downloads incomplete ones |
| No + manifest |
Fast resume — manifest enables local-only validation (zero remote calls) |
The key insight is that the state file is an optimization for skipping validation, not a requirement for correctness. The conservative resume validation (checking actual file existence and sizes) is the real safety mechanism.
Problem
When using
--resumeon a backup directory that exists but has nodownload.state2file, clickhouse-backup crashes with:Root cause
The download logic checks if the backup already exists locally:
When
--resumeis set butdownload.state2doesn't exist (e.g., the previous download crashed before creating the state file, or the state file was manually deleted), the code falls through to an error path.When this happens
Proposed Fix
Make the state file optional for resume. If it doesn't exist, resume from scratch — the conservative validation (see #1376) will detect which parts are complete and which need re-downloading:
Behavior after fix
The key insight is that the state file is an optimization for skipping validation, not a requirement for correctness. The conservative resume validation (checking actual file existence and sizes) is the real safety mechanism.