English | 中文版
[TOC]
Users can configure multiple save conditions; when any condition is met Redis triggers a BGSAVE.
Source schema:
struct redisServer {
struct saveparam *saveparams; /* configured "save" parameters: <seconds> <changes> */
int saveparamslen; /* number of save parameter entries */
}dirty: number of changes since the last successful persistencelastsave: time of the last successful persistence
Source:
struct redisServer {
long long dirty; /* changes since last successful persistence */
time_t lastsave; /* time of last successful persistence */
}The serverCron runs by default every 100ms to check whether any save condition is satisfied.
Source:
TODOTwo commands can generate an RDB file:
SAVE— create RDB file synchronously (blocking)BGSAVE— create RDB file in background (forks a child, non-blocking)
Notes:
- Redis prefers AOF for restoring the dataset; RDB is used only when AOF is disabled.
- While
BGSAVEruns, incomingSAVEandBGSAVEcommands are rejected;BGREWRITEAOFis delayed untilBGSAVEcompletes. - While
BGREWRITEAOFruns, incomingBGSAVEcommands are rejected.
Redis saves RDB via int rdbSave(char *filename) or int rdbSaveBackground(char *filename).
/** @brief save RDB to disk @param filename RDB path */
int rdbSave(char *filename) {
char tmpfile[256];
FILE *fp;
rio rdb;
int error;
snprintf(tmpfile,256,"temp-%d.rdb", (int) getpid());
fp = fopen(tmpfile,"w"); /* create temporary file temp-<pid>.rdb */
if (!fp) {
redisLog(REDIS_WARNING, "Failed opening .rdb for saving: %s",
strerror(errno));
return REDIS_ERR;
}
rioInitWithFile(&rdb,fp);
if (rdbSaveRio(&rdb,&error) == REDIS_ERR) { /* write RDB to temp file */
errno = error;
goto werr;
}
/* flush and sync */
if (fflush(fp) == EOF) goto werr;
if (fsync(fileno(fp)) == -1) goto werr;
if (fclose(fp) == EOF) goto werr;
/* atomic rename */
if (rename(tmpfile,filename) == -1) {
redisLog(REDIS_WARNING,"Error moving temp DB file on the final destination: %s", strerror(errno));
unlink(tmpfile);
return REDIS_ERR;
}
redisLog(REDIS_NOTICE,"DB saved on disk");
server.dirty = 0;
server.lastsave = time(NULL);
server.lastbgsave_status = REDIS_OK;
return REDIS_OK;
werr:
redisLog(REDIS_WARNING,"Write error saving DB on disk: %s", strerror(errno));
fclose(fp);
unlink(tmpfile);
return REDIS_ERR;
}Redis loads RDB using int rdbLoad(char *filename).
/** @brief load RDB file @param filename path */
int rdbLoad(char *filename) {
...
}Server startup load flow:
a=>operation: server start
load=>operation: run loading routine
isAOFEnable=>condition: is AOF enabled?
loadAOF=>operation: load AOF file
loadRDB=>operation: load RDB file
a->load->isAOFEnable
isAOFEnable(yes)->loadAOF
isAOFEnable(no)->loadRDB
An RDB file layout:
| REDIS | db_version | databases | EOF | check_sum |
|---|---|---|---|---|
| 5byte | 4byte | variable | 1byte | 8bytes |
REDIS: "REDIS" magic header (5 bytes)db_version: version string (4 bytes)databases: zero or more databases and their key-value pairs (variable length)EOF: end-of-file markercheck_sum: 8-byte checksum
databases structure (repeated per DB):
| SELECTDB | db_number | key_value_pairs |
|---|---|---|
| 1byte | 1/2/5byte | variable |
SELECTDB: marker indicating a DB section followsdb_number: DB id (1,2 or 5 bytes depending on size)key_value_pairs: one or more key/value entries
Key/value entry without expiration:
| TYPE | key | value |
|---|---|---|
| 1byte | var | var |
TYPE: REDIS_RDB_TYPE_XXXkey: stringvalue: encoded per type (see below)
Key/value with expiration (ms):
| EXPIRETIME_MS | ms | TYPE | key | value |
|---|---|---|---|---|
| 1byte | 8byte | 1byte | var | var |
EXPIRETIME_MS: marker indicating next field is an expiration in msms: signed integer expiration (ms)
RDB value may be any Redis type:
-
String (REDIS_RDB_TYPE_STRING)
Two encodings:
REDIS_ENCODING_INTorREDIS_ENCODING_RAW.When file compression is enabled:
- strings <= 20 bytes are saved as-is;
- strings > 20 bytes are compressed with LZF before saving.
When compression is disabled, strings are always stored raw.
-
List (REDIS_RDB_TYPE_LIST)
When encoded as
REDIS_ENCODING_LINKEDLISTthe format is:|list_length|item1|item2|...|itemN| -
Set (REDIS_RDB_TYPE_SET)
When encoded as
REDIS_ENCODING_HTthe format is:|set_size|elem1|elem2|...|elemN| -
Hash (REDIS_RDB_TYPE_HASH)
When encoded as
REDIS_ENCODING_HTthe format is:|hash_size|key_value_pair1|...|key_value_pairN| -
Sorted set (REDIS_RDB_TYPE_ZSET)
When encoded as
REDIS_ENCODING_SKIPLISTthe format is:|sorted_set_size|element1|element2|...|elementN| -
Intset (REDIS_RDB_TYPE_SET_INTSET)
The intset is converted to a string object and stored as a string.
-
Ziplist encodings (list/hash/zset ziplist types)
Ziplist objects are converted to a string object; when reading, the string is parsed back and converted to the appropriate ziplist type indicated by
TYPE.
An empty RDB file contains the four parts: header, version, EOF and checksum as described above.
When a DB is saved it includes the SELECTDB marker, DB number and its key/value entries; keys with expiration include the expire marker and ms value.
AOF persistence records write commands executed by the server to reconstruct the dataset:
graph LR
Client--write command-->Server--append executed command-->AOF file
- Append command to AOF buffer
- Write buffer to AOF file
- Sync file to disk
| appendfsync | behavior of flushAppendOnlyFile |
|---|---|
| always | write and fsync all aof_buf data each loop (safest, slowest) |
| everysec | write aof_buf each loop; if last fsync > 1s spawn a thread to fsync (good balance) |
| no | write aof_buf to file but do not fsync (rely on OS), fastest but least safe |
always: slowest but safest — at worst only one event loop of commands is lost on crash.everysec: writes every loop and fsyncs at most once per second in a background thread; typically loses at most one second of commands on crash.no: fastest writes but relies on OS flush; accumulated data may cause long sync times when OS flush occurs.
RDB loads by replaying commands from AOF as follows:
load=>operation: server startup load routine
create_cli=>operation: create pseudo-client
read_wcmd=>operation: parse and read a write command from AOF
exec_wcmd=>operation: execute the write command using pseudo-client
exec_fin=>condition: all AOF commands processed?
load_fin=>operation: finished loading
load->create_cli->read_wcmd->exec_fin
exec_fin(yes)->load_fin
exec_fin(no)->read_wcmd
- Create a fake client without a network socket.
- Parse and read one write command from the AOF.
- Use the fake client to execute the command.
- Repeat until all commands are processed.
Redis can rewrite the AOF in the background to produce a compact equivalent AOF without redundant commands.
The rewrite runs in a child process so the parent can continue serving clients without locks.
To keep the child consistent with the parent's ongoing writes, Redis uses an AOF rewrite buffer that records commands executed after the fork. The parent appends each write to both the normal AOF buffer and the rewrite buffer.
graph LR
Client --send command--> serv
subgraph Server
serv(command processor) --append command--> AOF buffer
serv(command processor) --append command--> AOF rewrite buffer
end
When rewrite completes:
- Append the rewrite buffer contents to the new AOF so it reflects the current DB state.
- Atomically rename the new AOF over the old one.
[1] Huang Jianhong. Redis Design and Implementation