Conversation
Previously, in the index updater we used the purge_seq value from clouseau. In some cases that can return an older value (0 on a new index) than what is in the purge checkpoint doc created in `maybe_create_local_purge_doc/2`. (In that function we initialize the checkpoint with the db purge sequence and call `clouseau_rpc:set_purge_seq/1` to also set the clouseau purge seq value). An older purge sequence than the current minimum db purge sequence would result in an `invalid_start_purge_seq` being thrown during purged infos folding. In general, if a client updates a purge checkpoint, then it should not query purged infos with a sequence value below that, since if that is the lowest current purge checkpoint value and compaction runs, it could have removed all the purged infos below that.
a446cdc to
cc08fce
Compare
|
Is the checkpoint document updated after the clouseau index has committed the purges up to that point to disk? |
In the index update process it is updated the document after it calls set_purge on the clouseau index but the commit is not called yet, that's called a bit later after updating the bulk of the docs |
|
ok, the purge checkpoint is updated at the end of the updater process so if clouseau crashes before the commit those purges have not happened, and would not be retried as the next update attempt will start from the purge checkpoint seq. |
Yup, that's why I suggested a companion clouseau pr in our DM -- to commit when set_purge seq is set. That has to happen in two place: when we first initialize the purge sequence (I think this part maybe missing in nouveau, but that's a separate PR) and when we update it after purging a new batch. |
This is a companion PR for Dreyfus [1] The purge sequence value when set via the API is not committed, it's just stored in memory only until commit time. With the PR [1], if we update the checkpoint document on the db side, and purges get removed by the compactor, then, the index service also crashes before it commits, it's possible that the index would miss some purges. So to keep them in sync ensure that always commit after updating the purge sequence value. We'll first set the purge sequence in the index the update the checkpoint document. We'll make sure to call the set_purge_seq only when it gets updated or initialized and when the index is created (and initialized). [1] apache/couchdb#5920
|
Companion clouseau PR to commit when setting the purge_seq cloudant-labs/clouseau#143 Highlighting the two places we set the purge sequence: couchdb/src/dreyfus/src/dreyfus_util.erl Lines 353 to 357 in 45b0fbc couchdb/src/dreyfus/src/dreyfus_index_updater.erl Lines 115 to 117 in 45b0fbc In both cases we first set the purge sequence and only then update the checkpoint document. If the index committed after updating the purge_seq we'd ensure it won't regress on restart if we don't manage to commit with a pending (in memory purge_seq in clouseau). |
Previously, in the index updater we used the purge_seq value from clouseau. In some cases that can return an older value (0 on a new index) than what is in the purge checkpoint doc created in
maybe_create_local_purge_doc/2. (In that function we initialize the checkpoint with the db purge sequence and callclouseau_rpc:set_purge_seq/1to also set the clouseau purge seq value). An older purge sequence than the current minimum db purge sequence would result in aninvalid_start_purge_seqbeing thrown during purged infos folding.In general, if a client updates a purge checkpoint, then it should not query purged infos with a sequence value below that, since if that is the lowest current purge checkpoint value and compaction runs, it could have removed all the purged infos below that.