The current inventory collection is an input to FM analysis. Presently, when we create a sitrep, we record the collection ID of the inventory that was current when the sitrep was produced. I think that diagnosis engines and such will eventually want to be able to compare (and especially, diff) the current inventory with the one that was input to the prior sitrep, so that we can see if the state of the system has changed. Unfortunately, inventory collections are large and get deleted very aggressively, so it is possible for a sitrep to be produced where the parent sitrep's inventory doesn't exist at all. This is also an issue when trying to interpret old sitreps in e.g. OMDB or archived debugging records. I wrote a bit about this in RFD 603, but didn't come up with a solution...
I wonder if we ought to be copying the entire contents of the inventory collection into a big pile of fm_inv_$WHATEVER tables, or something, so that the sitrep can be completely self-describing. In general, this is the approach that feels conceptually correct: the sitrep should contain all the data necessary to understand the sitrep. But, on the other hand, actually doing it in practice seems horrifically gross because the inventory is already a huge pile of tables, and we'd essentially have to duplicate all the schemas and model types and such, and then make sure we can reassemble them into the same domain types representing an inventory as the ones that come from the actual inventory tables. All of this would also require updating as more stuff is added to the inventory. So, um. Man. That sucks. And, in addition to being painful to implement, it also means we 're duplicating all this stuff on disk in CRDB.
Alternatively, we might consider modifying the inventory collection GC logic to retain inventory collections which were input to at least the current sitrep and possibly some buffer of prior ones. This is also a bit sketchy, especially because, well...we were deleting them for a reason. Hm.
@smklein and @jgallagher, any thoughts?
The current inventory collection is an input to FM analysis. Presently, when we create a sitrep, we record the collection ID of the inventory that was current when the sitrep was produced. I think that diagnosis engines and such will eventually want to be able to compare (and especially, diff) the current inventory with the one that was input to the prior sitrep, so that we can see if the state of the system has changed. Unfortunately, inventory collections are large and get deleted very aggressively, so it is possible for a sitrep to be produced where the parent sitrep's inventory doesn't exist at all. This is also an issue when trying to interpret old sitreps in e.g. OMDB or archived debugging records. I wrote a bit about this in RFD 603, but didn't come up with a solution...
I wonder if we ought to be copying the entire contents of the inventory collection into a big pile of
fm_inv_$WHATEVERtables, or something, so that the sitrep can be completely self-describing. In general, this is the approach that feels conceptually correct: the sitrep should contain all the data necessary to understand the sitrep. But, on the other hand, actually doing it in practice seems horrifically gross because the inventory is already a huge pile of tables, and we'd essentially have to duplicate all the schemas and model types and such, and then make sure we can reassemble them into the same domain types representing an inventory as the ones that come from the actual inventory tables. All of this would also require updating as more stuff is added to the inventory. So, um. Man. That sucks. And, in addition to being painful to implement, it also means we 're duplicating all this stuff on disk in CRDB.Alternatively, we might consider modifying the inventory collection GC logic to retain inventory collections which were input to at least the current sitrep and possibly some buffer of prior ones. This is also a bit sketchy, especially because, well...we were deleting them for a reason. Hm.
@smklein and @jgallagher, any thoughts?