-
Notifications
You must be signed in to change notification settings - Fork 3
concepts.data_storage
Justin Joyce edited this page Jul 15, 2021
·
2 revisions
The datasets are layed out like this:
path/to/dataset <-- this is the value in the dataset params
/year_2021
/month_03
/day_20 <-- datasets are for specific dates, this part
is added automatically if not specified
/as_at_<id> <-- datasets can't be deleted, if you have a
correction, you create the dataset again,
each run puts the data into a timestamped
frame. Usually, you read from the latest
frame when you read the dataset
/blob <-- the data is split into 64Mb blobs,
these help keep memory requirements low
and help to avoid small file problems