You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why these changes are being introduced:
There are a copule of primary ways in which a read method can fail based on the table requested:
1. The table name is invalid and will never work.
2. The table name is valid, but does not yet exist in the DuckDB context.
For the first, we want to raise an error immediately. For the second, there is a bit more nuance depending
on the table requested.
How this addresses that need:
For TIMDEXDataset.read_* methods, we operate from the assumption that `records` and `current_records` should always be available and so we raise an error indicating a metadata rebuild is required.
But for TIMDEXEmbeddings, and potentially other data sources as added, we may legitimately not have data yet.
As such, we'll want to log warnings and suggest a refresh, but just return an empty set.
Side effects of this change:
* Applications like TIM, which now attempt embeddings reading for its `reindex-source` CLI command, will
no longer encounter an error if embeddings don't yet exist.
* In the rare edge cases of a brand new dataset, we have better error raising and logging.
Relevant ticket(s):
* https://mitlibraries.atlassian.net/browse/USE-306
0 commit comments