-
Notifications
You must be signed in to change notification settings - Fork 0
The query object [WIP]
The query object provides a smart abstraction to a database, allowing data to be fetched to the frontend efficiently and transparently. Conceptually, it's an interface to a database:
from dj.query import QueryObject
qo = QueryObject('http://localhost:31415/')
rows = qo.execute('SELECT * FROM table1')
for row in rows:
print(rows)Now imagine 3 different scenarios:
Imagine that table1 is sufficiently small, say, 10k rows of 1KB each, and a user executes a query like this:
SELECT
country
, COUNT(*)
FROM
table1
GROUP BY
countryIn this case the server should return all the data to the frontend, and the query object should execute the aggregation on the client side. If the user run a second query on the object, it will send a request with an If-None-Match header. If the content has not changed, the server returns a 304 Not Modified status, and the frontend can reuse the data already present in the query object.
For huge tables, if the result set is too big (say, larger than 100 MB), the response is just a lazy iterator to the data. As the client iterates on the object, it's streamed from the server using a persistent connection (websocket? Keep-alive?).
Further queries need to be processed by the backend, since the query object holds no data.
For intermediary tables, the result set returned by the server will most likely contain only the rows for the executed query. There might still be room for some optimization, if the backend has metadata about the columns.
For example, suppose a user issues the following query:
SELECT
username
FROM
table1
WHERE
country = 'US' AND
age > 21If the backend knows the distribution of the age column (min, max, first and second moment, e.g.) it might omit that filter, and return the rows filtered only by country. The query object will apply the age filtering on the client side, and further queries with a different age condition (or no condition at all) can reuse the data present in the client side.
The query object can also reuse the client side data when more restrictive conditions are applied. For example, if the user filters by age > 30, the query object should be able to reuse the data from the query above, after confirming with the server that it hasn't changed.
The query object will be implemented in Webassembly (even though the example is in Python). This allows for quick filtering, grouping by and other operations to be performed on the object.
It needs a SQL parser, in order to understand the queries and reuse cached data.