📝 NOTE: This content was pulled from PR 308
Documents represent a single row or record of data in Astra DB Serverless.
Use the Collection class to work with documents.
If you haven’t already, consult the Collections reference topic for details on how to get a Collection object.
- Python
-
Date and datetime objects, which are instances of the Python standard library
datetime.datetimeanddatetime.dateclasses, can be used anywhere in documents.Read operations from a collection always return the
datetimeclass regardless of whether adateor adatetimewas provided in the insertion.import datetime from astrapy import DataAPIClient from astrapy.ids import ObjectId, uuid8, UUID client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_one({"when": datetime.datetime.now()}) collection.insert_one({"date_of_birth": datetime.date(2000, 1, 1)}) collection.update_one( {"registered_at": datetime.date(1999, 11, 14)}, {"$set": {"message": "happy Sunday!"}}, ) print( collection.find_one( {"date_of_birth": {"$lt": datetime.date(2001, 1, 1)}}, projection={"_id": False}, ) ) # will print: # {'date_of_birth': datetime.datetime(2000, 1, 1, 0, 0)}
NoteAs shown in the example, read operations from a collection always return the
datetimeclass regardless of whether adateor adatetimewas provided in the insertion. - TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Documents in a collection are always identified by an ID that is unique within the collection.
The ID can be any of several types, such as a string, integer, or datetime. However, it’s recommended to instead prefer the uuid or the ObjectId types.
The Data API supports uuid identifiers up to version 8, as well as ObjectId identifiers as provided by the bson library.
These can appear anywhere in the document, not only in its _id field. Moreover, different types of identifier can appear in different parts of the same document. And these identifiers can be part of filtering clauses and update/replace directives just like any other data type.
One of the optional settings of a collection is the "default ID type": that is, it is possible to specify what kind of identifiers the server should supply
for documents without an explicit _id field. (For details, see the create_collection method and Data API createCollection command in the Collections reference.) Regardless of the defaultId setting, however, identifiers of any type can be explicitly provided for documents. For example, during insertions, and will be honored by the insertion process.
- Python
-
from astrapy.ids import ( ObjectId, uuid1, uuid3, uuid4, uuid5, uuid6, uuid7, uuid8, UUID, )
AstraPy recognizes
uuidversions 1 through 8 (with the exception of 2) as provided by theuuidanduuid6Python libraries, as well as theObjectIdfrom thebsonpackage. Furthermore, out of convenience, these same utilities are exposed as shown in the example above.You can then generate new identifiers with statements such as
new_id = uuid8()ornew_obj_id = ObjectId(). Keep in mind that alluuidversions are instances of the same class (UUID), which exposes aversionproperty, should you need to access it.Here is a short example of the concepts:
from astrapy import DataAPIClient from astrapy.ids import ObjectId, uuid8, UUID client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_one({"_id": uuid8(), "tag": "new_id_v_8"}) collection.insert_one( {"_id": UUID("018e77bc-648d-8795-a0e2-1cad0fdd53f5"), "tag": "id_v_8"} ) collection.insert_one({"id": ObjectId(), "tag": "new_obj_id"}) collection.insert_one( {"id": ObjectId("6601fb0f83ffc5f51ba22b88"), "tag": "obj_id"} ) collection.find_one_and_update( {"_id": ObjectId("6601fb0f83ffc5f51ba22b88")}, {"$set": {"item_inventory_id": UUID("1eeeaf80-e333-6613-b42f-f739b95106e6")}}, )
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Insert a single document into a collection.
- Python
-
insert_result = collection.insert_one({"name": "Jane Doe"})
Insert a document with an associated vector.
insert_result = collection.insert_one( {"name": "Jane Doe"}, vector=[.08, .68, .30], )
Parameters:
Name Type Description document
DictThe dictionary expressing the document to insert. The
_idfield of the document can be left out, in which case it will be created automatically.vector
Dict[str, Any]A vector (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the "$vector" field of the document itself, however the two are mutually exclusive.
max_time_ms
Dict[str, Any]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection # Insert a document with a specific ID response1 = collection.insert_one( { "_id": 101, "name": "John Doe", }, vector=[.12, .52, .32], ) # Insert a document without specifying an ID # so that _id is generated automatically response2 = collection.insert_one( { "name": "Jane Doe", }, vector=[.08, .68, .30], )
- TypeScript
-
Collection<Schema>.insertOne( document: MaybeId<Schema>, options?: InsertOneOptions, ): Promise<InsertOneResult<Schema>>
Parameters:
Name Type Description document
The document to insert. If the document does not have an
_idfield, the server generates one.options?
The options for this operation.
Options (
InsertOneOptions):Name Type Description number[]The vector for the document. Equivalent to providing the vector in the
$vectorfield of the document itself; however, the two are mutually exclusive.numberThe maximum time in milliseconds that the client should wait for the operation to complete.
Returns:
Promise<InsertOneResult<Schema>>- A promise that resolves to the inserted ID.Example:
import { DataApiClient } from '@datastax/astra-db-ts'; // Reference an untyped collection const db = new DataApiClient('TOKEN').db('API_ENDPOINT'); const collection = db.collection('my_collection'); (async () => { // Insert a document with a specific ID await collection.insertOne({ _id: '1', name: 'John Doe' }); // Insert a document with an autogenerated ID await collection.insertOne({ name: 'Jane Doe' }); // Insert a document with a vector await collection.insertOne({ name: 'Jane Doe' }, { vector: [.12, .52, .32] }); await collection.insertOne({ name: 'Jane Doe', $vector: [.12, .52, .32] }); })();
- Java
-
TBD
- cURL
-
cURL -s --location \ --request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \ --header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --data '{ "insertOne": { "document": { "_id": "1", "purchase_type": "Online", "$vector": [0.25, 0.25, 0.25, 0.25, 0.25], "customer": { "name": "Jim A.", "phone": "123-456-1111", "age": 51, "credit_score": 782, "address": { "address_line": "1234 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1690045891}, "seller": { "name": "Jon B.", "location": "Manhattan NYC" }, "items": [ { "car" : "BMW 330i Sedan", "color": "Silver" }, "Extended warranty - 5 years" ], "amount": 47601, "status" : "active", "preferred_customer" : true } } }' | json_pp
Properties:
Name Type Description insertOne
command
Data API designation that a single document is inserted.
api-reference:partial$insert-command-payload.adoc
Response
{ "status": { "insertedIds": [ "1" ] } }
Insert multiple documents into a collection.
- Python
-
response = collection.insert_many( [ { "_id": 101, "name": "John Doe", }, { # ID is generated automatically "name": "Jane Doe", }, ], vectors=[ [.12, .52, .32], [.08, .68, .30], ], ordered=True, )
Returns:
InsertManyResult- An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.Example response
InsertManyResult(raw_results=[{'status': {'insertedIds': [101, '81077d86-05dc-43ca-877d-8605dce3ca4d']}}], inserted_ids=[101, '81077d86-05dc-43ca-877d-8605dce3ca4d'])Parameters:
Name Type Description documents
Iterable[Dict[str, Any]],An iterable of dictionaries, each a document to insert. Documents may specify their
_idfield or leave it out, in which case it will be added automatically.vectors
Optional[Iterable[Optional[Iterable[float]]]]An optional list of vectors (as many vectors as the provided documents) to associate to the documents when inserting. Each vector is added to the corresponding document prior to insertion on database. The list can be a mixture of None and vectors, in which case some documents will not have a vector, unless it is specified in their "$vector" field already. Passing vectors this way is indeed equivalent to the "$vector" field of the documents, however the two are mutually exclusive.
ordered
boolIf True (default), the insertions are processed sequentially. If False, they can occur in arbitrary order and possibly concurrently.
chunk_size
Optional[int]How many documents to include in a single API request. Exceeding the server maximum allowed value results in an error. Leave it unspecified (recommended) to use the system default.
concurrency
Optional[int]Maximum number of concurrent requests to the API at a given time. It cannot be more than one for ordered insertions.
max_time_ms
Optional[int]A timeout, in milliseconds, for the operation.
NoteUnless there are specific reasons not to, it is recommended to prefer
ordered = Falseas it will result in a much higher insert throughput than an equivalent ordered insertion.Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many([{"a": 10}, {"a": 5}, {"b": [True, False, False]}]) collection.insert_many( [{"seq": i} for i in range(50)], ordered=False, concurrency=5, ) # The following are three equivalent statements: collection.insert_many( [{"tag": "a"}, {"tag": "b"}], vectors=[[1, 2], [3, 4]], ) collection.insert_many( [{"tag": "a", "$vector": [1, 2]}, {"tag": "b"}], vectors=[None, [3, 4]], ) collection.insert_many( [ {"tag": "a", "$vector": [1, 2]}, {"tag": "b", "$vector": [3, 4]}, ] )
- TypeScript
-
Collection<Schema>.insertMany( documents: MaybeId<Schema>[], options?: InsertManyOptions, ): Promise<InsertManyResult<Schema>>
Parameters:
Name Type Description documents
The documents to insert. If any document does not have an
_idfield, the server generates one.options?
The options for this operation.
Options (
InsertManyOptions):Name Type Description booleanYou may set the
orderedoption totrueto stop the operation after the first error; otherwise all documents may be parallelized and processed in arbitrary order, improving, perhaps vastly, performance.numberYou can set the
concurrencyoption to control how many network requests are made in parallel on unordered insertions. Defaults to8. Not available for ordered insertions.numberControl how many documents are sent each network request. Defaults to
20.(number[] | null | undefined)[]An array of vectors to associate with each document. If a vector is
nullorundefined, the document will not have a vector. Must equal the number of documents if provided. Equivalent to providing the vector in the$vectorfield of the documents themselves; however, the two are mutually exclusive.NoteUnless there are specific reasons not to, it is recommended to prefer to leave ordered
falseas it will result in a much higher insert throughput than an equivalent ordered insertion.Returns:
Promise<InsertManyResult<Schema>>- A promise that resolves to the inserted IDs.Example:
import { DataApiClient, InsertManyError } from '@datastax/astra-db-ts'; // Reference an untyped collection const db = new DataApiClient('TOKEN').db('API_ENDPOINT'); const collection = db.collection('my_collection'); (async () => { try { // Insert many documents await collection.insertMany([ { _id: '1', name: 'John Doe' }, { name: 'Jane Doe' }, // Will autogen ID ], { ordered: true }); // Insert many with vectors await collection.insertMany([ { name: 'John Doe', $vector: [.12, .52, .32] }, { name: 'Jane Doe' }, { name: 'Jane Doe', $vector: [.32, .52, .12] }, ]); await collection.insertMany([ { name: 'John Doe' }, { name: 'Jane Doe' }, { name: 'Dane Joe' }, ], { vectors: [ [.12, .52, .32], null, [.32, .52, .12], ], ordered: true, }); } catch (e) { if (e instanceof InsertManyError) { console.log(e.insertedIds); } } })();
- Java
-
TBD
- cURL
-
The following Data API
insertManycommand adds 20 documents to a collection.cURL -s --location \ --request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \ --header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --data '{ "insertMany": { "documents": [ { "_id": "2", "purchase_type": "Online", "$vector": [0.1, 0.15, 0.3, 0.12, 0.05], "customer": { "name": "Jack B.", "phone": "123-456-2222", "age": 34, "credit_score": 700, "address": { "address_line": "888 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1690391491}, "seller": { "name": "Tammy S.", "location": "Staten Island NYC" }, "items": [ { "car" : "Tesla Model 3", "color": "White" }, "Extended warranty - 10 years", "Service - 5 years" ], "amount": 53990, "status" : "active" }, { "_id": "3", "purchase_type": "Online", "$vector": [0.15, 0.1, 0.1, 0.35, 0.55], "customer": { "name": "Jill D.", "phone": "123-456-3333", "age": 30, "credit_score": 742, "address": { "address_line": "12345 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1690564291}, "seller": { "name": "Jasmine S.", "location": "Brooklyn NYC" }, "items": "Extended warranty - 10 years", "amount": 4600, "status" : "active" }, { "_id": "4", "purchase_type": "In Person", "$vector": [0.25, 0.25, 0.25, 0.25, 0.26], "customer": { "name": "Lester M.", "phone": "123-456-4444", "age": 40, "credit_score": 802, "address": { "address_line": "12346 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1690909891}, "seller": { "name": "Jon B.", "location": "Manhattan NYC" }, "items": [ { "car" : "BMW 330i Sedan", "color": "Red" }, "Extended warranty - 5 years", "Service - 5 years" ], "amount": 48510, "status" : "active" }, { "_id": "5", "purchase_type": "Online", "$vector": [0.25, 0.045, 0.38, 0.31, 0.67], "customer": { "name": "David C.", "phone": "123-456-5555", "age": 50, "credit_score": 800, "address": { "address_line": "32345 Main Ave", "city": "Jersey City", "state": "NJ" } }, "purchase_date": {"$date": 1690996291}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [ { "car" : "Tesla Model S", "color": "Red" }, "Extended warranty - 5 years" ], "amount": 94990, "status" : "active" }, { "_id": "6", "purchase_type": "In Person", "$vector": [0.11, 0.02, 0.78, 0.10, 0.27], "customer": { "name": "Chris E.", "phone": "123-456-6666", "age": 43, "credit_score": 764, "address": { "address_line": "32346 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1691860291}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [ { "car" : "Tesla Model X", "color": "Blue" } ], "amount": 109990, "status" : "active" }, { "_id": "7", "purchase_type": "Online", "$vector": [0.21, 0.22, 0.33, 0.44, 0.53], "customer": { "name": "Jeff G.", "phone": "123-456-7777", "age": 66, "credit_score": 802, "address": { "address_line": "22999 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1692119491}, "seller": { "name": "Jasmine S.", "location": "Brooklyn NYC" }, "items": [{ "car" : "BMW M440i Gran Coupe", "color": "Black" }, "Extended warranty - 5 years"], "amount": 61050, "status" : "active" }, { "_id": "8", "purchase_type": "In Person", "$vector": [0.3, 0.23, 0.15, 0.17, 0.4], "customer": { "name": "Harold S.", "phone": "123-456-8888", "age": 29, "credit_score": 710, "address": { "address_line": "1234 Main St", "city": "Orange", "state": "NJ" } }, "purchase_date": {"$date": 1693329091}, "seller": { "name": "Tammy S.", "location": "Staten Island NYC" }, "items": [{ "car" : "BMW X3 SUV", "color": "Black" }, "Extended warranty - 5 years" ], "amount": 46900, "status" : "active" }, { "_id": "9", "purchase_type": "Online", "$vector": [0.1, 0.15, 0.3, 0.12, 0.06], "customer": { "name": "Richard Z.", "phone": "123-456-9999", "age": 22, "credit_score": 690, "address": { "address_line": "22345 Broadway", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1693588291}, "seller": { "name": "Jasmine S.", "location": "Brooklyn NYC" }, "items": [{ "car" : "Tesla Model 3", "color": "White" }, "Extended warranty - 5 years" ], "amount": 53990, "status" : "active" }, { "_id": "10", "purchase_type": "In Person", "$vector": [0.25, 0.045, 0.38, 0.31, 0.68], "customer": { "name": "Eric B.", "phone": null, "age": 54, "credit_score": 780, "address": { "address_line": "9999 River Rd", "city": "Fair Haven", "state": "NJ" } }, "purchase_date": {"$date": 1694797891}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [{ "car" : "Tesla Model S", "color": "Black" } ], "amount": 93800, "status" : "active" }, { "_id": "11", "purchase_type": "Online", "$vector": [0.44, 0.11, 0.33, 0.22, 0.88], "customer": { "name": "Ann J.", "phone": "123-456-1112", "age": 47, "credit_score": 660, "address": { "address_line": "99 Elm St", "city": "Fair Lawn", "state": "NJ" } }, "purchase_date": {"$date": 1695921091}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [{ "car" : "Tesla Model Y", "color": "White" }, "Extended warranty - 5 years" ], "amount": 57500, "status" : "active" }, { "_id": "12", "purchase_type": "In Person", "$vector": [0.33, 0.44, 0.55, 0.77, 0.66], "customer": { "name": "John T.", "phone": "123-456-1123", "age": 55, "credit_score": 786, "address": { "address_line": "23 Main Blvd", "city": "Staten Island", "state": "NY" } }, "purchase_date": {"$date": 1696180291}, "seller": { "name": "Tammy S.", "location": "Staten Island NYC" }, "items": [{ "car" : "BMW 540i xDrive Sedan", "color": "Black" }, "Extended warranty - 5 years" ], "amount": 64900, "status" : "active" }, { "_id": "13", "purchase_type": "Online", "$vector": [0.1, 0.15, 0.3, 0.12, 0.07], "customer": { "name": "Aaron W.", "phone": "123-456-1133", "age": 60, "credit_score": 702, "address": { "address_line": "1234 4th Ave", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1697389891}, "seller": { "name": "Jon B.", "location": "Manhattan NYC" }, "items": [{ "car" : "Tesla Model 3", "color": "White" }, "Extended warranty - 5 years" ], "amount": 55000, "status" : "active" }, { "_id": "14", "purchase_type": "In Person", "$vector": [0.11, 0.02, 0.78, 0.21, 0.27], "customer": { "name": "Kris S.", "phone": "123-456-1144", "age": 44, "credit_score": 702, "address": { "address_line": "1414 14th Pl", "city": "Brooklyn", "state": "NY" } }, "purchase_date": {"$date": 1698513091}, "seller": { "name": "Jasmine S.", "location": "Brooklyn NYC" }, "items": [{ "car" : "Tesla Model X", "color": "White" } ], "amount": 110400, "status" : "active" }, { "_id": "15", "purchase_type": "Online", "$vector": [0.1, 0.15, 0.3, 0.12, 0.08], "customer": { "name": "Maddy O.", "phone": "123-456-1155", "age": 41, "credit_score": 782, "address": { "address_line": "1234 Maple Ave", "city": "West New York", "state": "NJ" } }, "purchase_date": {"$date": 1701191491}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": { "car" : "Tesla Model 3", "color": "White" }, "amount": 52990, "status" : "active" }, { "_id": "16", "purchase_type": "In Person", "$vector": [0.44, 0.11, 0.33, 0.22, 0.88], "customer": { "name": "Tim C.", "phone": "123-456-1166", "age": 38, "credit_score": 700, "address": { "address_line": "1234 Main St", "city": "Staten Island", "state": "NY" } }, "purchase_date": {"$date": 1701450691}, "seller": { "name": "Tammy S.", "location": "Staten Island NYC" }, "items": [{ "car" : "Tesla Model Y", "color": "White" }, "Extended warranty - 5 years" ], "amount": 58990, "status" : "active" }, { "_id": "17", "purchase_type": "Online", "$vector": [0.1, 0.15, 0.3, 0.12, 0.09], "customer": { "name": "Yolanda Z.", "phone": "123-456-1177", "age": 61, "credit_score": 694, "address": { "address_line": "1234 Main St", "city": "Hoboken", "state": "NJ" } }, "purchase_date": {"$date": 1702660291}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [{ "car" : "Tesla Model 3", "color": "Blue" }, "Extended warranty - 5 years" ], "amount": 54900, "status" : "active" }, { "_id": "18", "purchase_type": "Online", "$vector": [0.15, 0.17, 0.15, 0.43, 0.55], "customer": { "name": "Thomas D.", "phone": "123-456-1188", "age": 45, "credit_score": 724, "address": { "address_line": "98980 20th St", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1703092291}, "seller": { "name": "Jon B.", "location": "Manhattan NYC" }, "items": [{ "car" : "BMW 750e xDrive Sedan", "color": "Black" }, "Extended warranty - 5 years" ], "amount": 106900, "status" : "active" }, { "_id": "19", "purchase_type": "Online", "$vector": [0.25, 0.25, 0.25, 0.25, 0.27], "customer": { "name": "Vivian W.", "phone": "123-456-1199", "age": 20, "credit_score": 698, "address": { "address_line": "5678 Elm St", "city": "Hartford", "state": "CT" } }, "purchase_date": {"$date": 1704215491}, "seller": { "name": "Jasmine S.", "location": "Brooklyn NYC" }, "items": [{ "car" : "BMW 330i Sedan", "color": "White" }, "Extended warranty - 5 years" ], "amount": 46980, "status" : "active" }, { "_id": "20", "purchase_type": "In Person", "$vector": [0.44, 0.11, 0.33, 0.22, 0.88], "customer": { "name": "Leslie E.", "phone": null, "age": 44, "credit_score": 782, "address": { "address_line": "1234 Main St", "city": "Newark", "state": "NJ" } }, "purchase_date": {"$date": 1705338691}, "seller": { "name": "Jim A.", "location": "Jersey City NJ" }, "items": [{ "car" : "Tesla Model Y", "color": "Black" }, "Extended warranty - 5 years" ], "amount": 59800, "status" : "active" }, { "_id": "21", "purchase_type": "In Person", "$vector": [0.21, 0.22, 0.33, 0.44, 0.53], "customer": { "name": "Rachel I.", "phone": null, "age": 62, "credit_score": 786, "address": { "address_line": "1234 Park Ave", "city": "New York", "state": "NY" } }, "purchase_date": {"$date": 1706202691}, "seller": { "name": "Jon B.", "location": "Manhattan NYC" }, "items": [{ "car" : "BMW M440i Gran Coupe", "color": "Silver" }, "Extended warranty - 5 years", "Gap Insurance - 5 years" ], "amount": 65250, "status" : "active" } ], "options": { "ordered": false } } }' | json_pp
Response
{ "status" : { "insertedIds" : [ "4", "7", "10", "13", "16", "19", "21", "18", "6", "12", "15", "9", "3", "11", "2", "17", "14", "8", "20", "5" ] } }Properties:
Name Type Description insertMany
command
Data API designation that many documents (up to 20 at a time) are being inserted.
api-reference:partial$insert-command-payload.adoc
Retrieve a single document from a collection using various options.
- Python
-
Retrieve a single document from a collection by its
_id.document = collection.find_one({"_id": 101})
Retrieve a single document from a collection by any attribute, as long as it is covered by the collection’s indexing configuration.
TipAs noted in The Indexing option in the Collections reference topic, any field that is part of a subsequent filter or sort operation must be an indexed field. If you elected to not index certain or all fields when you created the collection, you cannot reference that field in a filter/sort query.
document = collection.find_one({"location": "warehouse_C"})
Retrieve a single document from a collection by an arbitrary filtering clause.
document = collection.find_one({"tag": {"$exists": True}})
Retrieve the most similar document to a given vector.
result = collection.find_one({}, vector=[.12, .52, .32])
Retrieve only specific fields from a document.
result = collection.find_one({"_id": 101}, projection={"name": True})
Returns:
Union[Dict[str, Any], None]- Either the found document as a dictionary or None if no matching document is found.Example response
{'_id': 101, 'name': 'John Doe', '$vector': [0.12, 0.52, 0.32]}Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.include_similarity
Optional[bool]A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in the returned document. Can only be used for vector ANN search, i.e. when either
vectoris supplied or thesortparameter has the shape {"$vector": …}.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the order the documents are returned. See the Note about sorting for details.
max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.find_one({}) # prints: {'_id': '68d1e515-...', 'seq': 37} collection.find_one({"seq": 10}) # prints: {'_id': 'd560e217-...', 'seq': 10} collection.find_one({"seq": 1011}) # (returns None for no matches) collection.find_one({}, projection={"seq": False}) # prints: {'_id': '68d1e515-...'} collection.find_one( {}, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) # prints: {'_id': '97e85f81-...', 'seq': 69} collection.find_one({}, vector=[1, 0]) # prints: {'_id': '...', 'tag': 'D', '$vector': [4.0, 1.0]}
- TypeScript
-
TBD
- Java
-
TBD
- cURL
-
This Data API
findOnecommand retrieves a document based on a filter using a specific_idvalue.curl -s --location \ --request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \ --header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --data '{ "findOne": { "filter": {"_id" : "14"} } }' | json_pp
Result:
{ "data" : { "document" : { "$vector" : [ 0.11, 0.02, 0.78, 0.21, 0.27 ], "_id" : "14", "amount" : 110400, "customer" : { "address" : { "address_line" : "1414 14th Pl", "city" : "Brooklyn", "state" : "NY" }, "age" : 44, "credit_score" : 702, "name" : "Kris S.", "phone" : "123-456-1144" }, "items" : [ { "car" : "Tesla Model X", "color" : "White" } ], "purchase_date" : { "$date" : 1698513091 }, "purchase_type" : "In Person", "seller" : { "location" : "Brooklyn NYC", "name" : "Jasmine S." }, "status" : "active" } } }
Iterate over documents in a collection matching a given filter.
- Python
-
doc_iterator = collection.find({"category": "house_appliance"}, limit=10)
Iterate over the documents most similar to a given query vector.
doc_iterator = collection.find({}, vector=[0.55, -0.40, 0.08], limit=5)
Returns:
Cursor- A cursor for iterating over documents. An AstraPy cursor can be used in a for loop, and provides a few additional features.Example response
Cursor("vector_collection", new, retrieved: 0)Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
skip
Optional[int]With this integer parameter, what would be the first
skipdocuments returned by the query are discarded, and the results start from the (skip+1)-th document. This parameter can be used only in conjunction with an explicitsortcriterion of the ascending/descending type (i.e. it cannot be used when not sorting, nor with vector-based ANN search).limit
Optional[int]This (integer) parameter sets a limit over how many documents are returned. Once
limitis reached (or the cursor is exhausted for lack of matching documents), nothing more is returned.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search; that is, Approximate Nearest Neighbors (ANN) search. When running similarity search on a collection, no other sorting criteria can be specified. Moreover, there is an upper bound to the number of documents that can be returned. For details, see the Data API Limits.
include_similarity
Optional[bool]A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in each returned document. Can only be used for vector ANN search, i.e. when either
vectoris supplied or thesortparameter has the shape {"$vector": …}.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the order the documents are returned. See the Note about sorting, as well as the one about upper bounds, for details.
max_time_ms
Optional[int]A timeout, in milliseconds, for each single one of the underlying HTTP requests used to fetch documents as the cursor is iterated over.
- TypeScript
-
TBD
- Java
-
TBD
- cURL
-
TBD
- Python
-
When no particular order is required:
sort={} # (default when parameter not provided)
When sorting by a certain value in ascending/descending order:
from astrapy.constants import SortDocuments sort={"field": SortDocuments.ASCENDING} sort={"field": SortDocuments.DESCENDING}
When sorting first by "field" and then by "subfield" (while modern Python versions preserve the order of dictionaries, it is suggested for clarity to employ a
collections.OrderedDictin these cases):sort={ "field": SortDocuments.ASCENDING, "subfield": SortDocuments.ASCENDING, }
When running a vector similarity (ANN) search:
sort={"$vector": [0.4, 0.15, -0.5]}
NoteSome combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:
-
Vector ANN searches cannot return more than a certain number of documents; currently, 1000 per search operation.
-
When using a sort criterion of the ascending/descending type, the Data API returns a smaller number of documents, currently set to 20, and stops there. The returned documents are the top results across the whole collection according to the requested criterion.
Keep in mind these provisions even when subsequently running a command such as
.distinct()on a cursor.TipWhen not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents.
The behavior of the cursor — in the case that documents have been added/removed after the
findwas started — depends on database internals. It it is not guaranteed, nor excluded, that such "real-time" changes in the data would be picked up by the cursor.Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection filter = {"seq": {"$exists": True}} for doc in collection.find(filter, projection={"seq": True}, limit=5): print(doc["seq"]) ... # will print e.g.: # 37 # 35 # 10 # 36 # 27 cursor1 = collection.find( {}, limit=4, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) [doc["_id"] for doc in cursor1] # prints: ['97e85f81-...', '1581efe4-...', '...', '...'] cursor2 = collection.find({}, limit=3) cursor2.distinct("seq") # prints: [37, 35, 10] collection.insert_many([ {"tag": "A", "$vector": [4, 5]}, {"tag": "B", "$vector": [3, 4]}, {"tag": "C", "$vector": [3, 2]}, {"tag": "D", "$vector": [4, 1]}, {"tag": "E", "$vector": [2, 5]}, ]) ann_tags = [ document["tag"] for document in collection.find( {}, limit=3, vector=[3, 3], ) ] ann_tags # prints: ['A', 'B', 'C'] # (assuming the collection has metric VectorMetric.COSINE)
-
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Locate a document matching a filter condition and apply changes to it, returning the document itself.
- Python
-
collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, )
Locate and update a document, returning the document itself, creating a new one if nothing is found.
collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, upsert=True, )
Returns:
Dict[str, Any]- The document that was found, either before or after the update (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response
{'_id': 999, 'Marco': 'Polo'}Parameters:
Name Type Description filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.projection
Optional[ProjectionType]Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the
findmethod for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.return_document
strA flag controlling what document is returned: if set to
ReturnDocument.BEFORE, or the string "before", the document found on database is returned; if set toReturnDocument.AFTER, or the string "after", the new document is returned. The default is "before".max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.find_one_and_update( {"Marco": {"$exists": True}}, {"$set": {"title": "Mr."}}, ) # prints: {'_id': 'a80106f2-...', 'Marco': 'Polo'} collection.find_one_and_update( {"title": "Mr."}, {"$inc": {"rank": 3}}, projection=["title", "rank"], return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'a80106f2-...', 'title': 'Mr.', 'rank': 3} collection.find_one_and_update( {"name": "Johnny"}, {"$set": {"rank": 0}}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # (returns None for no matches) collection.find_one_and_update( {"name": "Johnny"}, {"$set": {"rank": 0}}, upsert=True, return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'cb4ef2ab-...', 'name': 'Johnny', 'rank': 0}
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello DevOps cURL world. TBD.
Update a single document on the collection as requested.
- Python
-
update_result = collection.update_one( {"_id": 456}, {"$set": {"name": "John Smith"}}, )
Update a single document on the collection, inserting a new one if no match is found.
update_result = collection.update_one( {"_id": 456}, {"$set": {"name": "John Smith"}}, upsert=True, )
Returns:
UpdateResult- An object representing the response from the database after the update operation. It includes information about the operation.Example response
UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'name': 'John Doe'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})Parameters:
Name Type Description filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.vector
Dict[str, Any]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the
findmethod for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.update_one({"Marco": {"$exists": True}}, {"$inc": {"rank": 3}}) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}) collection.update_one({"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}}) # prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.update_one( {"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}}, upsert=True, ) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '2a45ff60-...'})
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Update multiple documents in a collection.
- Python
-
results = collection.update_many( {"name": {"$exists": False}}, {"$set": {"name": "unknown"}}, )
Update multiple documents in a collection, inserting a new one if no matches are found.
results = collection.update_many( {"name": {"$exists": False}}, {"$set": {"name": "unknown"}}, upsert=True, )
Returns:
UpdateResult- An object representing the response from the database after the update operation. It includes information about the operation.Example response
UpdateResult(raw_results=[{'status': {'matchedCount': 2, 'modifiedCount': 2}}], update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})Parameters:
Name Type Description filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.update
Dict[str, Any]The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are:
{"$set": {"field": "value}},{"$inc": {"counter": 10}}and{"$unset": {"field": ""}}. See Data API operators for the full syntax.upsert
boolThis parameter controls the behavior in absence of matches. If True, a single new document (resulting from applying
updateto an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the operation.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many([{"c": "red"}, {"c": "green"}, {"c": "blue"}]) collection.update_many({"c": {"$ne": "green"}}, {"$set": {"nongreen": True}}) # prints: UpdateResult(raw_results=..., update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2}) collection.update_many({"c": "orange"}, {"$set": {"is_also_fruit": True}}) # prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.update_many( {"c": "orange"}, {"$set": {"is_also_fruit": True}}, upsert=True, ) # prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '46643050-...'})
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Use the Data API updateMany command to update multiple documents in a collection.
In this example, the JSON payload uses the
$setupdate operator to change a status to "inactive" for those documents that have an "active" status.The
updateManycommand includes pagination support in the event more documents that matched the filter are on a subsequent page. For more, see the pagination note after the cURL example.api-reference:partial$json-structure-http-post.adoc
Example:
curl -s --location \ --request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \ --header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --data '{ "updateMany": { "filter": {"status" : "active" }, "update" : {"$set" : { "status" : "inactive"}} } }' | json_pp
Result:
{ "status" : { "matchedCount" : 20, "modifiedCount" : 20, "moreData" : true } }Name Type Description updateMany
command
Updates multiple documents in the database’s collection.
filter
object
Defines the criteria for selecting documents to which the command applies. The filter looks for documents where: *
status: The key being evaluated in each document; a property within the documents in the database. *active: The value that thestatusproperty must match for the document to be selected. In this case, it’s targeting documents that currently have a status ofactive.update
object
Specifies the modifications to be applied to all documents that match the criteria set by the filter.
$set
operator
An update operator indicating that the operation should overwrite the value of a property (or properties) in the selected documents.
status
String
Specifies the property in the document to update. In this example, active or inactive will be set for all selected documents. In this context, it’s changing the
statusfromactivetoinactive.NoteIn the
updateManyresponse, check whether anextPageStateID was returned. TheupdateManycommand includes pagination support. You can update one page of matching documents at a time. If there is a subsequent page with matching documents to update, the transaction returns anextPageStateID. You would then submit theinsertManycommand again and include thepageStateID in the new request to update the next page of documents that matched the filter:{ "updateMany": { "filter": { "active_user": true }, "update": { "$set": { "new_data": "new_data_value" } }, "options": { "pageState": "<id-value-from-prior-response>" } } }Follow the sequence of one or more
insertManycommands until all pages with documents matching the filter have the update applied.
Get a list of the distinct values of a certain key in a collection.
- Python
-
collection.distinct("category")
Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.
collection.distinct( "food.allergies", filter={"registered_for_dinner": True}, )
Returns:
List[Any]- A list of the distinct values encountered. Documents that lack the requested key are ignored.Example response
['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]Parameters:
Name Type Description key
strThe name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable
keyvalues:"field","field.subfield","field.3", and"field.3.subfield". If lists are encountered and no numeric index is specified, all items in the list are visited.filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.max_time_ms
TypeA timeout, in milliseconds, for the operation.
NoteKeep in mind that
distinctis a client-side operation, which effectively browses all required documents using the logic of thefindmethod and collects the unique values found forkey. As such, there may be performance, latency and ultimately billing implications if the amount of matching documents is large.For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many( [ {"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"}, {"name": "Emma", "food": {"likes_fruit": True, "allergies": []}}, ] ) collection.distinct("name") # prints: ['Marco', 'Emma'] collection.distinct("city") # prints: ['Helsinki'] collection.distinct("food") # prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}] collection.distinct("food.1") # prints: ['orange'] collection.distinct("food.allergies") # prints: [] collection.distinct("food.likes_fruit") # prints: [True]
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Get the count of all documents in a collection.
- Python
-
collection.count_documents({}, upper_bound=500)
Get the count of the documents in a collection matching a condition.
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=50)
Returns:
int- The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.Example response
320
Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.upper_bound
intA required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception will be raised. Furthermore, if the actual number of documents exceeds the maximum count that the Data API can reach (regardless of upper_bound), an exception will be raised.
max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many([{"seq": i} for i in range(20)]) collection.count_documents({}, upper_bound=100) # prints: 20 collection.count_documents({"seq":{"$gt": 15}}, upper_bound=100) # prints: 4 collection.count_documents({}, upper_bound=10) # Raises: astrapy.exceptions.TooManyDocumentsToCountException
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Use the Data API
estimatedDocumentCountcommand to return the approximate number of documents in the collection.TipIn the
estimatedDocumentCountcommand’s response, the document count is based on the current system statistics at the time the request was received by the database server. Due to potentially in-progress updates (document additions and/or deletions), the actual number of documents in the collection may be lower or higher in the database.curl -s --location \ --request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \ --header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \ --header "Content-Type: application/json" \ --header "Accept: application/json" \ --data '{ "estimatedDocumentCount": { } }' | json_pp
Result:
{ "status": { "count": 21 } }Properties:
Name Type Description estimatedDocumentCount
command
Returns an estimated count of documents within the context of the specified collection.
The object is { } empty, meaning there are no filters or options for this implementation of the
estimatedDocumentCountcommand.
Locate a document matching a filter condition and replace it with a new document, returning the document itself.
- Python
-
collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, )
Locate and replace a document, returning the document itself, additionally creating it if nothing is found.
collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, upsert=True, )
Returns:
Dict[str, Any]- The document that was found, either before or after the replacement (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response
{'_id': 'rule1', 'text': 'all animals are equal'}Parameters:
Name Type Description filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.replacement
Dict[str, Any]the new document to write into the collection.
projection
Optional[ProjectionType]Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the
findmethod for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True,
replacementis inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.return_document
strA flag controlling what document is returned: if set to
ReturnDocument.BEFORE, or the string "before", the document found on database is returned; if set toReturnDocument.AFTER, or the string "after", the new document is returned. The default is "before".max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection import astrapy collection.insert_one({"_id": "rule1", "text": "all animals are equal"}) collection.find_one_and_replace( {"_id": "rule1"}, {"text": "some animals are more equal!"}, ) # prints: {'_id': 'rule1', 'text': 'all animals are equal'} collection.find_one_and_replace( {"text": "some animals are more equal!"}, {"text": "and the pigs are the rulers"}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # prints: {'_id': 'rule1', 'text': 'and the pigs are the rulers'} collection.find_one_and_replace( {"_id": "rule2"}, {"text": "F=ma^2"}, return_document=astrapy.constants.ReturnDocument.AFTER, ) # (returns None for no matches) collection.find_one_and_replace( {"_id": "rule2"}, {"text": "F=ma"}, upsert=True, return_document=astrapy.constants.ReturnDocument.AFTER, projection={"_id": False}, ) # prints: {'text': 'F=ma'}
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Replace a document in the collection with a new one.
- Python
-
replace_result = collection.replace_one( {"Marco": {"$exists": True}}, {"Buda": "Pest"}, )
Replace a document in the collection with a new one, creating a new one if no match is found.
replace_result = collection.replace_one( {"Marco": {"$exists": True}}, {"Buda": "Pest"}, upsert=True, )
Returns:
UpdateResult- An object representing the response from the database after the replace operation. It includes information about the operation.Example response
UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'Marco': 'Polo'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})Parameters:
Name Type Description filter
Dict[str, Any]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.replacement
Dict[str, Any]the new document to write into the collection.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the
findmethod for more on sorting.upsert
bool = FalseThis parameter controls the behavior in absence of matches. If True,
replacementis inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_one({"Marco": "Polo"}) collection.replace_one({"Marco": {"$exists": True}}, {"Buda": "Pest"}) prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1}) collection.find_one({"Buda": "Pest"}) prints: {'_id': '8424905a-...', 'Buda': 'Pest'} collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}) prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0}) collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}, upsert=True) prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '931b47d6-...'})
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Locate a document matching a filter condition and delete it, returning the document itself.
- Python
-
collection.find_one_and_delete({"status": "stale_entry"})
Returns:
Dict[str, Any]- The document that was just deleted (or a projection thereof, as requested). If no matches are found,Noneis returned.Example response
{'_id': 199, 'status': 'stale_entry', 'request_id': 'A4431'}Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.projection
Optional[Union[Iterable[str], Dict[str, bool]]]Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the
findmethod for more on sorting.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many( [ {"species": "swan", "class": "Aves"}, {"species": "frog", "class": "Amphibia"}, ], ) collection.find_one_and_delete( {"species": {"$ne": "frog"}}, projection=["species"], ) # prints: {'_id': '5997fb48-...', 'species': 'swan'} collection.find_one_and_delete({"species": {"$ne": "frog"}}) # (returns None for no matches)
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Locate and delete a single document from a collection.
- Python
-
response = collection.delete_one({ "_id": "1" })
Locate and delete a single document from a collection by any attribute (as long as it is covered by the collection’s indexing configuration).
document = collection.delete_one({"location": "warehouse_C"})
Locate and delete a single document from a collection by an arbitrary filtering clause.
document = collection.delete_one({"tag": {"$exists": True}})
Delete the most similar document to a given vector.
result = collection.delete_one({}, vector=[.12, .52, .32])
Returns:
DeleteResult- An object representing the response from the database after the delete operation. It includes information about the success of the operation.Example response
DeleteResult(raw_results=[{'status': {'deletedCount': 1}}], deleted_count=1)Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators.vector
Optional[Iterable[float]]A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with
sort. See thefindmethod for more details on this parameter.sort
Optional[Dict[str, Any]]With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the
findmethod for more on sorting.max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient import astrapy client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}]) collection.delete_one({"seq": 1}) # prints: DeleteResult(raw_results=..., deleted_count=1) collection.distinct("seq") # prints: [0, 2] collection.delete_one( {"seq": {"$exists": True}}, sort={"seq": astrapy.constants.SortDocuments.DESCENDING}, ) # prints: DeleteResult(raw_results=..., deleted_count=1) collection.distinct("seq") # prints: [0] collection.delete_one({"seq": 2}) # prints: DeleteResult(raw_results=..., deleted_count=0)
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Delete multiple documents from a collection.
- Python
-
delete_result = collection.delete_many({"status": "processed"})
Returns:
DeleteResult- An object representing the response from the database after the delete operation. It includes information about the success of the operation.Example response
DeleteResult(raw_results=[{'status': {'deletedCount': 2}}], deleted_count=2)Parameters:
Name Type Description filter
Optional[Dict[str, Any]]A predicate expressed as a dictionary according to the Data API filter syntax. Examples are
{},{"name": "John"},{"price": {"$le": 100}},{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}. See Data API operators for the full list of operators. Thedelete_manymethod does not accept an empty filter: seedelete_allto completely erase all contents of a collectionmax_time_ms
Optional[int]A timeout, in milliseconds, for the operation.
NoteThis method would not admit an empty filter clause: use the
delete_allmethod to delete all documents in the collection. If you want to delete all documents, usedelete_allinstead.Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}]) collection.delete_many({"seq": {"$lte": 1}}) # prints: DeleteResult(raw_results=..., deleted_count=2) collection.distinct("seq") # prints: [2] collection.delete_many({"seq": {"$lte": 1}}) # prints: DeleteResult(raw_results=..., deleted_count=0)
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Execute a (reusable) list of write operations on a collection with a single command.
- Python
-
bw_results = collection.bulk_write( [ InsertMany([{"a": 1}, {"a": 2}]), ReplaceOne( {"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True, ), ], )
Returns:
BulkWriteResult- A single object summarizing the whole list of requested operations. The keys in the map attributes of the result (when present) are the integer indices of the corresponding operation in therequestsiterable.Example response
BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})Parameters:
Name Type Description requests
Iterable[BaseOperation]An iterable over concrete subclasses of
BaseOperation, such asInsertManyorReplaceOne. Each such object represents an operation ready to be executed on a collection, and is instantiated by passing the same parameters as one would the corresponding collection method.ordered
boolWhether to launch the
requestsone after the other or in arbitrary order, possibly in a concurrent fashion. For performance reasons,ordered=Falseshould be preferred when compatible with the needs of the application flow.concurrency
Optional[int]Maximum number of concurrent operations executing at a given time. It cannot be more than one for ordered bulk writes.
max_time_ms
Optional[int]A timeout, in milliseconds, for the whole bulk write. Remember that, if the method call times out, then there’s no guarantee about what portion of the bulk write has been received and successfully executed by the Data API.
Example:
from astrapy import DataAPIClient from astrapy.operations import ( InsertOne, InsertMany, UpdateOne, UpdateMany, ReplaceOne, DeleteOne, DeleteMany, ) client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection op1 = InsertMany([{"a": 1}, {"a": 2}]) op2 = ReplaceOne({"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True) collection.bulk_write([op1, op2]) # prints: BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'}) collection.count_documents({}, upper_bound=100) # prints: 3 collection.distinct("replaced") # prints: [True]
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.
Delete all documents in a collection.
- Python
-
result = collection.delete_all()
Returns:
Dict- A dictionary in the form{"ok": 1}if the method succeeds.Example response
{'ok': 1}Parameters:
Name Type Description max_time_ms
Optional[int]A timeout, in milliseconds, for the underlying HTTP request.
Example:
from astrapy import DataAPIClient client = DataAPIClient("TOKEN") database = my_client.get_database_by_api_endpoint("01234567-...") collection = database.my_collection my_coll.distinct("seq") # prints: [2, 1, 0] my_coll.count_documents({}, upper_bound=100) # prints: 4 my_coll.delete_all() # prints: {'ok': 1} my_coll.count_documents({}, upper_bound=100) # prints: 0
- TypeScript
-
Hello TypeScript world. TBD.
- Java
-
Hello Java world. TBD.
- cURL
-
Hello cURL world. TBD.