Documents reference

📝 NOTE: This content was pulled from PR 308

Documents represent a single row or record of data in Astra DB Serverless.

Use the Collection class to work with documents.

If you haven’t already, consult the Collections reference topic for details on how to get a Collection object.

Working with dates

Python

Date and datetime objects, which are instances of the Python standard library datetime.datetime and datetime.date classes, can be used anywhere in documents.

Read operations from a collection always return the datetime class regardless of whether a date or a datetime was provided in the insertion.

import datetime

from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_one({"when": datetime.datetime.now()})
collection.insert_one({"date_of_birth": datetime.date(2000, 1, 1)})

collection.update_one(
    {"registered_at": datetime.date(1999, 11, 14)},
    {"$set": {"message": "happy Sunday!"}},
)

print(
    collection.find_one(
        {"date_of_birth": {"$lt": datetime.date(2001, 1, 1)}},
        projection={"_id": False},
    )
)
# will print:
#    {'date_of_birth': datetime.datetime(2000, 1, 1, 0, 0)}

Note	As shown in the example, read operations from a collection always return the `datetime` class regardless of whether a `date` or a `datetime` was provided in the insertion.

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Working with document IDs

Documents in a collection are always identified by an ID that is unique within the collection. The ID can be any of several types, such as a string, integer, or datetime. However, it’s recommended to instead prefer the uuid or the ObjectId types.

The Data API supports uuid identifiers up to version 8, as well as ObjectId identifiers as provided by the bson library. These can appear anywhere in the document, not only in its _id field. Moreover, different types of identifier can appear in different parts of the same document. And these identifiers can be part of filtering clauses and update/replace directives just like any other data type.

One of the optional settings of a collection is the "default ID type": that is, it is possible to specify what kind of identifiers the server should supply for documents without an explicit _id field. (For details, see the create_collection method and Data API createCollection command in the Collections reference.) Regardless of the defaultId setting, however, identifiers of any type can be explicitly provided for documents. For example, during insertions, and will be honored by the insertion process.

Python

from astrapy.ids import (
    ObjectId,
    uuid1,
    uuid3,
    uuid4,
    uuid5,
    uuid6,
    uuid7,
    uuid8,
    UUID,
)

AstraPy recognizes uuid versions 1 through 8 (with the exception of 2) as provided by the uuid and uuid6 Python libraries, as well as the ObjectId from the bson package. Furthermore, out of convenience, these same utilities are exposed as shown in the example above.

You can then generate new identifiers with statements such as new_id = uuid8() or new_obj_id = ObjectId(). Keep in mind that all uuid versions are instances of the same class (UUID), which exposes a version property, should you need to access it.

Here is a short example of the concepts:

from astrapy import DataAPIClient
from astrapy.ids import ObjectId, uuid8, UUID
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_one({"_id": uuid8(), "tag": "new_id_v_8"})
collection.insert_one(
    {"_id": UUID("018e77bc-648d-8795-a0e2-1cad0fdd53f5"), "tag": "id_v_8"}
)
collection.insert_one({"id": ObjectId(), "tag": "new_obj_id"})
collection.insert_one(
    {"id": ObjectId("6601fb0f83ffc5f51ba22b88"), "tag": "obj_id"}
)
collection.find_one_and_update(
    {"_id": ObjectId("6601fb0f83ffc5f51ba22b88")},
    {"$set": {"item_inventory_id": UUID("1eeeaf80-e333-6613-b42f-f739b95106e6")}},
)

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Insert a single document

Insert a single document into a collection.

Python

insert_result = collection.insert_one({"name": "Jane Doe"})

Insert a document with an associated vector.

insert_result = collection.insert_one(
    {"name": "Jane Doe"},
    vector=[.08, .68, .30],
)

Parameters:

Name	Type	Description
document	`Dict`	The dictionary expressing the document to insert. The `_id` field of the document can be left out, in which case it will be created automatically.
vector	`Dict[str, Any]`	A vector (a list of numbers appropriate for the collection) for the document. Passing this parameter is equivalent to providing the vector in the "$vector" field of the document itself, however the two are mutually exclusive.
max_time_ms	`Dict[str, Any]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

# Insert a document with a specific ID
response1 = collection.insert_one(
    {
        "_id": 101,
        "name": "John Doe",
    },
    vector=[.12, .52, .32],
)

# Insert a document without specifying an ID
# so that _id is generated automatically
response2 = collection.insert_one(
    {
        "name": "Jane Doe",
    },
    vector=[.08, .68, .30],
)

TypeScript

Collection<Schema>.insertOne(
  document: MaybeId<Schema>,
  options?: InsertOneOptions,
): Promise<InsertOneResult<Schema>>

Parameters:

Name	Type	Description
document	`MaybeId<Schema>`	The document to insert. If the document does not have an `_id` field, the server generates one.
options?	`InsertOneOptions`	The options for this operation.

Options (InsertOneOptions):

Name	Type	Description
vector?	`number[]`	The vector for the document. Equivalent to providing the vector in the `$vector` field of the document itself; however, the two are mutually exclusive.
maxTimeMS?	`number`	The maximum time in milliseconds that the client should wait for the operation to complete.

Returns:

Promise<InsertOneResult<Schema>> - A promise that resolves to the inserted ID.

Example:

import { DataApiClient } from '@datastax/astra-db-ts';

// Reference an untyped collection
const db = new DataApiClient('TOKEN').db('API_ENDPOINT');
const collection = db.collection('my_collection');

(async () => {
  // Insert a document with a specific ID
  await collection.insertOne({ _id: '1', name: 'John Doe' });

  // Insert a document with an autogenerated ID
  await collection.insertOne({ name: 'Jane Doe' });

  // Insert a document with a vector
  await collection.insertOne({ name: 'Jane Doe' }, { vector: [.12, .52, .32] });
  await collection.insertOne({ name: 'Jane Doe', $vector: [.12, .52, .32] });
})();

Java

TBD

cURL

cURL -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "insertOne": {
    "document": {
      "_id": "1",
      "purchase_type": "Online",
      "$vector": [0.25, 0.25, 0.25, 0.25, 0.25],
      "customer": {
        "name": "Jim A.",
        "phone": "123-456-1111",
        "age": 51,
        "credit_score": 782,
        "address": {
          "address_line": "1234 Broadway",
          "city": "New York",
          "state": "NY"
        }
      },
      "purchase_date": {"$date": 1690045891},
      "seller": {
        "name": "Jon B.",
        "location": "Manhattan NYC"
      },
      "items": [
        {
          "car" : "BMW 330i Sedan",
          "color": "Silver"
        },
        "Extended warranty - 5 years"
      ],
      "amount": 47601,
      "status" : "active",
      "preferred_customer" : true
    }
  }
}' | json_pp

Properties:

Name	Type	Description
insertOne	command	Data API designation that a single document is inserted. api-reference:partial$insert-command-payload.adoc

Name

Type

Description

insertOne

command

Data API designation that a single document is inserted.

api-reference:partial$insert-command-payload.adoc

Response

{
    "status": {
        "insertedIds": [
            "1"
        ]
    }
}

Insert many documents

Insert multiple documents into a collection.

Python

response = collection.insert_many(
    [
        {
            "_id": 101,
            "name": "John Doe",
        },
        {
            # ID is generated automatically
            "name": "Jane Doe",
        },
    ],
    vectors=[
        [.12, .52, .32],
        [.08, .68, .30],
    ],
    ordered=True,
)

Returns:

InsertManyResult - An object representing the response from the database after the insert operation. It includes information about the success of the operation and details of the inserted documents.

Example response

InsertManyResult(raw_results=[{'status': {'insertedIds': [101, '81077d86-05dc-43ca-877d-8605dce3ca4d']}}], inserted_ids=[101, '81077d86-05dc-43ca-877d-8605dce3ca4d'])

Parameters:

Name	Type	Description
documents	`Iterable[Dict[str, Any]],`	An iterable of dictionaries, each a document to insert. Documents may specify their `_id` field or leave it out, in which case it will be added automatically.
vectors	`Optional[Iterable[Optional[Iterable[float]]]]`	An optional list of vectors (as many vectors as the provided documents) to associate to the documents when inserting. Each vector is added to the corresponding document prior to insertion on database. The list can be a mixture of None and vectors, in which case some documents will not have a vector, unless it is specified in their "$vector" field already. Passing vectors this way is indeed equivalent to the "$vector" field of the documents, however the two are mutually exclusive.
ordered	`bool`	If True (default), the insertions are processed sequentially. If False, they can occur in arbitrary order and possibly concurrently.
chunk_size	`Optional[int]`	How many documents to include in a single API request. Exceeding the server maximum allowed value results in an error. Leave it unspecified (recommended) to use the system default.
concurrency	`Optional[int]`	Maximum number of concurrent requests to the API at a given time. It cannot be more than one for ordered insertions.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the operation.

Note	Unless there are specific reasons not to, it is recommended to prefer `ordered = False` as it will result in a much higher insert throughput than an equivalent ordered insertion.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many([{"a": 10}, {"a": 5}, {"b": [True, False, False]}])

collection.insert_many(
    [{"seq": i} for i in range(50)],
    ordered=False,
    concurrency=5,
)

# The following are three equivalent statements:
collection.insert_many(
    [{"tag": "a"}, {"tag": "b"}],
    vectors=[[1, 2], [3, 4]],
)

collection.insert_many(
    [{"tag": "a", "$vector": [1, 2]}, {"tag": "b"}],
    vectors=[None, [3, 4]],
)

collection.insert_many(
    [
        {"tag": "a", "$vector": [1, 2]},
        {"tag": "b", "$vector": [3, 4]},
    ]
)

TypeScript

Collection<Schema>.insertMany(
  documents: MaybeId<Schema>[],
  options?: InsertManyOptions,
): Promise<InsertManyResult<Schema>>

Parameters:

Name	Type	Description
documents	`MaybeId<Schema>[]`	The documents to insert. If any document does not have an `_id` field, the server generates one.
options?	`InsertManyOptions`	The options for this operation.

Options (InsertManyOptions):

Name	Type	Description
ordered?	`boolean`	You may set the `ordered` option to `true` to stop the operation after the first error; otherwise all documents may be parallelized and processed in arbitrary order, improving, perhaps vastly, performance.
concurrency?	`number`	You can set the `concurrency` option to control how many network requests are made in parallel on unordered insertions. Defaults to `8`. Not available for ordered insertions.
chunkSize?	`number`	Control how many documents are sent each network request. Defaults to `20`.
vectors?	`(number[] \| null \| undefined)[]`	An array of vectors to associate with each document. If a vector is `null` or `undefined`, the document will not have a vector. Must equal the number of documents if provided. Equivalent to providing the vector in the `$vector` field of the documents themselves; however, the two are mutually exclusive.

Note	Unless there are specific reasons not to, it is recommended to prefer to leave ordered `false` as it will result in a much higher insert throughput than an equivalent ordered insertion.

Returns:

Promise<InsertManyResult<Schema>> - A promise that resolves to the inserted IDs.

Example:

import { DataApiClient, InsertManyError } from '@datastax/astra-db-ts';

// Reference an untyped collection
const db = new DataApiClient('TOKEN').db('API_ENDPOINT');
const collection = db.collection('my_collection');

(async () => {
  try {
    // Insert many documents
    await collection.insertMany([
      { _id: '1', name: 'John Doe' },
      { name: 'Jane Doe' }, // Will autogen ID
    ], { ordered: true });

    // Insert many with vectors
    await collection.insertMany([
      { name: 'John Doe', $vector: [.12, .52, .32] },
      { name: 'Jane Doe' },
      { name: 'Jane Doe', $vector: [.32, .52, .12] },
    ]);

    await collection.insertMany([
      { name: 'John Doe' },
      { name: 'Jane Doe' },
      { name: 'Dane Joe' },
    ], {
      vectors: [
        [.12, .52, .32],
        null,
        [.32, .52, .12],
      ],
      ordered: true,
    });
  } catch (e) {
    if (e instanceof InsertManyError) {
      console.log(e.insertedIds);
    }
  }
})();

Java

TBD

cURL

The following Data API insertMany command adds 20 documents to a collection.

cURL -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "insertMany": {
    "documents": [
      {
        "_id": "2",
        "purchase_type": "Online",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.05],
        "customer": {
          "name": "Jack B.",
          "phone": "123-456-2222",
        "age": 34,
        "credit_score": 700,
          "address": {
            "address_line": "888 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1690391491},
        "seller": {
          "name": "Tammy S.",
          "location": "Staten Island NYC"
        },
        "items": [
            {
          "car" : "Tesla Model 3",
          "color": "White"
            },
            "Extended warranty - 10 years",
            "Service - 5 years"
        ],
        "amount": 53990,
      "status" : "active"
      },
      {
        "_id": "3",
        "purchase_type": "Online",
        "$vector": [0.15, 0.1, 0.1, 0.35, 0.55],
        "customer": {
          "name": "Jill D.",
          "phone": "123-456-3333",
        "age": 30,
        "credit_score": 742,
          "address": {
            "address_line": "12345 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1690564291},
        "seller": {
          "name": "Jasmine S.",
          "location": "Brooklyn NYC"
        },
        "items": "Extended warranty - 10 years",
        "amount": 4600,
      "status" : "active"
      },
      {
        "_id": "4",
        "purchase_type": "In Person",
        "$vector": [0.25, 0.25, 0.25, 0.25, 0.26],
        "customer": {
          "name": "Lester M.",
          "phone": "123-456-4444",
        "age": 40,
        "credit_score": 802,
          "address": {
            "address_line": "12346 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1690909891},
        "seller": {
          "name": "Jon B.",
          "location": "Manhattan NYC"
        },
        "items": [
            {
          "car" : "BMW 330i Sedan",
          "color": "Red"
            },
            "Extended warranty - 5 years",
            "Service - 5 years"
        ],
        "amount": 48510,
      "status" : "active"
      },
      {
        "_id": "5",
        "purchase_type": "Online",
        "$vector": [0.25, 0.045, 0.38, 0.31, 0.67],
        "customer": {
          "name": "David C.",
          "phone": "123-456-5555",
        "age": 50,
        "credit_score": 800,
          "address": {
            "address_line": "32345 Main Ave",
            "city": "Jersey City",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1690996291},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [
          {
          "car" : "Tesla Model S",
          "color": "Red"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 94990,
      "status" : "active"
      },
      {
        "_id": "6",
        "purchase_type": "In Person",
        "$vector": [0.11, 0.02, 0.78, 0.10, 0.27],
        "customer": {
          "name": "Chris E.",
          "phone": "123-456-6666",
        "age": 43,
        "credit_score": 764,
          "address": {
            "address_line": "32346 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1691860291},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [
          {
          "car" : "Tesla Model X",
          "color": "Blue"
            }
        ],
        "amount": 109990,
      "status" : "active"
      },
      {
        "_id": "7",
        "purchase_type": "Online",
        "$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
        "customer": {
          "name": "Jeff G.",
          "phone": "123-456-7777",
        "age": 66,
        "credit_score": 802,
          "address": {
            "address_line": "22999 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1692119491},
        "seller": {
          "name": "Jasmine S.",
          "location": "Brooklyn NYC"
        },
        "items": [{
          "car" : "BMW M440i Gran Coupe",
          "color": "Black"
            },
            "Extended warranty - 5 years"],
        "amount": 61050,
      "status" : "active"
      },
      {
        "_id": "8",
        "purchase_type": "In Person",
        "$vector": [0.3, 0.23, 0.15, 0.17, 0.4],
        "customer": {
          "name": "Harold S.",
          "phone": "123-456-8888",
        "age": 29,
        "credit_score": 710,
          "address": {
            "address_line": "1234 Main St",
            "city": "Orange",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1693329091},
        "seller": {
          "name": "Tammy S.",
          "location": "Staten Island NYC"
        },
        "items": [{
          "car" : "BMW X3 SUV",
          "color": "Black"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 46900,
      "status" : "active"
      },
      {
        "_id": "9",
        "purchase_type": "Online",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.06],
        "customer": {
          "name": "Richard Z.",
          "phone": "123-456-9999",
        "age": 22,
        "credit_score": 690,
          "address": {
            "address_line": "22345 Broadway",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1693588291},
        "seller": {
          "name": "Jasmine S.",
          "location": "Brooklyn NYC"
        },
        "items": [{
          "car" : "Tesla Model 3",
          "color": "White"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 53990,
      "status" : "active"
      },
      {
        "_id": "10",
        "purchase_type": "In Person",
        "$vector": [0.25, 0.045, 0.38, 0.31, 0.68],
        "customer": {
          "name": "Eric B.",
          "phone": null,
        "age": 54,
        "credit_score": 780,
          "address": {
            "address_line": "9999 River Rd",
            "city": "Fair Haven",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1694797891},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [{
          "car" : "Tesla Model S",
          "color": "Black"
            }
        ],
        "amount": 93800,
      "status" : "active"
      },
      {
        "_id": "11",
        "purchase_type": "Online",
        "$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
        "customer": {
          "name": "Ann J.",
          "phone": "123-456-1112",
        "age": 47,
        "credit_score": 660,
          "address": {
            "address_line": "99 Elm St",
            "city": "Fair Lawn",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1695921091},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [{
          "car" : "Tesla Model Y",
          "color": "White"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 57500,
      "status" : "active"
      },
      {
        "_id": "12",
        "purchase_type": "In Person",
        "$vector": [0.33, 0.44, 0.55, 0.77, 0.66],
        "customer": {
          "name": "John T.",
          "phone": "123-456-1123",
        "age": 55,
        "credit_score": 786,
          "address": {
            "address_line": "23 Main Blvd",
            "city": "Staten Island",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1696180291},
        "seller": {
          "name": "Tammy S.",
          "location": "Staten Island NYC"
        },
        "items": [{
          "car" : "BMW 540i xDrive Sedan",
          "color": "Black"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 64900,
      "status" : "active"
      },
      {
        "_id": "13",
        "purchase_type": "Online",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.07],
        "customer": {
          "name": "Aaron W.",
          "phone": "123-456-1133",
        "age": 60,
        "credit_score": 702,
          "address": {
            "address_line": "1234 4th Ave",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1697389891},
        "seller": {
          "name": "Jon B.",
          "location": "Manhattan NYC"
        },
        "items": [{
          "car" : "Tesla Model 3",
          "color": "White"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 55000,
      "status" : "active"
      },
      {
        "_id": "14",
        "purchase_type": "In Person",
        "$vector": [0.11, 0.02, 0.78, 0.21, 0.27],
        "customer": {
          "name": "Kris S.",
          "phone": "123-456-1144",
        "age": 44,
        "credit_score": 702,
          "address": {
            "address_line": "1414 14th Pl",
            "city": "Brooklyn",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1698513091},
        "seller": {
          "name": "Jasmine S.",
          "location": "Brooklyn NYC"
        },
        "items": [{
          "car" : "Tesla Model X",
          "color": "White"
            }
        ],
        "amount": 110400,
      "status" : "active"
      },
      {
        "_id": "15",
        "purchase_type": "Online",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.08],
        "customer": {
          "name": "Maddy O.",
          "phone": "123-456-1155",
        "age": 41,
        "credit_score": 782,
          "address": {
            "address_line": "1234 Maple Ave",
            "city": "West New York",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1701191491},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": {
          "car" : "Tesla Model 3",
          "color": "White"
            },
        "amount": 52990,
      "status" : "active"
      },
      {
        "_id": "16",
        "purchase_type": "In Person",
        "$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
        "customer": {
          "name": "Tim C.",
          "phone": "123-456-1166",
        "age": 38,
        "credit_score": 700,
          "address": {
            "address_line": "1234 Main St",
            "city": "Staten Island",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1701450691},
        "seller": {
          "name": "Tammy S.",
          "location": "Staten Island NYC"
        },
        "items": [{
          "car" : "Tesla Model Y",
          "color": "White"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 58990,
      "status" : "active"
      },
      {
        "_id": "17",
        "purchase_type": "Online",
        "$vector": [0.1, 0.15, 0.3, 0.12, 0.09],
        "customer": {
          "name": "Yolanda Z.",
          "phone": "123-456-1177",
        "age": 61,
        "credit_score": 694,
          "address": {
            "address_line": "1234 Main St",
            "city": "Hoboken",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1702660291},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [{
          "car" : "Tesla Model 3",
          "color": "Blue"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 54900,
      "status" : "active"
      },
      {
        "_id": "18",
        "purchase_type": "Online",
        "$vector": [0.15, 0.17, 0.15, 0.43, 0.55],
        "customer": {
          "name": "Thomas D.",
          "phone": "123-456-1188",
        "age": 45,
        "credit_score": 724,
          "address": {
            "address_line": "98980 20th St",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1703092291},
        "seller": {
          "name": "Jon B.",
          "location": "Manhattan NYC"
        },
        "items": [{
          "car" : "BMW 750e xDrive Sedan",
          "color": "Black"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 106900,
      "status" : "active"
      },
      {
        "_id": "19",
        "purchase_type": "Online",
        "$vector": [0.25, 0.25, 0.25, 0.25, 0.27],
        "customer": {
          "name": "Vivian W.",
          "phone": "123-456-1199",
        "age": 20,
        "credit_score": 698,
          "address": {
            "address_line": "5678 Elm St",
            "city": "Hartford",
            "state": "CT"
          }
        },
        "purchase_date": {"$date": 1704215491},
        "seller": {
          "name": "Jasmine S.",
          "location": "Brooklyn NYC"
        },
        "items": [{
          "car" : "BMW 330i Sedan",
          "color": "White"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 46980,
      "status" : "active"
      },
      {
        "_id": "20",
        "purchase_type": "In Person",
        "$vector": [0.44, 0.11, 0.33, 0.22, 0.88],
        "customer": {
          "name": "Leslie E.",
          "phone": null,
        "age": 44,
        "credit_score": 782,
          "address": {
            "address_line": "1234 Main St",
            "city": "Newark",
            "state": "NJ"
          }
        },
        "purchase_date": {"$date": 1705338691},
        "seller": {
          "name": "Jim A.",
          "location": "Jersey City NJ"
        },
        "items": [{
          "car" : "Tesla Model Y",
          "color": "Black"
            },
            "Extended warranty - 5 years"
        ],
        "amount": 59800,
      "status" : "active"
      },
      {
        "_id": "21",
        "purchase_type": "In Person",
        "$vector": [0.21, 0.22, 0.33, 0.44, 0.53],
        "customer": {
          "name": "Rachel I.",
          "phone": null,
        "age": 62,
        "credit_score": 786,
          "address": {
            "address_line": "1234 Park Ave",
            "city": "New York",
            "state": "NY"
          }
        },
        "purchase_date": {"$date": 1706202691},
        "seller": {
          "name": "Jon B.",
          "location": "Manhattan NYC"
        },
        "items": [{
          "car" : "BMW M440i Gran Coupe",
          "color": "Silver"
            },
            "Extended warranty - 5 years",
            "Gap Insurance - 5 years"
        ],
        "amount": 65250,
      "status" : "active"
      }
    ],
    "options": {
        "ordered": false
    }
  }
}' | json_pp

Response

{
   "status" : {
      "insertedIds" : [
         "4",
         "7",
         "10",
         "13",
         "16",
         "19",
         "21",
         "18",
         "6",
         "12",
         "15",
         "9",
         "3",
         "11",
         "2",
         "17",
         "14",
         "8",
         "20",
         "5"
      ]
   }
}

Properties:

Name	Type	Description
insertMany	command	Data API designation that many documents (up to 20 at a time) are being inserted. api-reference:partial$insert-command-payload.adoc

Name

Type

Description

insertMany

command

Data API designation that many documents (up to 20 at a time) are being inserted.

api-reference:partial$insert-command-payload.adoc

Find a document

Retrieve a single document from a collection using various options.

Python

Retrieve a single document from a collection by its _id.

document = collection.find_one({"_id": 101})

Retrieve a single document from a collection by any attribute, as long as it is covered by the collection’s indexing configuration.

Tip	As noted in The Indexing option in the Collections reference topic, any field that is part of a subsequent filter or sort operation must be an indexed field. If you elected to not index certain or all fields when you created the collection, you cannot reference that field in a filter/sort query.

document = collection.find_one({"location": "warehouse_C"})

Retrieve a single document from a collection by an arbitrary filtering clause.

document = collection.find_one({"tag": {"$exists": True}})

Retrieve the most similar document to a given vector.

result = collection.find_one({}, vector=[.12, .52, .32])

Retrieve only specific fields from a document.

result = collection.find_one({"_id": 101}, projection={"name": True})

Returns:

Union[Dict[str, Any], None] - Either the found document as a dictionary or None if no matching document is found.

Example response

{'_id': 101, 'name': 'John Doe', '$vector': [0.12, 0.52, 0.32]}

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
projection	`Optional[Union[Iterable[str], Dict[str, bool]]]`	Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
include_similarity	`Optional[bool]`	A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in the returned document. Can only be used for vector ANN search, i.e. when either `vector` is supplied or the `sort` parameter has the shape {"$vector": …}.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the order the documents are returned. See the Note about sorting for details.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.find_one({})
# prints: {'_id': '68d1e515-...', 'seq': 37}
collection.find_one({"seq": 10})
# prints: {'_id': 'd560e217-...', 'seq': 10}
collection.find_one({"seq": 1011})
# (returns None for no matches)
collection.find_one({}, projection={"seq": False})
# prints: {'_id': '68d1e515-...'}
collection.find_one(
    {},
    sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: {'_id': '97e85f81-...', 'seq': 69}
collection.find_one({}, vector=[1, 0])
# prints: {'_id': '...', 'tag': 'D', '$vector': [4.0, 1.0]}

TypeScript

TBD

Java

TBD

cURL

This Data API findOne command retrieves a document based on a filter using a specific _id value.

curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "findOne": {
    "filter": {"_id" : "14"}
  }
}' | json_pp

Result:

{
   "data" : {
      "document" : {
         "$vector" : [
            0.11,
            0.02,
            0.78,
            0.21,
            0.27
         ],
         "_id" : "14",
         "amount" : 110400,
         "customer" : {
            "address" : {
               "address_line" : "1414 14th Pl",
               "city" : "Brooklyn",
               "state" : "NY"
            },
            "age" : 44,
            "credit_score" : 702,
            "name" : "Kris S.",
            "phone" : "123-456-1144"
         },
         "items" : [
            {
               "car" : "Tesla Model X",
               "color" : "White"
            }
         ],
         "purchase_date" : {
            "$date" : 1698513091
         },
         "purchase_type" : "In Person",
         "seller" : {
            "location" : "Brooklyn NYC",
            "name" : "Jasmine S."
         },
         "status" : "active"
      }
   }
}

Find documents using filtering options

Iterate over documents in a collection matching a given filter.

Python

doc_iterator = collection.find({"category": "house_appliance"}, limit=10)

Iterate over the documents most similar to a given query vector.

doc_iterator = collection.find({}, vector=[0.55, -0.40, 0.08], limit=5)

Returns:

Cursor - A cursor for iterating over documents. An AstraPy cursor can be used in a for loop, and provides a few additional features.

Example response

Cursor("vector_collection", new, retrieved: 0)

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
projection	`Optional[Union[Iterable[str], Dict[str, bool]]]`	Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
skip	`Optional[int]`	With this integer parameter, what would be the first `skip` documents returned by the query are discarded, and the results start from the (skip+1)-th document. This parameter can be used only in conjunction with an explicit `sort` criterion of the ascending/descending type (i.e. it cannot be used when not sorting, nor with vector-based ANN search).
limit	`Optional[int]`	This (integer) parameter sets a limit over how many documents are returned. Once `limit` is reached (or the cursor is exhausted for lack of matching documents), nothing more is returned.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search; that is, Approximate Nearest Neighbors (ANN) search. When running similarity search on a collection, no other sorting criteria can be specified. Moreover, there is an upper bound to the number of documents that can be returned. For details, see the Data API Limits.
include_similarity	`Optional[bool]`	A boolean to request the numeric value of the similarity to be returned as an added "$similarity" key in each returned document. Can only be used for vector ANN search, i.e. when either `vector` is supplied or the `sort` parameter has the shape {"$vector": …}.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the order the documents are returned. See the Note about sorting, as well as the one about upper bounds, for details.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for each single one of the underlying HTTP requests used to fetch documents as the cursor is iterated over.

TypeScript

TBD

Java

TBD

cURL

TBD

Example values for sort operations

Python

When no particular order is required:

sort={}  # (default when parameter not provided)

When sorting by a certain value in ascending/descending order:

from astrapy.constants import SortDocuments
sort={"field": SortDocuments.ASCENDING}
sort={"field": SortDocuments.DESCENDING}

When sorting first by "field" and then by "subfield" (while modern Python versions preserve the order of dictionaries, it is suggested for clarity to employ a collections.OrderedDict in these cases):

sort={
    "field": SortDocuments.ASCENDING,
    "subfield": SortDocuments.ASCENDING,
}

When running a vector similarity (ANN) search:

sort={"$vector": [0.4, 0.15, -0.5]}

Note

Some combinations of arguments impose an implicit upper bound on the number of documents that are returned by the Data API. More specifically:

Vector ANN searches cannot return more than a certain number of documents; currently, 1000 per search operation.
When using a sort criterion of the ascending/descending type, the Data API returns a smaller number of documents, currently set to 20, and stops there. The returned documents are the top results across the whole collection according to the requested criterion.

Keep in mind these provisions even when subsequently running a command such as .distinct() on a cursor.

Tip

When not specifying sorting criteria at all (by vector or otherwise), the cursor can scroll through an arbitrary number of documents as the Data API and the client periodically exchange new chunks of documents.

The behavior of the cursor — in the case that documents have been added/removed after the find was started — depends on database internals. It it is not guaranteed, nor excluded, that such "real-time" changes in the data would be picked up by the cursor.

Example:

from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

filter = {"seq": {"$exists": True}}
for doc in collection.find(filter, projection={"seq": True}, limit=5):
    print(doc["seq"])
...
# will print e.g.:
#   37
#   35
#   10
#   36
#   27
cursor1 = collection.find(
    {},
    limit=4,
    sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
[doc["_id"] for doc in cursor1]
# prints: ['97e85f81-...', '1581efe4-...', '...', '...']
cursor2 = collection.find({}, limit=3)
cursor2.distinct("seq")
# prints: [37, 35, 10]
collection.insert_many([
    {"tag": "A", "$vector": [4, 5]},
    {"tag": "B", "$vector": [3, 4]},
    {"tag": "C", "$vector": [3, 2]},
    {"tag": "D", "$vector": [4, 1]},
    {"tag": "E", "$vector": [2, 5]},
])
ann_tags = [
    document["tag"]
    for document in collection.find(
        {},
        limit=3,
        vector=[3, 3],
    )
]
ann_tags
# prints: ['A', 'B', 'C']
# (assuming the collection has metric VectorMetric.COSINE)

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Find and update a document

Locate a document matching a filter condition and apply changes to it, returning the document itself.

Python

collection.find_one_and_update(
    {"Marco": {"$exists": True}},
    {"$set": {"title": "Mr."}},
)

Locate and update a document, returning the document itself, creating a new one if nothing is found.

collection.find_one_and_update(
    {"Marco": {"$exists": True}},
    {"$set": {"title": "Mr."}},
    upsert=True,
)

Returns:

Dict[str, Any] - The document that was found, either before or after the update (or a projection thereof, as requested). If no matches are found, None is returned.

Example response

{'_id': 999, 'Marco': 'Polo'}

Parameters:

Name	Type	Description
filter	`Dict[str, Any]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
update	`Dict[str, Any]`	The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: `{"$set": {"field": "value}}`, `{"$inc": {"counter": 10}}` and `{"$unset": {"field": ""}}`. See Data API operators for the full syntax.
projection	`Optional[ProjectionType]`	Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the `find` method for more on sorting.
upsert	`bool = False`	This parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the `update` to an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.
return_document	`str`	A flag controlling what document is returned: if set to `ReturnDocument.BEFORE`, or the string "before", the document found on database is returned; if set to `ReturnDocument.AFTER`, or the string "after", the new document is returned. The default is "before".
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_one({"Marco": "Polo"})

collection.find_one_and_update(
    {"Marco": {"$exists": True}},
    {"$set": {"title": "Mr."}},
)
# prints: {'_id': 'a80106f2-...', 'Marco': 'Polo'}
collection.find_one_and_update(
    {"title": "Mr."},
    {"$inc": {"rank": 3}},
    projection=["title", "rank"],
    return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'a80106f2-...', 'title': 'Mr.', 'rank': 3}
collection.find_one_and_update(
    {"name": "Johnny"},
    {"$set": {"rank": 0}},
    return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_update(
    {"name": "Johnny"},
    {"$set": {"rank": 0}},
    upsert=True,
    return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'cb4ef2ab-...', 'name': 'Johnny', 'rank': 0}

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello DevOps cURL world. TBD.

Update a document

Update a single document on the collection as requested.

Python

update_result = collection.update_one(
    {"_id": 456},
    {"$set": {"name": "John Smith"}},
)

Update a single document on the collection, inserting a new one if no match is found.

update_result = collection.update_one(
    {"_id": 456},
    {"$set": {"name": "John Smith"}},
    upsert=True,
)

Returns:

UpdateResult - An object representing the response from the database after the update operation. It includes information about the operation.

Example response

UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'name': 'John Doe'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})

Parameters:

Name	Type	Description
filter	`Dict[str, Any]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
update	`Dict[str, Any]`	The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: `{"$set": {"field": "value}}`, `{"$inc": {"counter": 10}}` and `{"$unset": {"field": ""}}`. See Data API operators for the full syntax.
vector	`Dict[str, Any]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the updated one. See the `find` method for more on sorting.
upsert	`bool = False`	This parameter controls the behavior in absence of matches. If True, a new document (resulting from applying the `update` to an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_one({"Marco": "Polo"})

collection.update_one({"Marco": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
collection.update_one({"Mirko": {"$exists": True}}, {"$inc": {"rank": 3}})
# prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.update_one(
    {"Mirko": {"$exists": True}},
    {"$inc": {"rank": 3}},
    upsert=True,
)
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '2a45ff60-...'})

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Update multiple documents

Update multiple documents in a collection.

Python

results = collection.update_many(
    {"name": {"$exists": False}},
    {"$set": {"name": "unknown"}},
)

Update multiple documents in a collection, inserting a new one if no matches are found.

results = collection.update_many(
    {"name": {"$exists": False}},
    {"$set": {"name": "unknown"}},
    upsert=True,
)

Returns:

UpdateResult - An object representing the response from the database after the update operation. It includes information about the operation.

Example response

UpdateResult(raw_results=[{'status': {'matchedCount': 2, 'modifiedCount': 2}}], update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})

Parameters:

Name	Type	Description
filter	`Dict[str, Any]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
update	`Dict[str, Any]`	The update prescription to apply to the document, expressed as a dictionary as per Data API syntax. Examples are: `{"$set": {"field": "value}}`, `{"$inc": {"counter": 10}}` and `{"$unset": {"field": ""}}`. See Data API operators for the full syntax.
upsert	`bool`	This parameter controls the behavior in absence of matches. If True, a single new document (resulting from applying `update` to an empty document) is inserted if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the operation.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many([{"c": "red"}, {"c": "green"}, {"c": "blue"}])

collection.update_many({"c": {"$ne": "green"}}, {"$set": {"nongreen": True}})
# prints: UpdateResult(raw_results=..., update_info={'n': 2, 'updatedExisting': True, 'ok': 1.0, 'nModified': 2})
collection.update_many({"c": "orange"}, {"$set": {"is_also_fruit": True}})
# prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.update_many(
    {"c": "orange"},
    {"$set": {"is_also_fruit": True}},
    upsert=True,
)
# prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '46643050-...'})

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Use the Data API updateMany command to update multiple documents in a collection.

In this example, the JSON payload uses the $set update operator to change a status to "inactive" for those documents that have an "active" status.

The updateMany command includes pagination support in the event more documents that matched the filter are on a subsequent page. For more, see the pagination note after the cURL example.

api-reference:partial$json-structure-http-post.adoc

Example:

curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
  "updateMany": {
    "filter": {"status" : "active" },
    "update" : {"$set" : { "status" : "inactive"}}
  }
}' | json_pp

Result:

{
   "status" : {
      "matchedCount" : 20,
      "modifiedCount" : 20,
      "moreData" : true
   }
}

Name	Type	Description
updateMany	command	Updates multiple documents in the database’s collection.
filter	object	Defines the criteria for selecting documents to which the command applies. The filter looks for documents where: * `status`: The key being evaluated in each document; a property within the documents in the database. * `active`: The value that the `status` property must match for the document to be selected. In this case, it’s targeting documents that currently have a status of `active.`
update	object	Specifies the modifications to be applied to all documents that match the criteria set by the filter.
$set	operator	An update operator indicating that the operation should overwrite the value of a property (or properties) in the selected documents.
status	String	Specifies the property in the document to update. In this example, active or inactive will be set for all selected documents. In this context, it’s changing the `status` from `active` to `inactive.`

Note

In the updateMany response, check whether a nextPageState ID was returned. The updateMany command includes pagination support. You can update one page of matching documents at a time. If there is a subsequent page with matching documents to update, the transaction returns a nextPageState ID. You would then submit the insertMany command again and include the pageState ID in the new request to update the next page of documents that matched the filter:

{
    "updateMany": {
        "filter": {
            "active_user": true
        },
        "update": {
            "$set": {
                "new_data": "new_data_value"
            }
        },
        "options": {
            "pageState": "<id-value-from-prior-response>"
        }
    }
}

Follow the sequence of one or more insertMany commands until all pages with documents matching the filter have the update applied.

Find distinct values across documents

Get a list of the distinct values of a certain key in a collection.

Python

collection.distinct("category")

Get the distinct values in a subset of documents, with a key defined by a dot-syntax path.

collection.distinct(
    "food.allergies",
    filter={"registered_for_dinner": True},
)

Returns:

List[Any] - A list of the distinct values encountered. Documents that lack the requested key are ignored.

Example response

['home_appliance', None, 'sports_equipment', {'cat_id': 54, 'cat_name': 'gardening_gear'}]

Parameters:

Name	Type	Description
key	`str`	The name of the field whose value is inspected across documents. Keys can use dot-notation to descend to deeper document levels. Example of acceptable `key` values: `"field"`, `"field.subfield"`, `"field.3"`, and `"field.3.subfield"`. If lists are encountered and no numeric index is specified, all items in the list are visited.
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
max_time_ms	`Type`	A timeout, in milliseconds, for the operation.

Note

Keep in mind that distinct is a client-side operation, which effectively browses all required documents using the logic of the find method and collects the unique values found for key. As such, there may be performance, latency and ultimately billing implications if the amount of matching documents is large.

For details on the behavior of "distinct" in conjunction with real-time changes in the collection contents, see the discussion in the Sort examples values section.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many(
    [
        {"name": "Marco", "food": ["apple", "orange"], "city": "Helsinki"},
        {"name": "Emma", "food": {"likes_fruit": True, "allergies": []}},
    ]
)

collection.distinct("name")
# prints: ['Marco', 'Emma']
collection.distinct("city")
# prints: ['Helsinki']
collection.distinct("food")
# prints: ['apple', 'orange', {'likes_fruit': True, 'allergies': []}]
collection.distinct("food.1")
# prints: ['orange']
collection.distinct("food.allergies")
# prints: []
collection.distinct("food.likes_fruit")
# prints: [True]

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Count documents in a collection

Get the count of all documents in a collection.

Python

collection.count_documents({}, upper_bound=500)

Get the count of the documents in a collection matching a condition.

collection.count_documents({"seq":{"$gt": 15}}, upper_bound=50)

Returns:

int - The exact count of the documents counted as requested, unless it exceeds the caller-provided or API-set upper bound. In case of overflow, an exception is raised.

Example response

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
upper_bound	`int`	A required ceiling on the result of the count operation. If the actual number of documents exceeds this value, an exception will be raised. Furthermore, if the actual number of documents exceeds the maximum count that the Data API can reach (regardless of upper_bound), an exception will be raised.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many([{"seq": i} for i in range(20)])

collection.count_documents({}, upper_bound=100)
# prints: 20
collection.count_documents({"seq":{"$gt": 15}}, upper_bound=100)
# prints: 4
collection.count_documents({}, upper_bound=10)
# Raises: astrapy.exceptions.TooManyDocumentsToCountException

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Use the Data API estimatedDocumentCount command to return the approximate number of documents in the collection.

Tip

In the estimatedDocumentCount command’s response, the document count is based on the current system statistics at the time the request was received by the database server. Due to potentially in-progress updates (document additions and/or deletions), the actual number of documents in the collection may be lower or higher in the database.

curl -s --location \
--request POST ${ASTRA_DB_API_ENDPOINT}/api/json/v1/${ASTRA_DB_KEYSPACE}/${ASTRA_DB_COLLECTION} \
--header "Token: ${ASTRA_DB_APPLICATION_TOKEN}" \
--header "Content-Type: application/json" \
--header "Accept: application/json" \
--data '{
            "estimatedDocumentCount": {
            }
}' | json_pp

Result:

{
    "status": {
        "count": 21
    }
}

Properties:

Name	Type	Description
estimatedDocumentCount	command	Returns an estimated count of documents within the context of the specified collection.

The object is { } empty, meaning there are no filters or options for this implementation of the estimatedDocumentCount command.

Find and replace a document

Locate a document matching a filter condition and replace it with a new document, returning the document itself.

Python

collection.find_one_and_replace(
    {"_id": "rule1"},
    {"text": "some animals are more equal!"},
)

Locate and replace a document, returning the document itself, additionally creating it if nothing is found.

collection.find_one_and_replace(
    {"_id": "rule1"},
    {"text": "some animals are more equal!"},
    upsert=True,
)

Returns:

Dict[str, Any] - The document that was found, either before or after the replacement (or a projection thereof, as requested). If no matches are found, None is returned.

Example response

{'_id': 'rule1', 'text': 'all animals are equal'}

Parameters:

Name	Type	Description
filter	`Dict[str, Any]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
replacement	`Dict[str, Any]`	the new document to write into the collection.
projection	`Optional[ProjectionType]`	Used to select a subset of fields in the document being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the `find` method for more on sorting.
upsert	`bool = False`	This parameter controls the behavior in absence of matches. If True, `replacement` is inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.
return_document	`str`	A flag controlling what document is returned: if set to `ReturnDocument.BEFORE`, or the string "before", the document found on database is returned; if set to `ReturnDocument.AFTER`, or the string "after", the new document is returned. The default is "before".
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection
import astrapy

collection.insert_one({"_id": "rule1", "text": "all animals are equal"})

collection.find_one_and_replace(
    {"_id": "rule1"},
    {"text": "some animals are more equal!"},
)
# prints: {'_id': 'rule1', 'text': 'all animals are equal'}
collection.find_one_and_replace(
    {"text": "some animals are more equal!"},
    {"text": "and the pigs are the rulers"},
    return_document=astrapy.constants.ReturnDocument.AFTER,
)
# prints: {'_id': 'rule1', 'text': 'and the pigs are the rulers'}
collection.find_one_and_replace(
    {"_id": "rule2"},
    {"text": "F=ma^2"},
    return_document=astrapy.constants.ReturnDocument.AFTER,
)
# (returns None for no matches)
collection.find_one_and_replace(
    {"_id": "rule2"},
    {"text": "F=ma"},
    upsert=True,
    return_document=astrapy.constants.ReturnDocument.AFTER,
    projection={"_id": False},
)
# prints: {'text': 'F=ma'}

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Replace a document

Replace a document in the collection with a new one.

Python

replace_result = collection.replace_one(
    {"Marco": {"$exists": True}},
    {"Buda": "Pest"},
)

Replace a document in the collection with a new one, creating a new one if no match is found.

replace_result = collection.replace_one(
    {"Marco": {"$exists": True}},
    {"Buda": "Pest"},
    upsert=True,
)

Returns:

UpdateResult - An object representing the response from the database after the replace operation. It includes information about the operation.

Example response

UpdateResult(raw_results=[{'data': {'document': {'_id': '1', 'Marco': 'Polo'}}, 'status': {'matchedCount': 1, 'modifiedCount': 1}}], update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})

Parameters:

Name	Type	Description
filter	`Dict[str, Any]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
replacement	`Dict[str, Any]`	the new document to write into the collection.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the replaced one. See the `find` method for more on sorting.
upsert	`bool = False`	This parameter controls the behavior in absence of matches. If True, `replacement` is inserted as a new document if no matches are found on the collection. If False, the operation silently does nothing in case of no matches.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_one({"Marco": "Polo"})
collection.replace_one({"Marco": {"$exists": True}}, {"Buda": "Pest"})
 prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': True, 'ok': 1.0, 'nModified': 1})
collection.find_one({"Buda": "Pest"})
 prints: {'_id': '8424905a-...', 'Buda': 'Pest'}
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"})
 prints: UpdateResult(raw_results=..., update_info={'n': 0, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0})
collection.replace_one({"Mirco": {"$exists": True}}, {"Oh": "yeah?"}, upsert=True)
 prints: UpdateResult(raw_results=..., update_info={'n': 1, 'updatedExisting': False, 'ok': 1.0, 'nModified': 0, 'upserted': '931b47d6-...'})

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Find and delete a document

Locate a document matching a filter condition and delete it, returning the document itself.

Python

collection.find_one_and_delete({"status": "stale_entry"})

Returns:

Dict[str, Any] - The document that was just deleted (or a projection thereof, as requested). If no matches are found, None is returned.

Example response

{'_id': 199, 'status': 'stale_entry', 'request_id': 'A4431'}

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
projection	`Optional[Union[Iterable[str], Dict[str, bool]]]`	Used to select a subset of fields in the documents being returned. The projection can be: an iterable over the field names to return; a dictionary {field_name: True} to positively select certain fields; or a dictionary {field_name: False} if one wants to discard some fields from the response. The default is to return the whole documents.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to perform vector search. That is, Approximate Nearest Neighbors (ANN) search, extracting the most similar document in the collection matching the filter. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the `find` method for more on sorting.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many(
    [
        {"species": "swan", "class": "Aves"},
        {"species": "frog", "class": "Amphibia"},
    ],
)
collection.find_one_and_delete(
    {"species": {"$ne": "frog"}},
    projection=["species"],
)
# prints: {'_id': '5997fb48-...', 'species': 'swan'}
collection.find_one_and_delete({"species": {"$ne": "frog"}})
# (returns None for no matches)

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Delete a document

Locate and delete a single document from a collection.

Python

response = collection.delete_one({ "_id": "1" })

Locate and delete a single document from a collection by any attribute (as long as it is covered by the collection’s indexing configuration).

document = collection.delete_one({"location": "warehouse_C"})

Locate and delete a single document from a collection by an arbitrary filtering clause.

document = collection.delete_one({"tag": {"$exists": True}})

Delete the most similar document to a given vector.

result = collection.delete_one({}, vector=[.12, .52, .32])

Returns:

DeleteResult - An object representing the response from the database after the delete operation. It includes information about the success of the operation.

Example response

DeleteResult(raw_results=[{'status': {'deletedCount': 1}}], deleted_count=1)

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators.
vector	`Optional[Iterable[float]]`	A suitable vector, meaning a list of float numbers of the appropriate dimensionality, to use vector search. That is, Approximate Nearest Neighbors (ANN) search, as the sorting criterion. In this way, the matched document (if any) will be the one that is most similar to the provided vector. This parameter cannot be used together with `sort`. See the `find` method for more details on this parameter.
sort	`Optional[Dict[str, Any]]`	With this dictionary parameter one can control the sorting order of the documents matching the filter, effectively determining what document will come first and hence be the deleted one. See the `find` method for more on sorting.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
import astrapy
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])

collection.delete_one({"seq": 1})
# prints: DeleteResult(raw_results=..., deleted_count=1)
collection.distinct("seq")
# prints: [0, 2]
collection.delete_one(
    {"seq": {"$exists": True}},
    sort={"seq": astrapy.constants.SortDocuments.DESCENDING},
)
# prints: DeleteResult(raw_results=..., deleted_count=1)
collection.distinct("seq")
# prints: [0]
collection.delete_one({"seq": 2})
# prints: DeleteResult(raw_results=..., deleted_count=0)

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Delete documents

Delete multiple documents from a collection.

Python

delete_result = collection.delete_many({"status": "processed"})

Returns:

DeleteResult - An object representing the response from the database after the delete operation. It includes information about the success of the operation.

Example response

DeleteResult(raw_results=[{'status': {'deletedCount': 2}}], deleted_count=2)

Parameters:

Name	Type	Description
filter	`Optional[Dict[str, Any]]`	A predicate expressed as a dictionary according to the Data API filter syntax. Examples are `{}`, `{"name": "John"}`, `{"price": {"$le": 100}}`, `{"$and": [{"name": "John"}, {"price": {"$le": 100}}]}`. See Data API operators for the full list of operators. The `delete_many` method does not accept an empty filter: see `delete_all` to completely erase all contents of a collection
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the operation.

Note	This method would not admit an empty filter clause: use the `delete_all` method to delete all documents in the collection. If you want to delete all documents, use `delete_all` instead.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

collection.insert_many([{"seq": 1}, {"seq": 0}, {"seq": 2}])

collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=2)
collection.distinct("seq")
# prints: [2]
collection.delete_many({"seq": {"$lte": 1}})
# prints: DeleteResult(raw_results=..., deleted_count=0)

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Execute multiple write operations

Execute a (reusable) list of write operations on a collection with a single command.

Python

bw_results = collection.bulk_write(
    [
        InsertMany([{"a": 1}, {"a": 2}]),
        ReplaceOne(
            {"z": 9},
            replacement={"z": 9, "replaced": True},
            upsert=True,
        ),
    ],
)

Returns:

BulkWriteResult - A single object summarizing the whole list of requested operations. The keys in the map attributes of the result (when present) are the integer indices of the corresponding operation in the requests iterable.

Example response

BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})

Parameters:

Name	Type	Description
requests	`Iterable[BaseOperation]`	An iterable over concrete subclasses of `BaseOperation`, such as `InsertMany` or `ReplaceOne`. Each such object represents an operation ready to be executed on a collection, and is instantiated by passing the same parameters as one would the corresponding collection method.
ordered	`bool`	Whether to launch the `requests` one after the other or in arbitrary order, possibly in a concurrent fashion. For performance reasons, `ordered=False` should be preferred when compatible with the needs of the application flow.
concurrency	`Optional[int]`	Maximum number of concurrent operations executing at a given time. It cannot be more than one for ordered bulk writes.
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the whole bulk write. Remember that, if the method call times out, then there’s no guarantee about what portion of the bulk write has been received and successfully executed by the Data API.

Example:

from astrapy import DataAPIClient
from astrapy.operations import (
    InsertOne,
    InsertMany,
    UpdateOne,
    UpdateMany,
    ReplaceOne,
    DeleteOne,
    DeleteMany,
)
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

op1 = InsertMany([{"a": 1}, {"a": 2}])
op2 = ReplaceOne({"z": 9}, replacement={"z": 9, "replaced": True}, upsert=True)
collection.bulk_write([op1, op2])
# prints: BulkWriteResult(bulk_api_results={0: ..., 1: ...}, deleted_count=0, inserted_count=3, matched_count=0, modified_count=0, upserted_count=1, upserted_ids={1: '2addd676-...'})
collection.count_documents({}, upper_bound=100)
# prints: 3
collection.distinct("replaced")
# prints: [True]

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Delete all documents from a collection

Delete all documents in a collection.

Python

result = collection.delete_all()

Returns:

Dict - A dictionary in the form {"ok": 1} if the method succeeds.

Example response

{'ok': 1}

Parameters:

Name	Type	Description
max_time_ms	`Optional[int]`	A timeout, in milliseconds, for the underlying HTTP request.

Example:

from astrapy import DataAPIClient
client = DataAPIClient("TOKEN")
database = my_client.get_database_by_api_endpoint("01234567-...")
collection = database.my_collection

my_coll.distinct("seq")
# prints: [2, 1, 0]
my_coll.count_documents({}, upper_bound=100)
# prints: 4
my_coll.delete_all()
# prints: {'ok': 1}
my_coll.count_documents({}, upper_bound=100)
# prints: 0

TypeScript

Hello TypeScript world. TBD.

Java

Hello Java world. TBD.

cURL

Hello cURL world. TBD.

Next steps

See the Administration reference topic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documents reference

Working with dates

Working with document IDs

Insert a single document

Insert many documents

Find a document

Find documents using filtering options

Example values for sort operations

Find and update a document

Update a document

Update multiple documents

Find distinct values across documents

Count documents in a collection

Find and replace a document

Replace a document

Find and delete a document

Delete a document

Delete documents

Execute multiple write operations

Delete all documents from a collection

Next steps

FilesExpand file tree

test-unified-api-reference.adoc

Latest commit

History

test-unified-api-reference.adoc

File metadata and controls

Documents reference

Working with dates

Working with document IDs

Insert a single document

Insert many documents

Find a document

Find documents using filtering options

Example values for sort operations

Find and update a document

Update a document

Update multiple documents

Find distinct values across documents

Count documents in a collection

Find and replace a document

Replace a document

Find and delete a document

Delete a document

Delete documents

Execute multiple write operations

Delete all documents from a collection

Next steps