Releases · pinecone-io/pinecone-python-client

24 Feb 01:00

jhamon

v3.1.0

8f2c370

Release v3.1.0

Listing vector ids by prefix in a namespace (for serverless indexes)

We've implemented SDK support for a new data plane endpoint used to list ids by prefix in a given namespace. If the prefix empty string is passed, this can be used to list all ids in a namespace.

The index client now has list and list_paginated. With clever assignment of vector ids, this can be used to help model hierarchical relationships between different vectors such as when you have embeddings for multiple chunks or fragments related to the same document.

The list method returns a generator that handles pagination on your behalf.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

# To iterate over all result pages using a generator function
for ids in index.list(prefix='pref', limit=3, namespace=namespace):
    print(ids) # ['pref1', 'pref2', 'pref3']

    # Now you can pass this id array to other methods, such as fetch or delete.
    vectors = index.fetch(ids=ids, namespace=namespace)

There is also an option to fetch each page of results yourself with list_paginated.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(host='hosturl')

namespace = 'foo-namespace'

# For manual control over pagination
results = index.list_paginated(
    prefix='pref',
    limit=3,
    namespace='foo',
    pagination_token='eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
)
print(results.namespace) # 'foo'
print([v.id for v in results.vectors]) # ['pref1', 'pref2', 'pref3']
print(results.pagination.next) # 'eyJza2lwX3Bhc3QiOiI5IiwicHJlZml4IjpudWxsfQ=='
print(results.usage) # { 'read_units': 1 }

Python 3.11 and 3.12 support

We made an adjustment to our declared python version support (from python >=3.8,<3.13 to ^3.8) to make it easier for tools with more expansive statements on what python versions they support to include the pinecone sdk as a dependency. Alongside this change, we expanded our test matrix to include more robust testing with python versions 3.11 and 3.12. Python 3.13 is still in alpha and is not yet part of our test matrix.

Adjust supported python versions to ^3.8 by @jhamon in #312
Update pytest-timeout to support python >= 3.12 by @mjvankampen in #314

Chores

Sync models from pinecone-protos by @fsxfreak in #315
Fix minor README docs issues in client reference by @austin-denoble in #316

New Contributors

@mjvankampen made their first contribution in #314

Full Changelog: v3.0.3...v3.1.0

Contributors

jhamon, fsxfreak, and 2 other contributors

Assets 2

14 Feb 23:57

jhamon

v3.0.3

26b7c10

Release v3.0.3

Fixes

gRPC: parse_query_response: Skip parsing empty Usage by @daverigby in #301
Support overriding additional_headers with PINECONE_ADDITIONAL_HEADERS environment variable by @fsxfreak in #304
upsert_from_dataframe: Hide all progressbars if !show_progress by @daverigby in #310

Chores

Update github actions dependencies to fix warnings by @jhamon in #308
Update generated openapi code by @jhamon in #309

New Contributors

@fsxfreak made their first contribution in #304

Full Changelog: v3.0.2...v3.0.3

Contributors

daverigby, jhamon, and fsxfreak

Assets 2

24 Jan 22:00

jhamon

v3.0.2

61c19ad

Release v3.0.2

Fixes

Create indexes using `source_collection` option in `PodSpec`

This release resolves a bug when passing source_collection as part of the PodSpec. This option is used when creating a new index from vector data stored in a collection. The value of this field should be a collection you have created previously from an index and that shows with pc.list_collections(). Currently collections and pod-based indexes are not portable across environments.

from pinecone import Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')

pc.create_index(
    name='my-index', 
    dimension=1536, 
    metric='cosine', 
    spec=PodSpec(
        environment='us-east1-gcp', 
        source_collection='collection-2024jan16',
    )
)

Pass optional `GRPCClientConfig` when using `PineconeGRPC`

This could be considered as a fix for a UX bug or a micro-feature, depending on your perspective. In 3.0.2 we updated the pc.Index helper method that is used to build instances of the GRPCIndex class. It now accepts an optional keyword param grpc_config. Before this fix, you would need to import GRPCIndex and instantiate GRPCIndex yourself in order to pass this configuration and customize some settings, which was a bit clunky.

from pinecone.grpc import PineconeGRPC, GRPCClientConfig

pc = PineconeGRPC(api_key='YOUR_API_KEY')
grpc_config = GRPCClientConfig(
    timeout=10, 
    secure=True,
    reuse_channel=True
)
index = pc.Index(
   name='my-index', 
   host='host', 
   grpc_config=grpc_config
)

# Now do data operations
index.upsert(...)

Pass optional `pool_threads` config on the index.

Similar to the grpc_config option, some people requested the ability to pass pool_threads when targeting an index rather than in the initial client initialization. Now the optional configuration is accepted in both places, with the value passed to .Index() taking precedence.

Now these are both valid approaches:

from pinecone import Pinecone

pc = Pinecone(api_key='key', pool_threads=5)
pc.Index(host='host')
pc.upsert(...)

from pinecone import Pinecone

pc = Pinecone(api_key='key')
index = pc.Index(host='host', pool_threads=5)
index.upsert(...)

Debugging

This is probably only relevant for internal Pinecone employees or support agents, but the index client now accepts configuration to attach additional headers to each data plane request. This can help with tracing requests in logs.

from pinecone import Pinecone

pc = Pinecone(api_key='xxx')
index = pc.Index(
    host='hosturl', 
    additional_headers={ 'header-1': 'header-1-value' }
)

# Now do things
index.upsert(...)

The equivalent concept for PineconeGRPC is to pass additional_metadata. gRPC metadata fill a similar role as HTTP request headers, and should not be confused with metadata associated with vectors stored in your Pinecone indexes.

from pinecone.grpc import PineconeGRPC, GRPCClientConfig

pc = PineconeGRPC(api_key='YOUR_API_KEY')

grpc_config = GRPCClientConfig(additional_metadata={'extra-header': 'value123'})
index = pc.Index(
    name='my-index', 
    host='host', 
    grpc_config=grpc_config
)

# do stuff
index.upsert(...)

Changelog

README.md: Update install steps to escape brackets by @daverigby in #298
Expose missing configurations for grpc_config and pool_threads by @jhamon in #296
Integration tests for collections by @jhamon in #299
Optional configs to pass additional_headers/additional_metadata to indexes by @jhamon in #297

New Contributors

@daverigby made their first contribution in #298

Full Changelog: v3.0.1...v3.0.2

Contributors

daverigby and jhamon

Assets 2

19 Jan 20:55

jhamon

v3.0.1

e100f3c

Release v3.0.1

This is a quick follow-up to the v3.0.0 release earlier this week. This release adds improved error messages to help guide people on how to address some of the breaking changes in v3, such as the migration of core functionality from attributes on the pinecone module into methods of the Pinecone class.

If you're updating from v2.2.x from the first time, you will still want to checkout the v3.0.0 Migration Guide for a walkthrough of all the new features and changes. All of that information is still accurate for this release.

Assets 2

16 Jan 13:00

jhamon

v3.0.0

bd4dea2

Release v3.0.0

Existing users will want to checkout the v3.0.0 Migration Guide for a walkthrough of all the new features and changes.
New users should start with the README and Reference Docs

Serverless indexes are currently in public preview, so make sure to review the current limitations and test thoroughly before using in production.

Changes overview

Deploy Pinecone’s new serverless indexes. The create_index method has been refactored to accept a PodSpec or ServerlessSpec depending on how you would like to deploy your index. Many old properties such as pod_type, replicas, etc are moved into PodSpec since they do not apply to serverless indexes.
Understand cost. The quantity of read units consumed by each serverless query and fetch call are now returned with the response.
Flexible API Keys. The v3.0.0 Python SDK is consuming the new Control Plane API hosted at https://api.pinecone.io/. This new API allows for a lot more flexibility in how API keys are used in comparison to the past when a rigid 1:1 relationship was enforced between projects and environments.
State encapsulation with classes. We’ve refactored away from global state variables set with pinecone.init into new Pinecone class instances that encapsulate their configuration state. This change enables users to interact with Pinecone using multiple API keys if they wish.
Streamlined dependencies, smoother installs.
- Removed many dependencies: numpy, pyyaml, loguru, requests, dnspython
- Expanded the urllib3 support back to 1.26.x
- Everything GRPC-related is now moved into a subpackage, pinecone.grpc, so that GRPC code is only imported when needed. For applications using REST, this will mean quicker startup and fewer dependency clashes with other packages.
Richer responses. The list_indexes and list_collections methods now return an array with full descriptions of each resource, not merely an array of names.
Migration to the Apache 2 open source license. We’ve moved from a proprietary EULA to a more welcoming Apache 2 license to make it easier than ever for people to incorporate the Pinecone Python SDK into their projects.
Bug fixes:
- Removed code that was erroneously parsing some metadata into DateTime objects.
- Refactored urllib3 usage to stop spamming deprecation warning messages.
- Suppressed a tqdm warning that was appearing during notebook runs.
Tidying up / Breaking changes
- list_indexes now returns additional data, and to continue iterating over an array of names you need to chain a call to a new helper method .names(). See here.
- list_collections has changed very similar to list_indexes. Use .names(). See here.
- describe_index takes the same arguments as before (the index name), but returns data in a different shape reflecting the move of some configurations under the spec key and elevation of host to the top level. See a table of changed properties here.
- The order of positional arguments to the query method has been updated to reflect that top_k is a required parameter. If you previously relied on passing your query vector as the first positional argument, you’ll see a strange error from the API about duplicate top_k values being passed. We recommend adopting keyword arguments to fix and be resilient to any future changes, e.g. index.query(vector=vec, top_k=10)
- query() no longer accepts multiple queries via the queries keyword argument.
Debugging tools. See what data is coming and going with a new environment variable, PINECONE_DEBUG_CURL='true'

New Contributors

@zackproser migrated the repository onto poetry in #193
@austin-denoble made numerous documentation and CI contributions, beginning with #208
@loisaidasam spotted some typos in #254

Full Changelog: v2.2.4...v3.0.0.dev10

Contributors

loisaidasam, zackproser, and austin-denoble

Assets 2

15 Sep 15:56

jhamon

v2.2.4

341a941

Release v2.2.4

What's Changed

Bump protobuf dependency to 3.20.x by @jhamon in #185
CI setup for nightly python builds by @jhamon in #179
Docs improvements
- by @byronnlandry in #187
- by @byronnlandry in #188
- by @efung in #191
Fixing annoying urllib3 deprecation error
- Replace deprecated HTTPResponse.getheader() by @izeye in #195
- update urllib3_response.headers.get as getheader is deprecated by @tdonia in #197
Give feedback when environment kwarg mispelled by @tdonia in #198

New Contributors

@efung made their first contribution in #191
@izeye made their first contribution in #195
@tdonia made their first contribution in #197

Full Changelog: v2.2.2...v2.2.4

Contributors

tdonia, efung, and 3 other contributors

Assets 2

07 Jun 06:02

jhamon

v2.2.2

9df6fe4

Release v2.2.2

Changelog

Security Fixes

numpy dependency from unpinned to >=1.22.0 to address low severity CVE-2021-34141
protobuf dependency from 3.19.3 to ~=3.19.5 to address a potential denial-of-service vector. This should only affect those consuming the grpc-flavored version of the client via pinecone-client[grpc].

Numpy features deprecated

We plan to remove our dependency on numpy in a future release to simplify the install experience. Deprecation warnings have been added to code paths where numpy is currently in use. Let us know if you have concerns about this.

End of Python 3.7 Support

We have also removed support for Python 3.7 which has reached the official end-of-life. The last version of the pinecone-client to support Python 3.7 is v2.2.1. Our numpy dependency forced our hand in this decision to drop support because numpy 1.22.0 no longer supports Python 3.7.

Assets 2

22 Feb 17:10

igiloh-pinecone

v2.2.0

1d60619

Release v2.2.0

Change log:

Support for Vector sparse_values
Added function upsert_from_dataframe() which allows upserting a large dataset of vectors by providing a Pandas dataframe
Added option to pass vectors to upsert() as a list of dictionaries
Implemented GRPC retry by directly configuring the low-level grpcio behavior, instead of wrapping with an interceptor

Assets 4

03 Jan 14:57

igiloh-pinecone

v2.1.0

f9c7c2c

Release 2.1.0

Change log:

Fix "Connection Reset by peer" error after long idle periods
Add typing and explicit names for arguments in all client operations
Add docstrings to all client operations
Support batch upsert by passing batch_size to upsert method
Improve gRPC query results parsing performance

Assets 4

16 Aug 14:23

rajat08

v2.0.13

40e8b44

v2.0.13

Release v2.0.13

Assets 2

Releases: pinecone-io/pinecone-python-client

Release v3.1.0

Listing vector ids by prefix in a namespace (for serverless indexes)

Python 3.11 and 3.12 support

Chores

New Contributors

Contributors

Uh oh!

Release v3.0.3

Fixes

Chores

New Contributors

Contributors

Uh oh!

Release v3.0.2

Fixes

Create indexes using source_collection option in PodSpec

Pass optional GRPCClientConfig when using PineconeGRPC

Pass optional pool_threads config on the index.

Debugging

Changelog

New Contributors

Contributors

Uh oh!

Release v3.0.1

Uh oh!

Release v3.0.0

Changes overview

New Contributors

Contributors

Uh oh!

Release v2.2.4

What's Changed

New Contributors

Contributors

Uh oh!

Release v2.2.2

Changelog

Security Fixes

Numpy features deprecated

End of Python 3.7 Support

Uh oh!

Release v2.2.0

Uh oh!

Release 2.1.0

Uh oh!

v2.0.13

Uh oh!

Create indexes using `source_collection` option in `PodSpec`

Pass optional `GRPCClientConfig` when using `PineconeGRPC`

Pass optional `pool_threads` config on the index.