Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
9a5ee17
Added the docbinder search command that allows to search through all …
ChristopheBeke Jun 15, 2025
3722d39
Changed all commands to dedicated folders and better structure. Also …
ChristopheBeke Jun 15, 2025
f03330e
added documentation about a gdrive and adding new provder
PaoloLeonard Jun 16, 2025
635bff7
added correct type for the example
PaoloLeonard Jun 16, 2025
fa88476
Correct the ordering
PaoloLeonard Jun 16, 2025
a9fc52c
Added traversal of child directories to get all the files
ChristopheBeke Jun 19, 2025
178ea9a
Delete a file that wasn't supposed to be there
ChristopheBeke Jun 19, 2025
7260899
Update tests
ChristopheBeke Jun 19, 2025
60d580a
Merge branch 'main' of https://github.com/SnappyLab/DocBinder-OSS int…
ChristopheBeke Jun 19, 2025
d107a8f
Fixed merge conflict in 'src/docbinder_oss/services/google_drive/goog…
ChristopheBeke Jun 19, 2025
9e5a51b
Add the path of the items as an attribute.
ChristopheBeke Jun 19, 2025
c753807
Update src/docbinder_oss/services/google_drive/google_drive_client.py
ChristopheBeke Jun 19, 2025
9dbf490
Update src/docbinder_oss/services/google_drive/__init__.py
ChristopheBeke Jun 19, 2025
98a6bc8
Update tests/services/test_search_export.py
ChristopheBeke Jun 19, 2025
7747680
Removed unused parameter
ChristopheBeke Jun 19, 2025
055182f
fix import
PaoloLeonard Jun 21, 2025
1f3c93f
Merge branch 'dev' of https://github.com/SnappyLab/DocBinder-OSS into…
PaoloLeonard Jun 21, 2025
ee7ef34
changing the order of the parameters
PaoloLeonard Jun 21, 2025
461bc22
Merge branch 'dev' of https://github.com/SnappyLab/DocBinder-OSS into…
PaoloLeonard Jun 21, 2025
59b00ff
improved help message and internal logic
PaoloLeonard Jun 21, 2025
3da72b5
typo in list
PaoloLeonard Jun 21, 2025
6e68c57
improved the internal logic
PaoloLeonard Jun 21, 2025
b4e2096
change to the provdier
PaoloLeonard Jun 22, 2025
eebc5b5
improved the cli
PaoloLeonard Jun 22, 2025
ea4ebfc
refactoring list all
PaoloLeonard Jun 22, 2025
e187467
Merge pull request #32 from SnappyLab/bugfix/2-add-the-capability-to-…
ChristopheBeke Jun 23, 2025
4e7dee0
Merge pull request #20 from SnappyLab/doc/16-doc-how-to-add-a-new-pro…
ChristopheBeke Jun 23, 2025
a5d2e6b
Update with all new comments. Also made sure the search command now f…
ChristopheBeke Jun 23, 2025
63bd58c
initial change to gdrive
PaoloLeonard Jun 24, 2025
b960941
Merge branch '2-add-the-capability-to-search-for-documents-metadata' …
PaoloLeonard Jun 24, 2025
b2d8d87
refactor google drive
PaoloLeonard Jun 24, 2025
36fa20d
increased the page size
PaoloLeonard Jun 24, 2025
a475692
ruff linting
PaoloLeonard Jun 24, 2025
2c06ddb
Merge branch 'improve-gdrive-integration' of https://github.com/Snapp…
PaoloLeonard Jun 24, 2025
2c5718f
added nice writers and printing
PaoloLeonard Jun 24, 2025
ec57051
corrected testsé
PaoloLeonard Jun 24, 2025
c93b6e8
Merge pull request #37 from SnappyLab/36-add-better-writer-support
ChristopheBeke Jun 27, 2025
8a958e1
Changed filter_files to private method and updated linting.
ChristopheBeke Jun 27, 2025
01f451e
Added black for formatting and ruff for linting, including pre-commits
ChristopheBeke Jun 27, 2025
97d749e
Added pre-commit
ChristopheBeke Jun 27, 2025
8c1fd29
Update contributing file
ChristopheBeke Jun 27, 2025
19e06a1
Merge pull request #35 from SnappyLab/improve-gdrive-integration
ChristopheBeke Jun 27, 2025
c31d4bb
Fixed the search cli --export-filename and improved the writer functi…
ChristopheBeke Jun 27, 2025
27a1b5d
Updated the writer functions to work and improve readibility and alig…
ChristopheBeke Jun 27, 2025
c7747b2
Make sure to get all files, not only the shared ones
ChristopheBeke Jun 27, 2025
463ce84
Fix mkdocs
ChristopheBeke Jun 27, 2025
4827ce9
Update incorrect readme reference in mkdocs
ChristopheBeke Jun 27, 2025
da02887
update workflow of docbinder oss to not trigger on doc updates and ch…
ChristopheBeke Jun 27, 2025
ca62bc6
revert back to writer and improve tests
PaoloLeonard Jun 30, 2025
090ee29
remove logger
PaoloLeonard Jun 30, 2025
a803975
fix linting
PaoloLeonard Jun 30, 2025
d4d2388
Merge pull request #42 from SnappyLab/revert-back-to-writer-and-impro…
ChristopheBeke Jul 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/docbinder-oss.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,16 @@ on:
branches:
- main
- dev
paths-ignore:
- "docs/**"
- "mkdocs.yml"
pull_request:
branches:
- main
- dev
paths-ignore:
- "docs/**"
- "mkdocs.yml"
jobs:
test:
runs-on: ubuntu-latest
Expand Down
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,7 @@ ENV/
# Credentials
gcp_credentials.json
*_token.json

# Test files
search_results.csv
search_results.json
24 changes: 24 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
repos:
- repo: https://github.com/astral-sh/uv-pre-commit
rev: 0.7.16
hooks:
- id: uv-export
- id: uv-lock
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.6.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
- id: check-yaml
- id: check-added-large-files
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.12.1
hooks:
# Run the linter.
- id: ruff-check
types_or: [ python, pyi ]
args: [ --select, I, --fix, --select=E501 ]
# Run the formatter.
- id: ruff-format
types_or: [ python, pyi ]
24 changes: 23 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,26 @@ All dependencies are tracked in `pyproject.toml`. Use `uv` commands to keep it u
---

**Note:**
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.

## Code Style and Linting

This project uses [Black](https://black.readthedocs.io/en/stable/) for code formatting and [Ruff](https://docs.astral.sh/ruff/) for linting. All code should be formatted and linted before committing.

- Run the following before committing code:

```zsh
uv run black .
uv run ruff check .
```

- To automatically format and lint code on every commit, install pre-commit hooks:

```zsh
uv pip install pre-commit
pre-commit install
```

This will ensure Black and Ruff are run on staged files before each commit.

Configuration for Black and Ruff is in `pyproject.toml`. This enforces consistent quotes, spacing, and other style rules for all contributors.
24 changes: 23 additions & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,4 +56,26 @@ All dependencies are tracked in `pyproject.toml`. Use `uv` commands to keep it u
---

**Note:**
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.

## Code Style and Linting

This project uses [Black](https://black.readthedocs.io/en/stable/) for code formatting and [Ruff](https://docs.astral.sh/ruff/) for linting. All code should be formatted and linted before committing.

- Run the following before committing code:

```zsh
uv run black .
uv run ruff check .
```

- To automatically format and lint code on every commit, install pre-commit hooks:

```zsh
uv pip install pre-commit
pre-commit install
```

This will ensure Black and Ruff are run on staged files before each commit.

Configuration for Black and Ruff is in `pyproject.toml`. This enforces consistent quotes, spacing, and other style rules for all contributors.
119 changes: 119 additions & 0 deletions docs/tool/providers/custom_provider.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
# How to Add a New Provider

This guide explains how to integrate a new storage provider (e.g., DropBox, OneDrive) into DocBinder-OSS. The process involves creating configuration and client classes, registering the provider, and ensuring compatibility with the system’s models and interfaces.

---

## 1. Create a Service Configuration Class

Each provider must define a configuration class that inherits from [`ServiceConfig`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py):

```python
# filepath: src/docbinder_oss/services/my_provider/my_provider_service_config.py
from docbinder_oss.services.base_class import ServiceConfig

class MyProviderServiceConfig(ServiceConfig):
type: str = "my_provider"
name: str
# Add any other provider-specific fields here
api_key: str
```

- `type` must be unique and match the provider’s identifier.
- `name` is a user-defined label for this provider instance.

---

## 2. Implement the Storage Client

Create a client class that inherits from [`BaseStorageClient`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py) and implements all abstract methods:

```python
# filepath: src/docbinder_oss/services/my_provider/my_provider_client.py
from typing import Optional, List
from docbinder_oss.services.base_class import BaseStorageClient
from docbinder_oss.core.schema import File, Permission
from .my_provider_service_config import MyProviderServiceConfig

class MyProviderClient(BaseStorageClient):
def __init__(self, config: MyProviderServiceConfig):
self.config = config
# Initialize SDK/client here

def test_connection(self) -> bool:
# Implement connection test
pass

def list_files(self, folder_id: Optional[str] = None) -> List[File]:
# Implement file listing
pass

def get_file_metadata(self, item_id: str) -> File:
# Implement metadata retrieval
pass

def get_permissions(self, item_id: str) -> List[Permission]:
# Implement permissions retrieval
pass
```

- Use the shared models [`File`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py), [`Permission`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py), etc., for return types.

---

## 3. Register the Provider

Add an `__init__.py` in your provider’s folder with a `register()` function:

```python
# filepath: src/docbinder_oss/services/my_provider/__init__.py
from .my_provider_client import MyProviderClient
from .my_provider_service_config import MyProviderServiceConfig

def register():
return {
"display_name": "my_provider",
"config_class": MyProviderServiceConfig,
"client_class": MyProviderClient,
}
```

---

## 4. Ensure Discovery

The system will automatically discover your provider if it’s in the `src/docbinder_oss/services/` directory and contains a `register()` function in `__init__.py`.

---

## 5. Update the Config File

Add your provider’s configuration to `~/.config/docbinder/config.yaml`:

```yaml
providers:
- type: my_provider
name: my_instance
# Add other required fields
api_key: <your-api-key>
```

---

## 6. Test Your Provider

- Run the application and ensure your provider appears and works as expected.
- The config loader will validate your config using your `ServiceConfig` subclass.

---

## Reference

- [src/docbinder_oss/services/base_class.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py)
- [src/docbinder_oss/core/schemas.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py)
- [src/docbinder_oss/services/google_drive/](https://github.com/SnappyLab/DocBinder-OSS/tree/main/src/docbinder_oss/services/google_drive/) (example implementation)
- [src/docbinder_oss/services/__init__.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/__init__.py)

---

**Tip:** Use the Google Drive as a template for your implementation. Make sure to follow the abstract method signatures and use the shared models for compatibility.
68 changes: 68 additions & 0 deletions docs/tool/providers/google_drive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Google Drive Configuration Setup

This guide will help you configure Google Drive as a provider for DocBinder.

## Prerequisites

- A Google account
- Access to [Google Cloud Console](https://console.cloud.google.com/)
- DocBinder installed

## Step 1: Create a Google Cloud Project

1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
2. Click on **Select a project** and then **New Project**.
3. Enter a project name and click **Create**.

## Step 2: Enable Google Drive API

1. In your project dashboard, navigate to **APIs & Services > Library**.
2. Search for **Google Drive API**.
3. Click **Enable**.

## Step 3: Create OAuth 2.0 Credentials

1. Go to **APIs & Services > Credentials**.
2. Click **+ CREATE CREDENTIALS** and select **OAuth client ID**.
3. Configure the consent screen if prompted.
4. Choose **Desktop app** or **Web application** as the application type.
5. Enter a name and click **Create**.
6. Download the `credentials.json` file.

## Step 4: Configure DocBinder

1. Place your downloaded credentials file somewhere accessible (e.g., ~/gcp_credentials.json).
2. The application will generate a token file (e.g., ~/gcp_token.json) after the first authentication.

## Step 5: Edit the Config File

Create the config file, and add a provider entry for Google Drive:
```yaml
providers:
- type: google_drive
name: my_gdrive
gcp_credentials_json: ./gcp_credentials.json
gcp_token_json: ./gcp_token.json
```

* type: Must be google_drive.
* name: A unique name for this provider.
* gcp_credentials_json: Absolute/relative path to your Google Cloud credentials file.
* gcp_token_json: Absolute/relative path where the token will be stored/generated.

## Step 6: Authenticate and Test

1. Run DocBinder with the Google Drive provider enabled.
2. On first run, follow the authentication prompt to grant access.
3. Verify that DocBinder can access your Google Drive files.

## Troubleshooting

- Ensure your credentials file is in the correct location.
- Check that the Google Drive API is enabled for your project.
- Review the [Google API Console](https://console.developers.google.com/) for error messages.

## References

- [Google Drive API Documentation](https://developers.google.com/drive)
- [DocBinder OSS - GitHub](https://github.com/SnappyLab/DocBinder-OSS)
3 changes: 3 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ nav:
- Commands:
- Main CLI: commands/main.md
- Provider: commands/provider.md
- Providers:
- Google Drive: tool/providers/google_drive.md
- Custom Provider: tool/providers/custom_provider.md
- Contributing: CONTRIBUTING.md
- Code of Conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
Expand Down
4 changes: 4 additions & 0 deletions provider_setup_example.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
providers:
- type: google_drive
name: my_google_drive
gcp_credentials_json: gcp_credentials.json
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,10 @@ include = ["src/docbinder_oss/**"]

[dependency-groups]
dev = [
"black>=25.1.0",
"mkdocs>=1.6.1",
"mkdocs-material>=9.6.14",
"pre-commit>=4.2.0",
"pytest>=8.4.0",
"tox>=4.26.0",
"tox-uv>=1.26.0",
Expand All @@ -47,8 +49,7 @@ testpaths = [
]

[tool.ruff]
# Set the maximum line length to 100.
line-length = 100
line-length = 125

[tool.ruff.lint]
# Add the `line-too-long` rule to the enforced rule set. By default, Ruff omits rules that
Expand Down
12 changes: 12 additions & 0 deletions src/docbinder_oss/cli/provider/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
import typer
from .get import app as get_app
from .list import app as list_app
from .test import app as test_app

# --- Provider Subcommand Group ---
# We create a separate Typer app for the 'provider' command.
# This allows us to nest commands like 'provider list' and 'provider get'.
app = typer.Typer(help="Commands to manage providers. List them or get details for a specific one.")
app.add_typer(get_app)
app.add_typer(list_app)
app.add_typer(test_app)
30 changes: 30 additions & 0 deletions src/docbinder_oss/cli/provider/get.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import typer

app = typer.Typer()


@app.command("get")
def get_provider(
connection_type: str = typer.Option(None, "--type", "-t", help="The type of the provider to get."),
name: str = typer.Option(None, "--name", "-n", help="The name of the provider to get."),
):
"""Get connection information for a provider by name or by type.
If both options are provided, it will search for providers matching either criterion."""
from docbinder_oss.helpers.config import load_config

config = load_config()

provider_found = False
if not config.providers:
typer.echo("No providers configured.")
raise typer.Exit(code=1)
for provider in config.providers:
if provider.name == name:
typer.echo(f"Provider '{name}' found with config: {provider}")
provider_found = True
if provider.type == connection_type:
typer.echo(f"Provider '{provider.name}' of type '{connection_type}'" f" found with config: {provider}")
provider_found = True
if not provider_found:
typer.echo(f"No providers found with name '{name}' or type '{connection_type}'.")
raise typer.Exit(code=1)
17 changes: 17 additions & 0 deletions src/docbinder_oss/cli/provider/list.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
import typer

app = typer.Typer()


@app.command()
def list():
"""List all configured providers."""
from docbinder_oss.helpers.config import load_config

config = load_config()
if not config.providers:
typer.echo("No providers configured.")
raise typer.Exit(code=1)

for provider in config.providers:
typer.echo(f"Provider: {provider.name}, type: {provider.type}")
Loading