-
Notifications
You must be signed in to change notification settings - Fork 1
36 add better writer support #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ChristopheBeke
merged 10 commits into
improve-gdrive-integration
from
36-add-better-writer-support
Jun 27, 2025
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
f03330e
added documentation about a gdrive and adding new provder
PaoloLeonard 635bff7
added correct type for the example
PaoloLeonard fa88476
Correct the ordering
PaoloLeonard 055182f
fix import
PaoloLeonard 1f3c93f
Merge branch 'dev' of https://github.com/SnappyLab/DocBinder-OSS into…
PaoloLeonard ee7ef34
changing the order of the parameters
PaoloLeonard 4e7dee0
Merge pull request #20 from SnappyLab/doc/16-doc-how-to-add-a-new-pro…
ChristopheBeke 2c06ddb
Merge branch 'improve-gdrive-integration' of https://github.com/Snapp…
PaoloLeonard 2c5718f
added nice writers and printing
PaoloLeonard ec57051
corrected testsé
PaoloLeonard File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,119 @@ | ||
| # How to Add a New Provider | ||
|
|
||
| This guide explains how to integrate a new storage provider (e.g., DropBox, OneDrive) into DocBinder-OSS. The process involves creating configuration and client classes, registering the provider, and ensuring compatibility with the system’s models and interfaces. | ||
|
|
||
| --- | ||
|
|
||
| ## 1. Create a Service Configuration Class | ||
|
|
||
| Each provider must define a configuration class that inherits from [`ServiceConfig`](src/docbinder_oss/services/base_class.py): | ||
|
|
||
| ```python | ||
| # filepath: src/docbinder_oss/services/my_provider/my_provider_service_config.py | ||
| from docbinder_oss.services.base_class import ServiceConfig | ||
|
|
||
| class MyProviderServiceConfig(ServiceConfig): | ||
| type: str = "my_provider" | ||
| name: str | ||
| # Add any other provider-specific fields here | ||
| api_key: str | ||
| ``` | ||
|
|
||
| - `type` must be unique and match the provider’s identifier. | ||
| - `name` is a user-defined label for this provider instance. | ||
|
|
||
| --- | ||
|
|
||
| ## 2. Implement the Storage Client | ||
|
|
||
| Create a client class that inherits from [`BaseStorageClient`](src/docbinder_oss/services/base_class.py) and implements all abstract methods: | ||
|
|
||
| ```python | ||
| # filepath: src/docbinder_oss/services/my_provider/my_provider_client.py | ||
| from typing import Optional, List | ||
| from docbinder_oss.services.base_class import BaseStorageClient | ||
| from docbinder_oss.core.schema import File, Permission | ||
| from .my_provider_service_config import MyProviderServiceConfig | ||
|
|
||
| class MyProviderClient(BaseStorageClient): | ||
| def __init__(self, config: MyProviderServiceConfig): | ||
| self.config = config | ||
| # Initialize SDK/client here | ||
|
|
||
| def test_connection(self) -> bool: | ||
| # Implement connection test | ||
| pass | ||
|
|
||
| def list_files(self, folder_id: Optional[str] = None) -> List[File]: | ||
| # Implement file listing | ||
| pass | ||
|
|
||
| def get_file_metadata(self, item_id: str) -> File: | ||
| # Implement metadata retrieval | ||
| pass | ||
|
|
||
| def get_permissions(self, item_id: str) -> List[Permission]: | ||
| # Implement permissions retrieval | ||
| pass | ||
| ``` | ||
|
|
||
| - Use the shared models [`File`](src/docbinder_oss/core/schemas.py), [`Permission`](src/docbinder_oss/core/schemas.py), etc., for return types. | ||
|
|
||
| --- | ||
|
|
||
| ## 3. Register the Provider | ||
|
|
||
| Add an `__init__.py` in your provider’s folder with a `register()` function: | ||
|
|
||
| ```python | ||
| # filepath: src/docbinder_oss/services/my_provider/__init__.py | ||
| from .my_provider_client import MyProviderClient | ||
| from .my_provider_service_config import MyProviderServiceConfig | ||
|
|
||
| def register(): | ||
| return { | ||
| "display_name": "my_provider", | ||
| "config_class": MyProviderServiceConfig, | ||
| "client_class": MyProviderClient, | ||
| } | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 4. Ensure Discovery | ||
|
|
||
| The system will automatically discover your provider if it’s in the `src/docbinder_oss/services/` directory and contains a `register()` function in `__init__.py`. | ||
|
|
||
| --- | ||
|
|
||
| ## 5. Update the Config File | ||
|
|
||
| Add your provider’s configuration to `~/.config/docbinder/config.yaml`: | ||
|
|
||
| ```yaml | ||
| providers: | ||
| - type: my_provider | ||
| name: my_instance | ||
| # Add other required fields | ||
| api_key: <your-api-key> | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## 6. Test Your Provider | ||
|
|
||
| - Run the application and ensure your provider appears and works as expected. | ||
| - The config loader will validate your config using your `ServiceConfig` subclass. | ||
|
|
||
| --- | ||
|
|
||
| ## Reference | ||
|
|
||
| - [src/docbinder_oss/services/base_class.py](src/docbinder_oss/services/base_class.py) | ||
| - [src/docbinder_oss/core/schemas.py](src/docbinder_oss/core/schemas.py) | ||
| - [src/docbinder_oss/services/google_drive/](src/docbinder_oss/services/google_drive/) (example implementation) | ||
| - [src/docbinder_oss/services/__init__.py](src/docbinder_oss/services/__init__.py) | ||
|
|
||
| --- | ||
|
|
||
| **Tip:** Use the Google Drive as a template for your implementation. Make sure to follow the abstract method signatures and use the shared models for compatibility. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| # Google Drive Configuration Setup | ||
|
|
||
| This guide will help you configure Google Drive as a provider for DocBinder. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - A Google account | ||
| - Access to [Google Cloud Console](https://console.cloud.google.com/) | ||
| - DocBinder installed | ||
|
|
||
| ## Step 1: Create a Google Cloud Project | ||
|
|
||
| 1. Go to the [Google Cloud Console](https://console.cloud.google.com/). | ||
| 2. Click on **Select a project** and then **New Project**. | ||
| 3. Enter a project name and click **Create**. | ||
|
|
||
| ## Step 2: Enable Google Drive API | ||
|
|
||
| 1. In your project dashboard, navigate to **APIs & Services > Library**. | ||
| 2. Search for **Google Drive API**. | ||
| 3. Click **Enable**. | ||
|
|
||
| ## Step 3: Create OAuth 2.0 Credentials | ||
|
|
||
| 1. Go to **APIs & Services > Credentials**. | ||
| 2. Click **+ CREATE CREDENTIALS** and select **OAuth client ID**. | ||
| 3. Configure the consent screen if prompted. | ||
| 4. Choose **Desktop app** or **Web application** as the application type. | ||
| 5. Enter a name and click **Create**. | ||
| 6. Download the `credentials.json` file. | ||
|
|
||
| ## Step 4: Configure DocBinder | ||
|
|
||
| 1. Place your downloaded credentials file somewhere accessible (e.g., ~/gcp_credentials.json). | ||
| 2. The application will generate a token file (e.g., ~/gcp_token.json) after the first authentication. | ||
|
|
||
| ## Step 5: Edit the Config File | ||
|
|
||
| Create the config file, and add a provider entry for Google Drive: | ||
| ```yaml | ||
| providers: | ||
| - type: google_drive | ||
| name: my_gdrive | ||
| gcp_credentials_json: ./gcp_credentials.json | ||
| gcp_token_json: ./gcp_token.json | ||
| ``` | ||
|
|
||
| * type: Must be google_drive. | ||
| * name: A unique name for this provider. | ||
| * gcp_credentials_json: Absolute/relative path to your Google Cloud credentials file. | ||
| * gcp_token_json: Absolute/relative path where the token will be stored/generated. | ||
|
|
||
| ## Step 6: Authenticate and Test | ||
|
|
||
| 1. Run DocBinder with the Google Drive provider enabled. | ||
| 2. On first run, follow the authentication prompt to grant access. | ||
| 3. Verify that DocBinder can access your Google Drive files. | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| - Ensure your credentials file is in the correct location. | ||
| - Check that the Google Drive API is enabled for your project. | ||
| - Review the [Google API Console](https://console.developers.google.com/) for error messages. | ||
|
|
||
| ## References | ||
|
|
||
| - [Google Drive API Documentation](https://developers.google.com/drive) | ||
| - [DocBinder Documentation](../README.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,92 @@ | ||
| import csv | ||
| import json | ||
| from abc import ABC, abstractmethod | ||
| from pathlib import Path | ||
| from typing import Any, Dict, List, Union | ||
| from pydantic import BaseModel | ||
| from rich import print | ||
|
|
||
| import logging | ||
|
|
||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class Writer(ABC): | ||
| """Abstract base writer class.""" | ||
|
|
||
| @abstractmethod | ||
| def write(self, data: Any, file_path: Union[None, str, Path]) -> None: | ||
| """Write data to file.""" | ||
| pass | ||
|
|
||
|
|
||
| class MultiFormatWriter: | ||
| """Factory writer that automatically detects format from file extension.""" | ||
|
|
||
| _writers = { | ||
| '.csv': 'CSVWriter', | ||
| '.json': 'JSONWriter', | ||
| } | ||
|
|
||
| @classmethod | ||
| def write(cls, data: Any, file_path: Union[None, str, Path]) -> None: | ||
| """Write data to file, format determined by extension.""" | ||
| if file_path is None: | ||
| # If no file path is provided, write to console | ||
| ConsoleWriter().write(data) | ||
| return | ||
| path = Path(file_path) | ||
| extension = path.suffix.lower() | ||
|
|
||
| if extension not in cls._writers: | ||
| raise ValueError(f"Unsupported format: {extension}") | ||
|
|
||
| writer_class = globals()[cls._writers[extension]] | ||
| writer = writer_class() | ||
| writer.write(data, file_path) | ||
|
|
||
|
|
||
| class CSVWriter(Writer): | ||
| def get_fieldnames(self, data: Dict[str, List[BaseModel]]) -> List[str]: | ||
| fieldnames = next(iter(data.values()))[0].model_fields_set | ||
ChristopheBeke marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| return ["provider", *fieldnames] | ||
|
|
||
| def write(self, data: List[Dict], file_path: Union[str, Path]) -> None: | ||
| if not data: | ||
| logger.warning("No data to write to CSV.") | ||
| return | ||
|
|
||
| with open(file_path, 'w', newline='', encoding='utf-8') as f: | ||
| writer = csv.DictWriter(f, fieldnames=self.get_fieldnames(data)) | ||
| writer.writeheader() | ||
| for provider, items in data.items(): | ||
| for item in items: | ||
| item_dict = item.model_dump() if isinstance(item, BaseModel) else item | ||
| item_dict['provider'] = provider | ||
| writer.writerow(item_dict) | ||
|
|
||
|
|
||
| class JSONWriter(Writer): | ||
| def write(self, data: Dict[str, List[BaseModel]], file_path: Union[str, Path]) -> None: | ||
| data = { | ||
| provider: [item.model_dump() for item in items] | ||
| for provider, items in data.items() | ||
| } | ||
| with open(file_path, 'w', encoding='utf-8') as f: | ||
| json.dump(data, f, indent=2, ensure_ascii=False, default=str) | ||
|
|
||
|
|
||
| class ConsoleWriter(Writer): | ||
| def write(self, data: Dict) -> None: | ||
| from rich.table import Table | ||
|
|
||
| table = Table(title="Files and Folders") | ||
| table.add_column("Provider", justify="right", style="cyan", no_wrap=True) | ||
| table.add_column("Id", style="magenta") | ||
| table.add_column("Name", style="magenta") | ||
| table.add_column("Kind", style="magenta") | ||
| for provider, items in data.items(): | ||
| for item in items: | ||
| table.add_row(provider, item.id, item.name, item.kind) | ||
| print(table) | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.