feat(drive): add +download helper for downloading Drive files#679
feat(drive): add +download helper for downloading Drive files#679nuthalapativarun wants to merge 8 commits intogoogleworkspace:mainfrom
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
🦋 Changeset detectedLatest commit: fa1e32f The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request adds a new Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a +download helper for Google Drive, supporting both binary downloads and Workspace file exports with path validation. Key feedback includes the need to sanitize API error strings against terminal escape sequences, respect the global --dry-run flag, and utilize streaming for file writes to prevent OOM errors on large downloads.
|
bot: rescan |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the +download helper command for Google Drive, which orchestrates file metadata retrieval and content downloading for both binary and Google Workspace native files. The implementation includes streaming to disk to handle large files and path validation to prevent traversal attacks. Review feedback identifies several critical improvements: the error handling should parse API responses rather than using hardcoded reasons, the byte counter needs to be upgraded to u64 to avoid overflow on 32-bit architectures, and the final JSON output should accurately report the exported MIME type for native files.
Adds a multi-step +download command that determines how to fetch a file based on its MIME type: - Google Workspace native files (Docs/Sheets/Slides) → files.export with a caller-supplied --mime-type (e.g. application/pdf, text/csv) - Binary/other files → files.get?alt=media The output path is validated with validate_safe_file_path() to reject path traversal and control characters. The file ID is validated with validate_resource_name() before being embedded in URLs. Complements the existing +upload helper; justified as a multi-step orchestration helper per the helper guidelines in AGENTS.md.
- Sanitize API error body strings with sanitize_for_terminal() before embedding them in GwsError::Api to prevent terminal escape injection - Add --dry-run support: after metadata fetch and path resolution, print what would be downloaded and return without network/disk I/O - Stream response bytes to file via tokio::fs::File + bytes_stream() instead of resp.bytes().await to avoid OOM on large Drive files
315ec42 to
6b7b809
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new drive +download helper command to the CLI, enabling users to download Google Drive files to a local path. The implementation handles both binary files and Google Workspace native files (via export), includes dry-run support, and implements streaming to prevent memory issues with large files. Feedback focuses on improving error handling by parsing API responses rather than hardcoding reasons, enhancing security by sanitizing Drive filenames to prevent path traversal, managing partial file cleanup on download failure, and correcting the reported MIME type in the final JSON output when an export occurs.
- Parse Google API error JSON properly (reason, message, enable_url) so that specialised error hints in error.rs (e.g. accessNotConfigured) still fire - Sanitize Drive filename: replace '/' chars to prevent unintended subdirectories - Use u64 for byte_count to avoid overflow on 32-bit platforms with >4GB files - Stream to a .tmp file and rename on success; delete partial file on any error - Report actual output MIME type in JSON result (export format for native files)
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new +download helper command for the Google Drive service, enabling users to download files by ID with automatic handling for Google Workspace native file exports. The implementation includes metadata fetching, path traversal validation, and atomic file writing using temporary files. Feedback focuses on improving cross-platform path sanitization for Windows, ensuring that MIME type validation for native files occurs before dry-run reporting, and adding a cleanup step if the final file rename operation fails.
…tion, rename cleanup - Sanitize backslash in addition to slash in Drive filename to prevent unintended subdirectory creation on Windows - Validate --mime-type requirement for native Google Workspace files before the dry-run block so dry-run output is accurate and errors surface early - Clean up temp file if tokio::fs::rename fails to avoid orphaned .tmp files
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new +download helper command for the Google Drive CLI, enabling multi-step downloads for both native Google Workspace files (via export) and binary files. The implementation includes metadata fetching, path sanitization, and atomic file writing using temporary files. Feedback focuses on several architectural and cross-platform issues: the dry-run mode incorrectly performs network requests and requires authentication, the use of tokio::fs::rename introduces platform-specific behavior on Windows regarding file overwrites, and the parse_google_api_error function duplicates existing error-handling logic found in the executor module.
… rename, dedup error parsing - Short-circuit dry-run before any auth or network I/O (consistent with +upload pattern) so unauthenticated users can verify syntax offline - Remove local parse_google_api_error; add pub executor::api_error_from_response so error parsing logic lives in one place and accessNotConfigured hints fire consistently from all callers - Remove existing destination file before rename for cross-platform consistency: tokio::fs::rename overwrites on Unix but fails if dest exists on Windows; TOCTOU limitation is documented as a known constraint
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new drive +download helper command to the Google Workspace CLI, which automates the multi-step process of fetching file metadata and downloading or exporting Drive content. The implementation includes a new utility for parsing Google API errors and enhanced validation to prevent path traversal during file creation. Review feedback focuses on improving consistency and robustness across the CLI, specifically by standardizing JSON output keys to snake_case, aligning error fallback reasons with existing patterns, and expanding filename sanitization to handle Windows-specific reserved characters and dangerous Unicode format characters.
…response and +download - Use "httpError" (not "unknown") as fallback reason in api_error_from_response to match the existing handle_error_response fallback in executor.rs - Use snake_case keys in all JSON output (dry_run, file_id, export_mime_type, saved_file, mime_type) to match the project convention used in executor.rs - Expand filename sanitization to cover all Windows-reserved chars (:*?"<>|), control characters (Cc), and dangerous Unicode (Cf/bidi/zero-width) to prevent file creation failures and terminal injection on all platforms
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new drive +download helper command to the CLI, enabling users to download Google Drive files with support for both binary and native Workspace formats. The implementation includes filename sanitization, dry-run capabilities, and improved API error parsing. A review comment identifies a potential data loss risk in the file finalization logic, suggesting that using tokio::fs::rename directly is more atomic and safer than the current manual removal and rename sequence.
…on all platforms tokio::fs::rename uses MOVEFILE_REPLACE_EXISTING on Windows (since Rust 1.26), so the destination file is always overwritten atomically. The remove_file + rename sequence added previously was wrong: it widens the failure window so that if rename fails the user loses the original file at the destination as well.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new drive +download helper command, enabling users to download files from Google Drive to a local path. The implementation handles both binary files and Google Workspace native files (Docs, Sheets, Slides) by using the appropriate export or media download endpoints. It includes logic for filename sanitization, path traversal protection, and atomic file writing via temporary files. Review feedback highlights two security concerns: the need to sanitize API error messages for terminal escape sequences to prevent injection attacks, and the use of unpredictable temporary filenames to mitigate potential symlink attacks in shared directories.
…name
- Sanitize `message` and `reason` fields extracted from Google API JSON
error responses to prevent terminal escape sequence injection from
malicious or compromised API responses.
- Add a random 64-bit hex suffix to the download temp filename
(`.{name}.{rand}.tmp`) to prevent symlink attacks in world-writable
output directories.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a new drive +download helper command to the Google Workspace CLI, enabling multi-step file downloads from Google Drive with support for exporting native Workspace files (Docs, Sheets, Slides). The implementation includes new error parsing logic in executor.rs and comprehensive path sanitization to prevent traversal and terminal injection. Feedback focuses on architectural consistency: the new error handling should include authentication hints for 401/403 status codes, and the handle_download function should be refactored to avoid hardcoded URLs and scopes, ensuring it respects the Discovery Document and includes necessary quota headers.
| pub fn api_error_from_response(status: reqwest::StatusCode, body: &str) -> GwsError { | ||
| if let Ok(error_json) = serde_json::from_str::<Value>(body) { | ||
| if let Some(err_obj) = error_json.get("error") { | ||
| let code = err_obj | ||
| .get("code") | ||
| .and_then(|c| c.as_u64()) | ||
| .unwrap_or(status.as_u16() as u64) as u16; | ||
| let message = err_obj | ||
| .get("message") | ||
| .and_then(|m| m.as_str()) | ||
| .map(sanitize_for_terminal) | ||
| .unwrap_or_else(|| "Unknown error".to_string()); | ||
| let reason = err_obj | ||
| .get("errors") | ||
| .and_then(|e| e.as_array()) | ||
| .and_then(|arr| arr.first()) | ||
| .and_then(|e| e.get("reason")) | ||
| .and_then(|r| r.as_str()) | ||
| .or_else(|| err_obj.get("reason").and_then(|r| r.as_str())) | ||
| .map(sanitize_for_terminal) | ||
| .unwrap_or_else(|| "unknown".to_string()); | ||
| let enable_url = if reason == "accessNotConfigured" { | ||
| extract_enable_url(&message) | ||
| } else { | ||
| None | ||
| }; | ||
| return GwsError::Api { | ||
| code, | ||
| message, | ||
| reason, | ||
| enable_url, | ||
| }; | ||
| } | ||
| } | ||
| GwsError::Api { | ||
| code: status.as_u16(), | ||
| message: crate::output::sanitize_for_terminal(body), | ||
| reason: "httpError".to_string(), | ||
| enable_url: None, | ||
| } | ||
| } |
There was a problem hiding this comment.
The new api_error_from_response function duplicates nearly all the logic from the existing (private) handle_error_response function but omits the specialized handling for 401 Unauthorized and 403 Forbidden errors that provide helpful authentication hints to the user (e.g., suggesting gws auth login).
Recommendation: Update the new api_error_from_response to include these authentication hints. While refactoring the existing handle_error_response to use a shared function would ensure consistency, please focus on making the new implementation complete and correct first to avoid introducing changes outside the primary goal of this pull request.
References
- Avoid introducing changes that are outside the primary goal of a pull request to prevent scope creep.
| async fn handle_download(matches: &ArgMatches) -> Result<(), GwsError> { | ||
| use futures_util::StreamExt; | ||
| use tokio::io::AsyncWriteExt; | ||
|
|
||
| let file_id = | ||
| crate::validate::validate_resource_name(matches.get_one::<String>("file").unwrap())?; | ||
| let output_arg = matches.get_one::<String>("output"); | ||
| let export_mime: Option<String> = matches.get_one::<String>("mime-type").cloned(); | ||
| let dry_run = matches.get_flag("dry-run"); | ||
|
|
||
| // Validate export mime-type for dangerous characters if provided | ||
| if let Some(mime) = &export_mime { | ||
| crate::validate::reject_dangerous_chars(mime, "--mime-type")?; | ||
| } | ||
|
|
||
| // 1. Dry-run: short-circuit before any auth or network I/O, consistent with | ||
| // how +upload handles --dry-run (auth is attempted optionally, then skipped). | ||
| if dry_run { | ||
| let out_display = output_arg.map(|s| s.as_str()).unwrap_or("<Drive filename>"); | ||
| println!( | ||
| "{}", | ||
| serde_json::to_string_pretty(&json!({ | ||
| "dry_run": true, | ||
| "file_id": file_id, | ||
| "output": out_display, | ||
| "export_mime_type": export_mime, | ||
| })) | ||
| .unwrap_or_default() | ||
| ); | ||
| return Ok(()); | ||
| } | ||
|
|
||
| let scope = "https://www.googleapis.com/auth/drive.readonly"; | ||
| let token = auth::get_token(&[scope]) | ||
| .await | ||
| .map_err(|e| GwsError::Auth(format!("Drive auth failed: {e}")))?; | ||
|
|
||
| let client = crate::client::build_client()?; | ||
|
|
||
| // 2. Fetch file metadata to get name and MIME type | ||
| let metadata_url = format!( | ||
| "https://www.googleapis.com/drive/v3/files/{}", | ||
| crate::validate::encode_path_segment(file_id), | ||
| ); | ||
| let meta_resp = crate::client::send_with_retry(|| { | ||
| client | ||
| .get(&metadata_url) | ||
| .query(&[("fields", "name,mimeType")]) | ||
| .bearer_auth(&token) | ||
| }) | ||
| .await | ||
| .map_err(|e| GwsError::Other(anyhow::anyhow!("Failed to fetch file metadata: {e}")))?; | ||
|
|
There was a problem hiding this comment.
The handle_download implementation bypasses several core CLI architectural patterns, which leads to the following issues:
- Hardcoded URLs: Lines 220, 302, and 324 use hardcoded Google API strings. This ignores the
RestDescription(Discovery Document) and will break if the CLI is configured with a customroot_url(e.g., for a proxy, VPC Service Control, or alternative environment). - Hardcoded Scopes: Line 212 hardcodes the
drive.readonlyscope. It should instead resolve the required scopes from the Discovery Document for thefiles.getandfiles.exportmethods to ensure compatibility with different organizational policies. - Missing Quota Header: The manual requests do not include the
x-goog-user-projectheader. This header is critical for billing and quota attribution, especially when using Service Accounts. Standard requests inexecutor.rsinclude this automatically viacrate::auth::get_quota_project().
Recommendation: Update handle_download to accept the RestDescription (doc) as an argument and use it to resolve method paths, scopes, and base URLs. Ensure the x-goog-user-project header is added to all RequestBuilder instances.
Description
Adds a
drive +downloadhelper that downloads a Drive file to a local path. This is a multi-step orchestration helper that cannot be expressed as a single Discovery API call:files.get?fields=name,mimeTypeto determine the file's name and typeapplication/vnd.google-apps.*) →files.export?mimeType=<type>, requiring a caller-supplied--mime-typefiles.get?alt=mediaThis complements the existing
+uploadhelper and satisfies the helper justification criteria in AGENTS.md (multi-step orchestration, format translation for native files).Input validation:
--file(Drive file ID):validate_resource_name()— rejects traversal/injection--output(local path):validate_safe_file_path()— rejects path traversal outside CWD--mime-type:reject_dangerous_chars()— rejects control charactersUsage:
Dry Run Output:
Checklist:
AGENTS.mdguidelines (no generatedgoogle-*crates).cargo fmt --allto format the code perfectly.cargo clippy -- -D warningsand resolved all warnings.pnpx changeset) to document my changes.