feat: implement remote gguf inspection on hugging face by pipe1os · Pull Request #39 · pipe1os/modelinfo-cli

pipe1os · 2026-06-27T15:16:57Z

Summary

This PR implements remote GGUF inspection for Hugging Face repositories, enabling the CLI to parse and calculate VRAM footprints for remote GGUF weights without downloading the full files.

Motivation & Context

Prior to this change, remote inspection on Hugging Face only supported SafeTensors format. Attempting to parse GGUF repositories resulted in a crash. This change introduces support for single and multi-quantization GGUF repositories, showing a comparison table of available files when no specific quantization is targeted.

Type of Change

New feature (non-breaking change which adds functionality)

How Has This Been Tested?

The changes were tested using pytest:

Unit tests added to tests/test_parsers.py to cover remote GGUF single, group, and targeted parsing with mocked HTTP responses.
Unit tests added to tests/test_cli.py to cover CLI routing and print_model_info group rendering.
Manual testing against live HF GGUF repository (bartowski/Meta-Llama-3-8B-Instruct-GGUF).
Unit tests
Integration tests
Manual testing

Screenshots (if appropriate)

N/A

Checklist

My code follows the code style of this project.
My commit messages follow the Conventional Commits format, are lowercase, imperative, and specific.
I have updated the documentation accordingly (if applicable).
I have added tests to cover my changes.
All new and existing tests passed.

coderabbitai · 2026-06-27T15:17:04Z

Warning

Review limit reached

@pipe1os, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 13 minutes and 41 seconds. Learn how PR review limits work.

To continue reviewing without waiting, enable usage-based billing in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8befa85b-8ade-48bb-a859-8228abd5c930

📥 Commits

Reviewing files that changed from the base of the PR and between f9f730c and 357ee16.

📒 Files selected for processing (7)

README.md
src/modelinfo/cli.py
src/modelinfo/parsers/gguf.py
src/modelinfo/parsers/huggingface.py
src/modelinfo/ui.py
tests/test_cli.py
tests/test_parsers.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch advisor/004-remote-gguf-inspection

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

codacy-production · 2026-06-27T15:17:44Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 72 complexity · 8 duplication

Metric Results

Complexity 72

Duplication 8

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

_{TIP This summary will be updated as you push new changes.}

codacy-production

Pull Request Overview

This PR successfully implements remote GGUF inspection for Hugging Face repositories using HTTP Range requests, which is a significant functional addition. However, the PR is currently not up to standards according to Codacy analysis.

There are two primary areas of concern: high complexity and potential stability issues. Specifically, fetch_huggingface_repo has seen a massive complexity increase (+37) that makes it harder to maintain. Furthermore, the RemoteFileStream implementation lacks buffer management, which could lead to high memory usage when encountering malformed headers. There are also implementation gaps in the UI where user-provided GPU utilization parameters are ignored in the comparison table logic. While error handling for gated models and missing repositories was implemented, it lacks automated test verification.

About this PR

The PR adds handling for 401 Unauthorized (gated models) and 404 Not Found scenarios, but these paths are not covered by the current test suite. Please add unit tests to verify the behavior of the remote parser when these HTTP errors occur.

1 comment outside of the diff

src/modelinfo/cli.py

_{line 133 🟡 MEDIUM RISK}
This function is too complex (16). It handles path resolution, remote repo detection, and multiple file format logic in one block. Refactoring this into a dispatcher pattern would improve maintainability.

Test suggestions

Remote GGUF single file header parsing via range-based streaming
Remote GGUF multi-file (group) repository detection and metadata aggregation
Targeting a specific quantization file within a remote Hugging Face repository
CLI UI rendering of the GGUF quantization comparison table with VRAM estimates
Handling of 401 Unauthorized (Gated models) and 404 Not Found (Missing repos) for GGUF paths

Prompt proposal for missing tests

Consider implementing these tests if applicable:
1. Handling of 401 Unauthorized (Gated models) and 404 Not Found (Missing repos) for GGUF paths

Low confidence findings

The UI uses a hardcoded 600MB overhead fallback for GGUF groups (line 82). This fixed value may lead to inaccurate 'Fits' calculations for models that deviate significantly from standard overhead patterns.

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

codacy-production · 2026-06-27T15:19:01Z

+            row_data = [filename, file_size_str, kv_cache_str, total_vram_str]
+            if show_fits:
+                utilization = total_vram_bytes / (max_vram_gb * 1024**3) if max_vram_gb > 0 else 2.0
+                if utilization <= 0.90:


_{🟡 MEDIUM RISK}

The 'Fits' column logic should use the 'gpu_util' parameter instead of a hardcoded 0.90 threshold to respect the user's CLI configuration.

codacy-production · 2026-06-27T15:19:01Z

+        pass
+
+
 def fetch_huggingface_repo(repo_id: str, fetch_tensors: bool = False, timeout: float = 10.0) -> Tuple[Dict[str, Any], Dict[str, Any] | None, str, float]:


_{🟡 MEDIUM RISK}

Suggestion: The function fetch_huggingface_repo has grown to ~150 lines and contains several branching paths for different repository types. Refactor this by splitting the logic into smaller private helper methods (e.g., _handle_gguf_repo, _handle_safetensors_repo) for each model format detection path.

codacy-production · 2026-06-27T15:19:01Z

+        self.url = url
+        self.chunk_size = chunk_size
+        self.timeout = timeout
+        self.buffer = b""


_{🟡 MEDIUM RISK}

Suggestion: The buffer in RemoteFileStream accumulates all read data and is never cleared. Since this stream is primarily used for header parsing, consider implementing a sliding window or limiting the maximum buffer size to prevent high memory usage on malformed files.

codacy-production · 2026-06-27T15:19:01Z

+    assert len(info["tensors"]["__metadata__"]["gguf_variants"]) == 2
+
+
+def test_print_model_info_gguf_group(capsys):


_{⚪ LOW RISK}

This test method exceeds the length limit (63 lines). Move the large tensors and footprint dictionary definitions into reusable pytest fixtures or a local helper function.

codacy-production · 2026-06-27T15:19:02Z

+            except Exception:
+                raise


_{⚪ LOW RISK}

Nitpick: This exception handler is redundant and can be removed as it only re-raises the exception.

codacy-production · 2026-06-27T15:19:02Z

-
+        url = f"{_get_hf_endpoint()}/{real_repo_id}/resolve/main/{target_filename}"
+        stream = RemoteFileStream(url, timeout=timeout)
+        from modelinfo.parsers.gguf import parse_gguf_header


_{⚪ LOW RISK}

Nitpick: The import of parse_gguf_header is duplicated multiple times. Move it to the top of the fetch_huggingface_repo function to keep the logic clean.

…er, add error tests

…xity

implement remote gguf inspection on hugging face

1b1a090

split print_model_info test to comply with codacy method size limit

71ef3a3

codacy-production Bot reviewed Jun 27, 2026

View reviewed changes

fix codacy issues: add read limit, honor gpu_util, modularize hf pars…

d0c5474

…er, add error tests

pipe1os changed the title ~~implement remote gguf inspection on hugging face~~ feat: implement remote gguf inspection on hugging face Jun 27, 2026

pipe1os self-assigned this Jun 27, 2026

pipe1os added 3 commits June 27, 2026 11:24

refactor: split concurrent shards fetching to lower cyclomatic comple…

bebe2c1

…xity

fix codacy issues: compute GGUF group variant overhead dynamically

6555e0e

docs: document remote gguf inspection options in README.md

357ee16

pipe1os merged commit 0e82634 into main Jun 27, 2026
11 checks passed

pipe1os deleted the advisor/004-remote-gguf-inspection branch June 27, 2026 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement remote gguf inspection on hugging face#39

feat: implement remote gguf inspection on hugging face#39
pipe1os merged 6 commits into
mainfrom
advisor/004-remote-gguf-inspection

pipe1os commented Jun 27, 2026

Uh oh!

coderabbitai Bot commented Jun 27, 2026 •

edited

Loading

Review limit reached

Uh oh!

codacy-production Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

codacy-production Bot left a comment

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		pass


		def fetch_huggingface_repo(repo_id: str, fetch_tensors: bool = False, timeout: float = 10.0) -> Tuple[Dict[str, Any], Dict[str, Any] \| None, str, float]:

		assert len(info["tensors"]["__metadata__"]["gguf_variants"]) == 2


		def test_print_model_info_gguf_group(capsys):

Conversation

pipe1os commented Jun 27, 2026

Summary

Motivation & Context

Type of Change

How Has This Been Tested?

Screenshots (if appropriate)

Checklist

Uh oh!

coderabbitai Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

codacy-production Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

About this PR

Test suggestions

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 27, 2026 •

edited

Loading

codacy-production Bot commented Jun 27, 2026 •

edited

Loading