Description
In scripts/1-fetch/github_fetch.py, the script fetches public repository counts for various Creative Commons tools. When parsing the GitHub API search response, the code expects to find "total_count" in the returned JSON object:
count = search_data["total_count"]
However, if the GitHub API returns an error response (such as "API Rate Limit Exceeded" or "Invalid Token" / "Bad credentials"), "total_count" will not exist in the response payload. As a result, the script encounters a KeyError on "total_count", which is caught by:
except KeyError as e:
raise shared.QuantifyingException(f"KeyError: {e}", 1)
This swallows the actual error details returned by the GitHub API (e.g., in the "message" field), making debugging extremely difficult without manually outputting the API response payload.
Reproduction
- Run
python scripts/1-fetch/github_fetch.py using a rate-limited network or an invalid/expired GH_TOKEN.
- The GitHub API returns a JSON response lacking the
"total_count" key (e.g., containing "message": "API rate limit exceeded..." or "message": "Bad credentials").
- The script throws
KeyError: 'total_count', which is converted into QuantifyingException("KeyError: 'total_count'", 1).
- Observe the script exiting with a generic
KeyError: 'total_count' exception, concealing the actual API rate limiting or authentication message.
Expectation
The script should check for GitHub API error messages (like "message" or other error indicators in the JSON response) before attempting to access "total_count", logging the specific error payload from GitHub to provide clarity on why the request failed.
Description
In
scripts/1-fetch/github_fetch.py, the script fetches public repository counts for various Creative Commons tools. When parsing the GitHub API search response, the code expects to find"total_count"in the returned JSON object:count = search_data["total_count"]However, if the GitHub API returns an error response (such as "API Rate Limit Exceeded" or "Invalid Token" / "Bad credentials"),
"total_count"will not exist in the response payload. As a result, the script encounters aKeyErroron"total_count", which is caught by:This swallows the actual error details returned by the GitHub API (e.g., in the
"message"field), making debugging extremely difficult without manually outputting the API response payload.Reproduction
python scripts/1-fetch/github_fetch.pyusing a rate-limited network or an invalid/expiredGH_TOKEN."total_count"key (e.g., containing"message": "API rate limit exceeded..."or"message": "Bad credentials").KeyError: 'total_count', which is converted intoQuantifyingException("KeyError: 'total_count'", 1).KeyError: 'total_count'exception, concealing the actual API rate limiting or authentication message.Expectation
The script should check for GitHub API error messages (like
"message"or other error indicators in the JSON response) before attempting to access"total_count", logging the specific error payload from GitHub to provide clarity on why the request failed.