Skip to content

Fix/chinadoi fallback support#23

Merged
wenh06 merged 6 commits intomasterfrom
fix/chinadoi-fallback-support
Feb 22, 2026
Merged

Fix/chinadoi fallback support#23
wenh06 merged 6 commits intomasterfrom
fix/chinadoi-fallback-support

Conversation

@wenh06
Copy link
Copy Markdown
Contributor

@wenh06 wenh06 commented Jan 29, 2026

This PR addresses the issue where DOIs registered via ChinaDOI (e.g., CNKI publications like 10.13748/j.cnki...) failed to resolve. These sources often do not support application/x-bibtex content negotiation and return 403 Forbidden or redirect to an intermediate HTML page.

The streamlit app is also updated along with this change.

Copilot AI review requested due to automatic review settings January 29, 2026 09:06
@wenh06 wenh06 enabled auto-merge January 29, 2026 09:06
@wenh06 wenh06 self-assigned this Jan 29, 2026
@wenh06 wenh06 added bug Something isn't working enhancement New feature or request labels Jan 29, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 88.00000% with 6 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (master@b45ba29). Learn more about missing BASE report.
⚠️ Report is 7 commits behind head on master.

Files with missing lines Patch % Lines
bib_lookup/bib_lookup.py 88.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master      #23   +/-   ##
=========================================
  Coverage          ?   97.04%           
=========================================
  Files             ?        8           
  Lines             ?     1254           
  Branches          ?        0           
=========================================
  Hits              ?     1217           
  Misses            ?       37           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fallback path for DOIs that don’t support application/x-bibtex content negotiation (notably ChinaDOI/CNKI), and updates the Streamlit UI to surface a manual-visit link when automatic BibTeX lookup isn’t possible.

Changes:

  • Switch DOI resolution to use GET + content validation for BibTeX responses, with a browser-header redirect probe for ChinaDOI/CNKI targets.
  • Add Streamlit handling for “manual check required” lookup results (warn + link to resolved article page).
  • Add an integration test case for a ChinaDOI-style DOI and document the fix in the changelog.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File Description
bib_lookup/bib_lookup.py Implements GET-based DOI lookup and a redirect-fallback message path for ChinaDOI/CNKI.
streamlit_app.py Detects redirect-flag results and shows a warning + link button instead of treating it as BibTeX.
test/test_doi_bib_lookup.py Adds coverage for the new ChinaDOI fallback behavior.
CHANGELOG.rst Documents the ChinaDOI/CNKI fallback fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread bib_lookup/bib_lookup.py Outdated
Comment thread bib_lookup/bib_lookup.py
Comment on lines +651 to +674
# Standard Attempt: Use API (Crossref/DataCite)
try:
r = self.session.post(**feed_content)
res = r.content.decode("utf-8")
except requests.Timeout:
res = self.timeout_err
except requests.RequestException:
res = self.network_err
if self.verbose > 3:
print_func(f"via `_handle_doi`, fetched content = {res}")
return res
r = self.session.get(**feed_content)

if r.status_code == 200:
res = r.content.decode("utf-8")

if is_requesting_bibtex:
# If we asked for BibTeX, strictly check if it looks like BibTeX (starts with @)
# This filters out ChinaDOI's HTML pages when they return status 200.
if res.strip().startswith("@"):
if self.verbose > 3:
print_func(f"via `_handle_doi`, fetched content = {res}")
return res
else:
# If we asked for Text/XML/RIS, trust the 200 OK response.
# This fixes the bug where format="text" was returning Network Error.
if self.verbose > 3:
print_func(f"via `_handle_doi`, fetched content = {res}")
return res

except (requests.Timeout, requests.RequestException):
pass

Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_handle_doi currently swallows requests.Timeout / RequestException (just pass) and later returns self.network_err, which loses the timeout vs network distinction and changes previous behavior. Also, by only decoding/returning the body on status_code == 200, responses like 404 that include "DOI Not Found" won’t reach _handle_network_error and may be misreported as Network Error. Consider: (1) returning self.timeout_err on Timeout and self.network_err on other RequestException; and (2) decoding the response body (or handling specific status codes) even when status_code != 200 so error mapping still works.

Copilot uses AI. Check for mistakes.
Comment thread bib_lookup/bib_lookup.py
Comment on lines +685 to +699
try:
r_probe = requests.get(
url, headers=browser_headers, timeout=feed_content.get("timeout", 10), allow_redirects=True
)

final_url = r_probe.url

# Check for ChinaDOI / CNKI
if "chndoi.org" in final_url or "cnki" in final_url:
msg = f"[Manual Check Required] Automatic lookup failed. Please visit: {final_url}"
if self.verbose >= 0:
print("-" * 60)
print(f"[!] {msg}")
print("-" * 60)
return msg
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback probe uses requests.get(...) instead of self.session.get(...), which bypasses the session configuration this class sets up (connection pooling/retries/proxies, and any user customization of self.session). It also prints unconditionally for the manual-check case because self.verbose >= 0 is always true at default verbosity, which will produce unexpected stdout in normal library usage. Use self.session.get(...) and gate printing behind the same verbosity convention used elsewhere (e.g., self.verbose >= 1).

Copilot uses AI. Check for mistakes.
Comment thread bib_lookup/bib_lookup.py Outdated
Comment thread bib_lookup/bib_lookup.py Outdated
print_func(f"via `_handle_doi`, fetched content = {res}")
return res

except (requests.Timeout, requests.RequestException):
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
Comment thread bib_lookup/bib_lookup.py
print("-" * 60)
return msg

except Exception:
Copy link

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
except Exception:
# Intentionally ignore all errors from this best-effort probe.
# If this browser-style request fails, we simply fall back to `self.network_err` below.

Copilot uses AI. Check for mistakes.
wenh06 and others added 4 commits February 22, 2026 09:42
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@wenh06 wenh06 disabled auto-merge February 22, 2026 07:21
@wenh06 wenh06 merged commit 99d3c16 into master Feb 22, 2026
10 checks passed
@wenh06 wenh06 deleted the fix/chinadoi-fallback-support branch February 22, 2026 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants