Fix "JSON.generate: UTF-8 string passed as BINARY" warning for text attachments#762
Open
andreaslillebo wants to merge 3 commits into
Open
Fix "JSON.generate: UTF-8 string passed as BINARY" warning for text attachments#762andreaslillebo wants to merge 3 commits into
andreaslillebo wants to merge 3 commits into
Conversation
…ttachments All four content-loading paths in Attachment return ASCII-8BIT-tagged strings (File.binread, binmode tempfile reads, ActiveStorage downloads, Faraday response bodies). For text/* attachments the bytes are valid UTF-8 but carry the binary tag, which propagates into the request body and trips json's deprecation warning (hard error in json 3.0). Re-tag the content as UTF-8 once the mime type is known to be text-like.
Disable Metrics/PerceivedComplexity inline on Attachment#content, matching the established pattern in lib/ruby_llm/error.rb, models.rb, and stream_accumulator.rb. Use described_class in the new spec.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #762 +/- ##
=======================================
Coverage 87.05% 87.05%
=======================================
Files 119 119
Lines 5594 5595 +1
Branches 1407 1408 +1
=======================================
+ Hits 4870 4871 +1
Misses 724 724 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix "JSON.generate: UTF-8 string passed as BINARY" warning for text attachments
Problem
Sending a text attachment with non-ASCII content trips a deprecation warning that becomes a hard error in json 3.0:
Cause
The path the bytes take from
hei.txttoJSON.generate:File.binreadreads bytes as ASCII-8BIT. The tag rides throughfor_llm's string interpolation into the request body, whereJSON.generaterejects it. The other three content loaders (load_content_from_io,load_content_from_active_storage,fetch_content) hit the same chain.Solution
Re-tag the content as UTF-8 once the mime type is known to be text-like:
Test
Adds a non-ASCII fixture (
spec/fixtures/multilingual.txt) and a regression test:The test fails before the fix with the json gem's deprecation warning, and passes after.