Enhance interpolateParams to correctly handle placeholders#1732
Enhance interpolateParams to correctly handle placeholders#1732rusher wants to merge 4 commits intogo-sql-driver:masterfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughReplaced placeholder interpolation with a byte-wise state machine that only substitutes Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
connection.go (2)
248-271: Consider using idiomatic Go constant definitions.The byte constants use runtime conversions and non-standard naming. Consider defining them as proper constants:
-var ( - QUOTE_BYTE = byte('\'') - DBL_QUOTE_BYTE = byte('"') - BACKSLASH_BYTE = byte('\\') - QUESTION_MARK_BYTE = byte('?') - SLASH_BYTE = byte('/') - STAR_BYTE = byte('*') - HASH_BYTE = byte('#') - MINUS_BYTE = byte('-') - LINE_FEED_BYTE = byte('\n') - RADICAL_BYTE = byte('`') -) +const ( + quoteByte = '\'' + dblQuoteByte = '"' + backslashByte = '\\' + questionMarkByte = '?' + slashByte = '/' + starByte = '*' + hashByte = '#' + minusByte = '-' + lineFeedByte = '\n' + backtickByte = '`' +)
272-274: Simplify slice initialization.The slice initialization can be simplified for better readability:
-paramPositions := make([]int, 0) +var paramPositions []intconnection_test.go (1)
206-214: Consider adding edge case test coverage.While the current test cases are comprehensive, consider adding tests for additional edge cases:
// Escaped quotes in strings {"SELECT 'don\\'t', ?", []driver.Value{int64(42)}, "SELECT 'don\\'t', 42", false}, // Double-quoted strings {"SELECT \"?\", ?", []driver.Value{int64(42)}, "SELECT \"?\", 42", false}, // Nested quotes {"SELECT \"It's a string\", ?", []driver.Value{int64(42)}, "SELECT \"It's a string\", 42", false}, // Comment without newline at end {"SELECT ? -- comment", []driver.Value{int64(42)}, "SELECT 42 -- comment", false},
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
connection.go(5 hunks)connection_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: shogo82148
PR: go-sql-driver/mysql#1576
File: CHANGELOG.md:1-6
Timestamp: 2024-07-26T20:55:56.049Z
Learning: PRs #1562 and #1570 in the go-sql-driver/mysql repository collectively address the same underlying issue related to fixing a race condition when a context is canceled, as documented in the CHANGELOG.md for version 1.8.1.
connection.go (1)
Learnt from: shogo82148
PR: go-sql-driver/mysql#1576
File: CHANGELOG.md:1-6
Timestamp: 2024-07-26T20:55:56.049Z
Learning: PRs #1562 and #1570 in the go-sql-driver/mysql repository collectively address the same underlying issue related to fixing a race condition when a context is canceled, as documented in the CHANGELOG.md for version 1.8.1.
🧬 Code Graph Analysis (1)
connection_test.go (1)
dsn.go (1)
Config(37-84)
⏰ Context from checks skipped due to timeout of 90000ms (20)
- GitHub Check: test (windows-latest, 1.24, mariadb-10.11)
- GitHub Check: test (windows-latest, 1.24, mariadb-10.6)
- GitHub Check: test (windows-latest, 1.24, mariadb-11.2)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.11)
- GitHub Check: test (macos-latest, 1.24, mariadb-10.11)
- GitHub Check: test (macos-latest, 1.24, mariadb-10.6)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.5)
- GitHub Check: test (windows-latest, 1.24, 9.0)
- GitHub Check: test (macos-latest, 1.24, mariadb-11.2)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-10.6)
- GitHub Check: test (macos-latest, 1.24, 9.0)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.1)
- GitHub Check: test (macos-latest, 1.24, 8.0)
- GitHub Check: test (ubuntu-latest, 1.24, 5.7)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.4)
- GitHub Check: test (ubuntu-latest, 1.24, 9.0)
- GitHub Check: test (ubuntu-latest, 1.24, mariadb-11.2)
- GitHub Check: test (ubuntu-latest, 1.24, 8.4)
- GitHub Check: lint
- GitHub Check: Analyze (go)
🔇 Additional comments (5)
connection.go (4)
248-341: Excellent state machine implementation for SQL parsing.The
findParamPositionsfunction implements a comprehensive state machine that correctly handles various SQL contexts including comments (--,#,/* */), string literals, backticks, and escape sequences. The logic properly tracks state transitions and accurately identifies real parameter placeholders while ignoring those inside comments or strings.The handling of the
noBackslashEscapesflag is particularly well done, respecting MySQL's SQL mode settings.
344-348: Improved parameter counting logic.The change from simple character counting to using
findParamPositionsis a significant improvement. The function now correctly validates that the number of real parameters matches the number of arguments before attempting interpolation.
400-404: Simplified and consistent escaping logic.The refactored escaping logic using the
noBackslashEscapesflag is cleaner and more consistent. The conditional logic is now easier to understand and maintain across different data types (JSON, bytes, strings).Also applies to: 411-415, 420-424
362-435: Correct implementation of position-based parameter replacement.The new interpolation loop correctly processes parameters based on their actual positions in the query rather than simple sequential replacement. The
lastIdxtracking ensures that query segments are properly appended between parameter substitutions.connection_test.go (1)
190-236: Comprehensive test coverage for enhanced parameter interpolation.The
TestInterpolateParamsWithCommentstest provides excellent coverage of the new parameter parsing logic. It correctly tests:
- Single-line comments with
--and#- Multi-line comments with
/* */- String literals with single quotes
- Backtick identifiers
- Complex queries with multiple comment types
- Error cases where parameter count doesn't match argument count
The test case on line 216 correctly expects
driver.ErrSkipbecause there are 4 real placeholders but only 3 arguments provided.
|
Please run BenchmarkInterpolation and write the results. Is this algorithm identical to the MySQL or MariaDB? And are there examples of this algorithm implemented in other database drivers? |
…s with comments, strings, and backticks. * Add `findParamPositions` to identify real parameter positions * Update and expand related tests.
|
force push implementation with performance improvement. current implementation : initial proposal: new proposal: There is only one loop now, and escaping methods for parameters have been improved. about other driver using that :
|
|
I hope this is included fast, since it's a security risk. proper client side statement must be handled well. This is particulary dangerous when for example, a query like |
|
is there anything to do still to be merged ? |
|
time. I plan to release v1.9.4 next month, followed by creating a branch for v1.9 and beginning to incorporate changes for v1.10 into the master branch. |
There was a problem hiding this comment.
Pull request overview
This PR improves client-side parameter interpolation so that ? placeholders are only substituted when they appear in real SQL parameter positions (not inside comments, string literals, or backtick-quoted identifiers), and updates escaping helpers/tests accordingly.
Changes:
- Reworked
mysqlConn.interpolateParamsto scan SQL and ignore?inside comments/strings/backticks. - Updated escaping helpers to consistently emit quoted literals (and
_binary'...'for byte slices) and expanded tests. - Added new interpolation tests covering comments, strings, backticks, and escaped strings.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| connection.go | Adds a state-machine parser for interpolateParams to avoid substituting ? inside comments/strings/backticks. |
| connection_test.go | Adds new tests validating interpolation behavior with comments/strings/backticks and removes the prior “skip” expectation for ? in strings. |
| utils.go | Refactors escaping to include proper quoting and optional _binary prefix; adds a shared backslash escape lookup table. |
| utils_test.go | Updates/expands tests for new escaping behavior, including binary _binary'...' cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| buf = append(buf, '\'') | ||
| case MINUS_BYTE: | ||
| if state == stateNormal && lastChar == MINUS_BYTE { |
There was a problem hiding this comment.
-- single-line comments in MySQL only start when the second - is followed by whitespace/control (e.g. -- ). The current check enters stateEOLComment whenever two consecutive - appear, which will misparse expressions like SELECT 1--1, ? as a comment and can cause placeholders to be skipped/ErrSkip. Update the detection to require a following whitespace/control character (or end-of-input) before switching to stateEOLComment.
| if state == stateNormal && lastChar == MINUS_BYTE { | |
| if state == stateNormal && lastChar == MINUS_BYTE && | |
| (i+1 == lenQuery || query[i+1] <= ' ') { |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Inada Naoki <songofacandy@gmail.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
connection.go (1)
259-270: Preferconstwith idiomatic Go naming for these byte literals.These single-byte literals never change, so they should be
const, notvar. Go style also preferscamelCase(or a single untyped char literal used inline) overSCREAMING_SNAKE_CASE. Declaring them asvarforces heap/stack allocations and locks you out of use in constant contexts.♻️ Suggested refactor
- var ( - QUOTE_BYTE = byte('\'') - DBL_QUOTE_BYTE = byte('"') - BACKSLASH_BYTE = byte('\\') - QUESTION_MARK_BYTE = byte('?') - SLASH_BYTE = byte('/') - STAR_BYTE = byte('*') - HASH_BYTE = byte('#') - MINUS_BYTE = byte('-') - LINE_FEED_BYTE = byte('\n') - BACKTICK_BYTE = byte('`') - ) + const ( + quoteByte byte = '\'' + dblQuoteByte byte = '"' + backslashByte byte = '\\' + questionMarkByte byte = '?' + slashByte byte = '/' + starByte byte = '*' + hashByte byte = '#' + minusByte byte = '-' + lineFeedByte byte = '\n' + backtickByte byte = '`' + )(References further down in the function would need to be updated accordingly.)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@connection.go` around lines 259 - 270, The byte literals declared as vars (QUOTE_BYTE, DBL_QUOTE_BYTE, BACKSLASH_BYTE, QUESTION_MARK_BYTE, SLASH_BYTE, STAR_BYTE, HASH_BYTE, MINUS_BYTE, LINE_FEED_BYTE, BACKTICK_BYTE) should be converted to untyped package-level consts and renamed to idiomatic camelCase (e.g., quoteByte, dblQuoteByte, backslashByte, questionMarkByte, slashByte, starByte, hashByte, minusByte, lineFeedByte, backtickByte); change the declarations from var to const and update all references in this file/function to use the new constant names so they can be used in constant contexts and avoid unnecessary allocations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@connection.go`:
- Around line 305-308: The parser treats a second dash as starting an EOL
comment unconditionally; update the branch handling MINUS_BYTE (where it checks
state == stateNormal && lastChar == MINUS_BYTE) to peek the next byte (or detect
EOF) and only transition to stateEOLComment when that following byte is
whitespace/control (space, tab, newline, other control) or EOF; otherwise leave
it as normal SQL (so placeholders like ? are not skipped and argPos remains
correct, avoiding driver.ErrSkip). Locate the MINUS_BYTE case in the state
machine in connection.go and add the lookahead/whitespace-or-EOF check before
setting state = stateEOLComment.
---
Nitpick comments:
In `@connection.go`:
- Around line 259-270: The byte literals declared as vars (QUOTE_BYTE,
DBL_QUOTE_BYTE, BACKSLASH_BYTE, QUESTION_MARK_BYTE, SLASH_BYTE, STAR_BYTE,
HASH_BYTE, MINUS_BYTE, LINE_FEED_BYTE, BACKTICK_BYTE) should be converted to
untyped package-level consts and renamed to idiomatic camelCase (e.g.,
quoteByte, dblQuoteByte, backslashByte, questionMarkByte, slashByte, starByte,
hashByte, minusByte, lineFeedByte, backtickByte); change the declarations from
var to const and update all references in this file/function to use the new
constant names so they can be used in constant contexts and avoid unnecessary
allocations.
🪄 Autofix (Beta)
❌ Autofix failed (check again to retry)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| case MINUS_BYTE: | ||
| if state == stateNormal && lastChar == MINUS_BYTE { | ||
| state = stateEOLComment | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
MySQL double-dash comment syntax whitespace requirement documentation
💡 Result:
Yes, MySQL requires a whitespace character (such as space, tab, or newline) immediately after the second dash in the double-dash (--) comment syntax for it to be recognized as a comment. Without it, -- is not treated as a comment start, allowing expressions like balance--1 to be parsed correctly as subtraction rather than commenting out the rest of the line. This is explicitly stated in official MySQL documentation across versions, including the current MySQL 8.4 and 9.x manuals: - In the Comments section: "In MySQL, the -- (double-dash) comment style requires the second dash to be followed by at least one whitespace or control character, such as a space or tab." - In the Restrictions section: "the -- start-comment sequence is accepted as such, but must be followed by a whitespace character such as a space or newline." The requirement prevents issues in generated SQL, like UPDATE balance=balance--1 being misinterpreted without the space. Examples: - Valid: SELECT 1 -- comment - Invalid as comment: SELECT 1--comment (treated as identifier or operator)
Citations:
- 1: https://dev.mysql.com/doc/mysql-reslimits-excerpt/8.0/en/ansi-diff-comments.html
- 2: https://dev.mysql.com/doc/refman/9.5/en/comments.html
- 3: https://dev.mysql.com/doc/refman/en/comments.html
- 4: https://dev.mysql.com/doc/en/comments.html
🏁 Script executed:
fd -t f "connection.go" --type fRepository: go-sql-driver/mysql
Length of output: 76
🏁 Script executed:
cat -n connection.go | sed -n '295,320p'Repository: go-sql-driver/mysql
Length of output: 905
🏁 Script executed:
cat -n connection.go | head -100 | tail -40Repository: go-sql-driver/mysql
Length of output: 1562
🏁 Script executed:
cat -n connection.go | sed -n '200,250p'Repository: go-sql-driver/mysql
Length of output: 1531
🏁 Script executed:
cat -n connection.go | sed -n '248,350p'Repository: go-sql-driver/mysql
Length of output: 3328
🏁 Script executed:
cat -n connection.go | sed -n '350,380p'Repository: go-sql-driver/mysql
Length of output: 1079
🏁 Script executed:
cat -n connection.go | sed -n '380,420p'Repository: go-sql-driver/mysql
Length of output: 1190
-- comment detection requires whitespace check after second dash to match MySQL/MariaDB behavior.
MySQL/MariaDB only recognize -- as an end-of-line comment when the second dash is followed by whitespace (space, tab, control character), or EOF. Without this check, expressions like SELECT 5--2 FROM t WHERE x = ? are incorrectly parsed: the --2 FROM... is treated as a comment, the ? placeholder is skipped, argPos ends up less than len(args), and the function returns driver.ErrSkip. The query executes via server-side prepare, but you lose the client-side parameter interpolation fast path.
Add a whitespace/control/EOF check after the second dash before transitioning to stateEOLComment.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@connection.go` around lines 305 - 308, The parser treats a second dash as
starting an EOL comment unconditionally; update the branch handling MINUS_BYTE
(where it checks state == stateNormal && lastChar == MINUS_BYTE) to peek the
next byte (or detect EOF) and only transition to stateEOLComment when that
following byte is whitespace/control (space, tab, newline, other control) or
EOF; otherwise leave it as normal SQL (so placeholders like ? are not skipped
and argPos remains correct, avoiding driver.ErrSkip). Locate the MINUS_BYTE case
in the state machine in connection.go and add the lookahead/whitespace-or-EOF
check before setting state = stateEOLComment.
Co-authored-by: Inada Naoki <songofacandy@gmail.com>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
connection.go (1)
293-308:⚠️ Potential issue | 🟡 MinorTighten paired comment delimiter detection.
Using
lastCharacross state transitions can misclassify valid SQL. After closing a block comment,lastCharremains/, soSELECT 1/**/*?is parsed as entering a second block comment at the*operator and the real placeholder is skipped. The same branch still treats every--as a comment start; MySQL/MariaDB require whitespace/control/EOF after the second dash, which was already flagged in an earlier review.Suggested targeted fix
+ isSpaceOrControl := func(c byte) bool { + return c <= ' ' + } + for i := 0; i < lenQuery; i++ { currentChar := query[i] @@ case SLASH_BYTE: if state == stateSlashStarComment && lastChar == STAR_BYTE { state = stateNormal + lastChar = 0 + continue } @@ case MINUS_BYTE: - if state == stateNormal && lastChar == MINUS_BYTE { + if state == stateNormal && lastChar == MINUS_BYTE && + (i+1 == lenQuery || isSpaceOrControl(query[i+1])) { state = stateEOLComment }MySQL documentation double dash comment syntax requires whitespace or control character after second dashAlso applies to: 412-412
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@connection.go` around lines 293 - 308, The parser is misclassifying delimiters because it relies on a stale lastChar across state transitions (e.g., STAR_BYTE/SLASH_BYTE pair and MINUS_BYTE handling), so change the logic to use an explicit previous-byte (or peek) directly adjacent to the current byte rather than a potentially stale lastChar: for the STAR_BYTE/SLASH_BYTE pair ensure you check the immediately preceding byte only when in stateNormal and when entering/exiting stateSlashStarComment reset or update the previous-byte so it cannot spuriously trigger on the next token (e.g., when closing a block comment clear prev/last byte); for MINUS_BYTE handling (the "--" case) only treat it as stateEOLComment when the second dash is followed by whitespace/control/EOF (implement a one-byte lookahead or a state that remembers a single leading dash and validates the next char) and similarly keep HASH_BYTE behavior unchanged except using the explicit prev/peek logic instead of relying on lastChar.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@connection.go`:
- Around line 293-308: The parser is misclassifying delimiters because it relies
on a stale lastChar across state transitions (e.g., STAR_BYTE/SLASH_BYTE pair
and MINUS_BYTE handling), so change the logic to use an explicit previous-byte
(or peek) directly adjacent to the current byte rather than a potentially stale
lastChar: for the STAR_BYTE/SLASH_BYTE pair ensure you check the immediately
preceding byte only when in stateNormal and when entering/exiting
stateSlashStarComment reset or update the previous-byte so it cannot spuriously
trigger on the next token (e.g., when closing a block comment clear prev/last
byte); for MINUS_BYTE handling (the "--" case) only treat it as stateEOLComment
when the second dash is followed by whitespace/control/EOF (implement a one-byte
lookahead or a state that remembers a single leading dash and validates the next
char) and similarly keep HASH_BYTE behavior unchanged except using the explicit
prev/peek logic instead of relying on lastChar.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 2c737be8-d927-4233-8aa3-b09fc3edd6f9
📒 Files selected for processing (2)
connection.goconnection_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- connection_test.go
| var ( | ||
| QUOTE_BYTE = byte('\'') | ||
| DBL_QUOTE_BYTE = byte('"') | ||
| BACKSLASH_BYTE = byte('\\') | ||
| QUESTION_MARK_BYTE = byte('?') | ||
| SLASH_BYTE = byte('/') | ||
| STAR_BYTE = byte('*') | ||
| HASH_BYTE = byte('#') | ||
| MINUS_BYTE = byte('-') | ||
| LINE_FEED_BYTE = byte('\n') | ||
| BACKTICK_BYTE = byte('`') | ||
| ) |
methane
left a comment
There was a problem hiding this comment.
please fix 2--1 and /* ... */* ... issues.
Enhance client side statement to correctly handle placeholders in queries with comments, strings, and backticks.
Add findParamPositions to identify real parameter positions
Update and expand related tests.