Skip to content

daemon/rule: stop mutating regex Data field on Compile (#1587)#1606

Open
tredondo wants to merge 1 commit into
evilsocket:masterfrom
tredondo:fix/regexp-data-mutation-case-insensitive
Open

daemon/rule: stop mutating regex Data field on Compile (#1587)#1606
tredondo wants to merge 1 commit into
evilsocket:masterfrom
tredondo:fix/regexp-data-mutation-case-insensitive

Conversation

@tredondo
Copy link
Copy Markdown
Contributor

Fixes #1587.

Compile() for Regexp operators with Sensitive==false used to do

o.Data = strings.ToLower(o.Data)

before compiling the pattern. That works as an ASCII-only way to make matching case-insensitive (pattern and subject both lowercased), but it mangles the pattern string itself. A regex like [0-9A-Za-z]+ becomes [0-9a-za-z]+: still valid, but visibly wrong, and the mangled value then bleeds into:

  • the rule JSON written to disk by loader.Save (json.MarshalIndent serializes the post-Compile o.Data)
  • the GUI display of the operator data and the slugified rule name

This is most visible on the auto-generated rules for AppImages, whose process.path lives under /tmp/.mount__/, producing the [0-9a-za-z]+ duplication the user can see in the popup and on disk.

Switch to Go regexp's native case-insensitivity instead: prepend the (?i) flag to a local pattern string and compile that. o.Data is no longer touched by Compile, so the original pattern survives both serialization and the round-trip back to the UI. reCmp can drop its matching strings.ToLower(data) call, since the compiled regex now handles case folding itself.

Existing rules on disk that already have a lowercased pattern keep working: (?i) over a [0-9a-z]+-style class is equivalent to the previous lowercase-both-sides behavior, and case-sensitive rules are unaffected (no (?i) is added).

A regression test (TestNewOperatorRegexpDataNotMutated) pins both the pattern-preservation contract and case-insensitive matching against an appimage-style path.

Compile() for Regexp operators with Sensitive==false used to do

    o.Data = strings.ToLower(o.Data)

before compiling the pattern. That works as an ASCII-only way to make
matching case-insensitive (pattern and subject both lowercased), but it
mangles the pattern string itself. A regex like [0-9A-Za-z]+ becomes
[0-9a-za-z]+: still a valid character class that matches the same set
of characters, but visibly wrong, and the mangled value then bleeds
into:

  - the rule JSON written to disk by loader.Save (json.MarshalIndent
    serializes the post-Compile o.Data)
  - the GUI display of the operator data and the slugified rule name

This is most visible on the auto-generated rules for AppImages, whose
process.path lives under /tmp/.mount_<name>_<random>/, producing the
[0-9a-za-z]+ duplication the user can see in the popup and on disk.

Switch to Go regexp's native case-insensitivity instead: prepend the
(?i) flag to a local pattern string and compile that. o.Data is no
longer touched by Compile, so the original pattern survives both
serialization and the round-trip back to the UI. reCmp can drop its
matching strings.ToLower(data) call, since the compiled regex now
handles case folding itself.

Existing rules on disk that already have a lowercased pattern keep
working: (?i) over a [0-9a-z]+-style class is equivalent to the
previous lowercase-both-sides behavior, and case-sensitive rules are
unaffected (no (?i) is added).

A regression test (TestNewOperatorRegexpDataNotMutated) pins both the
pattern-preservation contract and case-insensitive matching against an
appimage-style path.

Fixes evilsocket#1587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug Report] Auto-generated regexps for appimage /tmp paths duplicate the a-z character class

1 participant