From 7ad0557e5109f341f190cdb990e12cbb45f84179 Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 10 Apr 2026 12:00:59 +0000 Subject: [PATCH] Add man pages for strip-markup and sanitize-string https://claude.ai/code/session_01Wjn2KfitiA5iTADcLLfdbR --- man/sanitize-string.1.ronn | 82 ++++++++++++++++++++++++++++++++++++++ man/strip-markup.1.ronn | 76 +++++++++++++++++++++++++++++++++++ 2 files changed, 158 insertions(+) create mode 100644 man/sanitize-string.1.ronn create mode 100644 man/strip-markup.1.ronn diff --git a/man/sanitize-string.1.ronn b/man/sanitize-string.1.ronn new file mode 100644 index 00000000..b6ca9bf4 --- /dev/null +++ b/man/sanitize-string.1.ronn @@ -0,0 +1,82 @@ +sanitize-string(1) -- Strip markup and control characters from a string +======================================================================== + + + +## SYNOPSIS + +`sanitize-string [--help] max_length [string]` + +## DESCRIPTION + +`sanitize-string` combines the functionality of `strip-markup`(1) and +`stdisplay`(1) to fully sanitize an untrusted string by removing both +HTML markup tags and dangerous terminal control characters (such as ANSI +escape sequences). The result can be safely displayed in a terminal or +used in non-HTML text contexts. + +If a string is provided as the second positional argument, it is used +as the input. Otherwise, the string is read from standard input. + +The `max_length` argument specifies the maximum number of characters to +output. Set it to `nolimit` to allow arbitrarily long strings. When a +limit is set, the output is truncated to that many characters. + +### Sanitization order + +Sanitization is performed in three steps: + +1. Strip ANSI escape sequences and control characters (via `stdisplay`). +2. Strip HTML markup tags (via `strip-markup`). +3. Strip ANSI escape sequences and control characters again, in case + the markup stripping step decoded HTML entities into escape + characters. + +This ordering ensures that neither markup nor escape sequences can be +used to smuggle the other past the sanitizer. + +## RETURN VALUES + +* `0` Successfully sanitized and printed the result. +* `1` Usage error (missing or invalid arguments). + +## EXAMPLES + +Sanitize a string with no length limit: + + +sanitize-string nolimit '<b>Hello</b>' + + +Output: `Hello` + +Sanitize and truncate to 10 characters: + + +sanitize-string 10 'This is a long untrusted string.' + + +Output: `This is a ` + +Sanitize from standard input: + + +echo '<script>alert(1)</script>' | sanitize-string nolimit + + +Use `--` to separate options from positional arguments: + + +sanitize-string -- nolimit '--help' + + +## SEE ALSO + +strip-markup(1), stdisplay(1) + +## AUTHOR + +This man page has been written by Patrick Schleizer (adrelanos@whonix.org). diff --git a/man/strip-markup.1.ronn b/man/strip-markup.1.ronn new file mode 100644 index 00000000..85e240fc --- /dev/null +++ b/man/strip-markup.1.ronn @@ -0,0 +1,76 @@ +strip-markup(1) -- Strip HTML markup tags from a string +======================================================== + + + +## SYNOPSIS + +`strip-markup [--help] [string]` + +## DESCRIPTION + +`strip-markup` strips HTML markup tags from an untrusted string, +returning only the text content. It is intended to ensure that a string +will not be interpreted as HTML markup in isolation. + +If a string is provided as an argument, it is used as the input. +Otherwise, the string is read from standard input. + +HTML character references (such as `&`, `<`, `<`) are +decoded to their corresponding characters. + +### Double-strip protection + +`strip-markup` performs two consecutive strip passes over the input. If +the second pass further transforms the text, this indicates that the +first pass revealed new markup that was hidden inside nested tags (for +example, `<b>Bold</b>`). In this case, the tool treats the +input as malicious and replaces all `<`, `>`, and `&` characters in the +first-pass result with underscores (`_`), so that the neutered text is +visible to the user as a warning. + +### Scope + +`strip-markup` ensures that its output does not contain HTML tags. It +does **not** escape the output for safe embedding in HTML attributes or +other HTML contexts. If the output will be inserted into HTML, the +caller is responsible for applying appropriate context-specific +escaping. + +## RETURN VALUES + +* `0` Successfully stripped markup and printed the result. +* `1` Usage error. + +## EXAMPLES + +Strip tags from a string argument: + + +strip-markup '<p>Hello <b>world</b>.</p>' + + +Output: `Hello world.` + +Strip tags from standard input: + + +echo '<p>Hello</p>' | strip-markup + + +Use `--` to pass strings that start with `-`: + + +strip-markup -- '--help' + + +## SEE ALSO + +sanitize-string(1), stdisplay(1) + +## AUTHOR + +This man page has been written by Patrick Schleizer (adrelanos@whonix.org).