Homebrew Tap for justhtml

This tap provides the justhtml CLI via Homebrew.

justhtml is an HTML5 parser CLI with CSS selectors and full html5lib compliance.

Install

Homebrew 6.0.0 (June 2026) requires third-party taps to be explicitly trusted before their formulae can be loaded. Trust this tap, then install:

brew trust diffen/justhtml
brew install diffen/justhtml/justhtml

Trusting the tap means you accept that its code runs with your user's privileges. This tap contains a single MIT-licensed formula (Formula/justhtml.rb) that installs the MIT-licensed justhtml-php library — you can review both before trusting.

If you prefer to trust only this formula rather than the whole tap:

brew trust --formula diffen/justhtml/justhtml
brew install diffen/justhtml/justhtml

Verify

justhtml --version

CLI Documentation

The section below is synced from diffen/justhtml-php/CLI.md. Commands are rewritten to use justhtml for Homebrew.

CLI

The justhtml CLI parses HTML, optionally selects nodes with a CSS selector, and outputs HTML, text, or Markdown. It accepts either a file path or - for stdin.

Run it:

From this repo: justhtml
From a Composer install: justhtml

Sample input used below

Create a small input file:

cat > sample.html <<'HTML'
<!doctype html>
<html>
  <body>
    <article id="post">
      <h1>Title</h1>
      <p class="lead">Hello <em>world</em>!</p>
      <p>Second <span>para</span>.</p>
    </article>
  </body>
</html>
HTML

Create a whitespace-focused file:

cat > whitespace.html <<'HTML'
<!doctype html>
<html><body>
  <p class="sep">Alpha<span>Beta</span>Gamma</p>
  <p class="ws">  Hello <span> world </span> ! </p>
</body></html>
HTML

--selector

Select matching nodes (single selector):

justhtml sample.html --selector "p.lead" --format text

Output:

Hello world!

Select multiple selectors with a comma-separated list:

justhtml sample.html --selector "h1, p.lead" --format text

Output:

Title
Hello world!

--format

Choose output format: html, text, or markdown.

HTML output:

justhtml sample.html --selector "p.lead" --format html

Output:

<p class="lead">
  Hello
  <em>world</em>
  !
</p>

Text output:

justhtml sample.html --selector "p.lead" --format text

Output:

Hello world!

Markdown output:

justhtml sample.html --selector "p.lead" --format markdown

Output:

Hello *world*!

--outer / --inner

HTML output uses outer HTML by default. Use --inner to print only the matched node's children (inner HTML). --outer is a no-op that makes the default explicit. These flags only affect --format html.

justhtml sample.html --selector "p.lead" --format html --inner

Output:

Hello
<em>world</em>
!

--attr / --missing

Extract attribute values from matched nodes. Repeat --attr to output multiple attributes per node (tab-separated by default). Missing attributes are replaced with __MISSING__ by default; override with --missing.

justhtml sample.html --selector "p" --attr class --attr id

Output (tab-separated):

lead	__MISSING__
__MISSING__	__MISSING__

Use --separator to change the field separator:

justhtml sample.html --selector "p" --attr class --attr id --separator ","

--attr cannot be combined with --format, --inner, --outer, or --count.

--first

Limit to the first match:

justhtml sample.html --selector "p" --format text

Output:

Hello world!
Second para.

justhtml sample.html --selector "p" --format text --first

Output:

Hello world!

--first is equivalent to --limit 1 and cannot be combined with --limit.

--limit

Limit to the first N matches. This is equivalent to --first when N is 1.

justhtml sample.html --selector "p" --format text --limit 2

Output:

Hello world!
Second para.

--count

Print the number of matching nodes:

justhtml sample.html --selector "p" --count

Output:

--count cannot be combined with --first, --limit, --format, or --attr.

--separator

Join text nodes with a custom separator (text output only). In --attr mode, this controls the field separator (default: tab).

justhtml whitespace.html --selector ".sep" --format text

Output:

Alpha Beta Gamma

justhtml whitespace.html --selector ".sep" --format text --separator ""

Output:

AlphaBetaGamma

--strip / --no-strip

By default, each text node is trimmed and empty nodes are dropped (--strip). Use --no-strip to preserve the original whitespace within text nodes.

Default (strip on):

justhtml whitespace.html --selector ".ws" --format text

Output:

Hello world !

Preserve whitespace:

justhtml whitespace.html --selector ".ws" --format text --no-strip

Output (spaces shown between | markers):

|  Hello   world   ! |

Stdin

Read from stdin by passing - as the path:

cat sample.html | justhtml - --selector "p.lead" --format text

Output:

Hello world!

Piping examples (real pages)

These examples use a live page and pipe HTML into justhtml.

# Extract the first non-empty paragraph as text
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "#mw-content-text p:not(:empty)" --format text --first

# Extract links from the lead section (first 10 hrefs)
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "#mw-content-text p a" --attr href --limit 10 --separator "\n"

# Get the lead section as Markdown
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "#mw-content-text" --format markdown --first

# Count images on the page
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "img" --count

# Output the infobox as HTML (outer HTML)
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "table.infobox" --format html --outer --first

# Preserve whitespace and separate paragraphs
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "#mw-content-text p" --format text --no-strip --separator "\n\n" --limit 3

# Build a quick table of contents from headings
curl -s https://en.wikipedia.org/wiki/Earth | \
  justhtml - --selector "#mw-content-text h2, #mw-content-text h3" --format text --separator "\n"

--version and --help

justhtml --version

Output:

justhtml dev

justhtml --help

Output: prints the full usage/help text.

Upgrading

brew upgrade justhtml

Uninstall

brew uninstall justhtml

If you installed via the tap and want to remove it:

brew untap diffen/justhtml
brew untrust diffen/justhtml

Troubleshooting

“Refusing to load formula diffen/justhtml/justhtml from untrusted tap diffen/justhtml”

Homebrew 6.0.0 requires third-party taps to be explicitly trusted. If you tapped diffen/justhtml before upgrading to Homebrew 6, any brew command that touches this tap (brew info justhtml, brew upgrade, etc.) will fail with this error until you trust it. Run one of:

# Trust the whole tap (covers future formula updates):
brew trust diffen/justhtml

# Or trust only this formula:
brew trust --formula diffen/justhtml/justhtml

This is a one-time step. You can review what you've trusted with brew trust (no arguments) and revoke with brew untrust diffen/justhtml. See the Homebrew Tap Trust documentation for details.

If you install via a Brewfile, declare the trust there:

tap "diffen/justhtml", trusted: true
brew "diffen/justhtml/justhtml"

“justhtml: command not found”

Make sure your Homebrew prefix is on PATH:

brew --prefix

Then ensure $(brew --prefix)/bin is on your PATH.

Xdebug warning on `justhtml --version`

If you see an Xdebug warning from your PHP configuration, you can disable it for a single run:

XDEBUG_MODE=off justhtml --version

Formula

The formula lives at:

Formula/justhtml.rb

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Formula		Formula
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Homebrew Tap for justhtml

Install

Verify

CLI Documentation

CLI

Sample input used below

--selector

--format

--outer / --inner

--attr / --missing

--first

--limit

--count

--separator

--strip / --no-strip

Stdin

Piping examples (real pages)

--version and --help

Upgrading

Uninstall

Troubleshooting

“Refusing to load formula diffen/justhtml/justhtml from untrusted tap diffen/justhtml”

“justhtml: command not found”

Xdebug warning on `justhtml --version`

Formula

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Homebrew Tap for justhtml

Install

Verify

CLI Documentation

CLI

Sample input used below

--selector

--format

--outer / --inner

--attr / --missing

--first

--limit

--count

--separator

--strip / --no-strip

Stdin

Piping examples (real pages)

--version and --help

Upgrading

Uninstall

Troubleshooting

“Refusing to load formula diffen/justhtml/justhtml from untrusted tap diffen/justhtml”

“justhtml: command not found”

Xdebug warning on justhtml --version

Formula

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Xdebug warning on `justhtml --version`

Packages