This tap provides the justhtml CLI via Homebrew.
justhtml is an HTML5 parser CLI with CSS selectors and full html5lib compliance.
Homebrew 6.0.0 (June 2026) requires third-party taps to be explicitly trusted before their formulae can be loaded. Trust this tap, then install:
brew trust diffen/justhtml
brew install diffen/justhtml/justhtmlTrusting the tap means you accept that its code runs with your user's privileges. This tap contains a single MIT-licensed formula (Formula/justhtml.rb) that installs the MIT-licensed justhtml-php library — you can review both before trusting.
If you prefer to trust only this formula rather than the whole tap:
brew trust --formula diffen/justhtml/justhtml
brew install diffen/justhtml/justhtmljusthtml --versionThe section below is synced from diffen/justhtml-php/CLI.md.
Commands are rewritten to use justhtml for Homebrew.
The justhtml CLI parses HTML, optionally selects nodes with a CSS selector, and outputs HTML, text, or Markdown.
It accepts either a file path or - for stdin.
Run it:
- From this repo:
justhtml - From a Composer install:
justhtml
Create a small input file:
cat > sample.html <<'HTML'
<!doctype html>
<html>
<body>
<article id="post">
<h1>Title</h1>
<p class="lead">Hello <em>world</em>!</p>
<p>Second <span>para</span>.</p>
</article>
</body>
</html>
HTMLCreate a whitespace-focused file:
cat > whitespace.html <<'HTML'
<!doctype html>
<html><body>
<p class="sep">Alpha<span>Beta</span>Gamma</p>
<p class="ws"> Hello <span> world </span> ! </p>
</body></html>
HTMLSelect matching nodes (single selector):
justhtml sample.html --selector "p.lead" --format textOutput:
Hello world!
Select multiple selectors with a comma-separated list:
justhtml sample.html --selector "h1, p.lead" --format textOutput:
Title
Hello world!
Choose output format: html, text, or markdown.
HTML output:
justhtml sample.html --selector "p.lead" --format htmlOutput:
<p class="lead">
Hello
<em>world</em>
!
</p>Text output:
justhtml sample.html --selector "p.lead" --format textOutput:
Hello world!
Markdown output:
justhtml sample.html --selector "p.lead" --format markdownOutput:
Hello *world*!
HTML output uses outer HTML by default. Use --inner to print only the
matched node's children (inner HTML). --outer is a no-op that makes the
default explicit. These flags only affect --format html.
justhtml sample.html --selector "p.lead" --format html --innerOutput:
Hello
<em>world</em>
!Extract attribute values from matched nodes. Repeat --attr to output multiple
attributes per node (tab-separated by default). Missing attributes are replaced
with __MISSING__ by default; override with --missing.
justhtml sample.html --selector "p" --attr class --attr idOutput (tab-separated):
lead __MISSING__
__MISSING__ __MISSING__
Use --separator to change the field separator:
justhtml sample.html --selector "p" --attr class --attr id --separator ","--attr cannot be combined with --format, --inner, --outer, or --count.
Limit to the first match:
justhtml sample.html --selector "p" --format textOutput:
Hello world!
Second para.
justhtml sample.html --selector "p" --format text --firstOutput:
Hello world!
--first is equivalent to --limit 1 and cannot be combined with --limit.
Limit to the first N matches. This is equivalent to --first when N is 1.
justhtml sample.html --selector "p" --format text --limit 2Output:
Hello world!
Second para.
Print the number of matching nodes:
justhtml sample.html --selector "p" --countOutput:
2
--count cannot be combined with --first, --limit, --format, or --attr.
Join text nodes with a custom separator (text output only). In --attr mode,
this controls the field separator (default: tab).
justhtml whitespace.html --selector ".sep" --format textOutput:
Alpha Beta Gamma
justhtml whitespace.html --selector ".sep" --format text --separator ""Output:
AlphaBetaGamma
By default, each text node is trimmed and empty nodes are dropped (--strip).
Use --no-strip to preserve the original whitespace within text nodes.
Default (strip on):
justhtml whitespace.html --selector ".ws" --format textOutput:
Hello world !
Preserve whitespace:
justhtml whitespace.html --selector ".ws" --format text --no-stripOutput (spaces shown between | markers):
| Hello world ! |
Read from stdin by passing - as the path:
cat sample.html | justhtml - --selector "p.lead" --format textOutput:
Hello world!
These examples use a live page and pipe HTML into justhtml.
# Extract the first non-empty paragraph as text
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "#mw-content-text p:not(:empty)" --format text --first
# Extract links from the lead section (first 10 hrefs)
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "#mw-content-text p a" --attr href --limit 10 --separator "\n"
# Get the lead section as Markdown
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "#mw-content-text" --format markdown --first
# Count images on the page
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "img" --count
# Output the infobox as HTML (outer HTML)
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "table.infobox" --format html --outer --first
# Preserve whitespace and separate paragraphs
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "#mw-content-text p" --format text --no-strip --separator "\n\n" --limit 3
# Build a quick table of contents from headings
curl -s https://en.wikipedia.org/wiki/Earth | \
justhtml - --selector "#mw-content-text h2, #mw-content-text h3" --format text --separator "\n"justhtml --versionOutput:
justhtml dev
justhtml --helpOutput: prints the full usage/help text.
brew upgrade justhtmlbrew uninstall justhtmlIf you installed via the tap and want to remove it:
brew untap diffen/justhtml
brew untrust diffen/justhtmlHomebrew 6.0.0 requires third-party taps to be explicitly trusted. If you
tapped diffen/justhtml before upgrading to Homebrew 6, any brew
command that touches this tap (brew info justhtml, brew upgrade, etc.)
will fail with this error until you trust it. Run one of:
# Trust the whole tap (covers future formula updates):
brew trust diffen/justhtml
# Or trust only this formula:
brew trust --formula diffen/justhtml/justhtmlThis is a one-time step. You can review what you've trusted with
brew trust (no arguments) and revoke with brew untrust diffen/justhtml.
See the Homebrew Tap Trust documentation
for details.
If you install via a Brewfile, declare the trust there:
tap "diffen/justhtml", trusted: true
brew "diffen/justhtml/justhtml"Make sure your Homebrew prefix is on PATH:
brew --prefixThen ensure $(brew --prefix)/bin is on your PATH.
If you see an Xdebug warning from your PHP configuration, you can disable it for a single run:
XDEBUG_MODE=off justhtml --versionThe formula lives at:
Formula/justhtml.rb
MIT