rewrite-xml: support HTML void elements in JSP/HTML parsing#7906
Open
knutwannheden wants to merge 1 commit into
Open
rewrite-xml: support HTML void elements in JSP/HTML parsing#7906knutwannheden wants to merge 1 commit into
knutwannheden wants to merge 1 commit into
Conversation
JSP and HTML sources are parsed by the XML grammar, whose `element` rule only accepted fully-closed (`<a>…</a>`) and self-closing (`<a/>`) tags. An HTML void element written without a slash (e.g. `<br>`) was parsed as the start of a normal element; ANTLR error recovery then mangled the tree, the reprint no longer matched the input, and the whole file was downgraded to a ParseError. Add HTML void-element support, enabled only for HTML-like sources (.jsp/.jspx/.html/.htm) so strict XML parsing is unaffected: - the grammar gains a void-element alternative gated by a semantic predicate, plus an empty `voidClose` marker rule to detect it - htmlMode and isVoidElement live in a hand-written XMLParserBase wired via the grammar's superClass option, keeping the .g4 free of target-specific members so the C# generation is not broken - void tags carry an HtmlVoidElement marker so the printer emits a bare `>` instead of `/>`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
JSP and HTML files are parsed by the
rewrite-xmlXML grammar (the JSP extensions and the.jspextension live there). The grammar'selementrule only knew two shapes —<a>…</a>and<a/>— with no notion of an HTML void element. HTML5 allows void elements such as<br>,<img>,<input>and<meta>to be written without a trailing slash.When the parser hit
<br>, it treated it as the start of a normal element expecting</br>. ANTLR error recovery then mangled the tree (an entire<html>…</html>block collapsed to<html/>on reprint), the reprint no longer matched the input, andParser.requirePrintEqualsInputdowngraded the whole file to aParseError. The original text was preserved, but the file became opaque to recipes.This surfaced on real, valid Spring Boot JSP smoke tests (
welcome.jsp), whose only "offense" was an unclosed<br>.Examples
Previously this
.jspfailed to parse (became aParseError); it now round-trips as a properXml.Document:Void elements with attributes are supported too:
Strict XML is unchanged: an element that merely shares a name with a void element still parses as a container (
<link>https://example.com</link>), and an unclosed<br>in a plain.xmlfile remains aParseError.Summary
XMLParser.g4: added a thirdelementalternative for void elements, gated by anisVoidElement($name.text)semantic predicate, plus an emptyvoidClosemarker rule so the choice is detectable in the parse tree.XMLParserBaseand wired it via the grammar'ssuperClassoption. It holds thehtmlModeflag andisVoidElement(...), keeping the.g4free of target-specific (Java) members so the C# generation inrewrite-csharpis not broken. (When the C# sources are next regenerated they will need a matchingXMLParserBasein theOpenRewrite.Xml.Grammarnamespace; this is noted in a grammar comment.)XmlParser: enableshtmlModeonly for.jsp/.jspx/.html/.htmsources.XmlParserVisitor: maps the void shape to aTagwithnullcontent/closing and attaches anHtmlVoidElementmarker.HtmlVoidElementmarker +XmlPrinter: a marked tag prints a bare>instead of/>. A marker (rather than a model-field change) keeps the LST shape and serialization unchanged.Test plan
XmlParserTestcases:<br>in a.jsp; void elements with attributes (<meta>/<link>/<img>/<input>/<hr>); the full Spring Bootwelcome.jspround-trip.<link>…</link>,<source>…</source>) still parse in XML mode; an unclosed<br>in.xmlremains aParseError(void leniency is HTML-only).jsp,jspScriptlet,mixedJspElements, …) still pass../gradlew :rewrite-xml:checkis green (tests + license).