|
288 | 288 | * double-quoted strings, meaning that attributes on input with single-quoted or |
289 | 289 | * unquoted values will appear in the output with double-quotes. |
290 | 290 | * |
291 | | - * scripts aren't processed |
| 291 | + * ### Scripting Flag |
| 292 | + * |
| 293 | + * The Tag Processor parses HTML with the "scripting flag" disabled. This means |
| 294 | + * that it doesn't run any scripts while parsing the page. In a browser with |
| 295 | + * JavaScript enabled, for example, the script can change the parse of the |
| 296 | + * document as it loads. On the server, however, evaluating JavaScript is not |
| 297 | + * only impractical, but also unwanted. |
| 298 | + * |
| 299 | + * Practically this means that the Tag Processor will descend into NOSCRIPT |
| 300 | + * elements and process its child tags. Were the scripting flag enabled, such |
| 301 | + * as in a typical browser, the contents of NOSCRIPT are skipped entirely. |
| 302 | + * |
| 303 | + * This allows the HTML API to process the content that will be presented in |
| 304 | + * a browser when scripting is disabled, but it offers a different view of a |
| 305 | + * page than most browser sessions will experience. E.g. the tags inside the |
| 306 | + * NOSCRIPT disappear. |
| 307 | + * |
| 308 | + * ### Text Encoding |
| 309 | + * |
| 310 | + * The Tag Processor assumes that the input HTML document is encoded with a |
| 311 | + * text encoding compatible with 7-bit ASCII. These include but are not |
| 312 | + * limited to ISO-8859-1 (latin1), CP-1250, UTF-8. If provided a UTF-16 byte |
| 313 | + * stream or another encoding which does not preserve the 7-bit ASCII |
| 314 | + * characters as a single byte matching the 7-bit ASCII bytes, then it will |
| 315 | + * not be able to properly parse the document. |
292 | 316 | * |
293 | 317 | * @since 6.2.0 |
294 | 318 | * @since 6.2.1 Fix: Support for various invalid comments; attribute updates are case-insensitive. |
|
0 commit comments