Apache Sling Commons HTML Utilities

Clone this repo:
  1. 3969c2b Merge pull request #4 from apache/SLING-8566 by Jason E Bailey · 3 months ago master
  2. a8a3ef0 SLING-8566 support processing instruction and xml declaration by JE Bailey · 4 months ago
  3. 5320279 set scope of dependency org.apache.sling.testing.paxexam to test by Oliver Lietz · 4 months ago
  4. 3447042 SLING-8508 Make testing compliant with Java 9 and higher by Oliver Lietz · 4 months ago
  5. fbba856 Updating badges for org-apache-sling-commons-html by Radu Cotescu · 9 months ago

Build Status Test Status Maven Central JavaDocs License

Apache Sling Commons HTML Utilities

This module is part of the Apache Sling project.

current settings and their default values

default SAX features are defined here http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html

TagSoup specific features

Feature IDDefaultDescription
http://www.ccil.org/~cowan/tagsoup/features/ignore-bogonsfalseA value of true indicates that the parser will ignore unknown elements.
http://www.ccil.org/~cowan/tagsoup/features/bogons-emptyfalseA value of true indicates that the parser will give unknown elements a content model of EMPTY; a value of false, a content model of ANY.
http://www.ccil.org/~cowan/tagsoup/features/root-bogonstrueA value of `true indicates that the parser will allow unknown elements to be the root of the output document.
http://www.ccil.org/~cowan/tagsoup/features/default-attributestrueA value of true indicates that the parser will return default attribute values for missing attributes that have default values.
http://www.ccil.org/~cowan/tagsoup/features/translate-colonsfalseA value of true indicates that the parser will translate colons into underscores in names.
http://www.ccil.org/~cowan/tagsoup/features/restart-elementstrueA value of true indicates that the parser will attempt to restart the restartable elements.
http://www.ccil.org/~cowan/tagsoup/features/ignorable-whitespacefalseA value of true indicates that the parser will transmit whitespace in element-only content via the SAX ignorableWhitespace callback. Normally this is not done, because HTML is an SGML application and SGML suppresses such whitespace.
http://www.ccil.org/~cowan/tagsoup/features/cdata-elementstrueA value of true indicates that the parser will process the script and style elements (or any elements with type='cdata' in the TSSL schema) as SGML CDATA elements (that is, no markup is recognized except the matching end-tag).