Apache Sling Commons HTML Utilities

Clone this repo:
  1. d8b3af4 SLING-9251 Update Pax Exam to 4.13.3 by Oliver Lietz · 5 months ago master
  2. bd70d15 SLING-9125 Update Pax Exam to 4.13.2 by Oliver Lietz · 5 months ago
  3. 5a0f5a6 SLING-9205 Update to Sling Bundle Parent 38 by Oliver Lietz · 5 months ago
  4. 3969c2b Merge pull request #4 from apache/SLING-8566 by Jason E Bailey · 1 year, 1 month ago
  5. a8a3ef0 SLING-8566 support processing instruction and xml declaration by JE Bailey · 1 year, 1 month ago

Build Status Test Status Maven Central JavaDocs License

Apache Sling Commons HTML Utilities

This module is part of the Apache Sling project.

current settings and their default values

default SAX features are defined here http://www.saxproject.org/apidoc/org/xml/sax/package-summary.html

TagSoup specific features

Feature IDDefaultDescription
http://www.ccil.org/~cowan/tagsoup/features/ignore-bogonsfalseA value of true indicates that the parser will ignore unknown elements.
http://www.ccil.org/~cowan/tagsoup/features/bogons-emptyfalseA value of true indicates that the parser will give unknown elements a content model of EMPTY; a value of false, a content model of ANY.
http://www.ccil.org/~cowan/tagsoup/features/root-bogonstrueA value of `true indicates that the parser will allow unknown elements to be the root of the output document.
http://www.ccil.org/~cowan/tagsoup/features/default-attributestrueA value of true indicates that the parser will return default attribute values for missing attributes that have default values.
http://www.ccil.org/~cowan/tagsoup/features/translate-colonsfalseA value of true indicates that the parser will translate colons into underscores in names.
http://www.ccil.org/~cowan/tagsoup/features/restart-elementstrueA value of true indicates that the parser will attempt to restart the restartable elements.
http://www.ccil.org/~cowan/tagsoup/features/ignorable-whitespacefalseA value of true indicates that the parser will transmit whitespace in element-only content via the SAX ignorableWhitespace callback. Normally this is not done, because HTML is an SGML application and SGML suppresses such whitespace.
http://www.ccil.org/~cowan/tagsoup/features/cdata-elementstrueA value of true indicates that the parser will process the script and style elements (or any elements with type='cdata' in the TSSL schema) as SGML CDATA elements (that is, no markup is recognized except the matching end-tag).