release: final apache: true title: 3.0.0 date: 2020-11-20 summary: > SAX support, streaming, reduced memory usage, diagnostics, and bug fixes

source-dist: - “apache-daffodil-3.0.0-incubating-src.zip”

binary-dist: - “apache-daffodil-3.0.0-incubating-bin.tgz” - “apache-daffodil-3.0.0-incubating-bin.zip” - “apache-daffodil-3.0.0-incubating-bin.msi” - “apache-daffodil-3.0.0.incubating-1.noarch.rpm”

scala-version: 2.12

SAX Support and Streaming

Numerous changes were made to improve streaming and reduce memory usage during parse and unparse operations.

The most visible change is the addition of a new SAX API for both parsing and unparsing. A new DaffodilParseXMLReader can be used to parse data and provide infoset events to a SAX ContentHandler. And a new DaffodilUnparseContentHandler can be used to unparse SAX events from a SAX XMLReader. See the Javadoc and Scaladoc for examples of creating and using the new SAX Daffodil objects.

For both the new SAX API or original Daffodil API, infoset events are now created during parsing, rather than waiting for a parse to complete. As infoset events are streamed out while parsing, or read in during unparsing, Daffodil now removes them from the internal infoset representation, reducing overall memory usage in many cases.

Lastly, while unparsing, Daffodil must sometimes buffer data due to circumstances such as alignment, length calculations, and forward looking expressions. Previous versions of Daffodil would not attempt to resolve these issues until the end of an unparse. Daffodil now makes attempts to resolve these issues during the unparse process, allowing for the buffered data to be written to the unparse output stream, ultimately reducing memory usage.

Multiple changes were also made to support files larger than 4GB behave as expected, including those with blobs.

  • {% jira 934 %} Streaming parser: Need to stream input data in, and infoset out to handle arbitrarily large data.
  • {% jira 1272 %} Unparsing - Optimizing pruning of infoset elements that aren't needed by any future expressions
  • {% jira 2383 %} Implement SAX API
  • {% jira 2389 %} Implement OutputStream/Writer writing ContentHandler for SAX Parsing
  • {% jira 2395 %} Cannot have complex or simple types with a specified length larger than 256MB
  • {% jira 2401 %} 4GB NITF file crashes Daffodil. 8GB and 10GB files do not.
  • {% jira 2412 %} CLI unparse command always reads infoset into a byte array
  • {% jira 2425 %} unparse fail with large blob

API Changes

A new experimental API is added to allow traversing the Daffodil internal DFDL Schema Object Model, enabling API users the ability to build an alternative schema object model, sometimes needed to allow native infoset integrate with external tools.

  • {% jira 2372 %} Add API to walk DSOM

A new withExternalVariables function is added to the Java API that accepts a Java AbstractMap. Previously, the only available function used a Scala Map, which is difficult to use from a Java application.

  • {% jira 2382 %} JAPI withExternalVariables requires Scala Map

Diagnostics

Multiple changes were made to improve ambiguous error messages, or to provide new error checks or warnings for behavior that may be unintuitive to users.

  • {% jira 652 %} Unclear error message for test_whiteSpaceDuringLengthExceededByte
  • {% jira 879 %} Warn when using an expression in a property that does not accept expressions
  • {% jira 1025 %} InputValueCalc should not cause a circular reference or refer to its element
  • {% jira 1032 %} Choice with no branches should cause SDE
  • {% jira 1035 %} Lone “.” in a dfdl:inputValueCalc expression should be an error
  • {% jira 1043 %} Bad diagnostic message: The type Complex cannot be converted to ....
  • {% jira 2277 %} Warning needed for misplaced discriminators
  • {% jira 2375 %} Warn when it is impossible to unparse a choice branch
  • {% jira 2377 %} Abort instead of diagnostic message

Bug Fixes

  • {% jira 1894 %} NoSuchElementException when getting namedTypes in union restrictions
  • {% jira 2122 %} default=“0|1” causes exception for xs:boolean types
  • {% jira 2196 %} XML Schema for DFDL is invalid
  • {% jira 2210 %} Unhandled exception with leading/trailing space in restriction base
  • {% jira 2322 %} dfdl:textBidi property errors when used in attribute form
  • {% jira 2354 %} newVariableInstance leads to mark state exception
  • {% jira 2371 %} Multiple discriminator behavior is not correct
  • {% jira 2374 %} Choice with dispatch on variable getting reset when parsing potentially nil elements
  • {% jira 2394 %} IllegalFormatConversionException related to BacktrackingException

Miscellaneous Changes

  • {% jira 1533 %} Rename: NoRep should really be InputValueCalc
  • {% jira 1814 %} usingCompilerMode, usingRuntimeMode, usingUnrestrictedMode - no longer needed
  • {% jira 2216 %} thread-unsafe Logging trait usage
  • {% jira 2269 %} Update to latest dependencies
  • {% jira 2337 %} Update unsupported features and errata page for 2.6.0 and 2.7.0
  • {% jira 2363 %} pattern facet can't use  notation. Makes validating NUL very hard.
  • {% jira 2365 %} LICENSE file correction
  • {% jira 2367 %} Add tests for nested choice message-dispatch scenarios
  • {% jira 2368 %} API docs do not include UDF
  • {% jira 2370 %} Prepare for 3.0.0 development
  • {% jira 2388 %} Release rpms built with new zstd compression, unsupported by older versions of rpm
  • {% jira 2396 %} Performence degredation caused by infoset stream
  • {% jira 2403 %} Re-enable codecod PR comments
  • {% jira 2404 %} Drop support for Scala 2.11
  • {% jira 2414 %} Prepare for 3.0.0 Release
  • {% jira 2426 %} Create 3.0.0 Release Candidate
  • {% jira 2428 %} Update to sbt-native-packager caused inability to build windows MSI

Deprecation/Compatibility

Daffodil no longer provides support for Scala 2.11.x. Scala API users must upgrade to Scala 2.12.

  • {% jira 2404 %} Drop support for Scala 2.11

The dfdl:fillByte property is now always required in DFDL Schemas, even if it may not be needed. This change does not affect any schemas that use DFDLGeneralFormat.dfdl.xsd for providing default arguments, since this provides the property. Schemas that do not specify the fillByte property now result in a Schema Definition Error.

  • {% jira 2377 %} Abort instead of diagnostic message

The withExternalVariables function in the Java API that accepts a Scala Map has been deprecated. When using the Java API, the same function name that accepts a Java AbstractMap should be used instead.

  • {% jira 2382 %} JAPI withExternalVariables requires Scala Map