released: false apache: true title: 3.4.0 date: 2022-10-31 summary: > EXI binary XML support, pluggable character sets, embedded XML, C code generator updates

artifact-root: “https://dist.apache.org/repos/dist/dev/daffodil/3.4.0-rc1/” checksum-root: “https://dist.apache.org/repos/dist/dev/daffodil/3.4.0-rc1/

key-file: “https://downloads.apache.org/daffodil/KEYS

source-dist: - “apache-daffodil-3.4.0-src.zip”

binary-dist: - “apache-daffodil-3.4.0-bin.tgz” - “apache-daffodil-3.4.0-bin.zip” - “apache-daffodil-3.4.0-bin.msi” - “apache-daffodil-3.4.0-1.noarch.rpm”

scala-version: 2.12

EXI Support

The Daffodil CLI adds two new infoset types---I exi and -I exisa--to support infosets represented as EXI binary XML for non-schema aware and schema aware EXI, respectively. EXI infosets are significantly smaller in size than normal XML infosets, and are often even smaller than the original data format when made schema aware. This support has been enabled with the Exificient library.

API users can create EXI files using the existing SAXInfosetInputter and SAXInfosetOutputter classes and the Exificient SAX API. This has also been tested with the Agile Delta EXI SAX API. Schema aware EXI can be enabled using the DaffodilXMLEntityResolver, newly added to Java and Scala public APIs.

  • {% jira 1959 %} EXIficient Inputter and outputter for XML EXI representation
  • {% jira 2739 %} Expose DFDLCatalogResolver to the public API

Pluggable Character Sets

Custom character sets can now be added to Daffodil by implementing a custom BitsCharsetDefinition class and related functions/classes, listing it in a META-INF/services file, packaging it into a jar, and adding it to the Daffodil classpath.

  • {% jira 2663 %} charset encoders need to be pluggable (doc needed)

Embedded XML in the Infoset

When using the XMLTextInfosetInputter and XMLTextInfosetOutputter in the API, or -I xml in the CLI, simple string elements with the DFDL extension attribute of dfdl:runtimeProperties="stringAsXml=true" are treated as XML. This means that when parsing, instead of outputting the content as an XML escaped string, the parsed content is checked to be valid XML and output as if it were part of the XML infoset. When unparsing, the XML part of the infoset is converted back to a string. Note that because there are multiple ways to read and write XML that are syntactically different but semantically the same, it is possible that parsed or unparsed data may differ from the original data.

  • {% jira 2708 %} XML String feature in XML Text Infoset Inputter/Outputter

C Code Generator (Runtime2) Updates

The C code generator backend now supports reading and writing N-bit booleans and integers, where N is an explicit length from 1 to 64 bits. Additional miscellaneous changes include unit test support, fixes to nested choices, and tweaks to how float numbers are output.

  • {% jira 2696 %} Extend runtime2 to N-bit booleans and integers (1 <= N <= 64)

Additional New Features/Improvements

  • {% jira 2357 %} Update recoverable error to not be an SDW
  • {% jira 2369 %} TDML language needs comment syntax in documentPart byte and bits
  • {% jira 2685 %} Upgrade Scala XML library to 2.1.0 version
  • {% jira 2697 %} Choose TDML implementation to run TDML tests
  • {% jira 2702 %} Eliminate build warning on VSCode scala build of Daffodil
  • {% jira 2719 %} Add tests to illustrate some nillable and fixed-length scenarios
  • {% jira 2723 %} Change DataProcessor.copy() method to public
  • {% jira 2725 %} TDML runner needs to support pre-compiled DFDL schema

Miscellaneous Bug Fixes

  • {% jira 2237 %} XML comment not allowed inside dfdl:defineEscapeScheme
  • {% jira 2488 %} When executing a parse command with the debugger get incorrect result for display info unparser
  • {% jira 2616 %} End-user doc of how import/include schemaLocation resolver works
  • {% jira 2654 %} java.util.NoSuchElementException: head of empty list on commented out includes/choice branches
  • {% jira 2680 %} Add new transitive deps' NOTICE to Daffodil's bin.NOTICE
  • {% jira 2682 %} Update copyrights to 2022
  • {% jira 2693 %} Add Adam Rosien to https://daffodil.apache.org/people/
  • {% jira 2701 %} Facet checks throw exception for non-numeric float/double values
  • {% jira 2704 %} sbt tests are intermittenly hanging
  • {% jira 2710 %} build.md instructions fail for RedHat - no mxml-devel
  • {% jira 2713 %} Release candidate container has poor cacability and reproducability
  • {% jira 2717 %} unparser deadlocks in mil-std-2045 2-message file case
  • {% jira 2726 %} Index out of bounds -1 during unparsing

Deprecation/Compatibility

  • dfdl:assert's with failureType="recoverableError" are now reported as validation errors instead of schema definition warnings {% jira 2357 %}

  • All InfosetInputter and InfosetOutputter functions now return Unit instead of Boolean. Errors are now expected to be thrown as exceptions {% jira 2721 %}

Dependency Changes

The following dependencies have been added or updated:

Core

  • FasterXML Jackson Core 2.13.4 (update)
  • FasterXML Woodstox Core 6.4.0 (update)
  • ICU4J 72.1 (update)
  • Log4j API 2.19.0 (update)
  • Scala XML 2.1.0 (update)

Command Line Interface

  • Exificient 1.0.4 (new)
  • Log4j Core 2.19.0 (update)

Schematron Validator

  • Saxon-HE 11.4 (update)

Test

  • Scalacheck 1.17.0 (update)