released: true apache: true title: 3.4.0 date: 2022-11-08 summary: > EXI binary XML support, pluggable character sets, embedded XML, C code generator updates
artifact-root: “https://www.apache.org/dyn/closer.lua/daffodil/3.4.0/” checksum-root: “https://downloads.apache.org/daffodil/3.4.0/”
key-file: “https://downloads.apache.org/daffodil/KEYS”
source-dist: - “apache-daffodil-3.4.0-src.zip”
binary-dist: - “apache-daffodil-3.4.0-bin.tgz” - “apache-daffodil-3.4.0-bin.zip” - “apache-daffodil-3.4.0-bin.msi” - “apache-daffodil-3.4.0-1.noarch.rpm”
The Daffodil CLI adds two new infoset types---I exi
and -I exisa
--to support infosets represented as EXI binary XML for non-schema aware and schema aware EXI, respectively. EXI infosets are significantly smaller in size than normal XML infosets, and are often even smaller than the original data format when made schema aware. The Daffodil CLI has added the Exificient library to support these infoset types.
API users can create EXI files by combining the existing SAXInfosetInputter
and SAXInfosetOutputter
classes with the Exificient SAX API or they can use the Agile Delta EXI SAX API which has been tested with Daffodil as well. Daffodil has added the new DaffodilXMLEntityResolver
class to its Java and Scala public APIs to support creating schema aware EXI files too.
Custom character sets can now be added to Daffodil by implementing a custom BitsCharsetDefinition
class and related functions/classes, listing it in a META-INF/services
file, packaging it into a jar, and adding it to the Daffodil classpath.
When using the XMLTextInfosetInputter
and XMLTextInfosetOutputter
classes in the API, or -I xml
in the CLI, simple string elements with the DFDL extension attribute dfdlx:runtimeProperties="stringAsXml=true"
are treated as XML. This means that when parsing, instead of outputting the content as an XML escaped string, the parsed content is checked to be valid XML and output as if it were part of the XML infoset. When unparsing, the embedded XML part of the infoset is converted back to a string. Note that because there are multiple ways to read and write XML that are syntactically different but semantically the same, it is possible that parsed or unparsed data may differ from the original data.
The C code generator backend now supports reading and writing N-bit booleans and integers, where N is an explicit length from 1 to 64 bits. Additional miscellaneous changes include unit test support, fixes to nested choices, tweaks to how float numbers are output, and a command line option to choose the TDML implementation to run TDML tests.
dfdl:assert
's with failureType="recoverableError"
are now reported as validation errors instead of schema definition warnings {% jira 2357 %}
All InfosetInputter
and InfosetOutputter
functions now return Unit
instead of Boolean
. Errors are now expected to be thrown as exceptions {% jira 2721 %}
The following dependencies have been added or updated:
Core
Command Line Interface
Schematron Validator
Test
Changes to Transitive Dependencies