commit | a016ae6a1d59580ff050a8f7310a9f57ec874bd4 | [log] [tgz] |
---|---|---|
author | John Interrante <interran@research.ge.com> | Thu Apr 29 12:47:48 2021 -0400 |
committer | John Interrante <interran@research.ge.com> | Sun May 02 14:51:41 2021 -0400 |
tree | 19b1c00d9e2f0b472c00fab54a60f77805060f9f | |
parent | fcb4dc0a0722b1fea7e98c58f88464f45b07abf9 [diff] |
Fix some runtime2 todos Parse C executable's command line arguments with getopt instead of argp (now have to put all options before first non-option argument and lost ability to say "daffodil parse in.dat -o out.xml", but CLI was meant for TDML runner rather than users anyway). Update build instructions and CI workflow to build with clang instead of gcc. Fix MSYS2 portability problems exposed by change (e.g., no error() function). Lower minimum required Mini-XML version from 3.2 to 3.0. Add missing LICENSE and NOTICE files to daffodil-runtime2 jar. Move everything CLI-related in libruntime out to libcli. Make error lookup mechanism pluggable with a hook to let libcli insert its own error lookup routine. Also make changes and move files to help automate clang-format, include-what-you-use, and generated_code.[ch] updates later. DAFFODIL-2500, DAFFODIL-2505, DAFFODIL-2508 --- main.yml: Build C files with clang instead of gcc, but build mxml with gcc on MYSYS2 since runtime2 tests break with mxml compiled by clang. Also need diffutils on MSYS2 to avoid mxml makefile calling Cygwin's cmp with unnecessary error message. BUILD.md: Lower Mini-XML version to 3.0. Remove argp and gcc instructions. Add clang instructions. Show how to set env vars on one line. On MSYS2, install diffutils to let mxml makefile call cmp. README.md: Lower Mini-XML version to 3.0. .clang-format: Add StatementsMacros to allow clang-format to format parsers.c and unparsers.c without unexpected indentation. Also make all C files tell clang-format to leave include lines alone to prevent clang-format from interfering with include-what-you-use (SortIncludes: false isn't sufficient since clang-format still re-indents comments): // clang-format off #include ... // for ... // clang-format on LICENSE: Add to daffdil-runtime2 jar. NOTICE: Add to daffdil-runtime2 jar. Makefile: Add comments explaining how to use targets. Rename tests to check target for consistency with GNU make standard. Put options before non-option arguments in daffodil commands. All C files: Tell clang-format to leave includes alone so it won't interfere with include-what-you-use. cli_errors.[ch]: Move all error codes and messages used by libcli here. Use lookup tables and pluggable mechanism to allow libruntime to look up and format libcli messages. Move/rename those CLI-related enum constants here from libruntime: - ERR_FILE_CLOSE (all ERR_* renamed to CLI_*) - ERR_FILE_OPEN - ERR_INFOSET_READ/WRITE (-> CLI_INVALD_INFOSET) - ERR_STACK_EMPTY - ERR_STACK_OVERFLOW - ERR_STACK_UNDERFLOW - ERR_STRTOBOOL - ERR_STRTOD_ERRNO - ERR_STRTOI_ERRNO - ERR_STRTONUM_EMPTY - ERR_STRTONUM_NOT - ERR_STRTONUM_RANGE - ERR_XML_DECL - ERR_XML_ELEMENT - ERR_XML_ERD - ERR_XML_GONE - ERR_XML_INPUT - ERR_XML_LEFT - ERR_XML_MISMATCH - ERR_XML_WRITE - LIMIT_XML_NESTING daffodil_argp.[ch]: Rename to daffodil_getopt.[ch]. daffodil_getopt.c: Remove all argp structs and handlers. Simplify to just a single daffodil_parse_cli function calling getopt and returning a pointer to Error if any error happens. Note getopt has no portable way to parse "daffodil [options] command [more options] arguments" so callers now have to put all options before first non-option argument ("daffodil [options] command arguments"). daffodil_getopt.h: Include "errors.h" and make parse_daffodil_cli return a pointer to Error so we can use continue_or_exit(error) for all errors/messages. daffodil_main.c: Remove fflush_continue_or_exit (no really good reason to flush a stream right before closing it). Call continue_or_exit after parse_daffodil_cli to handle any CLi error. Simplify rest of code in main function. Rename error enumerations (ERR -> CLI) and initialize c field with 0 instead of s field with NULL since we now sort Error fields alphabetically (also needed in stack.c, xml_reader.c, xml_writer.c). errors.c: Remove <error.h> and replace error calls with fprintf/exit calls since error's a GNU function which MYSYS2 clang doesn't provide. Move eof_or_error, get_diagnostics, add_diagnostics up so they come first in file. Replace error_message function containing switch statement with error_lookup function indexing lookup table and returning ErrorLookup structs. Change print_maybe_stop function to call both error_lookup and cli_error_lookup (pluggable mechanism used to look up CLI errors) and switch on ErrorField enumerations instead of ErrorCode enumerations to print formatted messages with appropriate Error fields. Add check_error_lookup function (after you rearrange codes or messages or get an assertion failure in error_lookup, call check_error_lookup from the debugger to check all codes map to expected messages). errors.h: Move CLI-related errors to cli_errors.h. Add ERR_ZZZ enumeration to allow libcli's first error code to be numbered consecutively following libruntime's last error code without a gap between them. Add "int c" member to ErrorCode anonymous union for getopt-related errors. Add ErrorField enumerations and ErrorLookup struct to allow print_maybe_stop to print libcli's messages without hardcoding any knowledge about them. Sort Error's fields alphabetically. Move PState and UState structs back to infoset.h where they really belong. Declare daffodil_program_version (we were using argp_program_version before which is no longer available) and move eof_or_error up before get_diagnostics. Declare cli_error_lookup pluggable mechanism to allow print_maybe_stop to look up libcli's messages without knowing anythng about them. infoset.h: Move PState and UState structs back here. parsers.c, unparsers.c: Initialize c field with 0 and use explicit .s identifier to initialize s field since we now sort Error fields alphabetically. unparsers.h: Format from 80 to 100 columns with clang-format (missed this one before). CodeGenerator.scala: Pass relative/*.c filenames instead of absolute filenames on Windows to avoid problem running MSYS2 clang compiler on Windows. Remove "-largp" since we no longer need it on MSYS2. Reorder pickCompiler's list of compilers to prefer ${CC}, cc, clang, gcc in that order. Runtime2DataProcessor.scala: Call executable with -o outfile before parse or unparse in CLI command lines. CodeGeneratorState.scala: Initialize c field with 0 since we now sort Error fields alphabetically. Replace argp_program_version with daffodil_program_version since we no longer use "argp.h". NestedUnion.[ch] -> generated_code.[ch]: Move and regenerate with code generator without renaming or manual editing to enable automated update of generated_code.[ch] examples with daffodil's C code generator in future. ex_nums.[ch] -> generated_code.[ch]: Move and regenerate with code generator without renaming or manual editing to enable automated update of generated_code.[ch] examples with daffodil's C code generator in future. Rat.scala: Ignore generated_code.[ch] examples since daffodil's C code generator doesn't include Apache license when generating them.
Apache Daffodil is an open-source implementation of the DFDL specification that uses DFDL data descriptions to parse fixed format data into an infoset. This infoset is commonly converted into XML or JSON to enable the use of well-established XML or JSON technologies and libraries to consume, inspect, and manipulate fixed format data in existing solutions. Daffodil is also capable of serializing or “unparsing” data back to the original data format. The DFDL infoset can also be converted directly to/from the data structures carried by data processing frameworks so as to bypass any XML/JSON overheads.
For more information about Daffodil, see https://daffodil.apache.org/.
See BUILD.md for more details.
SBT is the officially supported tool to build Daffodil. Below are some of the more commonly used commands for Daffodil development.
Compile source code:
sbt compile
Run unit tests:
sbt test
Run command line interface tests:
sbt IntegrationTest/test
Build the command line interface (Linux and Windows shell scripts in daffodil-cli/target/universal/stage/bin/
; see the Command Line Interface documentation for details on their usage):
sbt daffodil-cli/stage
Run Apache RAT (license audit report in target/rat.txt
and error if any unapproved licenses are found):
sbt ratCheck
Run sbt-scoverage (report in target/scala-ver/scoverage-report/
):
sbt clean coverage test IntegrationTest/test sbt coverageAggregate
You can ask questions on the dev@daffodil.apache.org or users@daffodil.apache.org mailing lists. You can report bugs via the Daffodil JIRA.
Apache Daffodil is licensed under the Apache License, v2.0.