Flat: Support parsing using custom grammars

Add the ability to use the flat command-line tool to specify both a
grammar file and an input which should be parsed using that grammar.
After a successful parse, the parse tree is printed to standard output.

This is what makes Flat a parser *interpreter*, rather than a parser
*generator*. There is no need for the an intermediate step of compiling
generated files as with many other parsing tools like Bison/Flex, ANTLR,
JavaCC, or Stratego/XT. We can (and probably will) choose to add code
generation as an optional feature later on, but it's not a fundamental
requirement.
5 files changed
tree: 4b376c3dca5fa029dc106bf910977403a96d0c05
  1. consumers/
  2. DocFormats/
  3. Editor/
  4. experiments/
  5. sample/
  6. schemas/
  7. scripts/
  8. .gitattributes
  9. .gitignore
  10. CMakeLists.txt
  11. CorinthiaDirectoryTree.html
  12. LICENSE.txt
  13. NOTICE.txt
  14. README.md
README.md

About Apache Corinthia (incubating)

Corinthia is a library for converting between different word-processing file formats. Initially, it supports .docx (part of the OOXML specification), HTML, and LaTeX (export-only). The Corinthia project also provides convenience executables. The library has shipped as part of UX Write since February 2013.

On December 8, 2014, Corinthia entered the Apache Software Foundation incubator. The accepted proposal and incubation status provide incubation background and progress information.

The communication hub of the project is the development mailing list,

dev @ corinthia.incubator.apache.org

To receive list postings and interact on the list, simply send a message to

dev-subscribe @ corinthia.incubator.apache.org

from the email address to receive list messages at. The reply from the list robot to that address provides confirmation instructions and information on managing the subscription.

There are a Corinthia incubator web site, a project wiki, and a JIRA issue tracker.

The sites and documentation for this project are at a preliminary stage. Content will be moved to Apache and improved as incubation moves along.

Meanwhile, there is a Facebook page and a Twitter account, @ApacheCorinthia.

License

Corinthia is licensed under the Apache License version 2.0; see LICENSE.txt for details.

What the library can do

  1. Create new HTML files from a .docx source
  2. Create new .docx files from a HTML source
  3. Update existing .docx files based on a modified HTML file produced in (1)
  4. Convert .docx or HTML files to LaTeX
  5. Provide access to document structure, in terms of a DOM-like API for manipulating XML trees, and an object model for working with CSS stylesheets

Components

There are three major components, in their respective directories:

  • DocFormats - the library itself
  • dfutil - a driver program used for running [...]
  • automated tests (located in the tests directory)

Run dfutil without any command-line arguments to see a list of operations. Here is an example of converting a .docx file to HTML, modifying it, and then updating the original .docx. Note that it is important, due to how internal mapping works, that the .docx file being written is the same file as the original; using a new file won't work.

dfutil filename.docx filename.html
vi filename.html # Make some changes
dfutil filename.html filename.docx

If you examine the convertFile function in dfutil/Commands.c, you will see the main entry points to perform these conversions, which you can call from your own program.

Platforms and dependencies

Corinthia builds and runs on iOS, OS X, Linux and Windows.

To build DocFormats, you will need to have the following installed:

Build instructions

Corinthia currently builds on Linux, OS X (mac) and Windows. See the build instructions.

Contributing

Contributors are welcome and prized. Details on how to participate on the project will be posted soon.

Meanwhile, the easiest way to contribute is by subscribing to the development list and asking your questions and offering suggestions there.