|tagger||Matt Post <email@example.com>||Mon Jun 15 17:00:44 2015 -0400|
6.0.4 (June 15, 2015) ===================== - Local MIRA implementation (now the default for the pipeline; Moses' kbmira available with '--tuner kbmira') - PRO tuning implementation restored - Alignment in pipeline parallelized across chunks (up to --threads) - Better integration for class-based LMs - Bugfixes in pipeline script and elsewhere - Logic for KenLM/lmplz boost compile improved
|author||Matt Post <firstname.lastname@example.org>||Mon Jun 15 17:00:37 2015 -0400|
|committer||Matt Post <email@example.com>||Mon Jun 15 17:00:37 2015 -0400|
Updated for 6.0.4 release
Joshua is a statistical machine translation toolkit for both phrase-based (new in version 6.0) and syntax-based decoding. It can be run with pre-built language packs available for download, and can also be used to build models for new language pairs. Among the many features of Joshua are:
The latest release of Joshua is always linked to directly from the Home Page
Joshua 6.0 includes the following new features:
Joshua includes a number of “language packs”, which are pre-built models that allow you to use the translation system as a black box, without worrying too much about how machine translation works. You can browse the models available for download on the Joshua website.
Joshua includes a pipeline script that allows you to build new models, provided you have training data. This pipeline can be run (more or less) by invoking a single command, which handles data preparation, alignment, phrase-table or grammar construction, and tuning of the model parameters. See the documentation for a walkthrough and more information about the many available options.
To run the decoder in any form requires setting a few basic environment variables:
$JOSHUA, and potentially
export JAVA_HOME=/path/to/java # maybe /usr/java/home export JOSHUA=/path/to/joshua
You might also find it helpful to set these:
export LC_ALL=en_US.UTF-8 export LANG=en_US.UTF-8
Then, compile Joshua by typing:
cd $JOSHUA ant
The basic method for invoking the decoder looks like this:
cat SOURCE | JOSHUA m MEM -c CONFIG OPTIONS > OUTPUT
Some example usage scenarios and scripts can be found in the examples/ directory.