| An example of how to build a model using the Fisher Spanish CALLHOME corpus |
| |
| A) Download the corpus: |
| 1) mkdir $HOME/git |
| 2) cd $HOME/git |
| 3) curl -o fisher-callhome-corpus.zip https://codeload.github.com/joshua-decoder/fisher-callhome-corpus/legacy.zip/master |
| 4) unzip fisher-callhome-corpus.zip |
| 5) # Set environment variable SPANISH=$HOME/git/fisher-callhome-corpus |
| 5) mv joshua-decoder-*/ fisher-callhome-corpus |
| |
| B) Download and install Joshua: |
| 1) cd /directory/to/install/ |
| 2) git clone https://github.com/joshua-decoder/joshua.git |
| 3) cd joshua |
| 4) # Set environment variable JAVA_HOME=/path/to/java # Try $(readlink -f /usr/bin/javac | sed "s:/bin/javac::") |
| 5) # Set environment variable JOSHUA=/directory/to/install/joshua |
| 6) ant devel |
| |
| C) Train the model: |
| 1) mkdir -p $HOME/expts/joshua && cd $HOME/expts/joshua |
| 2) $JOSHUA/bin/pipeline.pl \ |
| --rundir 1 \ |
| --readme "Baseline Hiero run" \ |
| --source es \ |
| --target en \ |
| --lm-gen srilm \ |
| --witten-bell \ |
| --corpus $SPANISH/corpus/asr/callhome_train \ |
| --corpus $SPANISH/corpus/asr/fisher_train \ |
| --tune $SPANISH/corpus/asr/fisher_dev \ |
| --test $SPANISH/corpus/asr/callhome_devtest \ |
| --lm-order 3 |