blob: 386970f7d928a9d45d0809ae9a15e71dda048e89 [file] [log] [blame]
A) Download and install SRILM:
1) # Go to http://www.speech.sri.com/projects/srilm/download.html, enter your information, accept the license, and download version 1.7.1.
2) sudo mkdir /usr/local/srilm
3) # Move or copy srilm.tgz to /usr/local/srilm
4) cd /usr/local/srilm
4) sudo tar xvf srilm.tgz
5) # sudo edit Makefile at the top to set SRILM=/usr/local/srilm
6) # Set environment variable SRILM=/usr/local/srilm
6) sudo make World # This will take a while
B) Download the corpus:
1) mkdir $HOME/git
2) cd $HOME/git
3) curl -o fisher-callhome-corpus.zip https://codeload.github.com/joshua-decoder/fisher-callhome-corpus/legacy.zip/master
4) unzip fisher-callhome-corpus.zip
5) # Set environment variable SPANISH=$HOME/git/fisher-callhome-corpus
5) mv joshua-decoder-*/ fisher-callhome-corpus
C) Download and install Joshua:
1) cd /directory/to/install/
2) git clone https://github.com/joshua-decoder/joshua.git
3) cd joshua
4) # Set environment variable JAVA_HOME=/path/to/java # Try $(readlink -f /usr/bin/javac | sed "s:/bin/javac::")
5) # Set environment variable JOSHUA=/directory/to/install/joshua
6) ant devel
D) Train the model:
1) mkdir -p $HOME/expts/joshua && cd $HOME/expts/joshua
2) $JOSHUA/bin/pipeline.pl \
--rundir 1 \
--readme "Baseline Hiero run" \
--source es \
--target en \
--lm-gen srilm \
--witten-bell \
--corpus $SPANISH/corpus/asr/callhome_train \
--corpus $SPANISH/corpus/asr/fisher_train \
--tune $SPANISH/corpus/asr/fisher_dev \
--test $SPANISH/corpus/asr/callhome_devtest \
--lm-order 3