fixed a URL

commit: f326adb641c332c46c239d347efb49103d1b25e0 [log] [tgz]
author: Luke Orland <orluke@gmail.com> Wed Jul 18 09:20:53 2012 -0400
committer: Luke Orland <orluke@gmail.com> Wed Jul 18 09:20:53 2012 -0400
tree: bdae7120165fc9d7943446d1cce1be83afa01287
parent: 1d3f4b0c9dd387b43aea53a2c046d81c82f67df1 [diff]
diff --git a/4.0/step-by-step-instructions.html b/4.0/step-by-step-instructions.html
index f8fba32..ee004c2 100644
--- a/4.0/step-by-step-instructions.html
+++ b/4.0/step-by-step-instructions.html

@@ -230,7 +230,7 @@
 <p>Once you've gathered your data, you will need to do several preprocess steps: sentence alignment, tokenization, normalization, and subsampling. </p>
 
 <h3>Sentence alignment</h3>
-<p>In this exercise, we'll start with an existing sentence-aligned parallel corpus.  Download this tarball, which contains a Spanish-Engish parallel corpus, along with a dev and a test set: <a href="data.tar.gz">data.tar.gz</a> </p>
+<p>In this exercise, we'll start with an existing sentence-aligned parallel corpus.  Download this tarball, which contains a Spanish-Engish parallel corpus, along with a dev and a test set: <a href="http://cs.jhu.edu/~ccb/joshua/data.tar.gz">data.tar.gz</a> </p>
 
 <p> The data tarball contains two training directories <code>training/</code>, which includes a subset of the corpus, and <code>full-training</code>, which includes the full corpus.  I strongly recommend staring with the smaller set, and building an end-to-end system with it, since many steps take a very long time on the full data set.  You should debug on the smaller set to avoid wasting time.</p>
commit	f326adb641c332c46c239d347efb49103d1b25e0	[log] [tgz]
author	Luke Orland <orluke@gmail.com>	Wed Jul 18 09:20:53 2012 -0400
committer	Luke Orland <orluke@gmail.com>	Wed Jul 18 09:20:53 2012 -0400
tree	bdae7120165fc9d7943446d1cce1be83afa01287
parent	1d3f4b0c9dd387b43aea53a2c046d81c82f67df1 [diff]