fixed a URL
diff --git a/4.0/step-by-step-instructions.html b/4.0/step-by-step-instructions.html
index f8fba32..ee004c2 100644
--- a/4.0/step-by-step-instructions.html
+++ b/4.0/step-by-step-instructions.html
@@ -230,7 +230,7 @@
 <p>Once you've gathered your data, you will need to do several preprocess steps: sentence alignment, tokenization, normalization, and subsampling. </p>
 
 <h3>Sentence alignment</h3>
-<p>In this exercise, we'll start with an existing sentence-aligned parallel corpus.  Download this tarball, which contains a Spanish-Engish parallel corpus, along with a dev and a test set: <a href="data.tar.gz">data.tar.gz</a> </p>
+<p>In this exercise, we'll start with an existing sentence-aligned parallel corpus.  Download this tarball, which contains a Spanish-Engish parallel corpus, along with a dev and a test set: <a href="http://cs.jhu.edu/~ccb/joshua/data.tar.gz">data.tar.gz</a> </p>
 
 <p> The data tarball contains two training directories <code>training/</code>, which includes a subset of the corpus, and <code>full-training</code>, which includes the full corpus.  I strongly recommend staring with the smaller set, and building an end-to-end system with it, since many steps take a very long time on the full data set.  You should debug on the smaller set to avoid wasting time.</p>