updated indian languages page
diff --git a/indian-parallel-corpora/index.html b/indian-parallel-corpora/index.html
index 3a76575..f05d16e 100644
--- a/indian-parallel-corpora/index.html
+++ b/indian-parallel-corpora/index.html
@@ -1,136 +1,246 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
+<!DOCTYPE html>
+<html lang="en">
+ <head>
+ <meta charset="utf-8">
+ <title>Indian Languages Parallel Corpora</title>
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
+ <meta name="description" content="">
+ <meta name="author" content="">
-<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
+ <!-- Le styles -->
+ <link href="/bootstrap/css/bootstrap.css" rel="stylesheet">
+ <style>
+ body {
+ padding-top: 60px; /* 60px to make the container go all the way to the bottom of the topbar */
+ }
+ #download {
+ background-color: green;
+ font-size: 14pt;
+ font-weight: bold;
+ text-align: center;
+ color: white;
+ border-radius: 5px;
+ padding: 4px;
+ }
-<html>
+ #download a:link {
+ color: white;
+ }
-<head>
- <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
- <link rel="stylesheet" type="text/css" media="screen,print" href="../joshua.css" />
- <title>Indian Languages Parallel Corpora</title>
-</head>
+ #download a:hover {
+ color: lightgrey;
+ }
-<body>
+ #download a:visited {
+ color: white;
+ }
- <div id="content">
+ a.pdf {
+ font-variant: small-caps;
+ /* font-weight: bold; */
+ font-size: 10pt;
+ color: white;
+ background: brown;
+ padding: 2px;
+ }
- <h1>Indian Parallel Corpora <span id="download"><a href="https://github.com/joshua-decoder/indian-parallel-corpora/zipball/master">Download</a></span></h1>
- <hr />
+ a.bibtex {
+ font-variant: small-caps;
+ /* font-weight: bold; */
+ font-size: 10pt;
+ color: white;
+ background: orange;
+ padding: 2px;
+ }
- <h2>Description</h2>
- <a name="Description"/>
+ img.sponsor {
+ height: 120px;
+ margin: 5px;
+ }
+ </style>
+ <link href="bootstrap/css/bootstrap-responsive.css" rel="stylesheet">
- This page describes a set of parallel corpora between English and six languages from the Indian
- sub-continent:
+ <!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
+ <!--[if lt IE 9]>
+ <script src="bootstrap/js/html5shiv.js"></script>
+ <![endif]-->
- <ul>
- <li>Bengali</li>
- <li>Hindi</li>
- <li>Malayalam</li>
- <li>Tamil</li>
- <li>Telugu</li>
- <li>Urdu</li>
- </ul>
+ <!-- Fav and touch icons -->
+ <link rel="apple-touch-icon-precomposed" sizes="144x144" href="bootstrap/ico/apple-touch-icon-144-precomposed.png">
+ <link rel="apple-touch-icon-precomposed" sizes="114x114" href="bootstrap/ico/apple-touch-icon-114-precomposed.png">
+ <link rel="apple-touch-icon-precomposed" sizes="72x72" href="bootstrap/ico/apple-touch-icon-72-precomposed.png">
+ <link rel="apple-touch-icon-precomposed" href="bootstrap/ico/apple-touch-icon-57-precomposed.png">
+ <link rel="shortcut icon" href="bootstrap/ico/favicon.png">
+ </head>
- <p>
- They can be used to train (and evaluate) models
- for <a href="http://en.wikipedia.org/wiki/Statistical_machine_translation">automatically
- translating</a> text into and out of these languages.
- They were collected by translating Indian Wikipedia articles into English
- using Amazon's Mechanical Turk. Their collection and release are described in the paper:
- </p>
+ <body>
- <blockquote>
- <i>Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing</i> <br/>
- <a href="http://cs.jhu.edu/~post">Matt Post</a>, <a href="http://cs.jhu.edu/~ccb">Chris
- Callison-Burch</a>, and <a href="http://homepages.inf.ed.ac.uk/miles/">Miles Osborne</a> <br/>
- <a href="http://statmt.org/wmt12">WMT 2012</a> <br/>
- <a class="pdf" href="http://aclweb.org/anthology-new/W/W12/W12-3152.pdf">PDF</a>
- <a class="bibtex" href="http://aclweb.org/anthology-new/W/W12/W12-3152.bib">BIB</a>
- </blockquote>
-
- <h2>Download & License</h2>
-
- The Indian parallel corpora dataset
- is <a href="https://github.com/joshua-decoder/indian-parallel-corpora">hosted on Github</a>.
- You can download a tarball directly
- by <a href="https://github.com/joshua-decoder/indian-parallel-corpora/zipball/master">clicking
- here</a>. The corpus is licensed under the <a href="http://creativecommons.org/">Creative
- Commons</a> <a href="http://creativecommons.org/licenses/by-sa/3.0/">Attribution-Sharealike 3.0
- Unported License</a> (CC BY-SA 3.0).
-
- <h2>Citations</h2>
-
- <p>
- The following publications have made use of this dataset.
- </p>
-
- <ol>
- <li><b>Post, Callison-Burch, and Osborne (2012)</b> This paper introduced the parallel
- corpora, describing how the data was collected, reporting the results of prelimary
- experiments, and suggesting some potential research directions.
- </ol>
-
- <h2>Scores</h2>
-
- <p>
- Below are the best translation scores (case-insensitive BLEU-4) that have been reported on the
- provided test sets. The Google results were recorded in the fall of 2011 (and are described
- in Post et al. (2012)). Google does not have a Malayalam system.
- </p>
-
- <div>
- <table id=results>
- <tr>
- <th style="width:150px">Citation</th>
- <th>BN</th>
- <th>HI</th>
- <th>ML</th>
- <th>TA</th>
- <th>TE</th>
- <th>UR</th>
- </tr>
- <tr>
- <td colspan=7><hr/></td>
- </tr>
- <tr>
- <td class="system">Google</td>
- <td>20.01</td>
- <td>25.21</td>
- <td>–</td>
- <td>13.51</td>
- <td>16.03</td>
- <td>23.09</td>
- </tr>
- <tr>
- <td class="system">Post et al. (2012)</td>
- <td>13.53</td>
- <td>17.29</td>
- <td>13.72</td>
- <td> 9.81</td>
- <td>12.46</td>
- <td>19.53</td>
- </tr>
- <tr>
- <td class="system">Post et al. (2012)</td>
- <td>13.53</td>
- <td>17.29</td>
- <td>13.72</td>
- <td> 9.81</td>
- <td>12.46</td>
- <td>19.53</td>
- </tr>
- </table>
+ <div class="navbar navbar-inverse navbar-fixed-top">
+ <div class="navbar-inner">
+ <div class="container">
+ <button type="button" class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ <span class="icon-bar"></span>
+ </button>
+ <a class="brand" href="#">Joshua</a>
+ <div class="nav-collapse collapse">
+ <ul class="nav">
+ <li class="active"><a href="/">Home</a></li>
+ <li><a href="index.html">Indian Languages</a></li>
+ </ul>
+ </div><!--/.nav-collapse -->
+ </div>
+ </div>
</div>
- </div>
- <div style="width: 250px; margin-top: 100px">
- <img width="250px" src="images/map1.png"/>
- <p style="clear: both; text-align: center"><b>Indo-Aryan languages</b></p>
- <img width="250px" src="images/map2.png"/>
- <p style="clear: both; text-align: center"><b>Dravidian languages</b></p>
- </div>
+ <div class="container">
-</body>
+ <div class="row">
+ <div class="span8">
+ <h1>Indian Languages Parallel Corpora</h1>
+ </div>
+ <div>
+ <p>
+ <br/>
+ <span id="download">
+ <a href="https://github.com/joshua-decoder/indian-parallel-corpora/zipball/master">Download</a>
+ </span>
+ </p>
+ </div>
+ </div>
+
+ <hr />
+
+ <div class="row">
+ <div class="span8">
+
+ <h2>Description</h2>
+
+ This page describes a set of parallel corpora between English and six languages from the
+ Indian sub-continent:
+
+ <ul>
+ <li>Bengali</li>
+ <li>Hindi</li>
+ <li>Malayalam</li>
+ <li>Tamil</li>
+ <li>Telugu</li>
+ <li>Urdu</li>
+ </ul>
+
+ <p>
+ They can be used to train (and evaluate) models
+ for <a href="http://en.wikipedia.org/wiki/Statistical_machine_translation">automatically
+ translating</a> text into and out of these languages. They were collected by
+ translating Indian Wikipedia articles into English using Amazon's Mechanical Turk.
+ Their collection and release are described in the paper:
+ </p>
+
+ <blockquote>
+ <i>Constructing Parallel Corpora for Six Indian Languages via Crowdsourcing</i> <br/>
+ <a href="http://cs.jhu.edu/~post">Matt Post</a>, <a href="http://cs.jhu.edu/~ccb">Chris
+ Callison-Burch</a>, and <a href="http://homepages.inf.ed.ac.uk/miles/">Miles
+ Osborne</a> <br/>
+ <a href="http://statmt.org/wmt12">WMT 2012</a> <br/>
+ <a class="pdf" href="http://aclweb.org/anthology-new/W/W12/W12-3152.pdf">PDF</a>
+ <a class="bibtex" href="http://aclweb.org/anthology-new/W/W12/W12-3152.bib">BIB</a>
+ </blockquote>
+
+ <h2>Download & License</h2>
+
+ The Indian parallel corpora dataset
+ is <a href="https://github.com/joshua-decoder/indian-parallel-corpora">hosted on
+ Github</a>. You can download a tarball directly
+ by <a href="https://github.com/joshua-decoder/indian-parallel-corpora/zipball/master">clicking
+ here</a>. The corpus is licensed under the <a href="http://creativecommons.org/">Creative
+ Commons</a> <a href="http://creativecommons.org/licenses/by-sa/3.0/">Attribution-Sharealike
+ 3.0 Unported License</a> (CC BY-SA 3.0).
+
+ <h2>Citations</h2>
+
+ <p>
+ The following publications have made use of this dataset.
+ </p>
+
+ <ul>
+ <li><b>Post, Callison-Burch, and Osborne (2012)</b>. This paper introduced the parallel
+ corpora, describing how the data was collected, reporting the results of prelimary
+ experiments, and suggesting some potential research directions.
+ </li>
+ </ul>
+
+ <h2>Scores</h2>
+
+ <p>
+ Below are the best translation scores (case-insensitive BLEU-4) that have been
+ reported on the provided test sets. The Google results were recorded in the fall of
+ 2011 (and are described in Post et al. (2012)). Google does not have a Malayalam
+ system.
+ </p>
+
+ <div>
+ <table>
+ <tr>
+ <th style="width:150px">Citation</th>
+ <th>BN</th>
+ <th>HI</th>
+ <th>ML</th>
+ <th>TA</th>
+ <th>TE</th>
+ <th>UR</th>
+ </tr>
+ <tr>
+ <td class="system">Google</td>
+ <td>20.01</td>
+ <td>25.21</td>
+ <td>–</td>
+ <td>13.51</td>
+ <td>16.03</td>
+ <td>23.09</td>
+ </tr>
+ <tr>
+ <td class="system">Post et al. (2012)</td>
+ <td>13.53</td>
+ <td>17.29</td>
+ <td>13.72</td>
+ <td> 9.81</td>
+ <td>12.46</td>
+ <td>19.53</td>
+ </tr>
+ </table>
+ </div>
+ </div>
+
+ <div class="span4">
+ <div>
+ <img width="250px" src="images/map1.png"/>
+ <p style="text-align: center"><a href="http://en.wikipedia.org/wiki/Indo-Aryan_languages">Indo-Aryan languages</a></p>
+
+ <img width="250px" src="images/map2.png"/>
+ <p style="text-align: center"><a href="http://en.wikipedia.org/wiki/Dravidian_languages">Dravidian languages</a></p>
+ </div>
+ </div>
+ </div>
+ </div> <!-- /container -->
+
+ <!-- Le javascript
+ ================================================== -->
+ <!-- Placed at the end of the document so the pages load faster -->
+ <script src="bootstrap/js/jquery.js"></script>
+ <script src="bootstrap/js/bootstrap-transition.js"></script>
+ <script src="bootstrap/js/bootstrap-alert.js"></script>
+ <script src="bootstrap/js/bootstrap-modal.js"></script>
+ <script src="bootstrap/js/bootstrap-dropdown.js"></script>
+ <script src="bootstrap/js/bootstrap-scrollspy.js"></script>
+ <script src="bootstrap/js/bootstrap-tab.js"></script>
+ <script src="bootstrap/js/bootstrap-tooltip.js"></script>
+ <script src="bootstrap/js/bootstrap-popover.js"></script>
+ <script src="bootstrap/js/bootstrap-button.js"></script>
+ <script src="bootstrap/js/bootstrap-collapse.js"></script>
+ <script src="bootstrap/js/bootstrap-carousel.js"></script>
+ <script src="bootstrap/js/bootstrap-typeahead.js"></script>
+
+ </body>
</html>
+