| <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> |
| <html> |
| <head> |
| <META http-equiv="Content-Type" content="text/html; charset=UTF-8"> |
| <meta content="Apache Forrest" name="Generator"> |
| <meta name="Forrest-version" content="0.8"> |
| <meta name="Forrest-skin-name" content="lucene"> |
| <title>i18n</title> |
| <link type="text/css" href="skin/basic.css" rel="stylesheet"> |
| <link media="screen" type="text/css" href="skin/screen.css" rel="stylesheet"> |
| <link media="print" type="text/css" href="skin/print.css" rel="stylesheet"> |
| <link type="text/css" href="skin/profile.css" rel="stylesheet"> |
| <script src="skin/getBlank.js" language="javascript" type="text/javascript"></script><script src="skin/getMenu.js" language="javascript" type="text/javascript"></script><script src="skin/fontsize.js" language="javascript" type="text/javascript"></script> |
| <link rel="shortcut icon" href="images/favicon.ico"> |
| </head> |
| <body onload="init()"> |
| <script type="text/javascript">ndeSetTextSize();</script> |
| <div id="top"> |
| <!--+ |
| |breadtrail |
| +--> |
| <div class="breadtrail"> |
| <a href="http://www.apache.org/">Apache</a> > <a href="http://lucene.apache.org/">Lucene</a> > <a href="http://lucene.apache.org/nutch/">Nutch</a><script src="skin/breadcrumbs.js" language="JavaScript" type="text/javascript"></script> |
| </div> |
| <!--+ |
| |header |
| +--> |
| <div class="header"> |
| <!--+ |
| |start group logo |
| +--> |
| <div class="grouplogo"> |
| <a href="http://lucene.apache.org/"><img class="logoImage" alt="Lucene" src="images/lucene_green_150.gif" title="Apache Lucene"></a> |
| </div> |
| <!--+ |
| |end group logo |
| +--> |
| <!--+ |
| |start Project Logo |
| +--> |
| <div class="projectlogo"> |
| <a href="http://lucene.apache.org/nutch/"><img class="logoImage" alt="Nutch" src="images/nutch-logo.gif" title="Open Source Web Search Software"></a> |
| </div> |
| <!--+ |
| |end Project Logo |
| +--> |
| <!--+ |
| |start Search |
| +--> |
| <div class="searchbox"> |
| <form action="http://search.lucidimagination.com/p:nutch" method="get" class="roundtopsmall"> |
| <input onFocus="getBlank (this, 'Search the site with Solr');" size="25" name="q" id="query" type="text" value="Search the site with Solr"> |
| <input name="Search" value="Search" type="submit"> |
| </form> |
| <div style="position: relative; top: -5px; left: -10px">Powered by <a href="http://www.lucidimagination.com" style="color: #033268">Lucid Imagination</a> |
| </div> |
| </div> |
| <!--+ |
| |end search |
| +--> |
| <!--+ |
| |start Tabs |
| +--> |
| <ul id="tabs"> |
| <li class="current"> |
| <a class="selected" href="index.html">Main</a> |
| </li> |
| <li> |
| <a class="unselected" href="http://wiki.apache.org/nutch/">Wiki</a> |
| </li> |
| <li> |
| <a class="unselected" href="http://issues.apache.org/jira/browse/Nutch">Jira</a> |
| </li> |
| </ul> |
| <!--+ |
| |end Tabs |
| +--> |
| </div> |
| </div> |
| <div id="main"> |
| <div id="publishedStrip"> |
| <!--+ |
| |start Subtabs |
| +--> |
| <div id="level2tabs"></div> |
| <!--+ |
| |end Endtabs |
| +--> |
| <script type="text/javascript"><!-- |
| document.write("Last Published: " + document.lastModified); |
| // --></script> |
| </div> |
| <!--+ |
| |breadtrail |
| +--> |
| <div class="breadtrail"> |
| |
| |
| </div> |
| <!--+ |
| |start Menu, mainarea |
| +--> |
| <!--+ |
| |start Menu |
| +--> |
| <div id="menu"> |
| <div onclick="SwitchMenu('menu_1.1', 'skin/')" id="menu_1.1Title" class="menutitle">Project</div> |
| <div id="menu_1.1" class="menuitemgroup"> |
| <div class="menuitem"> |
| <a href="index.html">News</a> |
| </div> |
| <div class="menuitem"> |
| <a href="about.html">About</a> |
| </div> |
| <div class="menuitem"> |
| <a href="credits.html">Credits</a> |
| </div> |
| <div class="menuitem"> |
| <a href="http://www.cafepress.com/nutch/">Buy Stuff</a> |
| </div> |
| </div> |
| <div onclick="SwitchMenu('menu_selected_1.2', 'skin/')" id="menu_selected_1.2Title" class="menutitle" style="background-image: url('skin/images/chapter_open.gif');">Documentation</div> |
| <div id="menu_selected_1.2" class="selectedmenuitemgroup" style="display: block;"> |
| <div class="menuitem"> |
| <a href="http://wiki.apache.org/nutch/FAQ">FAQ</a> |
| </div> |
| <div class="menuitem"> |
| <a href="http://wiki.apache.org/nutch/">Wiki</a> |
| </div> |
| <div class="menuitem"> |
| <a href="tutorial.html">Tutorial (0.7.2)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="tutorial8.html">Tutorial (0.8.x)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="bot.html">Robot </a> |
| </div> |
| <div class="menupage"> |
| <div class="menupagetitle">i18n</div> |
| </div> |
| <div class="menuitem"> |
| <a href="apidocs-1.0/index.html">API Docs (1.0)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="apidocs-0.9/index.html">API Docs (0.9)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="apidocs-0.8.x/index.html">API Docs (0.8.x)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="apidocs/index.html">API Docs (0.7.2)</a> |
| </div> |
| <div class="menuitem"> |
| <a href="http://lucene.zones.apache.org:8080/hudson/job/Nutch-Nightly/ws/trunk/build/docs/api/index.html">API Docs (nightly)</a> |
| </div> |
| </div> |
| <div onclick="SwitchMenu('menu_1.3', 'skin/')" id="menu_1.3Title" class="menutitle">Resources</div> |
| <div id="menu_1.3" class="menuitemgroup"> |
| <div class="menuitem"> |
| <a href="release/">Download</a> |
| </div> |
| <div class="menuitem"> |
| <a href="nightly.html">Nightly builds</a> |
| </div> |
| <div class="menuitem"> |
| <a href="mailing_lists.html">Mailing Lists</a> |
| </div> |
| <div class="menuitem"> |
| <a href="issue_tracking.html">Issue Tracking</a> |
| </div> |
| <div class="menuitem"> |
| <a href="version_control.html">Version Control</a> |
| </div> |
| </div> |
| <div onclick="SwitchMenu('menu_1.4', 'skin/')" id="menu_1.4Title" class="menutitle">Related Projects</div> |
| <div id="menu_1.4" class="menuitemgroup"> |
| <div class="menuitem"> |
| <a href="http://lucene.apache.org/java/">Lucene Java</a> |
| </div> |
| <div class="menuitem"> |
| <a href="http://lucene.apache.org/hadoop/">Hadoop</a> |
| </div> |
| <div class="menuitem"> |
| <a href="http://incubator.apache.org/solr/">Solr</a> |
| </div> |
| </div> |
| <div id="credit"></div> |
| <div id="roundbottom"> |
| <img style="display: none" class="corner" height="15" width="15" alt="" src="skin/images/rc-b-l-15-1body-2menu-3menu.png"></div> |
| <!--+ |
| |alternative credits |
| +--> |
| <div id="credit2"></div> |
| </div> |
| <!--+ |
| |end Menu |
| +--> |
| <!--+ |
| |start content |
| +--> |
| <div id="content"> |
| <div title="Portable Document Format" class="pdflink"> |
| <a class="dida" href="i18n.pdf"><img alt="PDF -icon" src="skin/images/pdfdoc.gif" class="skin"><br> |
| PDF</a> |
| </div> |
| <h1>i18n</h1> |
| <div id="minitoc-area"> |
| <ul class="minitoc"> |
| <li> |
| <a href="#Getting+Started">Getting Started</a> |
| </li> |
| <li> |
| <a href="#Page+Header">Page Header</a> |
| </li> |
| <li> |
| <a href="#Static+Page+Content">Static Page Content</a> |
| </li> |
| <li> |
| <a href="#Dynamic+Page+Content">Dynamic Page Content</a> |
| </li> |
| <li> |
| <a href="#Generating+Static+Pages">Generating Static Pages</a> |
| </li> |
| <li> |
| <a href="#Testing+Dynamic+Pages">Testing Dynamic Pages</a> |
| </li> |
| </ul> |
| </div> |
| |
| |
| <p>The Nutch search pages are easy to internationalize.</p> |
| |
| |
| <p>For each language, there are three kinds things which must be |
| translated:</p> |
| |
| |
| <ol> |
| |
| |
| <li> |
| <b>page header</b>: This is a list of anchors included at the top of |
| every page.</li> |
| |
| |
| <li> |
| <b>static pages</b>: These include the "about" page, the "search" |
| page and the "help" page.</li> |
| |
| |
| <li> |
| <b>dynamic page text</b>: These are strings used when constructing |
| search result pages.</li> |
| |
| |
| </ol> |
| |
| |
| <p>Each of the above is described in more detail below.</p> |
| |
| |
| <a name="N10028"></a><a name="Getting+Started"></a> |
| <h2 class="h3">Getting Started</h2> |
| <div class="section"> |
| <p>The things to translate are:</p> |
| <ol> |
| |
| <li>the page header</li> |
| |
| <li>the "about" page (<tt>src/web/pages/<i>lang</i>/about.xml</tt>)</li> |
| |
| <li>the "search" page (<tt>src/web/pages/<i>lang</i>/search.xml</tt>)</li> |
| |
| <li>the "help" page (<tt>src/web/pages/<i>lang</i>/help.xml</tt>)</li> |
| |
| <li>text for search results (<tt>src/web/locale/org/nutch/jsp/search_<i>lang</i>.properties</tt>)</li> |
| |
| </ol> |
| <p>If you'd like to provide a translation, simply post translations of |
| these five files to <a href="mailto:nutch-dev@lucene.apache.org">nutch-dev@lucene.apache.org</a> |
| as an attachment.</p> |
| </div> |
| |
| |
| <a name="N10062"></a><a name="Page+Header"></a> |
| <h2 class="h3">Page Header</h2> |
| <div class="section"> |
| <p>The Nutch page header is included at the top of every page.</p> |
| <p>The header is filed as |
| <tt>src/web/include/<i>language</i>/header.xml</tt> where |
| <i>language</i> is the <a href="http://ftp.ics.uci.edu/pub/ietf/http/related/iso639.txt">IS0639</a> |
| language code.</p> |
| <p>The format of the header file is:</p> |
| <pre> |
| <header-menu> |
| <item> ... </item> |
| <item> ... </item> |
| </header-menu> |
| </pre> |
| <p>Each item typically includes an HTML anchor, one for each of the |
| top-level pages in the translation.</p> |
| <p>For example, the header file for an English translation is filed |
| as <a href="http://svn.apache.org/repos/asf/lucene/nutch/trunk/src/web/include/en/header.xml"><tt>src/web/include/en/header.xml</tt></a>.</p> |
| </div> |
| |
| |
| <a name="N1008C"></a><a name="Static+Page+Content"></a> |
| <h2 class="h3">Static Page Content</h2> |
| <div class="section"> |
| <p>Static pages compose most of the Nutch website, and are also used |
| for project documentation. These are HTML generated from XML files by |
| XSLT. This process is used to include a standard header and footer, |
| and optionally a menu of sub-pages.</p> |
| <p>Static page content is filed as |
| <tt>src/web/pages/<i>language</i>/<i>page</i>.xml</tt> where |
| <i>language</i> is the IS0639 language code, as above, and <i>page</i> |
| determines the name of the page generated: |
| <tt>docs/<i>page</i>.html</tt>.</p> |
| <p>The format of a static page xml file is:</p> |
| <pre> |
| <page> |
| <title> ... </title> |
| <menu> |
| <item> ... </item> |
| <item> ... </item> |
| </menu> |
| <body> ... </body> |
| </page> |
| </pre> |
| <tt><menu></tt> |
| <p>Note that if you use an encoding other than UTF-8 (the default for |
| XML data) then you need to declare that. Also, if you use HTML |
| entities in your data, you'll need to declare these too. Look at |
| existing translations for examples of this.</p> |
| <p>For example, the English language "about" page is filed |
| as <a href="http://svn.apache.org/repos/asf/lucene/nutch/trunk/src/web/pages/en/about.xml"><tt>src/web/pages/en/about.xml</tt></a>.</p> |
| </div> |
| |
| |
| <a name="N100C1"></a><a name="Dynamic+Page+Content"></a> |
| <h2 class="h3">Dynamic Page Content</h2> |
| <div class="section"> |
| <p>Java Server Pages (JSP) is used to generate Nutch search results, and |
| a few other dynamic pages (cached content, score explanations, etc.).</p> |
| <p>These use Java's <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/util/Locale.html">Locale</a> |
| mechanism for internationalization. For each page/language pair, |
| there is a Java property file containing the translated text of that |
| page.</p> |
| <p>These property files are filed as |
| <tt>src/web/locale/org/nutch/jsp/<i>page</i>_<i>language</i>.xml</tt> |
| where <i>page</i> is the name of the JSP page in <a href="http://svn.apache.org/repos/asf/lucene/nutch/trunk/src/web/jsp/"><tt>src/web/jsp/</tt></a> |
| and <i>language</i> is the IS0639 language code, as above.</p> |
| <p>For example, text for the English language search results page is filed |
| as <a href="http://svn.apache.org/repos/asf/lucene/nutch/trunk/src/web/locale/org/nutch/jsp/search_en.properties"><tt>src/web/locale/org/nutch/jsp/search_en.properties</tt></a>. |
| This contains something like:</p> |
| <pre> |
| title = search results |
| search = Search |
| hits = Hits <b>{0}-{1}</b> (out of {2} total matching documents): |
| cached = cached |
| explain = explain |
| anchors = anchors |
| next = Next |
| </pre> |
| <p>Each entry corresponds to a text fragment on the search results |
| page. The "hits" entry uses Java's <a href="http://java.sun.com/j2se/1.4.2/docs/api/java/text/MessageFormat.html">MessageFormat</a>.</p> |
| <p>Note that property files must use the ISO 8859-1 encoding with |
| unicode escapes. If you author them in a different encoding, please |
| use Java's <tt>native2ascii</tt> tool to convert them to this |
| encoding.</p> |
| </div> |
| |
| |
| <a name="N10100"></a><a name="Generating+Static+Pages"></a> |
| <h2 class="h3">Generating Static Pages</h2> |
| <div class="section"> |
| <p>To generate the static pages you must have <a href="http://java.sun.com/j2se/downloads.html">Java</a>, <a href="http://ant.apache.org/">Ant</a> and Nutch installed. To |
| install Nutch, either download and unpack the latest <a href="http://lucene.apache.org/nutch/release/nightly/">release</a>, or check it |
| out from <a href="version_control.html">Subversion</a>.</p> |
| <p>Then give the command:</p> |
| <pre> |
| ant generate-docs |
| </pre> |
| <i>This documentation needs more detail. Could someone |
| please submit a list of the actual steps required here?</i> |
| <p>Once this is working, try adding directories and files to make your |
| own translation of the header and a few of the static pages.</p> |
| </div> |
| |
| |
| <a name="N10125"></a><a name="Testing+Dynamic+Pages"></a> |
| <h2 class="h3">Testing Dynamic Pages</h2> |
| <div class="section"> |
| <p>To test the dynamic pages you must also have <a href="http://jakarta.apache.org/tomcat/">Tomcat</a> installed.</p> |
| <p>An index is also required. You can collect your own by working |
| through the <a href="http://lucene.apache.org/nutch/tutorial.html">tutorial</a>. |
| Once you have an index, follow the steps outlined at the end of the |
| tutorial for searching.</p> |
| <i>This documentation needs more detail. Could someone |
| please submit a list of the actual steps required here?</i> |
| </div> |
| |
| |
| </div> |
| <!--+ |
| |end content |
| +--> |
| <div class="clearboth"> </div> |
| </div> |
| <div id="footer"> |
| <!--+ |
| |start bottomstrip |
| +--> |
| <div class="lastmodified"> |
| <script type="text/javascript"><!-- |
| document.write("Last Published: " + document.lastModified); |
| // --></script> |
| </div> |
| <div class="copyright"> |
| Copyright © |
| 2006 <a href="http://www.apache.org/licenses/">The Apache Software Foundation.</a> |
| </div> |
| <!--+ |
| |end bottomstrip |
| +--> |
| </div> |
| </body> |
| </html> |