blob: cee868baefe642e0a1e39768a6683081b487157f [file] [log] [blame]
<html><head>
<title>Thesaurus Development</title>
<meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=UTF-8">
</head>
<body>
<h2>Lingucomponent Sub-Project: Thesaurus Development </h2>
<p>The goal of this project is to improve existing thesauri for OpenOffice.org
and to create new thesauri for languages that don't have one yet.</p>
<p>This project started by searching for and finding a synonym list for
English (US) that was compatible with the OpenOffice.org licensing and
then using that list and some simple software to develop a thesaurus for
OpenOffice.org 1.x. OpenOffice.org 2.x now uses a thesaurus automatically
built from the data in <a href="http://wordnet.princeton.edu">WordNet</a>.
The internal file format has also changed to a text-based one.</p>
<h4>TODO</h4>
<ul>
<li><a href="http://lingucomponent.openoffice.org/issues/buglist.cgi?Submit+query=Submit+query&amp;component=lingucomponent&amp;subcomponent=thesaurus&amp;issue_status=UNCONFIRMED&amp;issue_status=NEW&amp;issue_status=STARTED&amp;issue_status=REOPENED&amp;email1=&amp;emailtype1=exact&amp;emailassigned_to1=1&amp;email2=&amp;emailtype2=exact&amp;emailreporter2=1&amp;issueidtype=include&amp;issue_id=&amp;changedin=&amp;votes=&amp;chfieldfrom=&amp;chfieldto=Now&amp;chfieldvalue=&amp;short_desc=&amp;short_desc_type=substring&amp;long_desc=&amp;long_desc_type=substring&amp;issue_file_loc=&amp;issue_file_loc_type=substring&amp;status_whiteboard=&amp;status_whiteboard_type=substring&amp;keywords=&amp;keywords_type=anytokens&amp;field0-0-0=noop&amp;type0-0-0=noop&amp;value0-0-0=&amp;cmdtype=doit&amp;order=Reuse+same+sort+as+last+time">See
the list of all open thesaurus issues</a></li>
<li>Create new thesauri (see below)</li>
</ul>
<h4>Downloads</h4>
<ul>
<li><a href="MyThes-1.zip">MyThes-1.zip (4,5MB)</a> - standalone version of the MyThes thesaurus code.
This includes a thesaurus for en_US in its new format for OOo 2.0 (but not yet the WordNet-based
thesaurus).</li>
<li><a href="http://www.danielnaber.de/wn2ooo">wn2ooo</a>, the script used to create the OOo
thesaurus from WordNet data.</li>
</ul>
<h4>Creating a new thesaurus</h4>
<p>If you are willing to maintain a website to collect and coordinate a community
developed synonym list for any language we need your help. Please send an e-mail to
dev@lingucomponent.openoffice.org listing your skills and interests in being
involved in this project. A web-based software for building a new thesaurus is
<a href="http://www.openthesaurus.de">OpenThesaurus</a>, which is already successfully
used to maintain the German, Polish, and other thesauri. All you need is some knowledge of
MySQL and a Java-enabled server space to run your own version of OpenThesaurus.</p>
<hr />
<br />
</html>
</body>