<html> | |
<head> | |
<title>Apache Nutch</title> | |
</head> | |
<body> | |
<p>Apache Nutch 2.X is a branch of the Apache Nutch open | |
source web-search software project. It builds on Apache Gora for data | |
persistence and Apache Solr for indexing adding web-specifics, such as | |
a crawler, a link-graph database and parsing support handled by Apache | |
Tika for HTML and an array other document formats.</p> | |
</body> | |
</html> | |