1. ed7b661 Merge pull request #748 from sebastian-nagel/NUTCH-2883-docker by Sebastian Nagel · 2 weeks ago master
  2. 7c1a48c NUTCH-2883 Provide means to run server and webapp as persistent services in Docker container by Sebastian Nagel · 5 weeks ago
  3. 0bda1bd NUTCH-2883 Provide means to run server and webapp as persistent services in Docker container by Sebastian Nagel · 1 year ago
  4. 989c2ca NUTCH-2883 Provide means to run server and webapp as persistent services in Docker container by Lewis John McGibbney · 1 year, 3 months ago
  5. 85f7bcb Prepare for new development after release of 1.19 by Sebastian Nagel · 3 weeks ago
  6. 27cf929 Nutch 1.19 release by Sebastian Nagel · 5 weeks ago
  7. ffe0598 NUTCH-2969 Javadoc: Javascript search is not working when built on JDK 11 by Sebastian Nagel · 5 weeks ago
  8. 635ef2f Merge pull request #747 from sebastian-nagel/NUTCH-2963-upgrade-dependencies by Sebastian Nagel · 5 weeks ago
  9. bca5fc0 NUTCH-2795 CrawlDbReader: compress CrawlDb dumps if configured by Sebastian Nagel · 6 weeks ago
  10. bec577d NUTCH-2863 Injector to parse command-line flags case-insensitive by Sebastian Nagel · 6 weeks ago
  11. 0442562 NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  12. 59f7865 NUTCH-2843 Duplicate declaration of dependencies in ivy.xml by Sebastian Nagel · 6 weeks ago
  13. 148c8f8 NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  14. ef7c102 NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  15. 0c28398 NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  16. cdc67c9 NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  17. 3199dee NUTCH-2963 Upgrade dependencies before release of 1.19 by Sebastian Nagel · 6 weeks ago
  18. 6f4c80b NUTCH-2962 Update and complete package info of protocol plugins by Sebastian Nagel · 6 weeks ago
  19. 7e969ea NUTCH-2930 Protocol-okhttp: implement IP filter (#736) by Sebastian Nagel · 6 weeks ago
  20. 05afebd Merge pull request #743 from sebastian-nagel/NUTCH-2290-update-licenses by Sebastian Nagel · 6 weeks ago
  21. c0f723e NUTCH-2957 indexer-solr / Solr schema.xml by Sebastian Nagel · 7 weeks ago
  22. edebfe4 NUTCH-2955 indexer-solr: replace deprecated/removed field type solr.LatLonType by Sebastian Nagel · 7 weeks ago
  23. a5a6300 Merge pull request #729 from sebastian-nagel/NUTCH-2947-keep-stateful-fetch-queues by Sebastian Nagel · 6 weeks ago
  24. 82f9530 Merge pull request #697 from sebastian-nagel/NUTCH-2896-okhttp-connection-pool by Sebastian Nagel · 6 weeks ago
  25. b7b8345 NUTCH-2958 Upgrade to crawler-commons 1.3 (#740) by Sebastian Nagel · 7 weeks ago
  26. 957d460 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  27. 9a59ec9 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  28. 78f6f40 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  29. 2fbd309 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  30. 1d1eb63 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  31. a107131 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  32. eba8f38 NUTCH-2290 Update licenses of bundled libraries by Sebastian Nagel · 7 weeks ago
  33. ddca1c2 NUTCH-2822 Split the LICENSE.txt file into two files for source resp. binary releases by Sebastian Nagel · 7 weeks ago
  34. 1aec06f Upgrade to Apache Rat 0.14 (download of Rat 0.13 failed) by Sebastian Nagel · 7 weeks ago
  35. dfe430b NUTCH-2861 Remove parse-swf by Sebastian Nagel · 7 weeks ago
  36. 8fc4f17 NUTCH-2956 index-geoip: dependency upgrades and improvements by Sebastian Nagel · 7 weeks ago
  37. 01ab00b NUTCH-2953 Indexer Elastic to ignore SSL issues by Sebastian Nagel · 7 weeks ago
  38. e71841f NUTCH-2952 Upgrade core dependencies by Sebastian Nagel · 3 months ago
  39. 487110b NUTCH-2936 Early registration of URL stream handlers provided by plugins may fail Hadoop jobs by Sebastian Nagel · 3 months ago
  40. 1f5f3e4 NUTCH-2936 Early registration of URL stream handlers provided by plugins may fail Hadoop jobs running in distributed mode if protocol-okhttp is used by Sebastian Nagel · 4 months ago
  41. 03e0ffd NUTCH-2936 Early registration of URL stream handlers provided by plugins may fail Hadoop jobs running in distributed mode if protocol-okhttp is used by Sebastian Nagel · 4 months ago
  42. 5b970ff NUTCH-2951 Crawl datum with metadata WRITABLE_GENERATE_TIME_KEY awaits fetching forever by Sebastian Nagel · 4 months ago
  43. 467e591 NUTCH-2896 Protocol-okhttp: make connection pool configurable by Sebastian Nagel · 1 year ago
  44. af44bcb NUTCH-2896 Protocol-okhttp: make connection pool configurable by Sebastian Nagel · 1 year ago
  45. 47d3fe6 Merge pull request #731 from sebastian-nagel/NUTCH-2950-update-hostdb-performance by Sebastian Nagel · 4 months ago
  46. 02dca3b NUTCH-2936 Early registration of URL stream handlers provided by plugins may fail Hadoop jobs running in distributed mode (#726) by Lewis John McGibbney · 4 months ago
  47. 947e67b NUTCH-2950 Improve performance of UpdateHostDb - fix Javadoc errors / warnings by Sebastian Nagel · 4 months ago
  48. bafa752 Fail javadoc build on all kinds of javadoc errors and warnings by Sebastian Nagel · 4 months ago
  49. 5086958 NUTCH-2950 Improve performance of UpdateHostDb by Sebastian Nagel · 5 months ago
  50. 13f8504 Improve performance of UpdateHostDb by Sebastian Nagel · 5 months ago
  51. 417dee6 NUTCH-2950 Improve performance of UpdateHostDb by Sebastian Nagel · 5 months ago
  52. 5a6ac3b NUTCH-2950 Improve performance of UpdateHostDb by Sebastian Nagel · 5 months ago
  53. 70b2d5e NUTCH-2950 Improve performance of UpdateHostDb by Sebastian Nagel · 5 months ago
  54. 8cfa53f NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues by Sebastian Nagel · 5 months ago
  55. c862d24 NUTCH-2947 Fetcher: keep state of empty but stateful fetch queues by Sebastian Nagel · 8 months ago
  56. bdbe7b3 NUTCH-2946 Fetcher: optionally slow down fetching from hosts with repeated exceptions by Sebastian Nagel · 5 months ago
  57. 42ae2a3 NUTCH-2946 Fetcher: slow down fetching from hosts where requests fail repeatedly by Sebastian Nagel · 9 months ago
  58. 568993b NUTCH-2948 Upgrade dependencies to Any23 2.7 and Tika 2.3.0 by Sebastian Nagel · 5 months ago
  59. f8967c4 NUTCH-2923: Added JobId in Job Failure logs (#721) by Prakhar Chaube · 8 months ago
  60. f691bae NUTCH-2573 Suspend crawling if robots.txt fails to fetch with 5xx status (#724) by Sebastian Nagel · 8 months ago
  61. d565f45 NUTCH-2935 DeduplicationJob: failure on URLs with invalid percent encoding by Sebastian Nagel · 9 months ago
  62. 847e19d NUTCH-2919 Upgrade to Tika 2.2.1 and Any23 2.6 (#717) by Lewis John McGibbney · 8 months ago
  63. f4ce845 Merge pull request #722 from sebastian-nagel/NUTCH-2929-fetcher-threads-slow-start by Sebastian Nagel · 9 months ago
  64. 34e7b03 NUTCH-2929 Fetcher: start threads slowly to avoid that resources are temporarily exhausted by Sebastian Nagel · 9 months ago
  65. 78e827a Merge pull request #703 from sebastian-nagel/NUTCH-2903-indexer-elastic-https by Sebastian Nagel · 9 months ago
  66. e76d69f NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers (#720) by Lewis John McGibbney · 9 months ago
  67. 1d9ebf1 Upgrade to log4j 2.17.0 (#719) by Sebastian Nagel · 9 months ago
  68. 938e2dd NUTCH-2917 Remove transitive dependency to log4j 1.x (#718) by Sebastian Nagel · 9 months ago
  69. a9b50a7 NUTCH-2449 Replace Tika LanguageIdentifier in language-identifier (#716) by Lewis John McGibbney · 9 months ago
  70. bb3d8bc NUTCH-2914 nutch-default.xml: remove obsolete and unused properties (#709) by Sebastian Nagel · 9 months ago
  71. 3fe291b NUTCH-2807 SitemapProcessor to warn that ignoring robots.txt affects detection of sitemaps (#710) by Sebastian Nagel · 9 months ago
  72. af29192 Merge pull request #711 from sebastian-nagel/NUTCH-2808 by Sebastian Nagel · 9 months ago
  73. 4caa5ce NUTCH-2918 Upgrade to log4j 2.16.0 (#715) by Sebastian Nagel · 9 months ago
  74. dc6e320 NUTCH-2916 Fix log file rotation / rename default log file (#714) by Sebastian Nagel · 10 months ago
  75. 9671b64 Merge pull request #713 from sebastian-nagel/NUTCH-2915 by Sebastian Nagel · 10 months ago
  76. 0c9971d NUTCH-2915 Upgrade to log4j 2.15.0 by Sebastian Nagel · 10 months ago
  77. 68bd4b5 Update documentation of protocol-related properties in nutch-default.xml by Sebastian Nagel · 3 years, 3 months ago
  78. dc6f78b NUTCH-2808 Document side effects of ignoring robots.txt by Sebastian Nagel · 10 months ago
  79. 9a2f94f Merge pull request #539 from lewismc/NUTCH-2803 by Sebastian Nagel · 10 months ago
  80. b2ccbc9 Merge branch 'master' into NUTCH-2803 by Sebastian Nagel · 10 months ago
  81. a62168c Merge pull request #708 from prakharchaube/NUTCH-2911 by Sebastian Nagel · 10 months ago
  82. 6662081 NUTCH-2911: Added InterruptedException to throws to allow propagation by prakharchaube · 10 months ago
  83. dd27044 Merge pull request #704 from sebastian-nagel/NUTCH-2905-index-writers-logging-mask-credentials by Sebastian Nagel · 10 months ago
  84. 02cd13c Merge pull request #707 from sebastian-nagel/NUTCH-2908 by Sebastian Nagel · 10 months ago
  85. 671f904 Merge pull request #700 from sebastian-nagel/NUTCH-2891-tika-2.1 by Sebastian Nagel · 10 months ago
  86. f7705b9 NUTCH-2911: Caught and added log for InterruptedException by prakharchaube · 10 months ago
  87. 621c884 NUTCH-2891 Upgrade to Tika 2.1.0 by Sebastian Nagel · 10 months ago
  88. 511e4a9 fix for NUTCH-2911 contributed by prakharchaube by prakharchaube · 10 months ago
  89. 0dc6959 NUTCH-2908 Log mapreduce job messages and counters in local mode (Log4j2) by Sebastian Nagel · 10 months ago
  90. ff800c5 Merge pull request #705 from sebastian-nagel/NUTCH-2867 by Sebastian Nagel · 10 months ago
  91. ebf3036 NUTCH-2867 Support for custom HostDb aggregators - complete Javadoc by Sebastian Nagel · 10 months ago
  92. 75daf3e Merge pull request #706 from sebastian-nagel/NUTCH-2865 by Sebastian Nagel · 10 months ago
  93. 64fb604 Merge pull request #695 from lewismc/NUTCH-2892 by Sebastian Nagel · 10 months ago
  94. 5f6f627 NUTCH-2867 Support for custom HostDb aggregators by Sebastian Nagel · 10 months ago
  95. 1cff230 NUTCH-2892 Upgrade to Any23 2.5 by Sebastian Nagel · 10 months ago
  96. 25ccf89 Merge pull request #702 from sebastian-nagel/NUTCH-2904-crawler-commons-1.2 by Sebastian Nagel · 10 months ago
  97. 6bb30c7 NUTCH-2865 WARC exporter support for metadata and dropping empty responses by Sebastian Nagel · 10 months ago
  98. 9909a61 NUTCH-2867 Support for custom HostDb aggregators - rename aggregator interface by Sebastian Nagel · 10 months ago
  99. ad44f55 NUTCH-2867 Support for custom HostDb aggregators by Sebastian Nagel · 10 months ago
  100. 1e7eb52 NUTCH-2867 Support for custom HostDb aggregators (patch contributed by markus) by Sebastian Nagel · 10 months ago