1. 2009e0e NUTCH-3167 Upgrade to Hadoop 3.5.0 (#911) by Lewis John McGibbney · 5 days ago master
  2. 31c44b2 NUTCH-3163 Integrate Apache Yetus' pre-commit patch testing into Nutch GitHub Continuous Integration (#907) by Lewis John McGibbney · 3 weeks ago
  3. e47cfd5 parse: fix 'occured' -> 'occurred' typo in ParseStatus FAILED javadoc (#910) by Sai Asish Y · 4 weeks ago
  4. 3b663de NUTCH-3165 Remove the Nutch web service (#908) by Lewis John McGibbney · 4 weeks ago
  5. df62fa1 NUTCH-3168 Sandbox Commons JEXL usage in crawl and index pipelines (#909) by Lewis John McGibbney · 4 weeks ago
  6. 96552ae NUTCH-2932 Create OpenAPI specification for Nutch 1.x REST API (#896) by Lewis John McGibbney · 4 weeks ago
  7. da55708 [NUTCH-3160] Remove System.exit(..) from reusable code (#903) by Luca · 3 months ago
  8. eed3445 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#902) by Lewis John McGibbney · 3 months ago
  9. f1d3e8a NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#901) by Lewis John McGibbney · 3 months ago
  10. 89e6ec1 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#900) by Lewis John McGibbney · 3 months ago
  11. 9f3bb41 Bump SonarSource/sonarqube-scan-action from 5 to 6 in /.github/workflows (#899) by dependabot[bot] · 3 months ago
  12. 21fd780 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#898) by Lewis John McGibbney · 3 months ago
  13. 39cfc61 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#897) by Lewis John McGibbney · 3 months ago
  14. 64ac8b4 NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers (#895) by Lewis John McGibbney · 3 months ago
  15. ceb23e1 NUTCH-3145 Upgrade to JUnit 6 (#883) by Lewis John McGibbney · 3 months ago
  16. 4337927 Merge pull request #825 from lewismc/NUTCH-3064 by Sebastian Nagel · 3 months ago
  17. 674914c Prepare for new development after release of 1.22 by Sebastian Nagel · 3 months ago
  18. 195b4c0 NUTCH-3153 Update of license and notice files by Sebastian Nagel · 3 months ago
  19. 1d25cb8 NUTCH-3152 Job counters getGroup to use metrics constants by Sebastian Nagel · 3 months ago
  20. f7c7e1a NUTCH-3150 Expand Caching Hadoop Counter References (#892) by Lewis John McGibbney · 3 months ago
  21. 1242e22 NUTCH-3142 Add Error Context to Metrics (#882) by Lewis John McGibbney · 3 months ago
  22. 3101a9e Merge pull request #887 from lewismc/NUTCH-3110 by Sebastian Nagel · 3 months ago
  23. f8577a0 NUTCH-3143 GitHub workflow does not run all unit tests (#890) by Lewis John McGibbney · 4 months ago
  24. 7c5a529 NUTCH-3143 GitHub workflow does not run all unit tests (#889) by Lewis John McGibbney · 4 months ago
  25. 7d05fec NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 4 months ago
  26. f36a836 NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 4 months ago
  27. 3ddae0b Merge remote-tracking branch 'origin' into NUTCH-3064 by lewismc · 4 months ago
  28. 4207bc3 NUTCH-3148 Cache Ivy dependencies in GitHub CI builds (#886) by Lewis John McGibbney · 4 months ago
  29. 713835b NUTCH-3110 Upgrade to Tika 3.2.3 by lewismc · 4 months ago
  30. 7f724a9 Merge pull request #880 from igiguere/NUTCH-1564-AdaptiveFetchSchedule-refetch by Sebastian Nagel · 4 months ago
  31. ddabe96 NUTCH-3144 URLUtil unit tests fail after upgrade to crawler-commons 1.6 by Sebastian Nagel · 4 months ago
  32. 8e7bbc4 NUTCH-3143 GitHub workflow does not run all unit tests (#885) by Lewis John McGibbney · 4 months ago
  33. ec8747a NUTCH-3143 GitHub workflow does not run all unit tests (#884) by Lewis John McGibbney · 4 months ago
  34. 8186b04 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 4 months ago
  35. c01cc22 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 4 months ago
  36. b849499 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 4 months ago
  37. 66f678e NUTCH-3141 Cache Hadoop Counter References in Hot Paths (#878) by Lewis John McGibbney · 4 months ago
  38. 103fff6 NUTCH-1564: address code review comments. by Isabelle Giguere · 4 months ago
  39. 58687ec NUTCH-1564: fix AdaptiveFetchSchedule for unmodified pages by Isabelle Giguere · 4 months ago
  40. d5dccfb NUTCH-1564: fix immediate refetch for pages not modified by Isabelle Giguere · 4 months ago
  41. 00bf8c4 NUTCH-3139 protocol-okhttp: add support for zstd content-encoding by Sebastian Nagel · 5 months ago
  42. 8a0fb2b NUTCH-3137 Upgrade Nutch core dependencies (#875) by Sebastian Nagel · 5 months ago
  43. c7cf569 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 5 months ago
  44. 50b1ee6 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 5 months ago
  45. 8307b6b NUTCH-3135 Cache downloaded ant-eclipse.jar by Sebastian Nagel · 5 months ago
  46. de27acc NUTCH-3133 Upgrade GitHub workflows to JDK 17 by Sebastian Nagel · 5 months ago
  47. ca2591e NUTCH-3134 Add latency metrics with percentile support to Fetcher, Parser, and Indexer (#876) by Lewis John McGibbney · 5 months ago
  48. f71bab4 NUTCH-3132 Standardize existing Nutch metrics naming and implementation (#871) by Lewis John McGibbney · 5 months ago
  49. 7b5ed23 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#868) by Lewis John McGibbney · 5 months ago
  50. f65371d Merge pull request #870 from igiguere/NUTCH-2971 by Sebastian Nagel · 5 months ago
  51. f43ff78 fix for NUTCH-2671 contributed by igiguere. Also fixes NUTCH-3128, NUTCH-3125 by Isabelle Giguere · 6 months ago
  52. 1156801 NUTCH-3040 Upgrade to Hadoop 3.4.2 (#866) by Lewis John McGibbney · 6 months ago
  53. 317d2de NUTCH-3126 Report JUnit test results in GitHub pull request thread (#867) by Lewis John McGibbney · 7 months ago
  54. cefb48a NUTCH-3099 Allow wildcard '*' in http.proxy.exception.list (via Isabelle Giguere) (#865) by Lewis John McGibbney · 7 months ago
  55. 2d92366 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#863) by Lewis John McGibbney · 7 months ago
  56. 667e217 Merge pull request #864 from sebastian-nagel/NUTCH-2887-junit4-mrunit by Sebastian Nagel · 8 months ago
  57. a966c44 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 8 months ago
  58. 919e245 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 8 months ago
  59. cfcf2d7 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 8 months ago
  60. e2b60fc NUTCH-2887 Migrate to JUnit 5 Jupiter (#862) by Lewis John McGibbney · 8 months ago
  61. 4c04a98 NUTCH-2887 Migrate to JUnit 5 Jupiter (#861) by Lewis John McGibbney · 8 months ago
  62. 3991c5b Merge pull request #859 from TamimEhsan/NUTCH-3122 by Sebastian Nagel · 9 months ago
  63. 7e43e12 NUTCH-3124 Github workflow not run because of uncertified action "paths-changes-filter" by Sebastian Nagel · 9 months ago
  64. 365f585 [NUTCH-3122] Add test for backward compatibility of SpellCheckedMetadata by Tamim Ehsan · 9 months ago
  65. 5ae91b6 [NUTCH-3122] Make SpellCheckedMetadata case-insensitive for all Metadata names by Tamim Ehsan · 9 months ago
  66. 8416da8 NUTCH-3118 Logging pattern missing one argument placeholder by Sebastian Nagel · 10 months ago
  67. d1b70ad NUTCH-3119 Log4j package scanning is deprecated by Sebastian Nagel · 10 months ago
  68. 11e9a6a Prepare for new development after release of 1.21 by Sebastian Nagel · 10 months ago
  69. 2786b5a NUTCH-3118 Logging pattern missing one argument placeholder (#857) by Sebastian Nagel · 10 months ago
  70. 671b1e0 Merge pull request #851 from sebastian-nagel/NUTCH-3112-parameterized-logging by Sebastian Nagel · 10 months ago
  71. e62a0b8 Merge pull request #855 from sebastian-nagel/NUTCH-3116-dependency-upgrades by Sebastian Nagel · 10 months ago
  72. bf54609 NUTCH-3112 Utilize parameterized logging by Sebastian Nagel · 10 months ago
  73. 3961e9a NUTCH-3116 Update of license and notice files by Sebastian Nagel · 10 months ago
  74. 059c5be NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 10 months ago
  75. 9211106 NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 10 months ago
  76. a6c216c NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 10 months ago
  77. e850012 Merge pull request #856 from CatChullain/NUTCH-3115 by Sebastian Nagel · 10 months ago
  78. 3128286 NUTCH-2976 SitemapProcessor: verify sitemap values by Sebastian Nagel · 10 months ago
  79. 94a9935 NUTCH-3115 update to set all fields access on each POJO individually, updated JUnit tests, improved logging by Joe Gilvary · 10 months ago
  80. 154504b Added Apache license to source of toy class used by JUnit test. by Joe Gilvary · 10 months ago
  81. 1834c89 Corrected element order on a couple nutch-default.xml nodes for index-arbitrary configs. by Joe Gilvary · 10 months ago
  82. eab1ea9 NUTCH-3116 Update of license and notice files by Sebastian Nagel · 10 months ago
  83. b6443fa NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 10 months ago
  84. d2adde2 Updated Arbitrary Indexer that passes all indexer constructor args to user's POJO instance. by Joe Gilvary · 10 months ago
  85. 25f5610 NUTCH-3112 Utilize parameterized logging by Sebastian Nagel · 1 year, 2 months ago
  86. cf4f805 NUTCH-3113 Group commands in bin/nutch command-line help thematically by Sebastian Nagel · 1 year, 2 months ago
  87. a077ffc NUTCH-3087 BasicURLNormalizer to keep userinfo for protocols which might require it by Sebastian Nagel · 1 year, 5 months ago
  88. 14fc330 NUTCH-3114 Avoid stale fetching when only URLs by Sebastian Nagel · 10 months ago
  89. 71eca88 Merge pull request #847 from tatecn/NUTCH-3106 by Sebastian Nagel · 10 months ago
  90. 5335e6b Merge pull request #848 from martin-djukanovic/NUTCH-3103 by Sebastian Nagel · 10 months ago
  91. b61d11f Merge pull request #849 from maciejpuzianowski/NUTCH-3108 by Sebastian Nagel · 1 year, 2 months ago
  92. 76ced9b NUTCH-3110 Upgrade to Tika 3.1.0 by Sebastian Nagel · 1 year, 2 months ago
  93. 3fb8068 NUTCH-3110 Upgrade to Tika 3.1.0 by Sebastian Nagel · 1 year, 2 months ago
  94. a050c43 fix for NUTCH-3108 contributed by maciejpuzianowski/mpuzianowski by Maciej Puzianowski · 1 year, 3 months ago
  95. 931ba17 [NUTCH-3103] Fixed custom max intervals for AdaptiveFetchSchedule by martin · 1 year, 3 months ago
  96. aca19bb NUTCH-3106 fix Issue with SSLHandshakeException in v1.20 using protocol-http plugin by Hanbing Luo · 1 year, 3 months ago
  97. b52ec90 NUTCH-3100 HostDB to support minimum records per host by Markus Jelsma · 1 year, 4 months ago
  98. 18e7aeb NUTCH-3101 src/java/org/apache/nutch/crawl/Inlink.java by Markus Jelsma · 1 year, 4 months ago
  99. 3b6d2c6 Merge pull request #832 from sebastian-nagel/NUTCH-3072 by Sebastian Nagel · 1 year, 4 months ago
  100. 74b49e9 NUTCH-3086 Consolidate plugin extension names and IDs (#835) by Sebastian Nagel · 1 year, 4 months ago