1. da55708 [NUTCH-3160] Remove System.exit(..) from reusable code (#903) by Luca · 6 days ago master
  2. eed3445 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#902) by Lewis John McGibbney · 9 days ago
  3. f1d3e8a NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#901) by Lewis John McGibbney · 9 days ago
  4. 89e6ec1 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#900) by Lewis John McGibbney · 10 days ago
  5. 9f3bb41 Bump SonarSource/sonarqube-scan-action from 5 to 6 in /.github/workflows (#899) by dependabot[bot] · 10 days ago
  6. 21fd780 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#898) by Lewis John McGibbney · 10 days ago
  7. 39cfc61 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#897) by Lewis John McGibbney · 10 days ago
  8. 64ac8b4 NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers (#895) by Lewis John McGibbney · 11 days ago
  9. ceb23e1 NUTCH-3145 Upgrade to JUnit 6 (#883) by Lewis John McGibbney · 11 days ago
  10. 4337927 Merge pull request #825 from lewismc/NUTCH-3064 by Sebastian Nagel · 2 weeks ago
  11. 674914c Prepare for new development after release of 1.22 by Sebastian Nagel · 3 weeks ago
  12. 195b4c0 NUTCH-3153 Update of license and notice files by Sebastian Nagel · 3 weeks ago
  13. 1d25cb8 NUTCH-3152 Job counters getGroup to use metrics constants by Sebastian Nagel · 3 weeks ago
  14. f7c7e1a NUTCH-3150 Expand Caching Hadoop Counter References (#892) by Lewis John McGibbney · 3 weeks ago
  15. 1242e22 NUTCH-3142 Add Error Context to Metrics (#882) by Lewis John McGibbney · 4 weeks ago
  16. 3101a9e Merge pull request #887 from lewismc/NUTCH-3110 by Sebastian Nagel · 4 weeks ago
  17. f8577a0 NUTCH-3143 GitHub workflow does not run all unit tests (#890) by Lewis John McGibbney · 6 weeks ago
  18. 7c5a529 NUTCH-3143 GitHub workflow does not run all unit tests (#889) by Lewis John McGibbney · 6 weeks ago
  19. 7d05fec NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 6 weeks ago
  20. f36a836 NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 6 weeks ago
  21. 3ddae0b Merge remote-tracking branch 'origin' into NUTCH-3064 by lewismc · 7 weeks ago
  22. 4207bc3 NUTCH-3148 Cache Ivy dependencies in GitHub CI builds (#886) by Lewis John McGibbney · 7 weeks ago
  23. 713835b NUTCH-3110 Upgrade to Tika 3.2.3 by lewismc · 7 weeks ago
  24. 7f724a9 Merge pull request #880 from igiguere/NUTCH-1564-AdaptiveFetchSchedule-refetch by Sebastian Nagel · 7 weeks ago
  25. ddabe96 NUTCH-3144 URLUtil unit tests fail after upgrade to crawler-commons 1.6 by Sebastian Nagel · 8 weeks ago
  26. 8e7bbc4 NUTCH-3143 GitHub workflow does not run all unit tests (#885) by Lewis John McGibbney · 7 weeks ago
  27. ec8747a NUTCH-3143 GitHub workflow does not run all unit tests (#884) by Lewis John McGibbney · 7 weeks ago
  28. 8186b04 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 7 weeks ago
  29. c01cc22 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 7 weeks ago
  30. b849499 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 7 weeks ago
  31. 66f678e NUTCH-3141 Cache Hadoop Counter References in Hot Paths (#878) by Lewis John McGibbney · 8 weeks ago
  32. 103fff6 NUTCH-1564: address code review comments. by Isabelle Giguere · 8 weeks ago
  33. 58687ec NUTCH-1564: fix AdaptiveFetchSchedule for unmodified pages by Isabelle Giguere · 9 weeks ago
  34. d5dccfb NUTCH-1564: fix immediate refetch for pages not modified by Isabelle Giguere · 9 weeks ago
  35. 00bf8c4 NUTCH-3139 protocol-okhttp: add support for zstd content-encoding by Sebastian Nagel · 3 months ago
  36. 8a0fb2b NUTCH-3137 Upgrade Nutch core dependencies (#875) by Sebastian Nagel · 3 months ago
  37. c7cf569 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 3 months ago
  38. 50b1ee6 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 3 months ago
  39. 8307b6b NUTCH-3135 Cache downloaded ant-eclipse.jar by Sebastian Nagel · 3 months ago
  40. de27acc NUTCH-3133 Upgrade GitHub workflows to JDK 17 by Sebastian Nagel · 3 months ago
  41. ca2591e NUTCH-3134 Add latency metrics with percentile support to Fetcher, Parser, and Indexer (#876) by Lewis John McGibbney · 3 months ago
  42. f71bab4 NUTCH-3132 Standardize existing Nutch metrics naming and implementation (#871) by Lewis John McGibbney · 3 months ago
  43. 7b5ed23 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#868) by Lewis John McGibbney · 3 months ago
  44. f65371d Merge pull request #870 from igiguere/NUTCH-2971 by Sebastian Nagel · 3 months ago
  45. f43ff78 fix for NUTCH-2671 contributed by igiguere. Also fixes NUTCH-3128, NUTCH-3125 by Isabelle Giguere · 3 months ago
  46. 1156801 NUTCH-3040 Upgrade to Hadoop 3.4.2 (#866) by Lewis John McGibbney · 4 months ago
  47. 317d2de NUTCH-3126 Report JUnit test results in GitHub pull request thread (#867) by Lewis John McGibbney · 5 months ago
  48. cefb48a NUTCH-3099 Allow wildcard '*' in http.proxy.exception.list (via Isabelle Giguere) (#865) by Lewis John McGibbney · 5 months ago
  49. 2d92366 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#863) by Lewis John McGibbney · 5 months ago
  50. 667e217 Merge pull request #864 from sebastian-nagel/NUTCH-2887-junit4-mrunit by Sebastian Nagel · 6 months ago
  51. a966c44 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 6 months ago
  52. 919e245 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 6 months ago
  53. cfcf2d7 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 6 months ago
  54. e2b60fc NUTCH-2887 Migrate to JUnit 5 Jupiter (#862) by Lewis John McGibbney · 6 months ago
  55. 4c04a98 NUTCH-2887 Migrate to JUnit 5 Jupiter (#861) by Lewis John McGibbney · 6 months ago
  56. 3991c5b Merge pull request #859 from TamimEhsan/NUTCH-3122 by Sebastian Nagel · 6 months ago
  57. 7e43e12 NUTCH-3124 Github workflow not run because of uncertified action "paths-changes-filter" by Sebastian Nagel · 6 months ago
  58. 365f585 [NUTCH-3122] Add test for backward compatibility of SpellCheckedMetadata by Tamim Ehsan · 6 months ago
  59. 5ae91b6 [NUTCH-3122] Make SpellCheckedMetadata case-insensitive for all Metadata names by Tamim Ehsan · 7 months ago
  60. 8416da8 NUTCH-3118 Logging pattern missing one argument placeholder by Sebastian Nagel · 8 months ago
  61. d1b70ad NUTCH-3119 Log4j package scanning is deprecated by Sebastian Nagel · 8 months ago
  62. 11e9a6a Prepare for new development after release of 1.21 by Sebastian Nagel · 8 months ago
  63. 2786b5a NUTCH-3118 Logging pattern missing one argument placeholder (#857) by Sebastian Nagel · 8 months ago
  64. 671b1e0 Merge pull request #851 from sebastian-nagel/NUTCH-3112-parameterized-logging by Sebastian Nagel · 8 months ago
  65. e62a0b8 Merge pull request #855 from sebastian-nagel/NUTCH-3116-dependency-upgrades by Sebastian Nagel · 8 months ago
  66. bf54609 NUTCH-3112 Utilize parameterized logging by Sebastian Nagel · 8 months ago
  67. 3961e9a NUTCH-3116 Update of license and notice files by Sebastian Nagel · 8 months ago
  68. 059c5be NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 8 months ago
  69. 9211106 NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 8 months ago
  70. a6c216c NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 8 months ago
  71. e850012 Merge pull request #856 from CatChullain/NUTCH-3115 by Sebastian Nagel · 8 months ago
  72. 3128286 NUTCH-2976 SitemapProcessor: verify sitemap values by Sebastian Nagel · 8 months ago
  73. 94a9935 NUTCH-3115 update to set all fields access on each POJO individually, updated JUnit tests, improved logging by Joe Gilvary · 8 months ago
  74. 154504b Added Apache license to source of toy class used by JUnit test. by Joe Gilvary · 8 months ago
  75. 1834c89 Corrected element order on a couple nutch-default.xml nodes for index-arbitrary configs. by Joe Gilvary · 8 months ago
  76. eab1ea9 NUTCH-3116 Update of license and notice files by Sebastian Nagel · 8 months ago
  77. b6443fa NUTCH-3116 Minor dependency upgrades by Sebastian Nagel · 8 months ago
  78. d2adde2 Updated Arbitrary Indexer that passes all indexer constructor args to user's POJO instance. by Joe Gilvary · 8 months ago
  79. 25f5610 NUTCH-3112 Utilize parameterized logging by Sebastian Nagel · 11 months ago
  80. cf4f805 NUTCH-3113 Group commands in bin/nutch command-line help thematically by Sebastian Nagel · 11 months ago
  81. a077ffc NUTCH-3087 BasicURLNormalizer to keep userinfo for protocols which might require it by Sebastian Nagel · 1 year, 3 months ago
  82. 14fc330 NUTCH-3114 Avoid stale fetching when only URLs by Sebastian Nagel · 8 months ago
  83. 71eca88 Merge pull request #847 from tatecn/NUTCH-3106 by Sebastian Nagel · 8 months ago
  84. 5335e6b Merge pull request #848 from martin-djukanovic/NUTCH-3103 by Sebastian Nagel · 8 months ago
  85. b61d11f Merge pull request #849 from maciejpuzianowski/NUTCH-3108 by Sebastian Nagel · 11 months ago
  86. 76ced9b NUTCH-3110 Upgrade to Tika 3.1.0 by Sebastian Nagel · 11 months ago
  87. 3fb8068 NUTCH-3110 Upgrade to Tika 3.1.0 by Sebastian Nagel · 11 months ago
  88. a050c43 fix for NUTCH-3108 contributed by maciejpuzianowski/mpuzianowski by Maciej Puzianowski · 1 year ago
  89. 931ba17 [NUTCH-3103] Fixed custom max intervals for AdaptiveFetchSchedule by martin · 1 year, 1 month ago
  90. aca19bb NUTCH-3106 fix Issue with SSLHandshakeException in v1.20 using protocol-http plugin by Hanbing Luo · 1 year, 1 month ago
  91. b52ec90 NUTCH-3100 HostDB to support minimum records per host by Markus Jelsma · 1 year, 2 months ago
  92. 18e7aeb NUTCH-3101 src/java/org/apache/nutch/crawl/Inlink.java by Markus Jelsma · 1 year, 2 months ago
  93. 3b6d2c6 Merge pull request #832 from sebastian-nagel/NUTCH-3072 by Sebastian Nagel · 1 year, 2 months ago
  94. 74b49e9 NUTCH-3086 Consolidate plugin extension names and IDs (#835) by Sebastian Nagel · 1 year, 2 months ago
  95. 5068b76 Merge pull request #844 from maciejpuzianowski/NUTCH-3097 by Sebastian Nagel · 1 year, 2 months ago
  96. 86b893a NUTCH-3079 Dumping a segment fails unless it has been fetched and parsed by Sebastian Nagel · 1 year, 4 months ago
  97. b481f91 NUTCH-3083 Add RobotRulesParser to bin/nutch by Sebastian Nagel · 1 year, 4 months ago
  98. 5263b7c NUTCH-3096 HostDB ResolverThread can create too many job counters by Sebastian Nagel · 1 year, 3 months ago
  99. e2a29d0 NUTCH-3092 Replace all imports of commons-lang by commons-lang3 by Sebastian Nagel · 1 year, 3 months ago
  100. bb17570 fix for NUTCH-3097 contributed by maciejpuzianowski/mpuzianowski by Maciej Puzianowski · 1 year, 3 months ago