1. 85deeb5 Bump actions/upload-artifact from 4 to 7 by dependabot[bot] · 3 days ago master
  2. 24d9f72 Bump actions/checkout from 5 to 7 by dependabot[bot] · 3 days ago
  3. 4d58963 Bump SonarSource/sonarqube-scan-action from 7.1.0 to 8.2.0 by dependabot[bot] · 3 days ago
  4. 6d962e1 Bump actions/setup-node from 4 to 6 by dependabot[bot] · 3 days ago
  5. 886e3b6 Bump dawidd6/action-download-artifact from 11 to 21 by dependabot[bot] · 3 days ago
  6. f55262b NUTCH-3185 Upgrade NekoHTML to 1.9.22 by Sebastian Nagel · 10 days ago
  7. d7713ed NUTCH-3184 Upgrade OkHttp to 3.4.0 by Sebastian Nagel · 10 days ago
  8. 3544a71 NUTCH-3183 Upgrade Tika to 3.3.1 by Sebastian Nagel · 10 days ago
  9. 7544718 NUTCH-3182 Add GitHub Dependabot configuration to update GitHub workflows by Sebastian Nagel · 10 days ago
  10. e7db045 NUTCH-3188 CI builds startup failure because of disallowed workflow actions by Sebastian Nagel · 6 days ago
  11. 8e03a3e Add in-repo threat model and point AGENTS.md/SECURITY.md at it (#922) by Jarek Potiuk · 8 days ago
  12. 87483c6 NUTCH-1732: allow deleting non-parsable documents (#891) by igiguere · 9 days ago
  13. 5bcb0ab NUTCH-3180 BasicURLNormalizer missing catching ICUInputTooLongException (#923) by Luca · 10 days ago
  14. 571c2bc NUTCH-3178 Add AGENTS.md and SECURITY.md to the Nutch repository by Sebastian Nagel · 3 weeks ago
  15. a0814a7 NUTCH-3177 Fetcher to report idle threads not as hung threads by Sebastian Nagel · 6 weeks ago
  16. 2984770 Merge pull request #914 from sebastian-nagel/NUTCH-3176-idna2008 by Sebastian Nagel · 2 weeks ago
  17. a6159ea NUTCH-3174 protocol-okhttp: request may hang despite http.time.limit is set by Sebastian Nagel · 7 weeks ago
  18. 9f78c72 NUTCH-3173 protocol-okhttp: store OkHttp's internal URL in response metadata (#919) by Luca · 2 weeks ago
  19. c246f6cc Merge pull request #920 from potiuk/asf-security/agents-md-security-md-init-2026-05-27 by Sebastian Nagel · 3 weeks ago
  20. 8fc57d9 Merge pull request #918 from prakharchaube/NUTCH-3164 by Sebastian Nagel · 3 weeks ago
  21. fc26180 NUTCH-3164 Added Unit Test case by Prakhar Chaube · 3 weeks ago
  22. dae9f9d Add AGENTS.md + SECURITY.md linking the project's security model by Jarek Potiuk · 4 weeks ago
  23. fba8bc5 Merge pull request #916 from apache/infrastructure-ruleset-bot/default-branch-protection by Sebastian Nagel · 4 weeks ago
  24. 1482fcb Extend ASF GitHub integration branch protection patterns by Sebastian Nagel · 4 weeks ago infrastructure-ruleset-bot/default-branch-protection
  25. 0c4f51d NUTCH-3164 Catch specific exceptions in CrawlDbFilter so plugin errors no longer silently drop URLs by Prakhar Chaube · 4 weeks ago
  26. 0865e1a Set up default protection ruleset for default and release branches by The Apache Software Foundation · 5 weeks ago
  27. afb9439 Format source code by Sebastian Nagel · 6 weeks ago
  28. 84699a6 NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008 by Sebastian Nagel · 6 weeks ago
  29. 947cd28 NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008 by Sebastian Nagel · 6 weeks ago
  30. 36cc230 Refactor calls of URLDecoder and pass Charset instead of String (since Java 10) by Sebastian Nagel · 6 weeks ago
  31. 66f55e1 NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008 by Sebastian Nagel · 6 weeks ago
  32. 54efa9f NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008 by Sebastian Nagel · 6 weeks ago
  33. 57c290e Merge pull request #913 from lewismc/NUTCH-3175 by Sebastian Nagel · 6 weeks ago
  34. cbb1d09 NUTCH-3175 Implement integration testing framework for Nutch Protocol plugins using Testcontainers by lewismc · 6 weeks ago
  35. 9390674 NUTCH-3175 Implement integration testing framework for Nutch Protocol plugins using Testcontainers by lewismc · 6 weeks ago
  36. 2009e0e NUTCH-3167 Upgrade to Hadoop 3.5.0 (#911) by Lewis John McGibbney · 7 weeks ago
  37. 31c44b2 NUTCH-3163 Integrate Apache Yetus' pre-commit patch testing into Nutch GitHub Continuous Integration (#907) by Lewis John McGibbney · 9 weeks ago
  38. e47cfd5 parse: fix 'occured' -> 'occurred' typo in ParseStatus FAILED javadoc (#910) by Sai Asish Y · 9 weeks ago
  39. 3b663de NUTCH-3165 Remove the Nutch web service (#908) by Lewis John McGibbney · 10 weeks ago
  40. df62fa1 NUTCH-3168 Sandbox Commons JEXL usage in crawl and index pipelines (#909) by Lewis John McGibbney · 10 weeks ago
  41. 96552ae NUTCH-2932 Create OpenAPI specification for Nutch 1.x REST API (#896) by Lewis John McGibbney · 2 months ago
  42. da55708 [NUTCH-3160] Remove System.exit(..) from reusable code (#903) by Luca · 4 months ago
  43. eed3445 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#902) by Lewis John McGibbney · 4 months ago
  44. f1d3e8a NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#901) by Lewis John McGibbney · 4 months ago
  45. 89e6ec1 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#900) by Lewis John McGibbney · 4 months ago
  46. 9f3bb41 Bump SonarSource/sonarqube-scan-action from 5 to 6 in /.github/workflows (#899) by dependabot[bot] · 4 months ago
  47. 21fd780 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#898) by Lewis John McGibbney · 4 months ago
  48. 39cfc61 NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#897) by Lewis John McGibbney · 4 months ago
  49. 64ac8b4 NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers (#895) by Lewis John McGibbney · 4 months ago
  50. ceb23e1 NUTCH-3145 Upgrade to JUnit 6 (#883) by Lewis John McGibbney · 4 months ago
  51. 4337927 Merge pull request #825 from lewismc/NUTCH-3064 by Sebastian Nagel · 4 months ago
  52. 674914c Prepare for new development after release of 1.22 by Sebastian Nagel · 4 months ago
  53. 195b4c0 NUTCH-3153 Update of license and notice files by Sebastian Nagel · 4 months ago
  54. 1d25cb8 NUTCH-3152 Job counters getGroup to use metrics constants by Sebastian Nagel · 4 months ago
  55. f7c7e1a NUTCH-3150 Expand Caching Hadoop Counter References (#892) by Lewis John McGibbney · 4 months ago
  56. 1242e22 NUTCH-3142 Add Error Context to Metrics (#882) by Lewis John McGibbney · 5 months ago
  57. 3101a9e Merge pull request #887 from lewismc/NUTCH-3110 by Sebastian Nagel · 5 months ago
  58. f8577a0 NUTCH-3143 GitHub workflow does not run all unit tests (#890) by Lewis John McGibbney · 5 months ago
  59. 7c5a529 NUTCH-3143 GitHub workflow does not run all unit tests (#889) by Lewis John McGibbney · 5 months ago
  60. 7d05fec NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 5 months ago
  61. f36a836 NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 5 months ago
  62. 3ddae0b Merge remote-tracking branch 'origin' into NUTCH-3064 by lewismc · 5 months ago
  63. 4207bc3 NUTCH-3148 Cache Ivy dependencies in GitHub CI builds (#886) by Lewis John McGibbney · 5 months ago
  64. 713835b NUTCH-3110 Upgrade to Tika 3.2.3 by lewismc · 5 months ago
  65. 7f724a9 Merge pull request #880 from igiguere/NUTCH-1564-AdaptiveFetchSchedule-refetch by Sebastian Nagel · 5 months ago
  66. ddabe96 NUTCH-3144 URLUtil unit tests fail after upgrade to crawler-commons 1.6 by Sebastian Nagel · 6 months ago
  67. 8e7bbc4 NUTCH-3143 GitHub workflow does not run all unit tests (#885) by Lewis John McGibbney · 5 months ago
  68. ec8747a NUTCH-3143 GitHub workflow does not run all unit tests (#884) by Lewis John McGibbney · 5 months ago
  69. 8186b04 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 5 months ago
  70. c01cc22 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 5 months ago
  71. b849499 NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2 by lewismc · 5 months ago
  72. 66f678e NUTCH-3141 Cache Hadoop Counter References in Hot Paths (#878) by Lewis John McGibbney · 5 months ago
  73. 103fff6 NUTCH-1564: address code review comments. by Isabelle Giguere · 5 months ago
  74. 58687ec NUTCH-1564: fix AdaptiveFetchSchedule for unmodified pages by Isabelle Giguere · 6 months ago
  75. d5dccfb NUTCH-1564: fix immediate refetch for pages not modified by Isabelle Giguere · 6 months ago
  76. 00bf8c4 NUTCH-3139 protocol-okhttp: add support for zstd content-encoding by Sebastian Nagel · 6 months ago
  77. 8a0fb2b NUTCH-3137 Upgrade Nutch core dependencies (#875) by Sebastian Nagel · 6 months ago
  78. c7cf569 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 6 months ago
  79. 50b1ee6 NUTCH-3136 Upgrade crawler-commons dependency by Sebastian Nagel · 6 months ago
  80. 8307b6b NUTCH-3135 Cache downloaded ant-eclipse.jar by Sebastian Nagel · 6 months ago
  81. de27acc NUTCH-3133 Upgrade GitHub workflows to JDK 17 by Sebastian Nagel · 6 months ago
  82. ca2591e NUTCH-3134 Add latency metrics with percentile support to Fetcher, Parser, and Indexer (#876) by Lewis John McGibbney · 6 months ago
  83. f71bab4 NUTCH-3132 Standardize existing Nutch metrics naming and implementation (#871) by Lewis John McGibbney · 6 months ago
  84. 7b5ed23 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#868) by Lewis John McGibbney · 6 months ago
  85. f65371d Merge pull request #870 from igiguere/NUTCH-2971 by Sebastian Nagel · 7 months ago
  86. f43ff78 fix for NUTCH-2671 contributed by igiguere. Also fixes NUTCH-3128, NUTCH-3125 by Isabelle Giguere · 7 months ago
  87. 1156801 NUTCH-3040 Upgrade to Hadoop 3.4.2 (#866) by Lewis John McGibbney · 7 months ago
  88. 317d2de NUTCH-3126 Report JUnit test results in GitHub pull request thread (#867) by Lewis John McGibbney · 8 months ago
  89. cefb48a NUTCH-3099 Allow wildcard '*' in http.proxy.exception.list (via Isabelle Giguere) (#865) by Lewis John McGibbney · 8 months ago
  90. 2d92366 NUTCH-3126 Report JUnit test results in GitHub pull request thread (#863) by Lewis John McGibbney · 9 months ago
  91. 667e217 Merge pull request #864 from sebastian-nagel/NUTCH-2887-junit4-mrunit by Sebastian Nagel · 9 months ago
  92. a966c44 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 9 months ago
  93. 919e245 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 9 months ago
  94. cfcf2d7 NUTCH-2887 Migrate to JUnit 5 Jupiter by Sebastian Nagel · 9 months ago
  95. e2b60fc NUTCH-2887 Migrate to JUnit 5 Jupiter (#862) by Lewis John McGibbney · 9 months ago
  96. 4c04a98 NUTCH-2887 Migrate to JUnit 5 Jupiter (#861) by Lewis John McGibbney · 10 months ago
  97. 3991c5b Merge pull request #859 from TamimEhsan/NUTCH-3122 by Sebastian Nagel · 10 months ago
  98. 7e43e12 NUTCH-3124 Github workflow not run because of uncertified action "paths-changes-filter" by Sebastian Nagel · 10 months ago
  99. 365f585 [NUTCH-3122] Add test for backward compatibility of SpellCheckedMetadata by Tamim Ehsan · 10 months ago
  100. 5ae91b6 [NUTCH-3122] Make SpellCheckedMetadata case-insensitive for all Metadata names by Tamim Ehsan · 10 months ago