Sign in
apache
/
nutch
/
HEAD
85deeb5
Bump actions/upload-artifact from 4 to 7
by dependabot[bot]
· 3 days ago
master
24d9f72
Bump actions/checkout from 5 to 7
by dependabot[bot]
· 3 days ago
4d58963
Bump SonarSource/sonarqube-scan-action from 7.1.0 to 8.2.0
by dependabot[bot]
· 3 days ago
6d962e1
Bump actions/setup-node from 4 to 6
by dependabot[bot]
· 3 days ago
886e3b6
Bump dawidd6/action-download-artifact from 11 to 21
by dependabot[bot]
· 3 days ago
f55262b
NUTCH-3185 Upgrade NekoHTML to 1.9.22
by Sebastian Nagel
· 10 days ago
d7713ed
NUTCH-3184 Upgrade OkHttp to 3.4.0
by Sebastian Nagel
· 10 days ago
3544a71
NUTCH-3183 Upgrade Tika to 3.3.1
by Sebastian Nagel
· 10 days ago
7544718
NUTCH-3182 Add GitHub Dependabot configuration to update GitHub workflows
by Sebastian Nagel
· 10 days ago
e7db045
NUTCH-3188 CI builds startup failure because of disallowed workflow actions
by Sebastian Nagel
· 6 days ago
8e03a3e
Add in-repo threat model and point AGENTS.md/SECURITY.md at it (#922)
by Jarek Potiuk
· 8 days ago
87483c6
NUTCH-1732: allow deleting non-parsable documents (#891)
by igiguere
· 9 days ago
5bcb0ab
NUTCH-3180 BasicURLNormalizer missing catching ICUInputTooLongException (#923)
by Luca
· 10 days ago
571c2bc
NUTCH-3178 Add AGENTS.md and SECURITY.md to the Nutch repository
by Sebastian Nagel
· 3 weeks ago
a0814a7
NUTCH-3177 Fetcher to report idle threads not as hung threads
by Sebastian Nagel
· 6 weeks ago
2984770
Merge pull request #914 from sebastian-nagel/NUTCH-3176-idna2008
by Sebastian Nagel
· 2 weeks ago
a6159ea
NUTCH-3174 protocol-okhttp: request may hang despite http.time.limit is set
by Sebastian Nagel
· 7 weeks ago
9f78c72
NUTCH-3173 protocol-okhttp: store OkHttp's internal URL in response metadata (#919)
by Luca
· 2 weeks ago
c246f6cc
Merge pull request #920 from potiuk/asf-security/agents-md-security-md-init-2026-05-27
by Sebastian Nagel
· 3 weeks ago
8fc57d9
Merge pull request #918 from prakharchaube/NUTCH-3164
by Sebastian Nagel
· 3 weeks ago
fc26180
NUTCH-3164 Added Unit Test case
by Prakhar Chaube
· 3 weeks ago
dae9f9d
Add AGENTS.md + SECURITY.md linking the project's security model
by Jarek Potiuk
· 4 weeks ago
fba8bc5
Merge pull request #916 from apache/infrastructure-ruleset-bot/default-branch-protection
by Sebastian Nagel
· 4 weeks ago
1482fcb
Extend ASF GitHub integration branch protection patterns
by Sebastian Nagel
· 4 weeks ago
infrastructure-ruleset-bot/default-branch-protection
0c4f51d
NUTCH-3164 Catch specific exceptions in CrawlDbFilter so plugin errors no longer silently drop URLs
by Prakhar Chaube
· 4 weeks ago
0865e1a
Set up default protection ruleset for default and release branches
by The Apache Software Foundation
· 5 weeks ago
afb9439
Format source code
by Sebastian Nagel
· 6 weeks ago
84699a6
NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008
by Sebastian Nagel
· 6 weeks ago
947cd28
NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008
by Sebastian Nagel
· 6 weeks ago
36cc230
Refactor calls of URLDecoder and pass Charset instead of String (since Java 10)
by Sebastian Nagel
· 6 weeks ago
66f55e1
NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008
by Sebastian Nagel
· 6 weeks ago
54efa9f
NUTCH-3176 URLUtil and urlnormalizer-basic: add support for IDNA2008
by Sebastian Nagel
· 6 weeks ago
57c290e
Merge pull request #913 from lewismc/NUTCH-3175
by Sebastian Nagel
· 6 weeks ago
cbb1d09
NUTCH-3175 Implement integration testing framework for Nutch Protocol plugins using Testcontainers
by lewismc
· 6 weeks ago
9390674
NUTCH-3175 Implement integration testing framework for Nutch Protocol plugins using Testcontainers
by lewismc
· 6 weeks ago
2009e0e
NUTCH-3167 Upgrade to Hadoop 3.5.0 (#911)
by Lewis John McGibbney
· 7 weeks ago
31c44b2
NUTCH-3163 Integrate Apache Yetus' pre-commit patch testing into Nutch GitHub Continuous Integration (#907)
by Lewis John McGibbney
· 9 weeks ago
e47cfd5
parse: fix 'occured' -> 'occurred' typo in ParseStatus FAILED javadoc (#910)
by Sai Asish Y
· 9 weeks ago
3b663de
NUTCH-3165 Remove the Nutch web service (#908)
by Lewis John McGibbney
· 10 weeks ago
df62fa1
NUTCH-3168 Sandbox Commons JEXL usage in crawl and index pipelines (#909)
by Lewis John McGibbney
· 10 weeks ago
96552ae
NUTCH-2932 Create OpenAPI specification for Nutch 1.x REST API (#896)
by Lewis John McGibbney
· 2 months ago
da55708
[NUTCH-3160] Remove System.exit(..) from reusable code (#903)
by Luca
· 4 months ago
eed3445
NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#902)
by Lewis John McGibbney
· 4 months ago
f1d3e8a
NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#901)
by Lewis John McGibbney
· 4 months ago
89e6ec1
NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#900)
by Lewis John McGibbney
· 4 months ago
9f3bb41
Bump SonarSource/sonarqube-scan-action from 5 to 6 in /.github/workflows (#899)
by dependabot[bot]
· 4 months ago
21fd780
NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#898)
by Lewis John McGibbney
· 4 months ago
39cfc61
NUTCH-3085 Augment CI by adding code coverage and code quality reporting (#897)
by Lewis John McGibbney
· 4 months ago
64ac8b4
NUTCH-3154 Implement integration testing framework for Nutch IndexWriter plugins using Testcontainers (#895)
by Lewis John McGibbney
· 4 months ago
ceb23e1
NUTCH-3145 Upgrade to JUnit 6 (#883)
by Lewis John McGibbney
· 4 months ago
4337927
Merge pull request #825 from lewismc/NUTCH-3064
by Sebastian Nagel
· 4 months ago
674914c
Prepare for new development after release of 1.22
by Sebastian Nagel
· 4 months ago
195b4c0
NUTCH-3153 Update of license and notice files
by Sebastian Nagel
· 4 months ago
1d25cb8
NUTCH-3152 Job counters getGroup to use metrics constants
by Sebastian Nagel
· 4 months ago
f7c7e1a
NUTCH-3150 Expand Caching Hadoop Counter References (#892)
by Lewis John McGibbney
· 4 months ago
1242e22
NUTCH-3142 Add Error Context to Metrics (#882)
by Lewis John McGibbney
· 5 months ago
3101a9e
Merge pull request #887 from lewismc/NUTCH-3110
by Sebastian Nagel
· 5 months ago
f8577a0
NUTCH-3143 GitHub workflow does not run all unit tests (#890)
by Lewis John McGibbney
· 5 months ago
7c5a529
NUTCH-3143 GitHub workflow does not run all unit tests (#889)
by Lewis John McGibbney
· 5 months ago
7d05fec
NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2
by lewismc
· 5 months ago
f36a836
NUTCH-3064 Upgrade index-geoip to GeoIP2 5.0.2
by lewismc
· 5 months ago
3ddae0b
Merge remote-tracking branch 'origin' into NUTCH-3064
by lewismc
· 5 months ago
4207bc3
NUTCH-3148 Cache Ivy dependencies in GitHub CI builds (#886)
by Lewis John McGibbney
· 5 months ago
713835b
NUTCH-3110 Upgrade to Tika 3.2.3
by lewismc
· 5 months ago
7f724a9
Merge pull request #880 from igiguere/NUTCH-1564-AdaptiveFetchSchedule-refetch
by Sebastian Nagel
· 5 months ago
ddabe96
NUTCH-3144 URLUtil unit tests fail after upgrade to crawler-commons 1.6
by Sebastian Nagel
· 6 months ago
8e7bbc4
NUTCH-3143 GitHub workflow does not run all unit tests (#885)
by Lewis John McGibbney
· 5 months ago
ec8747a
NUTCH-3143 GitHub workflow does not run all unit tests (#884)
by Lewis John McGibbney
· 5 months ago
8186b04
NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2
by lewismc
· 5 months ago
c01cc22
NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2
by lewismc
· 5 months ago
b849499
NUTCH-3064: Upgrade index-geoip to GeoIP2 5.0.2
by lewismc
· 5 months ago
66f678e
NUTCH-3141 Cache Hadoop Counter References in Hot Paths (#878)
by Lewis John McGibbney
· 5 months ago
103fff6
NUTCH-1564: address code review comments.
by Isabelle Giguere
· 5 months ago
58687ec
NUTCH-1564: fix AdaptiveFetchSchedule for unmodified pages
by Isabelle Giguere
· 6 months ago
d5dccfb
NUTCH-1564: fix immediate refetch for pages not modified
by Isabelle Giguere
· 6 months ago
00bf8c4
NUTCH-3139 protocol-okhttp: add support for zstd content-encoding
by Sebastian Nagel
· 6 months ago
8a0fb2b
NUTCH-3137 Upgrade Nutch core dependencies (#875)
by Sebastian Nagel
· 6 months ago
c7cf569
NUTCH-3136 Upgrade crawler-commons dependency
by Sebastian Nagel
· 6 months ago
50b1ee6
NUTCH-3136 Upgrade crawler-commons dependency
by Sebastian Nagel
· 6 months ago
8307b6b
NUTCH-3135 Cache downloaded ant-eclipse.jar
by Sebastian Nagel
· 6 months ago
de27acc
NUTCH-3133 Upgrade GitHub workflows to JDK 17
by Sebastian Nagel
· 6 months ago
ca2591e
NUTCH-3134 Add latency metrics with percentile support to Fetcher, Parser, and Indexer (#876)
by Lewis John McGibbney
· 6 months ago
f71bab4
NUTCH-3132 Standardize existing Nutch metrics naming and implementation (#871)
by Lewis John McGibbney
· 6 months ago
7b5ed23
NUTCH-3126 Report JUnit test results in GitHub pull request thread (#868)
by Lewis John McGibbney
· 6 months ago
f65371d
Merge pull request #870 from igiguere/NUTCH-2971
by Sebastian Nagel
· 7 months ago
f43ff78
fix for NUTCH-2671 contributed by igiguere. Also fixes NUTCH-3128, NUTCH-3125
by Isabelle Giguere
· 7 months ago
1156801
NUTCH-3040 Upgrade to Hadoop 3.4.2 (#866)
by Lewis John McGibbney
· 7 months ago
317d2de
NUTCH-3126 Report JUnit test results in GitHub pull request thread (#867)
by Lewis John McGibbney
· 8 months ago
cefb48a
NUTCH-3099 Allow wildcard '*' in http.proxy.exception.list (via Isabelle Giguere) (#865)
by Lewis John McGibbney
· 8 months ago
2d92366
NUTCH-3126 Report JUnit test results in GitHub pull request thread (#863)
by Lewis John McGibbney
· 9 months ago
667e217
Merge pull request #864 from sebastian-nagel/NUTCH-2887-junit4-mrunit
by Sebastian Nagel
· 9 months ago
a966c44
NUTCH-2887 Migrate to JUnit 5 Jupiter
by Sebastian Nagel
· 9 months ago
919e245
NUTCH-2887 Migrate to JUnit 5 Jupiter
by Sebastian Nagel
· 9 months ago
cfcf2d7
NUTCH-2887 Migrate to JUnit 5 Jupiter
by Sebastian Nagel
· 9 months ago
e2b60fc
NUTCH-2887 Migrate to JUnit 5 Jupiter (#862)
by Lewis John McGibbney
· 9 months ago
4c04a98
NUTCH-2887 Migrate to JUnit 5 Jupiter (#861)
by Lewis John McGibbney
· 10 months ago
3991c5b
Merge pull request #859 from TamimEhsan/NUTCH-3122
by Sebastian Nagel
· 10 months ago
7e43e12
NUTCH-3124 Github workflow not run because of uncertified action "paths-changes-filter"
by Sebastian Nagel
· 10 months ago
365f585
[NUTCH-3122] Add test for backward compatibility of SpellCheckedMetadata
by Tamim Ehsan
· 10 months ago
5ae91b6
[NUTCH-3122] Make SpellCheckedMetadata case-insensitive for all Metadata names
by Tamim Ehsan
· 10 months ago
Next »