- 36211dd TIKA-4683 -- hyphens in ooxml (#2799) by Tim Allison · 6 hours ago main
- ddaebd0 TIKA-4683 -- fix ole digesting (#2798) by Tim Allison · 6 hours ago
- 4b78311 TIKA-4683 -- charset detector dep mgmt and order in AutoDetectReader (#2800) by Tim Allison · 6 hours ago
- d90734b TIKA-4327: update aws, swagger-annotations, mime4j, opennlp, zstd, joda-time by Tilman Hausherr · 10 hours ago
- e8c36c9 TIKA-4722: Add parse_context_json field to FetchAndParseRequest for per-request ParseContext configuration (#2797) by Nicholas DiPiazza · 34 hours ago
- 0300d94 TIKA-4327: update aws, jackson, grpc, micronaut by Tilman Hausherr · 36 hours ago
- 6c78186 TIKA-4683-rollback-encoding-detection (#2796) by Tim Allison · 2 days ago
- aeea39e improve epub handling of truncated files (#2795) by Tim Allison · 3 days ago
- 66a83d3 charset and junk tweaks (#2794) by Tim Allison · 3 days ago
- 6b53816 Automatically add slices and/or log underprovisioned pipes configurations (#2793) by Tim Allison · 4 days ago
- 4e0340b TIKA-4721: Fix TOCTOU race in SharedServerManager port assignment (#2791) by Nicholas DiPiazza · 4 days ago
- c9da280 switch to OK... we're not actually testing anything with SLOW (#2788) by Tim Allison · 4 days ago
- b6bcce6 TIKA-4327: update open-nlp by Tilman Hausherr · 5 days ago
- 0eac3d3 TIKA-4703: Fix tika-grpc container permissions for plugin directory (#2792) by Nicholas DiPiazza · 5 days ago
- bcfc0ea TIKA-4703: Fix tika-grpc Docker image missing runtime dependencies (#2790) by Nicholas DiPiazza · 5 days ago
- cb20b84 Bump org.jetbrains.kotlin:kotlin-stdlib from 2.3.20 to 2.3.21 (#2789) by dependabot[bot] · 6 days ago
- 3ad2e84 TIKA-4327: remove unneeded dependency by Tilman Hausherr · 7 days ago
- 9eb4f59 TIKA-4327: add comment about mchange-commons-java by Tilman Hausherr · 7 days ago
- da79f42 TIKA-4327: update gson by Tilman Hausherr · 7 days ago
- 0c99d53 TIKA-4720-wiring (#2787) by Tim Allison · 7 days ago
- dc9f770 TIKA-4695: revert gson, mchange due to test failure by Tilman Hausherr · 7 days ago
- 2ebdc22 TIKA-4327: update aws, commons-codec, c3p0, gson, mchange by Tilman Hausherr · 8 days ago
- e63170f TIKA-4720 -- Move charset detection to byte-bigram Naive Bayes pipeline (#2784) by Tim Allison · 9 days ago
- 7d34f9e TIKA-4719 -- Universalish junk detector (#2783) by Tim Allison · 9 days ago
- 638cbb2 TIKA-4327: update commons-io by Tilman Hausherr · 9 days ago
- 5dfa028 improve legacy charset detector to benefit from features of StandardHtmlEncodingDetector (#2786) by Tim Allison · 9 days ago
- e0d4a6d remove bestMatch (#2785) by Tim Allison · 9 days ago
- 4a43c2c TIKA-4703: Fix chmod failure in tika-grpc Dockerfile on CI (#2782) by Nicholas DiPiazza · 9 days ago
- dc99fff TIKA-4703: Fix Docker Hub secret name DOCKERHUB_USERNAME -> DOCKERHUB_USER (#2781) by Nicholas DiPiazza · 9 days ago
- c0d5296 TIKA-4703: Upgrade GitHub Actions to Node.js 24 compatible versions (#2780) by Nicholas DiPiazza · 9 days ago
- 0ae889f TIKA-4703: Pin docker/* actions to SHA digests per ASF policy (INFRA-27837) (#2779) by Nicholas DiPiazza · 10 days ago
- fd16980 TIKA-4327: update lombok by Tilman Hausherr · 11 days ago
- 65e8193 TIKA-4327: update aws, oak by Tilman Hausherr · 11 days ago
- 2260a19 TIKA-4703: Add Docker CI pipelines for tika-server and tika-grpc (#2715) by Nicholas DiPiazza · 11 days ago
- 306c79c TIKA-4327: update activation, jsoup, testcontainers, angus by Tilman Hausherr · 12 days ago
- 7a039f3 Add OCR encode parser module (#2769) by Cristian Zamfir · 12 days ago
- c12494c Bump eu.maveniverse.maven.nisse:extension from 0.8.3 to 0.8.4 (#2777) by dependabot[bot] · 13 days ago
- f5a4b55 Bump org.apache.maven:maven-model from 3.9.14 to 3.9.15 (#2778) by dependabot[bot] · 13 days ago
- c79648b TIKA-4327: update junrar, aws, google cloud, guava, reactor, spring, sqlite, swagger-annotations, oauth2, microsoft-graph by Tilman Hausherr · 2 weeks ago
- 9ecd958 charset-ship-today (#2776) by Tim Allison · 2 weeks ago
- 619077d 4x-reg-sax-fixes (#2773) by Tim Allison · 2 weeks ago
- 5bd8fbb update epub along the lines of oodt (#2774) by Tim Allison · 2 weeks ago
- 48e4ecc strip charset from mimes (#2775) by Tim Allison · 2 weeks ago
- 35ebe92 fix 'occured' -> 'occurred' in JoshuaNetworkTranslator log (#2772) by Sai Asish Y · 2 weeks ago
- 5396175 TIKA-4715 - try to fix osgi integration tests (#2758) by Tim Allison · 2 weeks ago
- 07e08fb clean up dwg parsing (#2770) by Tim Allison · 2 weeks ago
- c38a475 Merge remote-tracking branch 'origin/main' into 4x-reg-general-tweaks by tallison · 2 weeks ago
- b6c85ee Bump bouncycastle from 1.83 to 1.84 (#2771) by David Frizelle · 2 weeks ago
- 86a4f3e Merge remote-tracking branch 'origin/main' into 4x-reg-test-charset-detection-tweaks by tallison · 3 weeks ago
- 52b3ce6 improve hyperlink spacing by tallison · 3 weeks ago
- 4dd9e93 clean up file list pipes iterator by tallison · 3 weeks ago
- ed3a7b1 fix npe by tallison · 3 weeks ago
- d28c9b1 gate UTF-16 model output on 2-byte column-diversity asymmetry by tallison · 3 weeks ago
- d370f54 bump limit to something realistic by tallison · 3 weeks ago
- 4178451 sparse-Latin vCard IBM424 false positive test by tallison · 3 weeks ago
- 9ce41f5 win-1252 hack by tallison · 3 weeks ago
- 155a620 ebcdic gate by tallison · 3 weeks ago
- 4fbd3ec TIKA-4327: update junrar by Tilman Hausherr · 3 weeks ago
- 2a67718 TIKA-4327: update google api services by Tilman Hausherr · 3 weeks ago
- 2e6080b Bump software.amazon.awssdk:bom from 2.42.31 to 2.42.33 (#2763) by dependabot[bot] · 3 weeks ago
- 486db1b TIKA-4327: fix jackrabbit version by Tilman Hausherr · 3 weeks ago
- 4f9210f Bump io.swagger.core.v3:swagger-annotations from 2.2.46 to 2.2.47 (#2760) by dependabot[bot] · 3 weeks ago
- 000febb Bump google-auth-library-oauth2-http.version from 1.43.0 to 1.45.0 (#2761) by dependabot[bot] · 3 weeks ago
- 7eb29a6 Bump eu.maveniverse.maven.nisse:extension from 0.8.2 to 0.8.3 (#2764) by dependabot[bot] · 3 weeks ago
- f60a9ed Bump org.codelibs:jhighlight from 1.1.1 to 2.0.0 (#2765) by dependabot[bot] · 3 weeks ago
- a79f78b Bump com.google.cloud:google-cloud-storage from 2.64.1 to 2.66.0 (#2766) by dependabot[bot] · 3 weeks ago
- 7cc3069 TIKA-4327: remove beta by Tilman Hausherr · 3 weeks ago
- f642922 TIKA-4712 -- mv tests to integration test module, clarify purpose of tika-parser-standard-package (#2754) by Tim Allison · 3 weeks ago
- 59c84e7 TIKA-4718 - check for empty comment string in .doc (#2757) by Tim Allison · 3 weeks ago
- 9e1efa4 Update pipes docs (#2759) by Tim Allison · 3 weeks ago
- 4cf115c TIKA-4716 (#2755) by Tim Allison · 3 weeks ago
- 0a0b7e7 TIKA-4717 -- update/publish initial 4.0.0-SNAPSHOT docs (#2756) by Tim Allison · 3 weeks ago
- c5b9849 [TIKA-4327] set TZ because of metadata-extractor change in update (#2753) by Tilman Hausherr · 3 weeks ago
- 8501894 TIKA-4707 -- rm dom parsers for docx/pptx (#2747) by Tim Allison · 3 weeks ago
- dc2ee57 TIKA-4699 - fix bundle to handle tika-standard-parsers-package (#2752) by Tim Allison · 3 weeks ago
- 9b211c5 TIKA-4700 -- this adds a few more components to the activator and service loader for osgi. (#2751) by Tim Allison · 3 weeks ago
- 50cb0fa Merge branch 'main' of https://gitbox.apache.org/repos/asf/tika by Tilman Hausherr · 3 weeks ago
- fa6895c TIKA-4327: update aws, azure, error-prone-annotations by Tilman Hausherr · 3 weeks ago
- cff5a73 TIKA-4705 -- resourceName of nested tarball should not contain the parent directories of its parent gzip file, plus fixing typo where '.' was missing from gz extension (#2750) by iachimoe · 3 weeks ago
- 6c40c39 TIKA-4700 Support OSGi Service Loader Mediator (#2714) by Konrad Windszus · 4 weeks ago
- 3b46de8 [TIKA-4704] clean up to avoid remaining temp directory (#2749) by Tilman Hausherr · 4 weeks ago
- 6d9a61e [TIKA-4327] Update jwarc, aws (#2748) by Tilman Hausherr · 4 weeks ago
- 27ba181 [TIKA-4704] Implement pipes client shutdown to reduce temp directory leak by Tilman Hausherr · 4 weeks ago
- 4a65796 TIKA-4710-rtf-attachments-in-html-decapsulation (#2744) by Tim Allison · 4 weeks ago
- 7f69cc6 TIKA-4709: avoid shading grpc (#2745) by Tim Allison · 4 weeks ago
- c960d38 TIKA-4704: close client so that temp directory gets deleted (#2743) by Tilman Hausherr · 4 weeks ago
- 89a301f TIKA-4327: fix lombok, remove unneeded by Tilman Hausherr · 4 weeks ago
- c04a426 Bump log4j2.version from 2.25.3 to 2.25.4 (#2737) by dependabot[bot] · 4 weeks ago
- f07b696 Bump org.codehaus.plexus:plexus-classworlds from 2.9.0 to 2.10.0 (#2740) by dependabot[bot] · 4 weeks ago
- a070592 Bump io.swagger.core.v3:swagger-annotations from 2.2.45 to 2.2.46 (#2741) by dependabot[bot] · 4 weeks ago
- 1b1d16f Bump software.amazon.awssdk:bom from 2.42.23 to 2.42.28 (#2739) by dependabot[bot] · 4 weeks ago
- 3f0d120 Bump eu.maveniverse.maven.nisse:extension from 0.8.1 to 0.8.2 (#2742) by dependabot[bot] · 4 weeks ago
- d0d6c3b Bump com.nimbusds:nimbus-jose-jwt from 10.8 to 10.9 (#2738) by dependabot[bot] · 4 weeks ago
- ad78ebf Bump commonmark.version from 0.27.1 to 0.28.0 (#2736) by dependabot[bot] · 4 weeks ago
- ee971cf TIKA-4704: remove unused by Tilman Hausherr · 4 weeks ago
- 2570a08 [TIKA-4704] Implement tearDown method to remove temp directories (#2735) by Tilman Hausherr · 4 weeks ago
- 5847697 [TIKA-4704] pass JUnit tempdir; improve logging (#2734) by Tilman Hausherr · 4 weeks ago
- 19c3b8e TIKA-4692-improve-ooxml-sax-parsers (#2731) by Tim Allison · 4 weeks ago
- bfb5096 TIKA-4708-refactor-xlsx (#2733) by Tim Allison · 4 weeks ago
- fec0a99 decapsulate html from rtf within msgs...lol (#2713) by Tim Allison · 4 weeks ago