A Helm chart to deploy Apache Tika on Kubernetes.

Clone this repo:
  1. 67763cd Configuring Git user identity before creating the annotated tag by Lewis John McGibbney · 5 days ago main
  2. da88153 Bump Apache Tika Docker image to 3.3.0.0-full (#34) by github-actions[bot] · 5 days ago
  3. 0c378e1 TIKA-4678 Create GitHub Action automation to publish tika-helm to Artfactory on each merge to main branch (#38) by Lewis John McGibbney · 5 days ago
  4. 6fb1fc8 TIKA-4678 Create GitHub Action automation to publish tika-helm to Artfactory on each merge to main branch (#37) by Lewis John McGibbney · 5 days ago
  5. ea5330a TIKA-4678 Create GitHub Action automation to publish tika-helm to Artfactory on each merge to main branch (#36) by Lewis John McGibbney · 5 days ago

tika-helm

Artifact HUB Version: 3.3.0 Type: application AppVersion: 3.3.0.0-full

We recommend that the Helm chart version is aligned to the version Tika (and subsequently the version of the Tika Docker image) you want to deploy. This will ensure that you using a chart version that has been tested against the corresponding production version. This will also ensure that the documentation and examples for the chart will work with the version of Tika you are installing.

Installing

Install released version from Helm OCI registry

Charts are published to the Tika Helm OCI repository on Apache JFrog Artifactory. Install directly from the OCI registry (Helm 3.8+).

N.B. You may or may not need/wish to install the chart into a specific namespace, in which case you may need to augment the commands below.

  • If the registry requires authentication (e.g. for private access), log in first: helm registry login apache.jfrog.io --username <your-username> --password <your-password>

  • Snapshot builds from main: Each merge publishes a chart to the same OCI repository with version {chart_version}-{git_short_sha} (for example 3.2.3-a1b2c3d). These are not official releases. Use helm install or helm pull with that version and the OCI URL below.

  • Install from OCI (replace <version> with the chart version you want, e.g. 3.2.3):

    • with Helm 3: helm install tika oci://apache.jfrog.io/tika-helm/tika --version <version> --set image.tag=<app-version> -n tika-test

    • Example:

      helm install tika oci://apache.jfrog.io/tika-helm/tika --version 3.2.3 --set image.tag=latest-full -n tika-test
      

      Example installation notes:

      NAME: tika
      LAST DEPLOYED: Mon Jan 24 13:38:01 2022
      NAMESPACE: tika-test
      STATUS: deployed
      REVISION: 1
      NOTES:
      1. Get the application URL by running these commands:
        export POD_NAME=$(kubectl get pods --namespace tika-test -l "app.kubernetes.io/name=tika,app.kubernetes.io/instance=tika" -o jsonpath="{.items[0].metadata.name}")
        export CONTAINER_PORT=$(kubectl get pod --namespace tika-test $POD_NAME -o jsonpath="{.spec.containers[0].ports[0].containerPort}")
        echo "Visit http://127.0.0.1:9998 to use your application"
        kubectl --namespace tika-test port-forward $POD_NAME 9998:$CONTAINER_PORT
      

You may notice that the kubectl port forwarding experiences a timeout issue which ultimately kills the app. In this case you can run port forwarding in a loop:

while true; do kubectl --namespace tika-test port-forward $POD_NAME 9998:$CONTAINER_PORT ; done

This should keep kubectl reconnecting on connection lost.

Note: The classic Helm repository (helm repo add tika https://apache.jfrog.io/artifactory/tika) is deprecated. Official releases and main-branch snapshot charts are published to the Helm OCI repository above.

Install development version using main branch

  • Clone the git repo: git clone git@github.com:apache/tika-helm.git

  • Install it:

    • with Helm 3: helm install tika . --set image.tag=latest-full

Custom configuration for tika

To use custom configuration values for apache tika, use the tikaConfig key in the values.yaml. Example:

tikaConfig: |
  <?xml version="1.0" encoding="UTF-8"?>
  <properties>
    <mtrandata>
      <mime-table-path>/tika-config/custom-mimetypes.xml</mime-table-path>
    </mtrandata>
    <parsers>
      <!-- Default Parser for most things, except for 2 mime types -->
      <parser class="org.apache.tika.parser.DefaultParser">
        <mime-exclude>image/jpeg</mime-exclude>
        <mime-exclude>application/pdf</mime-exclude>
      </parser>
    </parsers>
  </properties>

additionalConfigs:
  custom-mimetypes.xml: |
    <?xml version="1.0" encoding="UTF-8"?>
    <mime-info>
      <mime-type type="application/pdf">
        <magic priority="80">
          <match value="%PDF-" type="string" offset="0:8192"/>
        </magic>
        <glob pattern="*.pdf"/>
      </mime-type>
    </mime-info>

Upgrading

Please check artifacthub.io/changes in Chart.yaml before upgrading.

Values

Testing

helm plugin install https://github.com/helm-unittest/helm-unittest.git
helm unittest .

See helm-unittest for canonical documentation.

Contributing

Please check CONTRIBUTING before any contribution or for any questions about our development and testing process.

More Information

For more infomation on Apache Tika Server, go to the Apache Tika Server documentation.

For more information on Apache Tika, go to the official Apache Tika project website.

For more information on the Apache Software Foundation, go to the Apache Software Foundation website.

License

The code is licensed permissively under the Apache License v2.0.

Maintainers

NameEmailUrl
lewismclewismc@apache.orghttps://github.com/lewismc
stijnbrouwershttps://github.com/stijnbrouwers
philipsouthamhttps://github.com/philipsoutham
frascuhttps://github.com/frascu
euvenhttps://github.com/euven
ps0uthhttps://github.com/ps0uth
ahilmathewhttps://github.com/ahilmathew
aidanthewizhttps://github.com/aidanthewiz
bartekhttps://github.com/bartek
CiraciNicolohttps://github.com/CiraciNicolo
amalucellihttps://github.com/amalucelli
thatmlopsguyhttps://github.com/thatmlopsguy

Autogenerated from chart metadata using helm-docs v1.14.2