Make Docker images non-root, by default, and OpenShift compliant (#13376)

Master Issue: https://github.com/apache/pulsar/issues/11269

### Motivation

In order to increase the overall security of our Pulsar docker images, they should default to run as the non-root user. While updating these permissions, I make sure to comply with the OpenShift spec so the docker image can run on that platform out of the box.

Once we finalize these changes, we will need to update the Apache Pulsar Helm chart to make sure that deployments take advantage of this feature. We'll use the `fsGroup` to make sure that k8s sets the appropriate file system permissions for the zookeeper, bookkeeper, and function pods.

### Modifications

* Default to run as UID 10000. As noted in the `Dockerfile`, this UID is arbitrary. No logic should rely on this id.
* Update filesystem permissions so that the group user has sufficient write permission. The group user is 0 (root).
* Remove unnecessary write access.
    * The `/pulsar/{conf,data,logs}` directories and their members must be writable by the root group. I don't know of any other directories that need to be written to. Note that the `bin/pulsar-admin` too creates a log file in the `/pulsar/logs` directory. Please let me know if there are any additional
    * Note also that the executable file permissions are already set in our git repo. Those permissions are inherited by the docker image when we run the `COPY` directive in the `Dockerfile`.
* There are no changes to the function worker in the k8s runtime. We do not need them because we already merged https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b.
* Add note to `conf/bkenv.sh`, as it is a `.sh` script that is not executable (and doesn't need to be).
* Update test docker image and `supervisord` configuration.

Note: it's unclear to me how the OpenShift spec handles restarts. I know that the UID is arbitrary. It's possible that the umask needs to be switched from `022` to `002`. Setting the umask in the docker image does not persist for consumers of the image, so this would need to be set in a helm chart.

### Verifying this change

You can access a test image built with these changes here: `michaelmarshall/pulsar:2.10.0-SNAPSHOT`. I have already run some manual tests like `bin/pulsar standalone` in the container. I still need to deploy an actual cluster to verify that all of the unique components work correctly. Because we already merged https://github.com/apache/pulsar/commit/04b5da0f95794259694cc781e8960b7e52fac06b, the upgrade scenarios are already simplified. If this change is in 2.10.0, that means 2.8 and 2.9 will be compatible for certain function worker upgrade scenarios.

I wrote test criteria in https://github.com/apache/pulsar/issues/11269. I'll need to follow up on that criteria using my newly build image. I should be able to look closer at this tomorrow.

We'll also need tests to pass, as I modified some tests with this PR.

### References

The following links were useful in understanding how to make these changes:

* https://engineering.bitnami.com/articles/running-non-root-containers-on-openshift.html
* https://cloud.redhat.com/blog/a-guide-to-openshift-and-uids

### Does this pull request potentially affect one of the following parts:

This PR updates our Docker images in a breaking way. It could result in bookkeepers, zookeepers, or functions with insufficient permissions. We will mitigate these permissions by updating the helm chart. These changes are easily overridden by extending the docker image. In k8s, you can use the pod's `securityContext` to override the user or group.

(cherry picked from commit f7f861988780578e2ba102ce4e22fa8841c13e3b)
12 files changed
tree: 5cded8123a1cf6263510e8ed8fc7c8ac5f32221e
  1. .github/
  2. bin/
  3. bouncy-castle/
  4. build/
  5. buildtools/
  6. conf/
  7. deployment/
  8. dev/
  9. distribution/
  10. docker/
  11. docker-compose/
  12. grafana/
  13. jclouds-shaded/
  14. kafka-connect-avro-converter-shaded/
  15. managed-ledger/
  16. pulsar-broker/
  17. pulsar-broker-auth-athenz/
  18. pulsar-broker-auth-sasl/
  19. pulsar-broker-common/
  20. pulsar-broker-shaded/
  21. pulsar-client/
  22. pulsar-client-1x-base/
  23. pulsar-client-admin/
  24. pulsar-client-admin-api/
  25. pulsar-client-admin-shaded/
  26. pulsar-client-all/
  27. pulsar-client-api/
  28. pulsar-client-auth-athenz/
  29. pulsar-client-auth-sasl/
  30. pulsar-client-cpp/
  31. pulsar-client-messagecrypto-bc/
  32. pulsar-client-shaded/
  33. pulsar-client-tools/
  34. pulsar-client-tools-test/
  35. pulsar-common/
  36. pulsar-config-validation/
  37. pulsar-function-go/
  38. pulsar-functions/
  39. pulsar-io/
  40. pulsar-metadata/
  41. pulsar-package-management/
  42. pulsar-proxy/
  43. pulsar-sql/
  44. pulsar-testclient/
  45. pulsar-transaction/
  46. pulsar-websocket/
  47. pulsar-zookeeper-utils/
  48. site2/
  49. src/
  50. structured-event-log/
  51. testmocks/
  52. tests/
  53. tiered-storage/
  54. wireshark/
  55. .asf.yaml
  56. .gitignore
  57. CONTRIBUTING.md
  58. CONTRIBUTORS.md
  59. faq.md
  60. LICENSE
  61. lombok.config
  62. NOTICE
  63. pom.xml
  64. README.md
  65. SECURITY.md
README.md

logo

Pulsar is a distributed pub-sub messaging platform with a very flexible messaging model and an intuitive client API.

Learn more about Pulsar at https://pulsar.apache.org

Main features

  • Horizontally scalable (Millions of independent topics and millions of messages published per second)
  • Strong ordering and consistency guarantees
  • Low latency durable storage
  • Topic and queue semantics
  • Load balancer
  • Designed for being deployed as a hosted service:
    • Multi-tenant
    • Authentication
    • Authorization
    • Quotas
    • Support mixing very different workloads
    • Optional hardware isolation
  • Keeps track of consumer cursor position
  • REST API for provisioning, admin and stats
  • Geo replication
  • Transparent handling of partitioned topics
  • Transparent batching of messages

Repositories

This repository is the main repository of Apache Pulsar. Pulsar PMC also maintains other repositories for components in the Pulsar ecosystem, including connectors, adapters, and other language clients.

Helm Chart

Ecosystem

Clients

Dashboard & Management Tools

Documentation

CI/CD

Build Pulsar

Requirements:

Compile and install:

$ mvn install -DskipTests

Compile and install individual module

$ mvn -pl module-name (e.g: pulsar-broker) install -DskipTests

Minimal build (This skips most of external connectors and tiered storage handlers)

mvn install -Pcore-modules,-main -DskipTests

Run Unit Tests:

$ mvn test

Run Individual Unit Test:

$ mvn -pl module-name (e.g: pulsar-client) test -Dtest=unit-test-name (e.g: ConsumerBuilderImplTest)

Run Selected Test packages:

$ mvn test -pl module-name (for example, pulsar-broker) -Dinclude=org/apache/pulsar/**/*.java

Start standalone Pulsar service:

$ bin/pulsar standalone

Check https://pulsar.apache.org for documentation and examples.

Build custom docker images

Docker images must be built with Java 8 for branch-2.7 or previous branches because of issue 8445. Java 11 is the recommended JDK version in master/branch-2.8.

This builds the docker images apachepulsar/pulsar-all:latest and apachepulsar/pulsar:latest.

mvn clean install -DskipTests
mvn package -Pdocker,-main -am -pl docker/pulsar-all -DskipTests

After the images are built, they can be tagged and pushed to your custom repository. Here's an example of a bash script that tags the docker images with the current version and git revision and pushes them to localhost:32000/apachepulsar.

image_repo_and_project=localhost:32000/apachepulsar
pulsar_version=$(mvn initialize help:evaluate -Dexpression=project.version -pl . -q -DforceStdout)
gitrev=$(git rev-parse HEAD | colrm 10)
tag="${pulsar_version}-${gitrev}"
echo "Using tag $tag"
docker tag apachepulsar/pulsar-all:latest ${image_repo_and_project}/pulsar-all:$tag
docker push ${image_repo_and_project}/pulsar-all:$tag
docker tag apachepulsar/pulsar:latest ${image_repo_and_project}/pulsar:$tag
docker push ${image_repo_and_project}/pulsar:$tag

Setting up your IDE

Apache Pulsar is using lombok so you have to ensure your IDE setup with required plugins.

Intellij

Configure Project JDK to Java 11 JDK

  1. Open Project Settings.

    Click File -> Project Structure -> Project Settings -> Project.

  2. Select the JDK version.

    From the JDK version drop-down list, select Download JDK... or choose an existing recent Java 11 JDK version.

  3. In the download dialog, select version 11. You can pick a version from many vendors. Unless you have a specific preference, choose Eclipse Temurin (AdoptOpenJDK (Hotspot)).

Configure Java version for Maven in IntelliJ

  1. Open Maven Importing Settings dialog by going to Settings -> Build, Execution, Deployment -> Build Tools -> Maven -> Importing.

  2. Choose Use Project JDK for JDK for Importer setting. This uses the Java 11 JDK for running Maven when importing the project to IntelliJ. Some of the configuration in the Maven build is conditional based on the JDK version. Incorrect configuration gets chosen when the “JDK for Importer” isn't the same as the “Project JDK”.

  3. Validate that the JRE setting in Maven -> Runner dialog is set to Use Project JDK.

Configure annotation processing in IntelliJ

  1. Open Annotation Processors Settings dialog box by going to Settings -> Build, Execution, Deployment -> Compiler -> Annotation Processors.

  2. Select the following buttons:

    1. Enable annotation processing
    2. Obtain processors from project classpath
    3. Store generated sources relative to: Module content root
  3. Set the generated source directories to be equal to the Maven directories:

    1. Set “Production sources directory:” to “target/generated-sources/annotations”.
    2. Set “Test sources directory:” to “target/generated-test-sources/test-annotations”.
  4. Click OK.

  5. Install the lombok plugin in intellij.

Configure code style

  1. Open Code Style Settings dialog box by going to Settings -> Editor -> Code Style.

  2. Click on the :gear: symbol -> Import scheme -> Intellij IDEA code style XML

  3. Pick the file ${pulsar_dir}/src/idea-code-style.xml

  4. On the dialog box that opens, click OK.

  5. Ensure the scheme you just created is selected in Scheme dropdown then click OK.

Configure Checkstyle

  1. Install the Checkstyle-IDEA plugin.

  2. Open Checkstyle Settings dialog box by going to Settings -> Tools -> Checkstyle.

  3. Set Checkstyle version to 8.37.

  4. Set Scan scope to Only Java sources (including tests).

  5. Click + button in the Configuration section to open a dialog to choose the checkfile file.

    1. Enter a Description. For example, Pulsar.
    2. Select Use a local checkstyle file.
    3. Set File to buildtools/src/main/resources/pulsar/checkstyle.xml.
    4. Select Store relative to project location.
    5. Click Next -> Next -> Finish.
  6. Activate the configuration you just added by toggling the corresponding box.

  7. Click OK.

Further configuration in IntelliJ

  • When working on the Pulsar core modules in IntelliJ, reduce the number of active projects in IntelliJ to speed up IDE actions and reduce unrelated IDE warnings.

    • In IntelliJ‘s Maven UI’s tree view under “Profiles”
      • Activate “core-modules” Maven profile
      • De-activate “main” Maven profile
      • Run the “Reload All Maven Projects” action from the Maven UI toolbar. You can also find the action by the name in the IntelliJ “Search Everywhere” window that gets activated by pressing the Shift key twice.
  • Run the “Generate Sources and Update Folders For All Projects” action from the Maven UI toolbar. You can also find the action by the name in the IntelliJ “Search Everywhere” window that gets activated by pressing the Shift key twice. Running the action takes about 10 minutes for all projects. This is faster when the “core-modules” profile is the only active profile.

IntelliJ usage tips

  • In the case of compilation errors with missing Protobuf classes, ensure to run the “Generate Sources and Update Folders For All Projects” action.

  • All of the Pulsar source code doesn't compile properly in IntelliJ and there are compilation errors.

    • Use the “core-modules” profile if working on the Pulsar core modules since the source code for those modules can be compiled in IntelliJ.
    • Sometimes it might help to mark a specific project ignored in IntelliJ Maven UI by right-clicking the project name and select Ignore Projects from the menu.
    • Currently, it is not always possible to run unit tests directly from the IDE because of the compilation issues. As a workaround, individual test classes can be run by using the mvn test -Dtest=TestClassName command.
  • The above steps have all been performed, but a test still won't run.

    • In this case, try the following steps:
      1. Close IntelliJ.
      2. Run mvn clean install -DskipTests on the command line.
      3. Reopen IntelliJ.
    • If that still doesn't work:
      1. Verify Maven is using a supported version. Currently, the supported version of Maven is specified in the section of the main pom.xml file.
      2. Try “restart and clear caches” in IntelliJ and repeat the above steps to reload projects and generate sources.

Eclipse

Follow the instructions here to configure your Eclipse setup.

Build Pulsar docs

Refer to the docs README.

Contact

Mailing lists
NameScope
users@pulsar.apache.orgUser-related discussionsSubscribeUnsubscribeArchives
dev@pulsar.apache.orgDevelopment-related discussionsSubscribeUnsubscribeArchives
Slack

Pulsar slack channel at https://apache-pulsar.slack.com/

You can self-register at https://apache-pulsar.herokuapp.com/

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Crypto Notice

This distribution includes cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country's laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See http://www.wassenaar.org/ for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software using or performing cryptographic functions with asymmetric algorithms. The form and manner of this Apache Software Foundation distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code.

The following provides more details on the included cryptographic software: Pulsar uses the SSL library from Bouncy Castle written by http://www.bouncycastle.org.