As PIP-98 explained, Pulsar documentation site today is built like an encyclopedia. New users or existing are overwhelmed by it. Without a clear path per role (developer / DevOps / …), they resort to skim-read or read-it-all to fit the pieces of the puzzle together to form a complete picture of the knowledge they need.
New users usually start with the Getting Started section, which today is mainly focused on starting Pulsar on your development computer in several ways, and then test drive it by publishing and consuming messages using the CLI. It lacks a brief intro into subjects and terminology used throughout that section.
New users, approaching learning a subject for the first time, mainly divided into two types of learning methods:
Today, the people that learn by reading are forced to read the entire Pulsar documentation site and fit the pieces together, which is an immense high bar for newcomers. The ones learning by example don’t have any examples in today’s getting started section and are forced to google their way around many sites until they get their answers.
PIP-98, among other things, explained we should have several guides:
The people that learn by reading, in the future, will use the Developer or Operator guide, as it will be their “book” for it. The people who learn by doing will use the new getting started section we aim to present here, catering to both developers and operators (also referred to as SREs, Infrastructure, DevOps roles).
This PIP is focused on providing a new structure (table of contents) for the Getting Started Guide.
1. Quickstart
In this section, we will let the users, either a developer or DevOps (operator) role, “feel” Pulsar using the command line. First, we’ll present two ways to start Pulsar in stand-alone mode (which includes BK and ZK all within a single process) - by downloading a binary and running it or by issuing a single docker run
command. Also present a way to start pulsar in a cluster mode, which includes a process for each component, using Docker Compose. Then we’ll continue by starting a producer, which will produce a message every 5 seconds, and in another terminal window, a consumer displaying those messages. We’ll utilize pulsar shell scripts for that either directly if they downloaded them or use docker exec
.
1.1 Step 1: Start Pulsar locally
1.1.1. Standalone mode
Here we’ll explain the standalone mode and explain two ways to start pulsar on your development machine. In each section, we’ll show how to view the logs to check if Pulsar started ok.
1.1.1.1 Using release binary
1.1.1.1.1. Downloading
1.1.1.1.2. Running
1.1.1.2. Using Docker
1.1.2. Cluster mode (Docker Compose)
Here we’ll take the content we have on the site showing how to start a Pulsar Cluster locally using Docker compose
1.2. Step 2: Publish and Consume messages using the CLI
1.2.1. Publish messages
Here we will explain how to use the CLI bundled with pulsar to produce a message every 5 seconds. Here we’ll take the opportunity to explain what a topic is briefly.
We’ll use tabs to display code running the CLI since, if you downloaded a binary, it’s one way and if you have used Docker then we’ll issue a docker exec
command.
1.2.2 Consume messages
Here we will explain how to use the CLI bundled with Pulsar to consume those messages and display them to the standard output.
Here we will take the opportunity to explain what a subscription is briefly.
1.4. Stopping Pulsar
Contain short steps how to stop pulsar, be it a release binary or docker, or docker compose, using tabs for the different ways.
2. Developer Guide
this will be a full blown guide for developers. For now we’re adding the first section: Getting Started.
2.1. Getting Started
This section is focused on developers wanting to have an introduction to Pulsar - basic level - by doing rather than by reading. Some people prefer to learn by doing and “feeling” it in their hands. Developers who prefer to learn by reading will skip and go straight to an Overview section.
We will have 2 tutorials, each featuring a ready-made application (micro-service) showcasing pulsar features and concepts (the most basic ones). Each tutorial will have a link to a repository containing the full example if they just want to see the complete code or just run the example. The tutorial will be a step-by-step explanation of the example app and basically building it in steps.
The tutorials were chosen such that, in my opinion, they are the most popular use case for Pulsar or any other messaging system. In other cases, you will resort to the Tutorials section (explained briefly at the beginning of the PIP), containing more use cases that are less popular.
Since Pulsar SDK is available in several languages, we’ll write the same application first in Java and eventually in all languages Pulsar supports. Each directory in the repository will be dedicated to a single language. Each code snippet will have tabs allowing you to choose which language to see this code snippet for.
2.1.1 Basic Job Queue
In this section, we’ll present a ready-made app that showcases Pulsar's ability to be used as a Job Queue. In our example, it will be a micro-service in charge of video encoding. Each message in the topic represents an encoding task to be done (download the file from S3, encode it, then upload it back to S3).
We’ll explain:
2.1.1.1 Prerequisite: running Pulsar in Standalone mode
Link to (1), where we show how to start Pulsar locally.
We prefer that option to Testcontainers since this library doesn’t exist in all languages yet.
2.1.1.2…2.1.1.x :
2.1.2. Event Sourcing example app
This section will showcase partitioned topics, Failover subscriptions, Key-shared subscriptions, and scaling producers.
The app environment is a beer factory. It has a warehouse micro-service for managing the warehouse. It writes the current stock level as a message into a partitioned topic each time the stock increases or decreases inside the physical warehouse. The key is the beer catalog number, and the message is the stock level in a number.
Another micro-service, Inventory, exposes a REST interface to retrieve current stock levels per beer catalog number. It consumes the stock level messages and persists them to Cassandra (key = beer catalog name).
At first, the rate of changes and the number of beers in the catalog were small. The beer factory owners started with the partitioned topic with one single partition and a Failover subscription since they had to update the inventory levels in Cassandra in order with respect to the same beer catalog number.
Once the beer factory got bigger, more changes were introduced, and more beers were added to the catalog. They were bottlenecked by the update to Cassandra, so they scaled Cassandra, but the bottleneck was now at the consumer, so they wanted to scale out the Inventory micro-service. Hence they switched to a Key-shared subscription to maintain order updates per beer catalog number.
As they got even bigger, the bottleneck was now the broker. They increased the number of partitions and made sure they used a partitioner that writes the same key to the same partition.
This example will include a brief explanation about:
3. Operator Guide
3.1. Getting Started
This section is aimed at a person with an operator role (sometimes referred to as Infrastructure / SRE / DevOps), who wants to get started with Pulsar. This role implies different needs compared to the developer getting started. Operators want to try out Pulsar on their k8s cluster (whether mini kube or a test k8s cluster) as opposed to Docker Compose or running a binary. The learning mostly focuses on how to operate it: monitoring, security, and handling failure scenarios.
We’ll start by deploying Pulsar, BK, and ZK using helm charts to k8s and test driving by publishing and consuming messages using the CLI.
We’ll then proceed to deploy a demo application, with one service generating data constantly and writing to Pulsar and the other consuming it and increasing a metric to showcase it. It will be deployed alongside a Prometheus instance for collecting metrics and Grafana with bundled dashboards for Pulsar and the demo app.
Next, we’ll see if the demo app is working and learn a bit about pulsar using the ready-made Pulsar and BK dashboards.
Next, we’ll walk through several scenarios to showcase pulsar features:
The sidebar will look like this:
Discussion: https://lists.apache.org/thread/p8d8ks2ygqnq53oxqczxg2mtpf932wpg Vote: https://lists.apache.org/thread/95p5mn873d6d3lsk5kgfks4n6x07x5pq