Local Docker Cluster for Phoenix Adapters

Brings up the full dependency stack (Hadoop / ZooKeeper / HBase / Phoenix) required to run phoenix-adapters on your laptop. Uses upstream images where they exist; custom only where they don't.

ComponentVersionImage
Apache ZooKeeper3.8.4library/zookeeper:3.8.4 (Docker Official)
Apache Hadoop (HDFS)3.3.6apache/hadoop:3.3.6 (Apache convenience build)
Apache HBase2.5.14-hadoop3phoenix-adapters/hbase-phoenix:latest (custom)
Apache Phoenix5.3.1 (phoenix-hbase-2.5)bundled into phoenix-adapters/hbase-phoenix
Phoenix Adapters RESTthis repophoenix-adapters/rest:latest (custom)

Versions are kept in lockstep with the top-level pom.xml.

Apple Silicon. apache/hadoop:3.3.6 is amd64-only; the compose file pins platform: linux/amd64 so the NameNode/DataNode run under Rosetta emulation. Slower than native, but functional.

Layout

docker/
├── Dockerfile.hbase-phoenix         # HBase 2.5.14 + Phoenix 5.3.1
├── Dockerfile.phoenix-adapters      # Multi-stage build of the REST server
├── docker-compose.yml
├── conf/
│   ├── hbase/{hbase-site.xml,hbase-env.sh}
│   └── phoenix-adapters/hbase-site.xml      # Client-side overrides
└── scripts/
    ├── hbase-entrypoint.sh                  # hbase-master, hbase-regionserver
    ├── phoenix-adapters-entrypoint.sh
    └── smoke.sh                             # End-to-end DDB validation suite

ZooKeeper and Hadoop config lives entirely in docker-compose.yml as env vars that the upstream images template into XML.

Quick start

Prerequisites: Docker Desktop running; jq and curl on PATH (brew install jq on macOS).

From the project root:

# 1. Bring up the full stack (ZK + HDFS + HBase+Phoenix + REST) and BLOCK
#    until every service reports healthy (REST takes ~30-60s on a cold
#    start because Phoenix has to bootstrap SYSTEM.* tables).
#    First time: ~8-12 min -- most of that is Maven downloading ~1.5 GB
#    of dependencies into the BuildKit cache mount; subsequent runs reuse
#    the cache and rebuild in seconds.
docker compose -f docker/docker-compose.yml up -d --build --wait

# 2. Validate it works end-to-end (CRUD + UpdateItem + BatchWriteItem + streams).
bash docker/scripts/smoke.sh
# -> "Result: 21 checks PASSED across 18 API calls"

# 3. Use it. The DynamoDB-compatible REST endpoint is at http://localhost:8842 .
#    Point any AWS SDK at it (Java/Python/Node.js snippets in
#    phoenix-ddb-rest/README.md), or hit it directly with curl:
curl -s -X POST http://localhost:8842/ \
    -H 'Content-Type: application/x-amz-json-1.0' \
    -H 'X-Amz-Target: DynamoDB_20120810.ListTables' -d '{}'

# 4. Tear down when you're done.
docker compose -f docker/docker-compose.yml down       # keep volumes
docker compose -f docker/docker-compose.yml down -v    # also wipe HDFS + ZK

URLs

URLService
http://localhost:8842Phoenix Adapters REST (DynamoDB-compatible)
http://localhost:9870HDFS NameNode UI
http://localhost:9864HDFS DataNode UI
http://localhost:16010HBase Master UI
http://localhost:16030HBase RegionServer UI

Two host ports are remapped because their defaults often collide on dev machines (macOS AirPlay on 9000, a locally installed Kafka/ZK on 2181):

ServiceContainerHost
HDFS NameNode RPCnamenode:9000localhost:19000
ZooKeeper clientzookeeper:2181localhost:12181

Inter-container traffic still uses the standard ports.

Bring up just the cluster (no REST)

docker compose -f docker/docker-compose.yml up -d --build --wait \
    zookeeper namenode datanode hbase-master hbase-regionserver

Validation suite

docker/scripts/smoke.sh exercises every supported DynamoDB API against the running REST server and asserts the expected behaviour. It prints each request, response, and assertion as it runs.

docker compose -f docker/docker-compose.yml up -d --build --wait
bash docker/scripts/smoke.sh

Exits 0 on full pass; exits non-zero on the first failed assertion and prints the offending response.

StepAPI
1ListTables (baseline)
2CreateTable (with StreamSpecification enabled, NEW_AND_OLD_IMAGES)
3DescribeTable
4PutItem (id=a)
5UpdateItem (SET score, bonus, ReturnValues=ALL_NEW)
6GetItem
7PutItem (id=b)
8Scan
9Query
10DeleteItem
11Scan (after delete)
12BatchWriteItem (mixed put + delete)
13Scan paginated (drains all pages)
14ListStreams
15DescribeStream (polls until StreamStatus == ENABLED)
16GetShardIterator (TRIM_HORIZON)
17GetRecords (drains all pages)
18DeleteTable

Poking around the cluster

HBase shell:

docker compose -f docker/docker-compose.yml exec hbase-master hbase shell
status
list
create 'demo', 'cf'
put 'demo', 'r1', 'cf:c1', 'hello'
scan 'demo'

Phoenix sqlline:

docker compose -f docker/docker-compose.yml exec hbase-master \
    /opt/phoenix/bin/sqlline.py zookeeper:2181
!tables
CREATE TABLE IF NOT EXISTS t1 (id BIGINT PRIMARY KEY, name VARCHAR);
UPSERT INTO t1 VALUES (1, 'phoenix-adapters');
SELECT * FROM t1;

Developer inner loop: code change → live endpoint

phoenix-ddb-rest/src/**.java
        │  (1) edit on host
        ▼
docker compose ... up -d --build phoenix-adapters-rest
   ├── stage 1: mvn package -DskipTests   (BuildKit caches ~/.m2)
   ├── stage 1 output: phoenix-ddb-assembly/target/*-bin.tar.gz
   └── stage 2: temurin runtime extracts that tarball
        │
        ▼
http://localhost:8842/   (new code, live)

The cluster (ZK + HDFS + HBase) keeps running across REST rebuilds, and HBase data persists across full down/up cycles.

The loop

  1. Edit code in phoenix-ddb-rest/src/... or phoenix-ddb-utils/src/....

  2. (Optional) sanity-check the compile on the host:

    mvn -B -DskipTests -pl phoenix-ddb-rest -am package
    
  3. Rebuild and recreate just the REST container:

    docker compose -f docker/docker-compose.yml up -d --build phoenix-adapters-rest
    

    No-dep-change rebuilds typically take 30-60 s on a warm cache.

  4. Watch logs:

    docker compose -f docker/docker-compose.yml logs -f phoenix-adapters-rest
    
  5. Hit the endpoint and verify.

Quick reference

TaskCommand
Rebuild REST + restart itdocker compose -f docker/docker-compose.yml up -d --build phoenix-adapters-rest
Restart REST (no code change)docker compose -f docker/docker-compose.yml restart phoenix-adapters-rest
Tail REST logsdocker compose -f docker/docker-compose.yml logs -f phoenix-adapters-rest
Tail HBase logsdocker compose -f docker/docker-compose.yml logs -f hbase-master hbase-regionserver
HBase shelldocker compose -f docker/docker-compose.yml exec hbase-master hbase shell
Phoenix sqllinedocker compose -f docker/docker-compose.yml exec hbase-master /opt/phoenix/bin/sqlline.py zookeeper:2181
List containersdocker compose -f docker/docker-compose.yml ps
Stop (keep data)docker compose -f docker/docker-compose.yml down
Stop + wipe datadocker compose -f docker/docker-compose.yml down -v

Edge cases

SituationWhat to do
Changed conf/hbase/hbase-site.xml or hbase-env.shdocker compose ... up -d --build hbase-master hbase-regionserver. Existing tables survive.
Bumped hbase.version / phoenix.version in pom.xmlBump matching ARGs in Dockerfile.hbase-phoenix, then --build hbase-master hbase-regionserver phoenix-adapters-rest. Often pair with down -v.
Added a Maven dep to phoenix-ddb-rest/pom.xml--build phoenix-adapters-rest. New dep downloads once; cache warms after.
Clean slatedocker compose ... down -v then up -d --build.
Code doesn't seem picked upYou ran restart instead of up --build. restart does not rebuild.
Stack left running for days / many smoke iterationsHBase + REST logs grow unbounded inside the containers. down -v periodically to reclaim disk.

Pre-PR checklist

# 1. Host-side compile + unit tests (no cluster required).
mvn -B clean install -DskipITs

# 2. End-to-end validation: fresh stack + full DDB round-trip including streams.
docker compose -f docker/docker-compose.yml down -v
docker compose -f docker/docker-compose.yml up -d --build --wait
bash docker/scripts/smoke.sh

# 3. Tear it down.
docker compose -f docker/docker-compose.yml down -v

If smoke.sh finishes with Result: 21 checks PASSED across 18 API calls, your change is wire-compatible end to end through Phoenix on dockerized HBase across CRUD, batch, and the change-stream chain.

Running the REST server outside Docker

  1. Bring up only the cluster services.

  2. Add cluster hostnames to /etc/hosts (HBase advertises hostnames over ZK):

    127.0.0.1 zookeeper namenode datanode hbase-master hbase-regionserver
    
  3. Start the REST server pointing at the dockerized ZooKeeper:

    mvn -DskipTests clean package
    tar xzf phoenix-ddb-assembly/target/phoenix-adapters-*-bin.tar.gz -C /tmp
    cd /tmp/phoenix-adapters-*
    export JAVA_HOME=$(/usr/libexec/java_home -v 1.8)   # macOS example
    export PHOENIX_ADAPTERS_HOME=$(pwd)
    bin/phoenix-adapters rest foreground_start -p 8842 -z localhost:12181
    

Phoenix tuning baked into the image

docker/conf/hbase/hbase-site.xml enables what Phoenix 5.x needs for secondary indexes, DDL events, and the multi-priority RPC controller:

PropertyValue
hbase.coprocessor.master.classes…PhoenixMasterObserver
hbase.coprocessor.regionserver.classes…PhoenixRegionServerEndpoint
hbase.regionserver.wal.codec…IndexedWALEditCodec
hbase.region.server.rpc.scheduler.factory.class…PhoenixRpcSchedulerFactory
hbase.rpc.controllerfactory.class…ServerRpcControllerFactory
phoenix.task.handling.interval.ms1000
phoenix.task.handling.initial.delay.ms1

phoenix-server-hbase-2.5-5.3.1.jar is copied into ${HBASE_HOME}/lib/ so the coprocessors and WAL codec are visible to master and every RegionServer.

Why upstream images for ZK + Hadoop but not HBase?

ComponentDecisionReason
ZooKeeper 3.8.4Upstream zookeeper:3.8.4Docker Official, exact version, multi-arch.
Hadoop 3.3.6Upstream apache/hadoop:3.3.6Apache convenience build at the exact version. amd64-only, runs under emulation on Apple Silicon.
HBase 2.5.14-hadoop3CustomNo official Apache image; community images don't cover 2.5.14-hadoop3.
Phoenix 5.3.1Custom (layered on HBase)No Phoenix image anywhere; server JAR must be on HBase's classpath.

Troubleshooting

  • NameNode unhealthy on first start. First start formats the NameNode via ENSURE_NAMENODE_DIR. Watch with docker compose ... logs -f namenode.
  • HBase Master RegionTooBusyException / NotServingRegion. Wait ~30 s after RegionServer comes up; Phoenix bootstraps SYSTEM.* tables on its first connection and the REST server retries transparently.
  • REST exits with NoClassDefFoundError: org/apache/hadoop/fs/WithErasureCoding. The phoenix-ddb-assembly tarball ships hadoop-common:3.3.6 (from pom.xml) alongside hadoop-hdfs:3.4.x / hadoop-yarn:3.4.x (transitive from phoenix-core-client). The 3.4.x JARs register FileSystem impls that need WithErasureCoding, which only exists in hadoop-common 3.4+. When HBase returns a remote exception during bootstrap, the client tries to enumerate FileSystem impls, hits NoClassDefFoundError, and poisons the JVM. The REST image Dockerfile.phoenix-adapters strips the 3.4.x hadoop-hdfs*, hadoop-yarn-*, hadoop-mapreduce-client-*, and hadoop-distcp-* jars after extracting the tarball — the REST server only talks to HBase via RPC and never opens HDFS directly, so removing them is safe. If this error reappears, check that those rm -f lines in Dockerfile.phoenix-adapters weren't dropped.
  • Datanode denied communication with namenode. Cluster ID mismatch. docker compose down -v and bring the stack back up.
  • platform mismatch warnings on Apple Silicon. Expected for the Hadoop containers (amd64 image, emulated). No action needed.

Customising versions

HBase / Phoenix versions are ARGs on Dockerfile.hbase-phoenix:

docker compose -f docker/docker-compose.yml build \
    --build-arg HBASE_VERSION=2.5.13 \
    --build-arg PHOENIX_VERSION=5.3.0 \
    hbase-master

Hadoop and ZooKeeper versions are pinned by tag in docker-compose.yml. Keep all four in lockstep with pom.xml.