blob: a77f7ee20a6cf060fe9afc43f4a73ba8ed58cb53 [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Getting Started
===============
.. note::
This page covers how to start with kyuubi quickly on you
laptop in about 3~5 minutes.
Requirements
------------
For quick start deployment, we need to prepare the following stuffs:
- A **client** that connects and submits queries to the server. Here, we use the
kyuubi beeline for demonstration.
- A **server** that serves clients and manages engines.
- An **engine** that is used to instantiate query execution environments. Here we
use Spark for demonstration.
These essential components are JVM-based applications. So, the JRE needs to be
pre-installed and the ``JAVA_HOME`` is correctly set to each component.
================ ============ ==================== =======================================================
Component Role Version Remarks
================ ============ ==================== =======================================================
**Java** JRE 8/11/17 Officially released against JDK8
**Kyuubi** Gateway \ |release| \ - Kyuubi Server
Engine lib - Kyuubi Engine
Beeline - Kyuubi Beeline
**Spark** Engine 3.1 to 3.5 A Spark distribution
**Flink** Engine 1.17 to 1.19 A Flink distribution
**Trino** Engine N/A A Trino cluster allows to access via trino-client v411
**Doris** Engine N/A A Doris cluster
**Hive** Engine - 2.1-cdh6/2.3/3.1 - A Hive distribution
Metastore - N/A - An optional and external metadata store,
whose version is decided by engines
**Zookeeper** HA >=3.4.x
**Disk** Storage N/A N/A
================ ============ ==================== =======================================================
The other internal or external parts listed in the above sheet can be used individually
or all together. For example, you can use Kyuubi, Spark and Flink to build a streaming
data warehouse. And then, you can use Zookeeper to enable the load balancing for high
availability. The data could be stored in Hive, Apache Iceberg, or other DBMSs.
In what follows, we will only use Kyuubi and Spark.
Installation
------------
.. note::
:class: dropdown, toggle
This following instructions are based on binary releases. If you start with
source releases, please refer to the page for `building kyuubi`_.
Install Kyuubi
~~~~~~~~~~~~~~
The official releases, binary- and source-, are archived on the
`download page`_. Please download the most recent stable release
to start.
To install Kyuubi, you need to unpack the tarball. For example,
.. parsed-literal::
$ tar zxf apache-kyuubi-\ |release|\-bin.tgz
.. code-block::
:class: toggle
├── LICENSE
├── NOTICE
├── RELEASE
├── beeline-jars
├── bin
├── charts
└── kyuubi
├── conf
| ├── kyuubi-defaults.conf.template
├── kyuubi-env.sh.template
└── log4j2.xml.template
├── db-scripts
├── mysql
├── postgresql
└── sqlite
├── docker
├── Dockerfile
└── playground
├── externals
└── engines
├── jars
├── licenses
├── logs
├── pid
├── web-ui
└── work
From top to bottom are:
- LICENSE: the APACHE `LICENSE`_, VERSION 2.0 we claim to obey.
- RELEASE: the build information of this package.
- NOTICE: the notice made by Apache Kyuubi Community about its project and dependencies.
- bin: the entry of the Kyuubi server with `kyuubi` as the startup script.
- conf: all the defaults used by Kyuubi Server itself or creating a session with engines.
- externals
- engines: contains all kinds of SQL engines that we support
- licenses: a bunch of licenses included.
- jars: packages needed by the Kyuubi server.
- logs: where the logs of the Kyuubi server locates.
- pid: stores the PID file of the Kyuubi server instance.
- work: the root of the working directories of all the forked sub-processes, a.k.a. SQL engines.
Install Spark
~~~~~~~~~~~~~
The official releases, binary- and source-, are archived on the
`spark download page`_. Please download the most recent stable
release to start.
.. note::
:class: dropdown, toggle
Currently, Kyuubi is compiled and pre-built against Spark 3 and Scala 2.12
You will probably meet runtime exceptions if you use Spark 2 or Spark with
unsupported Scala versions.
To install Spark, you need to unpack the tarball. For example,
.. code-block::
$ tar zxf spark-3.4.2-bin-hadoop3.tgz
Configuration
~~~~~~~~~~~~~
The `kyuubi-env.sh` file is used to set system environment variables to the kyuubi
server process and engine processes it creates.
The `kyuubi-defaults.conf` file is used to set system properties to the kyuubi server
process and engine processes it creates.
Each file has a template lays in `conf` directory for your information. The following
are examples of the parameters necessary for a quick start with Spark.
- **JAVA_HOME**
.. code-block::
$ echo 'export JAVA_HOME=/path/to/java' >> conf/kyuubi-env.sh
- **SPARK_HOME**
.. code-block::
$ echo 'export SPARK_HOME=/path/to/spark' >> conf/kyuubi-env.sh
Start Kyuubi
------------
.. code-block::
$ bin/kyuubi start
If script above runs successfully, it will store the `PID` of the server instance
into `pid/kyuubi-<username>-org.apache.kyuubi.server.KyuubiServer.pid`.
And you are able to get the JDBC connection URL from the log file -
`logs/kyuubi-<username>-org.apache.kyuubi.server.KyuubiServer-<hostname>.out`.
For example,
Starting and exposing JDBC connection at: jdbc:kyuubi://localhost:10009/
If something goes wrong, you shall be able to find some clues in the log file too.
.. note::
:class: toggle
Alternatively, it can run in the foreground, with the logs and other output
written to stdout/stderr. Both streams should be captured if using a
supervision system like `supervisord`.
.. code-block::
bin/kyuubi run
Operate Clients
---------------
Kyuubi delivers a kyuubi-beeline client, enabling a similar experience to Apache Hive use cases.
Open Connections
~~~~~~~~~~~~~~~~
Replace the `host` and `port` with the actual ones you've got in the step of server startup
for the following JDBC URL. The case below open a session for user named `apache`.
.. code-block::
$ bin/kyuubi-beeline -u 'jdbc:kyuubi://localhost:10009/' -n apache
.. note::
:class: toggle
Use `--help` to display the usage guide for the kyuubi-beeline tool.
.. code-block::
$ bin/kyuubi-beeline --help
Execute Statements
~~~~~~~~~~~~~~~~~~
After successfully connected with the server, you can run sql queries in the kyuubi-beeline
console. For instance,
.. code-block::
:class: sql
> SHOW DATABASES;
You will see a wall of operation logs, and a result table in the kyuubi-beeline console.
.. code-block::
omitted logs
+------------+
| namespace |
+------------+
| default |
+------------+
1 row selected (0.2 seconds)
Start Engines
~~~~~~~~~~~~~
Engines are launched by the server automatically without end users' attention.
If you use the same user in the above case to create another connection, the
engine will be reused. You may notice that the time cost for connection here is
much shorter than the last round.
If you use a different user to create a new connection, another engine will be
started.
.. code-block::
$ bin/kyuubi-beeline -u 'jdbc:kyuubi://localhost:10009/' -n kentyao
This may change depending on the `engine share level`_ you set.
Close Connections
~~~~~~~~~~~~~~~~~
Close the session between kyuubi-beeline and Kyuubi server by executing `!quit`, for example,
.. code-block::
> !quit
Closing: 0: jdbc:kyuubi://localhost:10009/
Stop Engines
~~~~~~~~~~~~
Engines are stop by the server automatically according `engine lifecycle`_
without end users' attention. Terminations of connections do not necessarily
mean terminations of engines. It depends on both the `engine share level`_ and
`engine lifecycle`_.
Stop Kyuubi
-----------
Stop Kyuubi which running at the background by performing the following in the `$KYUUBI_HOME` directory:
.. code-block::
$ bin/kyuubi stop
And then, you will see the Kyuubi server waving goodbye to you.
The Kyuubi server will be stopped immediately while
the engine will still be alive for a while.
If you start Kyuubi again before the engine terminates itself,
it will reconnect to the newly created one.
.. _DOWNLOAD PAGE: https://kyuubi.apache.org/releases.html
.. _BUILDING KYUUBI: ../develop_tools/distribution.html
.. _SPARK DOWNLOAD PAGE: https://spark.apache.org/downloads.html
.. _LICENSE: https://www.apache.org/licenses/LICENSE-2.0
.. _ENGINE SHARE LEVEL: ../deployment/engine_share_level.html
.. _ENGINE LIFECYCLE: ../deployment/engine_lifecycle.html