HBase provides an optional REST API (previously called Stargate). See the HBase REST Setup section below for getting started with the HBase REST API and Knox with the Hortonworks Sandbox environment.
The gateway by default includes a sample topology descriptor file {GATEWAY_HOME}/deployments/sandbox.xml
. The value in this sample is configured to work with an installed Sandbox VM.
<service> <role>WEBHBASE</role> <url>http://localhost:60080</url> <param> <name>replayBufferSize</name> <value>8</value> </param> </service>
By default the gateway is configured to use port 60080 for Hbase in the Sandbox. Please see the steps to configure the port mapping below.
A default replayBufferSize of 8KB is shown in the sample topology file above. This may need to be increased if your query size is larger.
| ------- | ----------------------------------------------------------------------------- | | Gateway | https://{gateway-host}:{gateway-port}/{gateway-path}/{cluster-name}/hbase
| | Cluster | http://{hbase-rest-host}:8080/
|
The examples below illustrate the set of basic operations with HBase instance using the REST API. Use following link to get more details about HBase REST API: http://hbase.apache.org/book.html#_rest.
Note: Some HBase examples may not work due to enabled Access Control. User may not be granted access for performing operations in the samples. In order to check if Access Control is configured in the HBase instance verify hbase-site.xml
for a presence of org.apache.hadoop.hbase.security.access.AccessController
in hbase.coprocessor.master.classes
and hbase.coprocessor.region.classes
properties.
To grant the Read, Write, Create permissions to guest
user execute the following command:
echo grant 'guest', 'RWC' | hbase shell
If you are using a cluster secured with Kerberos you will need to have used kinit
to authenticate to the KDC.
The command below launches the REST daemon on port 8080 (the default)
sudo {HBASE_BIN}/hbase-daemon.sh start rest
Where {HBASE_BIN}
is /usr/hdp/current/hbase-master/bin/
in the case of a HDP install.
To use a different port use the -p
option:
sudo {HBASE_BIN/hbase-daemon.sh start rest -p 60080
If it becomes necessary to restart HBase you can log into the hosts running HBase and use these steps.
sudo {HBASE_BIN}/hbase-daemon.sh stop rest sudo -u hbase {HBASE_BIN}/hbase-daemon.sh stop regionserver sudo -u hbase {HBASE_BIN}/hbase-daemon.sh stop master sudo -u hbase {HBASE_BIN}/hbase-daemon.sh stop zookeeper sudo -u hbase {HBASE_BIN}/hbase-daemon.sh start regionserver sudo -u hbase {HBASE_BIN}/hbase-daemon.sh start master sudo -u hbase {HBASE_BIN}/hbase-daemon.sh start zookeeper sudo {HBASE_BIN}/hbase-daemon.sh start rest -p 60080
Where {HBASE_BIN}
is /usr/hdp/current/hbase-master/bin/
in the case of a HDP Sandbox install.
For more details about client DSL usage please look at the chapter about the client DSL in this guide.
After launching the shell, execute the following command to be able to use the snippets below. import org.apache.knox.gateway.shell.hbase.HBase;
HBase.session(session).systemVersion().now().string
HBase.session(session).clusterVersion().now().string
HBase.session(session).status().now().string
HBase.session(session).table().list().now().string
HBase.session(session).table().schema().now().string
Request
Response
Example
HBase.session(session).table(tableName).create() .attribute(“tb_attr1”, “value1”) .attribute(“tb_attr2”, “value2”) .family(“family1”) .attribute(“fm_attr1”, “value3”) .attribute(“fm_attr2”, “value4”) .endFamilyDef() .family(“family2”) .family(“family3”) .endFamilyDef() .attribute(“tb_attr3”, “value5”) .now()
Request
Response
Example
HBase.session(session).table(tableName).update() .family(“family1”) .attribute(“fm_attr1”, “new_value3”) .endFamilyDef() .family(“family4”) .attribute(“fm_attr3”, “value6”) .endFamilyDef() .now()```
HBase.session(session).table(tableName).regions().now().string
HBase.session(session).table(tableName).delete().now()
Request
Response
Example
HBase.session(session).table(tableName).row(“row_id_1”).store() .column(“family1”, “col1”, “col_value1”) .column(“family1”, “col2”, “col_value2”, 1234567890l) .column(“family2”, null, “fam_value1”) .now()
HBase.session(session).table(tableName).row(“row_id_2”).store() .column(“family1”, “row2_col1”, “row2_col_value1”) .now()
rowId is optional. Querying with null or empty rowId will select all rows.
Request
Response
Example
HBase.session(session).table(tableName).row(“row_id_1”) .query() .now().string
HBase.session(session).table(tableName).row().query().now().string
HBase.session(session).table(tableName).row().query() .column(“family1”, “row2_col1”) .column(“family2”) .times(0, Long.MAX_VALUE) .numVersions(1) .now().string
Request
Response
Example
HBase.session(session).table(tableName).row(“row_id_1”) .delete() .column(“family1”, “col1”) .now()```
HBase.session(session).table(tableName).row(“row_id_1”) .delete() .column(“family2”) .time(Long.MAX_VALUE) .now()```
Request
Response
Example
HBase.session(session).table(tableName).scanner().create() .column(“family1”, “col2”) .column(“family2”) .startRow(“row_id_1”) .endRow(“row_id_2”) .batch(1) .startTime(0) .endTime(Long.MAX_VALUE) .filter("") .maxVersions(100) .now()```
HBase.session(session).table(tableName).scanner(scannerId).getNext().now().string
HBase.session(session).table(tableName).scanner(scannerId).delete().now()
This example illustrates sequence of all basic HBase operations:
There are several ways to do this depending upon your preference.
You can use the Groovy interpreter provided with the distribution.
java -jar bin/shell.jar samples/ExampleHBase.groovy
You can manually type in the KnoxShell DSL script into the interactive Groovy interpreter provided with the distribution.
java -jar bin/shell.jar
Each line from the file below will need to be typed or copied into the interactive shell.
/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * https://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.knox.gateway.shell.hbase import org.apache.knox.gateway.shell.Hadoop import static java.util.concurrent.TimeUnit.SECONDS gateway = "https://localhost:8443/gateway/sandbox" username = "guest" password = "guest-password" tableName = "test_table" session = Hadoop.login(gateway, username, password) println "System version : " + HBase.session(session).systemVersion().now().string println "Cluster version : " + HBase.session(session).clusterVersion().now().string println "Status : " + HBase.session(session).status().now().string println "Creating table '" + tableName + "'..." HBase.session(session).table(tableName).create() \ .attribute("tb_attr1", "value1") \ .attribute("tb_attr2", "value2") \ .family("family1") \ .attribute("fm_attr1", "value3") \ .attribute("fm_attr2", "value4") \ .endFamilyDef() \ .family("family2") \ .family("family3") \ .endFamilyDef() \ .attribute("tb_attr3", "value5") \ .now() println "Done" println "Table List : " + HBase.session(session).table().list().now().string println "Schema for table '" + tableName + "' : " + HBase.session(session) \ .table(tableName) \ .schema() \ .now().string println "Updating schema of table '" + tableName + "'..." HBase.session(session).table(tableName).update() \ .family("family1") \ .attribute("fm_attr1", "new_value3") \ .endFamilyDef() \ .family("family4") \ .attribute("fm_attr3", "value6") \ .endFamilyDef() \ .now() println "Done" println "Schema for table '" + tableName + "' : " + HBase.session(session) \ .table(tableName) \ .schema() \ .now().string println "Inserting data into table..." HBase.session(session).table(tableName).row("row_id_1").store() \ .column("family1", "col1", "col_value1") \ .column("family1", "col2", "col_value2", 1234567890l) \ .column("family2", null, "fam_value1") \ .now() HBase.session(session).table(tableName).row("row_id_2").store() \ .column("family1", "row2_col1", "row2_col_value1") \ .now() println "Done" println "Querying row by id..." println HBase.session(session).table(tableName).row("row_id_1") \ .query() \ .now().string println "Querying all rows..." println HBase.session(session).table(tableName).row().query().now().string println "Querying row by id with extended settings..." println HBase.session(session).table(tableName).row().query() \ .column("family1", "row2_col1") \ .column("family2") \ .times(0, Long.MAX_VALUE) \ .numVersions(1) \ .now().string println "Deleting cell..." HBase.session(session).table(tableName).row("row_id_1") \ .delete() \ .column("family1", "col1") \ .now() println "Rows after delete:" println HBase.session(session).table(tableName).row().query().now().string println "Extended cell delete" HBase.session(session).table(tableName).row("row_id_1") \ .delete() \ .column("family2") \ .time(Long.MAX_VALUE) \ .now() println "Rows after delete:" println HBase.session(session).table(tableName).row().query().now().string println "Table regions : " + HBase.session(session).table(tableName) \ .regions() \ .now().string println "Creating scanner..." scannerId = HBase.session(session).table(tableName).scanner().create() \ .column("family1", "col2") \ .column("family2") \ .startRow("row_id_1") \ .endRow("row_id_2") \ .batch(1) \ .startTime(0) \ .endTime(Long.MAX_VALUE) \ .filter("") \ .maxVersions(100) \ .now().scannerId println "Scanner id=" + scannerId println "Scanner get next..." println HBase.session(session).table(tableName).scanner(scannerId) \ .getNext() \ .now().string println "Dropping scanner with id=" + scannerId HBase.session(session).table(tableName).scanner(scannerId).delete().now() println "Done" println "Dropping table '" + tableName + "'..." HBase.session(session).table(tableName).delete().now() println "Done" session.shutdown(10, SECONDS)
Set Accept Header to “text/plain”, “text/xml”, “application/json” or “application/x-protobuf”
% curl -ik -u guest:guest-password\ -H "Accept: application/json"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/version'
Set Accept Header to “text/plain”, “text/xml” or “application/x-protobuf”
% curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/version/cluster'
Set Accept Header to “text/plain”, “text/xml”, “application/json” or “application/x-protobuf”
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/status/cluster'
Set Accept Header to “text/plain”, “text/xml”, “application/json” or “application/x-protobuf”
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase'
curl -ik -u guest:guest-password\ -H "Accept: text/xml" -H "Content-Type: text/xml"\ -d '<?xml version="1.0" encoding="UTF-8"?><TableSchema name="table1"><ColumnSchema name="family1"/><ColumnSchema name="family2"/></TableSchema>'\ -X PUT 'https://localhost:8443/gateway/sandbox/hbase/table1/schema'
curl -ik -u guest:guest-password\ -H "Accept: application/json" -H "Content-Type: application/json"\ -d '{"name":"table2","ColumnSchema":[{"name":"family3"},{"name":"family4"}]}'\ -X PUT 'https://localhost:8443/gateway/sandbox/hbase/table2/schema'
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/table1/regions'
curl -ik -u guest:guest-password\ -H "Content-Type: text/xml"\ -H "Accept: text/xml"\ -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MQ=="><Cell column="ZmFtaWx5MTpjb2wx" >dGVzdA==</Cell></Row></CellSet>'\ -X POST 'https://localhost:8443/gateway/sandbox/hbase/table1/row1'
curl -ik -u guest:guest-password\ -H "Content-Type: text/xml"\ -H "Accept: text/xml"\ -d '<?xml version="1.0" encoding="UTF-8" standalone="yes"?><CellSet><Row key="cm93MA=="><Cell column=" ZmFtaWx5Mzpjb2x1bW4x" >dGVzdA==</Cell></Row><Row key="cm93MQ=="><Cell column=" ZmFtaWx5NDpjb2x1bW4x" >dGVzdA==</Cell></Row></CellSet>'\ -X POST 'https://localhost:8443/gateway/sandbox/hbase/table2/false-row-key'
Set Accept Header to “text/plain”, “text/xml”, “application/json” or “application/x-protobuf”
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/table1/*'
Set Accept Header to “text/plain”, “text/xml”, “application/json” or “application/x-protobuf”
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/table1/row1/family1:col1'
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X DELETE 'https://localhost:8443/gateway/sandbox/hbase/table2/row0'
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X DELETE 'https://localhost:8443/gateway/sandbox/hbase/table2/row0/family3'
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X DELETE 'https://localhost:8443/gateway/sandbox/hbase/table2/row0/family3'
Scanner URL will be in Location response header
curl -ik -u guest:guest-password\ -H "Content-Type: text/xml"\ -d '<Scanner batch="1"/>'\ -X PUT 'https://localhost:8443/gateway/sandbox/hbase/table1/scanner'
curl -ik -u guest:guest-password\ -H "Accept: application/json"\ -X GET 'https://localhost:8443/gateway/sandbox/hbase/table1/scanner/13705290446328cff5ed'
curl -ik -u guest:guest-password\ -H "Accept: text/xml"\ -X DELETE 'https://localhost:8443/gateway/sandbox/hbase/table1/scanner/13705290446328cff5ed'
curl -ik -u guest:guest-password\ -X DELETE 'https://localhost:8443/gateway/sandbox/hbase/table1/schema'
Please look at #[Default Service HA support] if you wish to explicitly list the URLs under the service definition.
If you run the HBase REST Server from the HBase Region Server nodes, you can utilize more advanced HA support. The HBase REST Server does not register itself with ZooKeeper. So the Knox HA component looks in ZooKeeper for instances of HBase Region Servers and then performs a light weight ping for the presence of the REST Server on the same hosts. The user should not supply URLs in the service definition.
Note: Users of Ambari must manually startup the HBase REST Server.
To enable HA functionality for HBase in Knox the following configuration has to be added to the topology file.
<provider> <role>ha</role> <name>HaProvider</name> <enabled>true</enabled> <param> <name>WEBHBASE</name> <value>maxFailoverAttempts=3;failoverSleep=1000;enabled=true;zookeeperEnsemble=machine1:2181,machine2:2181,machine3:2181</value> </param> </provider>
The role and name of the provider above must be as shown. The name in the ‘param’ section must match that of the service role name that is being configured for HA and the value in the ‘param’ section is the configuration for that particular service in HA mode. In this case the name is ‘WEBHBASE’.
The various configuration parameters are described below:
maxFailoverAttempts - This is the maximum number of times a failover will be attempted. The failover strategy at this time is very simplistic in that the next URL in the list of URLs provided for the service is used and the one that failed is put at the bottom of the list. If the list is exhausted and the maximum number of attempts is not reached then the first URL will be tried again after the list is fetched again from Zookeeper (a refresh of the list is done at this point)
failoverSleep - The amount of time in millis that the process will wait or sleep before attempting to failover.
enabled - Flag to turn the particular service on or off for HA.
zookeeperEnsemble - A comma separated list of host names (or IP addresses) of the ZooKeeper hosts that consist of the ensemble that the HBase servers register their information with.
And for the service configuration itself the URLs need NOT be added to the list. For example:
<service> <role>WEBHBASE</role> </service>
Please note that there is no <url>
tag specified here as the URLs for the Kafka servers are obtained from ZooKeeper.