blob: 041ed3e171aab9739364fa00cc394320a35a9a9f [file]
---
title: Introduction to HAWQ Tables
---
## <a id="tut_introhawqtbldisc"></a>Discussion
DISCUSS HAWQ TABLES HERE
In this exercise, you will create a HAWQ table, add data to the table, and perform simple queries. You will also load the Retail sample data set into HAWQ tables and perform more complicated queries.
## <a id="tut_introhawqtblprereq"></a>Prerequisites
Ensure that you have a running HAWQ installation and have downloaded the tutorial work files, including the Retail example data set and HAWQ scripts.
## <a id="tut_excreatehawqtblsteps"></a>Steps
Perform the following steps to create a HAWQ database and associated tables to represent the Retail data set.
1. In a terminal window, navigate to the tutorial work files base directory.
``` shell
$ cd $HAWQTUTBASE
```
2. Run the `createdb` command at the shell prompt to create a database named `hawqintro_tbl1`:
``` shell
$ createdb hawqintro_tbl1
```
3. Run the PostgreSQL interpreter `psql` and connect to database `hawqintro_tbl1`:
``` shell
$ psql -d hawqintro_tbl1
```
You will enter the `psql` interpreter:
``` sql
hawqintro_tbl1=#
```
Notice that the `psql` prompt is the database name followed by `=#`.
4. Create the table `id_tbl1` with a single column named `id` of type `integer`:
``` sql
hawqintro_tbl1=# create table id_tbl1 (id int);
```
5. Add some data to `id_tbl1`:
``` sql
hawqintro_tbl1=# insert into id_tbl1 select generate_series(1,100);
```
This `insert` command adds 100 rows to `id_tbl1`, incrementing and writing the `id` for each row.
6. Query the table:
``` sql
hawqintro_tbl1=# select * from id_tbl1 where id > 97;
```
``` shell
** id
-----
98
99
100
(3 rows)**
```
The query returns all rows in the table where the `id` is greater than 97. The query also returns the number of rows returned.
7. Quit and restart `psql`.
``` sql
hawqintro_tbl1=#\q
```
``` shell
$ cd $HAWQTUTBASE/sample_data
$ psql
```
8. Create a schema named `retail_demo` to represent the Retail data set:
``` sql
gpadmin=# CREATE SCHEMA retail_demo;
```
A schema is a namespace for the database. It contains named objects like tables, data types, functions, and operators. These named objects are accessed by qualifying their name with `schemaname.` (or `retail_demo.` in this exercise) as a prefix. You may also set a search path that includes schema name(s).
7. Use the provided scripts to create tables for, load, and verify the Retail demo data set:
``` sql
gpadmin=#\i ../hawq/hawq_tables/create_hawq_tables.sql
gpadmin=#\q
```
The `create_hawq_tables.sql` script deletes each table before creating. If this is your first time performing this exercise, ignore the `psql` "table does not exist, skipping" messages.
View the `create_hawq_tables.sql` script. Notice the use of the `retail_demo.` prefix to the table name:
``` sql
CREATE TABLE retail_demo.categories_dim_hawq
(
category_id integer NOT NULL,
category_name character varying(400) NOT NULL
)
WITH (appendonly=true, compresstype=zlib) DISTRIBUTED RANDOMLY;
```
Load data into the tables and verify the load succeeded. You will run a script that displays the row count of tables in the `retail_demo` schema:
``` shell
$ ../hawq/hawq_tables/load_hawq_tables.sh
$ cd ../hawq/hawq_tables
$ ./verify_load_hawq_tables.sh
```
The output of the script should match the following:
``` shell
Table Name | Count
-----------------------------+------------------------
customers_dim_hawq | 401430
categories_dim_hawq | 56
customer_addresses_dim_hawq | 1130639
email_addresses_dim_hawq | 401430
order_lineitems_hawq | 1024158
orders_hawq | 512071
payment_methods_hawq | 5
products_dim_hawq | 698911
-----------------------------+------------------------
```
8. Determine the top ten postal codes by order revenue by running the following query on the `retail_demo.orders_hawq` table:
``` shell
$ psql
```
``` sql
gpadmin=# select billing_address_postal_code,
sum(total_paid_amount::float8) as total,
sum(total_tax_amount::float8) as tax
from retail_demo.orders_hawq
group by billing_address_postal_code
order by total desc limit 10;
```
Compare your output to the following:
``` shell
billing_address_postal_code | total | tax
----------------------------+-----------+-----------
48001 | 111868.32 | 6712.0992
15329 | 107958.24 | 6477.4944
42714 | 103244.58 | 6194.6748
41030 | 101365.5 | 6081.93
50223 | 100511.64 | 6030.6984
03106 | 83566.41 | 0
57104 | 77383.63 | 3095.3452
23002 | 73673.66 | 3683.683
25703 | 68282.12 | 4096.9272
26178 | 66836.4 | 4010.184
(10 rows)
Time: 2120.214 ms
```