docs/topics/impala_struct.xml - impala - Git at Google

 <?xml version="1.0" encoding="UTF-8"?>
 <!--
 Licensed to the Apache Software Foundation (ASF) under one
 or more contributor license agreements.  See the NOTICE file
 distributed with this work for additional information
 regarding copyright ownership.  The ASF licenses this file
 to you under the Apache License, Version 2.0 (the
 "License"); you may not use this file except in compliance
 with the License.  You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

 Unless required by applicable law or agreed to in writing,
 software distributed under the License is distributed on an
 "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
 KIND, either express or implied.  See the License for the
 specific language governing permissions and limitations
 under the License.
 -->
 <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
 <concept id="struct">

   <title>STRUCT Complex Type (<keyword keyref="impala23"/> or higher only)</title>

   <prolog>
     <metadata>
       <data name="Category" value="Impala"/>
       <data name="Category" value="Impala Data Types"/>
       <data name="Category" value="SQL"/>
       <data name="Category" value="Developers"/>
       <data name="Category" value="Data Analysts"/>
     </metadata>
   </prolog>

   <conbody>

     <p>
       A complex data type, representing multiple fields of a single item. Frequently used as the element type of an <codeph>ARRAY</codeph>
       or the <codeph>VALUE</codeph> part of a <codeph>MAP</codeph>.
     </p>

     <p conref="../shared/impala_common.xml#common/syntax_blurb"/>

 <codeblock><varname>column_name</varname> STRUCT &lt; <varname>name</varname> : <varname>type</varname> [COMMENT '<varname>comment_string</varname>'], ... &gt;

 type ::= <varname>primitive_type</varname> | <varname>complex_type</varname>
 </codeblock>

     <p>
       The names and number of fields within the <codeph>STRUCT</codeph> are fixed. Each field can be a different type. A field within a
       <codeph>STRUCT</codeph> can also be another <codeph>STRUCT</codeph>, or an <codeph>ARRAY</codeph> or a <codeph>MAP</codeph>, allowing
       you to create nested data structures with a maximum nesting depth of 100.
     </p>

     <p>
       A <codeph>STRUCT</codeph> can be the top-level type for a column, or can itself be an item within an <codeph>ARRAY</codeph> or the
       value part of the key-value pair in a <codeph>MAP</codeph>.
     </p>

     <p>
       When a <codeph>STRUCT</codeph> is used as an <codeph>ARRAY</codeph> element or a <codeph>MAP</codeph> value, you use a join clause to
       bring the <codeph>ARRAY</codeph> or <codeph>MAP</codeph> elements into the result set, and then refer to
       <codeph><varname>array_name</varname>.ITEM.<varname>field</varname></codeph> or
       <codeph><varname>map_name</varname>.VALUE.<varname>field</varname></codeph>. In the case of a <codeph>STRUCT</codeph> directly inside
       an <codeph>ARRAY</codeph> or <codeph>MAP</codeph>, you can omit the <codeph>.ITEM</codeph> and <codeph>.VALUE</codeph> pseudocolumns
       and refer directly to <codeph><varname>array_name</varname>.<varname>field</varname></codeph> or
       <codeph><varname>map_name</varname>.<varname>field</varname></codeph>.
     </p>

     <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>

     <p conref="../shared/impala_common.xml#common/complex_types_combo"/>

     <p>
       A <codeph>STRUCT</codeph> is similar conceptually to a table row: it contains a fixed number of named fields, each with a predefined
       type. To combine two related tables, while using complex types to minimize repetition, the typical way to represent that data is as an
       <codeph>ARRAY</codeph> of <codeph>STRUCT</codeph> elements.
     </p>

     <p>
       Because a <codeph>STRUCT</codeph> has a fixed number of named fields, it typically does not make sense to have a
       <codeph>STRUCT</codeph> as the type of a table column. In such a case, you could just make each field of the <codeph>STRUCT</codeph>
       into a separate column of the table. The <codeph>STRUCT</codeph> type is most useful as an item of an <codeph>ARRAY</codeph> or the
       value part of the key-value pair in a <codeph>MAP</codeph>. A nested type column with a <codeph>STRUCT</codeph> at the lowest level
       lets you associate a variable number of row-like objects with each row of the table.
     </p>

     <p>
       The <codeph>STRUCT</codeph> type is straightforward to reference within a query. You do not need to include the
       <codeph>STRUCT</codeph> column in a join clause or give it a table alias, as is required for the <codeph>ARRAY</codeph> and
       <codeph>MAP</codeph> types. You refer to the individual fields using dot notation, such as
       <codeph><varname>struct_column_name</varname>.<varname>field_name</varname></codeph>, without any pseudocolumn such as
       <codeph>ITEM</codeph> or <codeph>VALUE</codeph>.
     </p>

     <p conref="../shared/impala_common.xml#common/complex_types_describe"/>

     <p conref="../shared/impala_common.xml#common/internals_blurb"/>

     <p>
       Within the Parquet data file, the values for each <codeph>STRUCT</codeph> field are stored adjacent to each other, so that they can be
       encoded and compressed using all the Parquet techniques for storing sets of similar or repeated values. The adjacency applies even
       when the <codeph>STRUCT</codeph> values are part of an <codeph>ARRAY</codeph> or <codeph>MAP</codeph>. During a query, Impala avoids
       unnecessary I/O by reading only the portions of the Parquet data file containing the requested <codeph>STRUCT</codeph> fields.
     </p>

     <p conref="../shared/impala_common.xml#common/added_in_230"/>

     <p conref="../shared/impala_common.xml#common/restrictions_blurb"/>

     <ul conref="../shared/impala_common.xml#common/complex_types_restrictions">
       <li/>
     </ul>

     <p conref="../shared/impala_common.xml#common/kudu_blurb"/>
     <p conref="../shared/impala_common.xml#common/kudu_unsupported_data_type"/>

     <p conref="../shared/impala_common.xml#common/example_blurb"/>

     <note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/>

     <p>
       The following example shows a table with various kinds of <codeph>STRUCT</codeph> columns, both at the top level and nested within
       other complex types. Practice the <codeph>CREATE TABLE</codeph> and query notation for complex type columns using empty tables, until
       you can visualize a complex data structure and construct corresponding SQL statements reliably.
     </p>

 <codeblock><![CDATA[CREATE TABLE struct_demo
 (
   id BIGINT,
   name STRING,

 -- A STRUCT as a top-level column. Demonstrates how the table ID column
 -- and the ID field within the STRUCT can coexist without a name conflict.
   employee_info STRUCT < employer: STRING, id: BIGINT, address: STRING >,

 -- A STRUCT as the element type of an ARRAY.
   places_lived ARRAY < STRUCT <street: STRING, city: STRING, country: STRING >>,

 -- A STRUCT as the value portion of the key-value pairs in a MAP.
   memorable_moments MAP < STRING, STRUCT < year: INT, place: STRING, details: STRING >>,

 -- A STRUCT where one of the fields is another STRUCT.
   current_address STRUCT < street_address: STRUCT <street_number: INT, street_name: STRING, street_type: STRING>, country: STRING, postal_code: STRING >
 )
 STORED AS PARQUET;
 ]]>
 </codeblock>

     <p>
       The following example shows how to examine the structure of a table containing one or more <codeph>STRUCT</codeph> columns by using
       the <codeph>DESCRIBE</codeph> statement. You can visualize each <codeph>STRUCT</codeph> as its own table, with columns named the same
       as each field of the <codeph>STRUCT</codeph>. If the <codeph>STRUCT</codeph> is nested inside another complex type, such as
       <codeph>ARRAY</codeph>, you can extend the qualified name passed to <codeph>DESCRIBE</codeph> until the output shows just the
       <codeph>STRUCT</codeph> fields.
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo;
 +-------------------+--------------------------+
 | name              | type                     |
 +-------------------+--------------------------+
 | id                | bigint                   |
 | name              | string                   |
 | employee_info     | struct<                  |
 |                   |   employer:string,       |
 |                   |   id:bigint,             |
 |                   |   address:string         |
 |                   | >                        |
 | places_lived      | array<struct<            |
 |                   |   street:string,         |
 |                   |   city:string,           |
 |                   |   country:string         |
 |                   | >>                       |
 | memorable_moments | map<string,struct<       |
 |                   |   year:int,              |
 |                   |   place:string,          |
 |                   |   details:string         |
 |                   | >>                       |
 | current_address   | struct<                  |
 |                   |   street_address:struct< |
 |                   |     street_number:int,   |
 |                   |     street_name:string,  |
 |                   |     street_type:string   |
 |                   |   >,                     |
 |                   |   country:string,        |
 |                   |   postal_code:string     |
 |                   | >                        |
 +-------------------+--------------------------+
 ]]>
 </codeblock>

     <p>
       The top-level column <codeph>EMPLOYEE_INFO</codeph> is a <codeph>STRUCT</codeph>. Describing
       <codeph><varname>table_name</varname>.<varname>struct_name</varname></codeph> displays the fields of the <codeph>STRUCT</codeph> as if
       they were columns of a table:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.employee_info;
 +----------+--------+
 | name     | type   |
 +----------+--------+
 | employer | string |
 | id       | bigint |
 | address  | string |
 +----------+--------+
 ]]>
 </codeblock>

     <p>
       Because <codeph>PLACES_LIVED</codeph> is a <codeph>STRUCT</codeph> inside an <codeph>ARRAY</codeph>, the initial
       <codeph>DESCRIBE</codeph> shows the structure of the <codeph>ARRAY</codeph>:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.places_lived;
 +------+------------------+
 | name | type             |
 +------+------------------+
 | item | struct<          |
 |      |   street:string, |
 |      |   city:string,   |
 |      |   country:string |
 |      | >                |
 | pos  | bigint           |
 +------+------------------+
 ]]>
 </codeblock>

     <p>
       Ask for the details of the <codeph>ITEM</codeph> field of the <codeph>ARRAY</codeph> to see just the layout of the
       <codeph>STRUCT</codeph>:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.places_lived.item;
 +---------+--------+
 | name    | type   |
 +---------+--------+
 | street  | string |
 | city    | string |
 | country | string |
 +---------+--------+
 ]]>
 </codeblock>

     <p>
       Likewise, <codeph>MEMORABLE_MOMENTS</codeph> has a <codeph>STRUCT</codeph> inside a <codeph>MAP</codeph>, which requires an extra
       level of qualified name to see just the <codeph>STRUCT</codeph> part:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.memorable_moments;
 +-------+------------------+
 | name  | type             |
 +-------+------------------+
 | key   | string           |
 | value | struct<          |
 |       |   year:int,      |
 |       |   place:string,  |
 |       |   details:string |
 |       | >                |
 +-------+------------------+
 ]]>
 </codeblock>

     <p>
       For a <codeph>MAP</codeph>, ask to see the <codeph>VALUE</codeph> field to see the corresponding <codeph>STRUCT</codeph> fields in a
       table-like structure:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.memorable_moments.value;
 +---------+--------+
 | name    | type   |
 +---------+--------+
 | year    | int    |
 | place   | string |
 | details | string |
 +---------+--------+
 ]]>
 </codeblock>

     <p>
       For a <codeph>STRUCT</codeph> inside a <codeph>STRUCT</codeph>, we can see the fields of the outer <codeph>STRUCT</codeph>:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.current_address;
 +----------------+-----------------------+
 | name           | type                  |
 +----------------+-----------------------+
 | street_address | struct<               |
 |                |   street_number:int,  |
 |                |   street_name:string, |
 |                |   street_type:string  |
 |                | >                     |
 | country        | string                |
 | postal_code    | string                |
 +----------------+-----------------------+
 ]]>
 </codeblock>

     <p>
       Then we can use a further qualified name to see just the fields of the inner <codeph>STRUCT</codeph>:
     </p>

 <codeblock><![CDATA[DESCRIBE struct_demo.current_address.street_address;
 +---------------+--------+
 | name          | type   |
 +---------------+--------+
 | street_number | int    |
 | street_name   | string |
 | street_type   | string |
 +---------------+--------+
 ]]>
 </codeblock>

     <p>
       The following example shows how to examine the structure of a table containing one or more <codeph>STRUCT</codeph> columns by using
       the <codeph>DESCRIBE</codeph> statement. You can visualize each <codeph>STRUCT</codeph> as its own table, with columns named the same
       as each field of the <codeph>STRUCT</codeph>. If the <codeph>STRUCT</codeph> is nested inside another complex type, such as
       <codeph>ARRAY</codeph>, you can extend the qualified name passed to <codeph>DESCRIBE</codeph> until the output shows just the
       <codeph>STRUCT</codeph> fields.
     </p>

 <!-- To do: See why the most verbose query form gives an error. -->

 <codeblock><![CDATA[DESCRIBE struct_demo;
 +-------------------+--------------------------+---------+
 | name              | type                     | comment |
 +-------------------+--------------------------+---------+
 | id                | bigint                   |         |
 | name              | string                   |         |
 | employee_info     | struct<                  |         |
 |                   |   employer:string,       |         |
 |                   |   id:bigint,             |         |
 |                   |   address:string         |         |
 |                   | >                        |         |
 | places_lived      | array<struct<            |         |
 |                   |   street:string,         |         |
 |                   |   city:string,           |         |
 |                   |   country:string         |         |
 |                   | >>                       |         |
 | memorable_moments | map<string,struct<       |         |
 |                   |   year:int,              |         |
 |                   |   place:string,          |         |
 |                   |   details:string         |         |
 |                   | >>                       |         |
 | current_address   | struct<                  |         |
 |                   |   street_address:struct< |         |
 |                   |     street_number:int,   |         |
 |                   |     street_name:string,  |         |
 |                   |     street_type:string   |         |
 |                   |   >,                     |         |
 |                   |   country:string,        |         |
 |                   |   postal_code:string     |         |
 |                   | >                        |         |
 +-------------------+--------------------------+---------+

 SELECT id, employee_info.id FROM struct_demo;

 SELECT id, employee_info.id AS employee_id FROM struct_demo;

 SELECT id, employee_info.id AS employee_id, employee_info.employer
   FROM struct_demo;

 SELECT id, name, street, city, country
   FROM struct_demo, struct_demo.places_lived;

 SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, places_lived.country
   FROM struct_demo, struct_demo.places_lived;

 SELECT id, name, pl.pos, pl.street, pl.city, pl.country
   FROM struct_demo, struct_demo.places_lived AS pl;

 SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, places_lived.country
   FROM struct_demo, struct_demo.places_lived;

 SELECT id, name, pos, street, city, country
   FROM struct_demo, struct_demo.places_lived;

 SELECT id, name, memorable_moments.key,
   memorable_moments.value.year,
   memorable_moments.value.place,
   memorable_moments.value.details
 FROM struct_demo, struct_demo.memorable_moments
 WHERE memorable_moments.key IN ('Birthday','Anniversary','Graduation');

 SELECT id, name, mm.key, mm.value.year, mm.value.place, mm.value.details
   FROM struct_demo, struct_demo.memorable_moments AS mm
 WHERE mm.key IN ('Birthday','Anniversary','Graduation');

 SELECT id, name, memorable_moments.key, memorable_moments.value.year,
   memorable_moments.value.place, memorable_moments.value.details
 FROM struct_demo, struct_demo.memorable_moments
 WHERE key IN ('Birthday','Anniversary','Graduation');

 SELECT id, name, key, value.year, value.place, value.details
   FROM struct_demo, struct_demo.memorable_moments
 WHERE key IN ('Birthday','Anniversary','Graduation');

 SELECT id, name, key, year, place, details
   FROM struct_demo, struct_demo.memorable_moments
 WHERE key IN ('Birthday','Anniversary','Graduation');

 SELECT id, name,
   current_address.street_address.street_number,
   current_address.street_address.street_name,
   current_address.street_address.street_type,
   current_address.country,
   current_address.postal_code
 FROM struct_demo;
 ]]>
 </codeblock>

     <p>
       For example, this table uses a struct that encodes several data values for each phone number associated with a person. Each person can
       have a variable-length array of associated phone numbers, and queries can refer to the category field to locate specific home, work,
       mobile, and so on kinds of phone numbers.
     </p>

 <codeblock><![CDATA[CREATE TABLE contact_info_many_structs
 (
   id BIGINT, name STRING,
   phone_numbers ARRAY < STRUCT <category:STRING, country_code:STRING, area_code:SMALLINT, full_number:STRING, mobile:BOOLEAN, carrier:STRING > >
 ) STORED AS PARQUET;
 ]]>
 </codeblock>

     <p>
       Because structs are naturally suited to composite values where the fields have different data types, you might use them to decompose
       things such as addresses:
     </p>

 <codeblock><![CDATA[CREATE TABLE contact_info_detailed_address
 (
   id BIGINT, name STRING,
   address STRUCT < house_number:INT, street:STRING, street_type:STRING, apartment:STRING, city:STRING, region:STRING, country:STRING >
 );
 ]]>
 </codeblock>

     <p>
       In a big data context, splitting out data fields such as the number part of the address and the street name could let you do analysis
       on each field independently. For example, which streets have the largest number range of addresses, what are the statistical
       properties of the street names, which areas have a higher proportion of <q>Roads</q>, <q>Courts</q> or <q>Boulevards</q>, and so on.
     </p>

     <p conref="../shared/impala_common.xml#common/related_info"/>

     <p>
       <xref href="impala_complex_types.xml#complex_types"/>, <xref href="impala_array.xml#array"/>,
 <!-- <xref href="impala_struct.xml#struct"/>, -->
       <xref href="impala_map.xml#map"/>
     </p>

   </conbody>

 </concept>
	<?xml version="1.0" encoding="UTF-8"?>
	<!--
	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.
	-->
	<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
	<concept id="struct">

	<title>STRUCT Complex Type (<keyword keyref="impala23"/> or higher only)</title>

	<prolog>
	<metadata>
	<data name="Category" value="Impala"/>
	<data name="Category" value="Impala Data Types"/>
	<data name="Category" value="SQL"/>
	<data name="Category" value="Developers"/>
	<data name="Category" value="Data Analysts"/>
	</metadata>
	</prolog>

	<conbody>

	<p>
	A complex data type, representing multiple fields of a single item. Frequently used as the element type of an <codeph>ARRAY</codeph>
	or the <codeph>VALUE</codeph> part of a <codeph>MAP</codeph>.
	</p>

	<p conref="../shared/impala_common.xml#common/syntax_blurb"/>

	<codeblock><varname>column_name</varname> STRUCT < <varname>name</varname> : <varname>type</varname> [COMMENT '<varname>comment_string</varname>'], ... >

	type ::= <varname>primitive_type</varname> \| <varname>complex_type</varname>
	</codeblock>

	<p>
	The names and number of fields within the <codeph>STRUCT</codeph> are fixed. Each field can be a different type. A field within a
	<codeph>STRUCT</codeph> can also be another <codeph>STRUCT</codeph>, or an <codeph>ARRAY</codeph> or a <codeph>MAP</codeph>, allowing
	you to create nested data structures with a maximum nesting depth of 100.
	</p>

	<p>
	A <codeph>STRUCT</codeph> can be the top-level type for a column, or can itself be an item within an <codeph>ARRAY</codeph> or the
	value part of the key-value pair in a <codeph>MAP</codeph>.
	</p>

	<p>
	When a <codeph>STRUCT</codeph> is used as an <codeph>ARRAY</codeph> element or a <codeph>MAP</codeph> value, you use a join clause to
	bring the <codeph>ARRAY</codeph> or <codeph>MAP</codeph> elements into the result set, and then refer to
	<codeph><varname>array_name</varname>.ITEM.<varname>field</varname></codeph> or
	<codeph><varname>map_name</varname>.VALUE.<varname>field</varname></codeph>. In the case of a <codeph>STRUCT</codeph> directly inside
	an <codeph>ARRAY</codeph> or <codeph>MAP</codeph>, you can omit the <codeph>.ITEM</codeph> and <codeph>.VALUE</codeph> pseudocolumns
	and refer directly to <codeph><varname>array_name</varname>.<varname>field</varname></codeph> or
	<codeph><varname>map_name</varname>.<varname>field</varname></codeph>.
	</p>

	<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>

	<p conref="../shared/impala_common.xml#common/complex_types_combo"/>

	<p>
	A <codeph>STRUCT</codeph> is similar conceptually to a table row: it contains a fixed number of named fields, each with a predefined
	type. To combine two related tables, while using complex types to minimize repetition, the typical way to represent that data is as an
	<codeph>ARRAY</codeph> of <codeph>STRUCT</codeph> elements.
	</p>

	<p>
	Because a <codeph>STRUCT</codeph> has a fixed number of named fields, it typically does not make sense to have a
	<codeph>STRUCT</codeph> as the type of a table column. In such a case, you could just make each field of the <codeph>STRUCT</codeph>
	into a separate column of the table. The <codeph>STRUCT</codeph> type is most useful as an item of an <codeph>ARRAY</codeph> or the
	value part of the key-value pair in a <codeph>MAP</codeph>. A nested type column with a <codeph>STRUCT</codeph> at the lowest level
	lets you associate a variable number of row-like objects with each row of the table.
	</p>

	<p>
	The <codeph>STRUCT</codeph> type is straightforward to reference within a query. You do not need to include the
	<codeph>STRUCT</codeph> column in a join clause or give it a table alias, as is required for the <codeph>ARRAY</codeph> and
	<codeph>MAP</codeph> types. You refer to the individual fields using dot notation, such as
	<codeph><varname>struct_column_name</varname>.<varname>field_name</varname></codeph>, without any pseudocolumn such as
	<codeph>ITEM</codeph> or <codeph>VALUE</codeph>.
	</p>

	<p conref="../shared/impala_common.xml#common/complex_types_describe"/>

	<p conref="../shared/impala_common.xml#common/internals_blurb"/>

	<p>
	Within the Parquet data file, the values for each <codeph>STRUCT</codeph> field are stored adjacent to each other, so that they can be
	encoded and compressed using all the Parquet techniques for storing sets of similar or repeated values. The adjacency applies even
	when the <codeph>STRUCT</codeph> values are part of an <codeph>ARRAY</codeph> or <codeph>MAP</codeph>. During a query, Impala avoids
	unnecessary I/O by reading only the portions of the Parquet data file containing the requested <codeph>STRUCT</codeph> fields.
	</p>

	<p conref="../shared/impala_common.xml#common/added_in_230"/>

	<p conref="../shared/impala_common.xml#common/restrictions_blurb"/>

	<ul conref="../shared/impala_common.xml#common/complex_types_restrictions">
	<li/>
	</ul>

	<p conref="../shared/impala_common.xml#common/kudu_blurb"/>
	<p conref="../shared/impala_common.xml#common/kudu_unsupported_data_type"/>

	<p conref="../shared/impala_common.xml#common/example_blurb"/>

	<note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/>

	<p>
	The following example shows a table with various kinds of <codeph>STRUCT</codeph> columns, both at the top level and nested within
	other complex types. Practice the <codeph>CREATE TABLE</codeph> and query notation for complex type columns using empty tables, until
	you can visualize a complex data structure and construct corresponding SQL statements reliably.
	</p>

	<codeblock><![CDATA[CREATE TABLE struct_demo
	(
	id BIGINT,
	name STRING,

	-- A STRUCT as a top-level column. Demonstrates how the table ID column
	-- and the ID field within the STRUCT can coexist without a name conflict.
	employee_info STRUCT < employer: STRING, id: BIGINT, address: STRING >,

	-- A STRUCT as the element type of an ARRAY.
	places_lived ARRAY < STRUCT <street: STRING, city: STRING, country: STRING >>,

	-- A STRUCT as the value portion of the key-value pairs in a MAP.
	memorable_moments MAP < STRING, STRUCT < year: INT, place: STRING, details: STRING >>,

	-- A STRUCT where one of the fields is another STRUCT.
	current_address STRUCT < street_address: STRUCT <street_number: INT, street_name: STRING, street_type: STRING>, country: STRING, postal_code: STRING >
	)
	STORED AS PARQUET;
	]]>
	</codeblock>

	<p>
	The following example shows how to examine the structure of a table containing one or more <codeph>STRUCT</codeph> columns by using
	the <codeph>DESCRIBE</codeph> statement. You can visualize each <codeph>STRUCT</codeph> as its own table, with columns named the same
	as each field of the <codeph>STRUCT</codeph>. If the <codeph>STRUCT</codeph> is nested inside another complex type, such as
	<codeph>ARRAY</codeph>, you can extend the qualified name passed to <codeph>DESCRIBE</codeph> until the output shows just the
	<codeph>STRUCT</codeph> fields.
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo;
	+-------------------+--------------------------+
	\| name \| type \|
	+-------------------+--------------------------+
	\| id \| bigint \|
	\| name \| string \|
	\| employee_info \| struct< \|
	\| \| employer:string, \|
	\| \| id:bigint, \|
	\| \| address:string \|
	\| \| > \|
	\| places_lived \| array<struct< \|
	\| \| street:string, \|
	\| \| city:string, \|
	\| \| country:string \|
	\| \| >> \|
	\| memorable_moments \| map<string,struct< \|
	\| \| year:int, \|
	\| \| place:string, \|
	\| \| details:string \|
	\| \| >> \|
	\| current_address \| struct< \|
	\| \| street_address:struct< \|
	\| \| street_number:int, \|
	\| \| street_name:string, \|
	\| \| street_type:string \|
	\| \| >, \|
	\| \| country:string, \|
	\| \| postal_code:string \|
	\| \| > \|
	+-------------------+--------------------------+
	]]>
	</codeblock>

	<p>
	The top-level column <codeph>EMPLOYEE_INFO</codeph> is a <codeph>STRUCT</codeph>. Describing
	<codeph><varname>table_name</varname>.<varname>struct_name</varname></codeph> displays the fields of the <codeph>STRUCT</codeph> as if
	they were columns of a table:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.employee_info;
	+----------+--------+
	\| name \| type \|
	+----------+--------+
	\| employer \| string \|
	\| id \| bigint \|
	\| address \| string \|
	+----------+--------+
	]]>
	</codeblock>

	<p>
	Because <codeph>PLACES_LIVED</codeph> is a <codeph>STRUCT</codeph> inside an <codeph>ARRAY</codeph>, the initial
	<codeph>DESCRIBE</codeph> shows the structure of the <codeph>ARRAY</codeph>:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.places_lived;
	+------+------------------+
	\| name \| type \|
	+------+------------------+
	\| item \| struct< \|
	\| \| street:string, \|
	\| \| city:string, \|
	\| \| country:string \|
	\| \| > \|
	\| pos \| bigint \|
	+------+------------------+
	]]>
	</codeblock>

	<p>
	Ask for the details of the <codeph>ITEM</codeph> field of the <codeph>ARRAY</codeph> to see just the layout of the
	<codeph>STRUCT</codeph>:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.places_lived.item;
	+---------+--------+
	\| name \| type \|
	+---------+--------+
	\| street \| string \|
	\| city \| string \|
	\| country \| string \|
	+---------+--------+
	]]>
	</codeblock>

	<p>
	Likewise, <codeph>MEMORABLE_MOMENTS</codeph> has a <codeph>STRUCT</codeph> inside a <codeph>MAP</codeph>, which requires an extra
	level of qualified name to see just the <codeph>STRUCT</codeph> part:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.memorable_moments;
	+-------+------------------+
	\| name \| type \|
	+-------+------------------+
	\| key \| string \|
	\| value \| struct< \|
	\| \| year:int, \|
	\| \| place:string, \|
	\| \| details:string \|
	\| \| > \|
	+-------+------------------+
	]]>
	</codeblock>

	<p>
	For a <codeph>MAP</codeph>, ask to see the <codeph>VALUE</codeph> field to see the corresponding <codeph>STRUCT</codeph> fields in a
	table-like structure:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.memorable_moments.value;
	+---------+--------+
	\| name \| type \|
	+---------+--------+
	\| year \| int \|
	\| place \| string \|
	\| details \| string \|
	+---------+--------+
	]]>
	</codeblock>

	<p>
	For a <codeph>STRUCT</codeph> inside a <codeph>STRUCT</codeph>, we can see the fields of the outer <codeph>STRUCT</codeph>:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.current_address;
	+----------------+-----------------------+
	\| name \| type \|
	+----------------+-----------------------+
	\| street_address \| struct< \|
	\| \| street_number:int, \|
	\| \| street_name:string, \|
	\| \| street_type:string \|
	\| \| > \|
	\| country \| string \|
	\| postal_code \| string \|
	+----------------+-----------------------+
	]]>
	</codeblock>

	<p>
	Then we can use a further qualified name to see just the fields of the inner <codeph>STRUCT</codeph>:
	</p>

	<codeblock><![CDATA[DESCRIBE struct_demo.current_address.street_address;
	+---------------+--------+
	\| name \| type \|
	+---------------+--------+
	\| street_number \| int \|
	\| street_name \| string \|
	\| street_type \| string \|
	+---------------+--------+
	]]>
	</codeblock>

	<p>
	The following example shows how to examine the structure of a table containing one or more <codeph>STRUCT</codeph> columns by using
	the <codeph>DESCRIBE</codeph> statement. You can visualize each <codeph>STRUCT</codeph> as its own table, with columns named the same
	as each field of the <codeph>STRUCT</codeph>. If the <codeph>STRUCT</codeph> is nested inside another complex type, such as
	<codeph>ARRAY</codeph>, you can extend the qualified name passed to <codeph>DESCRIBE</codeph> until the output shows just the
	<codeph>STRUCT</codeph> fields.
	</p>

	<!-- To do: See why the most verbose query form gives an error. -->

	<codeblock><![CDATA[DESCRIBE struct_demo;
	+-------------------+--------------------------+---------+
	\| name \| type \| comment \|
	+-------------------+--------------------------+---------+
	\| id \| bigint \| \|
	\| name \| string \| \|
	\| employee_info \| struct< \| \|
	\| \| employer:string, \| \|
	\| \| id:bigint, \| \|
	\| \| address:string \| \|
	\| \| > \| \|
	\| places_lived \| array<struct< \| \|
	\| \| street:string, \| \|
	\| \| city:string, \| \|
	\| \| country:string \| \|
	\| \| >> \| \|
	\| memorable_moments \| map<string,struct< \| \|
	\| \| year:int, \| \|
	\| \| place:string, \| \|
	\| \| details:string \| \|
	\| \| >> \| \|
	\| current_address \| struct< \| \|
	\| \| street_address:struct< \| \|
	\| \| street_number:int, \| \|
	\| \| street_name:string, \| \|
	\| \| street_type:string \| \|
	\| \| >, \| \|
	\| \| country:string, \| \|
	\| \| postal_code:string \| \|
	\| \| > \| \|
	+-------------------+--------------------------+---------+

	SELECT id, employee_info.id FROM struct_demo;

	SELECT id, employee_info.id AS employee_id FROM struct_demo;

	SELECT id, employee_info.id AS employee_id, employee_info.employer
	FROM struct_demo;

	SELECT id, name, street, city, country
	FROM struct_demo, struct_demo.places_lived;

	SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, places_lived.country
	FROM struct_demo, struct_demo.places_lived;

	SELECT id, name, pl.pos, pl.street, pl.city, pl.country
	FROM struct_demo, struct_demo.places_lived AS pl;

	SELECT id, name, places_lived.pos, places_lived.street, places_lived.city, places_lived.country
	FROM struct_demo, struct_demo.places_lived;

	SELECT id, name, pos, street, city, country
	FROM struct_demo, struct_demo.places_lived;

	SELECT id, name, memorable_moments.key,
	memorable_moments.value.year,
	memorable_moments.value.place,
	memorable_moments.value.details
	FROM struct_demo, struct_demo.memorable_moments
	WHERE memorable_moments.key IN ('Birthday','Anniversary','Graduation');

	SELECT id, name, mm.key, mm.value.year, mm.value.place, mm.value.details
	FROM struct_demo, struct_demo.memorable_moments AS mm
	WHERE mm.key IN ('Birthday','Anniversary','Graduation');

	SELECT id, name, memorable_moments.key, memorable_moments.value.year,
	memorable_moments.value.place, memorable_moments.value.details
	FROM struct_demo, struct_demo.memorable_moments
	WHERE key IN ('Birthday','Anniversary','Graduation');

	SELECT id, name, key, value.year, value.place, value.details
	FROM struct_demo, struct_demo.memorable_moments
	WHERE key IN ('Birthday','Anniversary','Graduation');

	SELECT id, name, key, year, place, details
	FROM struct_demo, struct_demo.memorable_moments
	WHERE key IN ('Birthday','Anniversary','Graduation');

	SELECT id, name,
	current_address.street_address.street_number,
	current_address.street_address.street_name,
	current_address.street_address.street_type,
	current_address.country,
	current_address.postal_code
	FROM struct_demo;
	]]>
	</codeblock>

	<p>
	For example, this table uses a struct that encodes several data values for each phone number associated with a person. Each person can
	have a variable-length array of associated phone numbers, and queries can refer to the category field to locate specific home, work,
	mobile, and so on kinds of phone numbers.
	</p>

	<codeblock><![CDATA[CREATE TABLE contact_info_many_structs
	(
	id BIGINT, name STRING,
	phone_numbers ARRAY < STRUCT <category:STRING, country_code:STRING, area_code:SMALLINT, full_number:STRING, mobile:BOOLEAN, carrier:STRING > >
	) STORED AS PARQUET;
	]]>
	</codeblock>

	<p>
	Because structs are naturally suited to composite values where the fields have different data types, you might use them to decompose
	things such as addresses:
	</p>

	<codeblock><![CDATA[CREATE TABLE contact_info_detailed_address
	(
	id BIGINT, name STRING,
	address STRUCT < house_number:INT, street:STRING, street_type:STRING, apartment:STRING, city:STRING, region:STRING, country:STRING >
	);
	]]>
	</codeblock>

	<p>
	In a big data context, splitting out data fields such as the number part of the address and the street name could let you do analysis
	on each field independently. For example, which streets have the largest number range of addresses, what are the statistical
	properties of the street names, which areas have a higher proportion of <q>Roads</q>, <q>Courts</q> or <q>Boulevards</q>, and so on.
	</p>

	<p conref="../shared/impala_common.xml#common/related_info"/>

	<p>
	<xref href="impala_complex_types.xml#complex_types"/>, <xref href="impala_array.xml#array"/>,
	<!-- <xref href="impala_struct.xml#struct"/>, -->
	<xref href="impala_map.xml#map"/>
	</p>

	</conbody>

	</concept>