blob: 832672e4c8a6454b23c63e7c01ebe4eee8b2fdab [file] [log] [blame]
<!-- HTML header for doxygen 1.8.4-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.8.10"/>
<meta name="keywords" content="madlib,postgres,greenplum,machine learning,data mining,deep learning,ensemble methods,data science,market basket analysis,affinity analysis,pca,lda,regression,elastic net,huber white,proportional hazards,k-means,latent dirichlet allocation,bayes,support vector machines,svm"/>
<title>MADlib: Pivot</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
$(document).ready(initResizable);
$(window).load(resizeHeight);
</script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/searchdata.js"></script>
<script type="text/javascript" src="search/search.js"></script>
<script type="text/javascript">
$(document).ready(function() { init_search(); });
</script>
<!-- hack in the navigation tree -->
<script type="text/javascript" src="eigen_navtree_hacks.js"></script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="madlib_extra.css" rel="stylesheet" type="text/css"/>
<!-- google analytics -->
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-45382226-1', 'madlib.net');
ga('send', 'pageview');
</script>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="titlearea">
<table cellspacing="0" cellpadding="0">
<tbody>
<tr style="height: 56px;">
<td id="projectlogo"><a href="http://madlib.net"><img alt="Logo" src="madlib.png" height="50" style="padding-left:0.5em;" border="0"/ ></a></td>
<td style="padding-left: 0.5em;">
<div id="projectname">
<span id="projectnumber">1.9.1</span>
</div>
<div id="projectbrief">User Documentation for MADlib</div>
</td>
<td> <div id="MSearchBox" class="MSearchBoxInactive">
<span class="left">
<img id="MSearchSelect" src="search/mag_sel.png"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
alt=""/>
<input type="text" id="MSearchField" value="Search" accesskey="S"
onfocus="searchBox.OnSearchFieldFocus(true)"
onblur="searchBox.OnSearchFieldFocus(false)"
onkeyup="searchBox.OnSearchFieldChange(event)"/>
</span><span class="right">
<a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
</span>
</div>
</td>
</tr>
</tbody>
</table>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.10 -->
<script type="text/javascript">
var searchBox = new SearchBox("searchBox", "search",false,'Search');
</script>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
<div id="nav-tree-contents">
<div id="nav-sync" class="sync"></div>
</div>
</div>
<div id="splitbar" style="-moz-user-select:none;"
class="ui-resizable-handle">
</div>
</div>
<script type="text/javascript">
$(document).ready(function(){initNavTree('group__grp__pivot.html','');});
</script>
<div id="doc-content">
<!-- window showing the filter options -->
<div id="MSearchSelectWindow"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
onkeydown="return searchBox.OnSearchSelectKey(event)">
</div>
<!-- iframe showing the search results (closed by default) -->
<div id="MSearchResultsWindow">
<iframe src="javascript:void(0)" frameborder="0"
name="MSearchResults" id="MSearchResults">
</iframe>
</div>
<div class="header">
<div class="headertitle">
<div class="title">Pivot<div class="ingroups"><a class="el" href="group__grp__datatrans.html">Data Types and Transformations</a></div></div> </div>
</div><!--header-->
<div class="contents">
<div class="toc"><b>Contents</b> </p><ul>
<li>
<a href="#categorical">Pivoting</a> </li>
<li>
<a href="#notes">Notes</a> </li>
<li>
<a href="#examples">Examples</a> </li>
<li>
<a href="#literature">Literature</a> </li>
</ul>
</div><p><a class="anchor" id="categorical"></a>The goal of the MADlib pivot function is to provide a data summarization tool that can do basic OLAP type operations on data stored in one table and output the summarized data to a second table.</p>
<pre class="syntax">
pivot(
source_table,
output_table,
index,
pivot_cols,
pivot_values,
aggregate_func,
fill_value,
keep_null,
output_col_dictionary
)
</pre><p> <b>Arguments</b> </p><dl class="arglist">
<dt>source_table </dt>
<dd>VARCHAR. Name of the source table (or view) containing data to pivot. </dd>
<dt>output_table </dt>
<dd><p class="startdd">VARCHAR. Name of output table that contains the pivoted data. The output table contains all the columns present in the <em>'index'</em> column list, plus additional columns for each distinct value in <em>'pivot_cols'</em>.</p>
<dl class="section note"><dt>Note</dt><dd>The names of the columns in the output table are auto-generated. Please see the examples section below to see how this works in practice. The convention used is to concatenate the following strings and separate each by an underscore '_' :<ul>
<li>name of the value column <em>'pivot_values'</em></li>
<li>aggregate function</li>
<li>name of the pivot column <em>'pivot_cols'</em></li>
<li>values in the pivot column</li>
</ul>
</dd></dl>
</dd>
<dt>index </dt>
<dd>VARCHAR. Comma-separated columns that will form the index of the output pivot table. By index we mean the values to group by; these are the rows in the output pivot table. </dd>
<dt>pivot_cols </dt>
<dd>VARCHAR. Comma-separated columns that will form the columns of the output pivot table. </dd>
<dt>pivot_values </dt>
<dd>VARCHAR. Comma-separated columns that contain the values to be summarized in the output pivot table. </dd>
<dt>aggregate_func (optional) </dt>
<dd><p class="startdd">VARCHAR. default: 'AVG'. A comma-separated list of aggregates to be applied to values. These can be PostgreSQL built-in aggregates [1] or UDAs. It is possible to assign a set of aggregates per value column. Please refer to the examples 12-14 below for syntax details.</p>
<dl class="section note"><dt>Note</dt><dd>Only aggregates with strict transition functions are permitted here. A strict transition function means rows with null values are ignored; the function is not called and the previous state value is retained. If you need some other behavior for null inputs, this should be done prior to calling the pivot function. Aggregates with strict transition functions are described in [2,3].</dd></dl>
</dd>
<dt>fill_value (optional) </dt>
<dd>VARCHAR. default: NULL. If specified, determines how to fill NULL values resulting from pivot operation. This is a global parameter (not applied per aggregate) and is applied post-aggregation to the output table. </dd>
<dt>keep_null (optional) </dt>
<dd>BOOLEAN. default: FALSE. If TRUE, then pivot columns are created corresponding to NULL categories. If FALSE, then no pivot columns will be created for NULL categories. </dd>
<dt>output_col_dictionary (optional) </dt>
<dd><p class="startdd">BOOLEAN. default: FALSE. This parameter is used to handle auto-generated column names that exceed the PostgreSQL limit of 63 bytes (could be a common occurrence). If TRUE, column names will be set as numerical IDs and will create a dictionary table called output_table appended with _dictionary. If FALSE, will auto-generate column names in the usual way unless the limit of 63 bytes will be exceeded. In this case, a dictionary output file will be created and a message given to the user. </p>
<p class="enddd"></p>
</dd>
</dl>
<p><a class="anchor" id="notes"></a></p><dl class="section note"><dt>Note</dt><dd><ul>
<li>NULLs in the index column are treated like any other value.</li>
<li>NULLs in the pivot column are ignored unless keep_null is TRUE.</li>
<li>Only strict transition functions are allowed so NULLs are ignored.</li>
<li>It is not allowed to set the fill_value parameter without setting the aggregate_func parameter due to possible ambiguity. Set aggregate_func to NULL for the default behavior and use fill_value as desired.</li>
<li>It is not allowed to set the output_col_dictionary parameter without setting the keep_null parameter due to possible ambiguity. Set keep_null to NULL for the default behavior and use output_col_dictionary as desired.</li>
<li>Expressions (instead of column names) are not supported. Create a view with the desired expressions and pass it as the input table (see example 3 below).</li>
<li>It is allowed to pass a partial mapping for the aggregate_func parameter. The missing value columns will be aggregated using the default function (average).</li>
</ul>
</dd></dl>
<p><a class="anchor" id="examples"></a></p><dl class="section user"><dt>Examples</dt><dd></dd></dl>
<ol type="1">
<li>Create a simple dataset to demonstrate a basic pivot: <pre class="example">
DROP TABLE IF EXISTS pivset CASCADE; -- View below may depend on table so use CASCADE
CREATE TABLE pivset(
id INTEGER,
piv INTEGER,
val FLOAT8
);
INSERT INTO pivset VALUES
(0, 10, 1),
(0, 10, 2),
(0, 20, 3),
(1, 20, 4),
(1, 30, 5),
(1, 30, 6),
(1, 10, 7),
(NULL, 10, 8),
(1, NULL, 9),
(1, 10, NULL);
</pre></li>
<li>Pivot the table: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset', 'pivout', 'id', 'piv', 'val');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
id | val_avg_piv_10 | val_avg_piv_20 | val_avg_piv_30
----+----------------+----------------+----------------
0 | 1.5 | 3 |
1 | 7 | 4 | 5.5
| 8 | |
</pre> Here NULL is showing as an empty cell in the output.</li>
<li>Now let's add some more columns to our data set and create a view: <pre class="example">
DROP VIEW IF EXISTS pivset_ext;
CREATE VIEW pivset_ext AS
SELECT *,
COALESCE(id + (val / 3)::integer, 0) AS id2,
COALESCE(100*(val / 3)::integer, 0) AS piv2,
COALESCE(val + 10, 0) AS val2
FROM pivset;
SELECT id,id2,piv,piv2,val,val2 FROM pivset_ext
ORDER BY id,id2,piv,piv2,val,val2;
</pre> <pre class="result">
id | id2 | piv | piv2 | val | val2
----+-----+-----+------+-----+------
0 | 0 | 10 | 0 | 1 | 11
0 | 1 | 10 | 100 | 2 | 12
0 | 1 | 20 | 100 | 3 | 13
1 | 0 | 10 | 0 | | 0
1 | 2 | 20 | 100 | 4 | 14
1 | 3 | 10 | 200 | 7 | 17
1 | 3 | 30 | 200 | 5 | 15
1 | 3 | 30 | 200 | 6 | 16
1 | 4 | | 300 | 9 | 19
| 0 | 10 | 300 | 8 | 18
(10 rows)
</pre></li>
<li>Let's use a different aggregate function on the view we just created: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'sum');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
id | val_sum_piv_10 | val_sum_piv_20 | val_sum_piv_30
----+----------------+----------------+----------------
0 | 3 | 3 |
1 | 7 | 4 | 11
| 8 | |
</pre></li>
<li>Now create a custom aggregate. Note that the aggregate must have a strict transition function: <pre class="example">
DROP FUNCTION IF EXISTS array_add1 (ANYARRAY, ANYELEMENT) CASCADE;
CREATE FUNCTION array_add1(ANYARRAY, ANYELEMENT) RETURNS ANYARRAY AS $$
SELECT $1 || $2
$$ LANGUAGE sql STRICT;
DROP AGGREGATE IF EXISTS array_accum1 (anyelement);
CREATE AGGREGATE array_accum1 (anyelement) (
sfunc = array_add1,
stype = anyarray,
initcond = '{}'
);
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'array_accum1');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
id | val_array_accum1_piv_10 | val_array_accum1_piv_20 | val_array_accum1_piv_30
----+-------------------------+-------------------------+-------------------------
0 | {1,2} | {3} | {}
1 | {7} | {4} | {5,6}
| {8} | {} | {}
</pre></li>
<li>Keep null values in the pivot column: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'sum', True);
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
id | val_sum_piv_10 | val_sum_piv_20 | val_sum_piv_30 | val_sum_piv_null
----+----------------+----------------+----------------+------------------
0 | 3 | 3 | |
1 | 7 | 4 | 11 | 9
| 8 | | |
</pre></li>
<li>Fill null results with a value of interest: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'sum', '111');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
id | val_sum_piv_10 | val_sum_piv_20 | val_sum_piv_30
----+----------------+----------------+----------------
0 | 3 | 3 | 111
1 | 7 | 4 | 11
| 8 | 111 | 111
</pre></li>
<li>Use multiple index columns: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id,id2', 'piv', 'val');
SELECT * FROM pivout ORDER BY id,id2;
</pre> <pre class="result">
id | id2 | val_avg_piv_10 | val_avg_piv_20 | val_avg_piv_30
----+-----+----------------+----------------+----------------
0 | 0 | 1 | |
0 | 1 | 2 | 3 |
1 | 0 | | |
1 | 2 | | 4 |
1 | 3 | 7 | | 5.5
1 | 4 | | |
| 0 | 8 | |
</pre></li>
<li>Turn on the extended view for readability: <pre class="example">
\x on
</pre></li>
<li>Use multiple pivot columns: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv, piv2', 'val');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
-[ RECORD 1 ]-----------+----
id | 0
val_avg_piv_10_piv2_0 | 1
val_avg_piv_10_piv2_100 | 2
val_avg_piv_10_piv2_200 |
val_avg_piv_10_piv2_300 |
val_avg_piv_20_piv2_0 |
val_avg_piv_20_piv2_100 | 3
val_avg_piv_20_piv2_200 |
val_avg_piv_20_piv2_300 |
val_avg_piv_30_piv2_0 |
val_avg_piv_30_piv2_100 |
val_avg_piv_30_piv2_200 |
val_avg_piv_30_piv2_300 |
-[ RECORD 2 ]-----------+----
id | 1
val_avg_piv_10_piv2_0 |
val_avg_piv_10_piv2_100 |
val_avg_piv_10_piv2_200 | 7
val_avg_piv_10_piv2_300 |
val_avg_piv_20_piv2_0 |
val_avg_piv_20_piv2_100 | 4
val_avg_piv_20_piv2_200 |
val_avg_piv_20_piv2_300 |
val_avg_piv_30_piv2_0 |
val_avg_piv_30_piv2_100 |
val_avg_piv_30_piv2_200 | 5.5
val_avg_piv_30_piv2_300 |
...
</pre></li>
<li>Use multiple value columns: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val, val2');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
-[ RECORD 1 ]---+-----
id | 0
val_avg_piv_10 | 1.5
val_avg_piv_20 | 3
val_avg_piv_30 |
val2_avg_piv_10 | 11.5
val2_avg_piv_20 | 13
val2_avg_piv_30 |
-[ RECORD 2 ]---+-----
id | 1
val_avg_piv_10 | 7
val_avg_piv_20 | 4
val_avg_piv_30 | 5.5
val2_avg_piv_10 | 8.5
val2_avg_piv_20 | 14
val2_avg_piv_30 | 15.5
...
</pre></li>
<li>Use multiple aggregate functions on the same value column (cross product): <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val', 'avg, sum');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
-[ RECORD 1 ]--+----
id | 0
val_avg_piv_10 | 1.5
val_avg_piv_20 | 3
val_avg_piv_30 |
val_sum_piv_10 | 3
val_sum_piv_20 | 3
val_sum_piv_30 |
-[ RECORD 2 ]--+----
id | 1
val_avg_piv_10 | 7
val_avg_piv_20 | 4
val_avg_piv_30 | 5.5
val_sum_piv_10 | 7
val_sum_piv_20 | 4
val_sum_piv_30 | 11
...
</pre></li>
<li>Use different aggregate functions for different value columns: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val, val2',
'val=avg, val2=sum');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
-[ RECORD 1 ]---+----
id | 0
val_avg_piv_10 | 1.5
val_avg_piv_20 | 3
val_avg_piv_30 |
val2_sum_piv_10 | 23
val2_sum_piv_20 | 13
val2_sum_piv_30 |
-[ RECORD 2 ]---+----
id | 1
val_avg_piv_10 | 7
val_avg_piv_20 | 4
val_avg_piv_30 | 5.5
val2_sum_piv_10 | 17
val2_sum_piv_20 | 14
val2_sum_piv_30 | 31
...
</pre></li>
<li>Use multiple aggregate functions for different value columns: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id', 'piv', 'val, val2',
'val=avg, val2=[avg,sum]');
SELECT * FROM pivout ORDER BY id;
</pre> <pre class="result">
-[ RECORD 1 ]---+-----
id | 0
val_avg_piv_10 | 1.5
val_avg_piv_20 | 3
val_avg_piv_30 |
val2_avg_piv_10 | 11.5
val2_avg_piv_20 | 13
val2_avg_piv_30 |
val2_sum_piv_10 | 23
val2_sum_piv_20 | 13
val2_sum_piv_30 |
-[ RECORD 2 ]---+-----
id | 1
val_avg_piv_10 | 7
val_avg_piv_20 | 4
val_avg_piv_30 | 5.5
val2_avg_piv_10 | 8.5
val2_avg_piv_20 | 14
val2_avg_piv_30 | 15.5
val2_sum_piv_10 | 17
val2_sum_piv_20 | 14
val2_sum_piv_30 | 31
...
</pre></li>
<li>Combine all of the options: <pre class="example">
DROP TABLE IF EXISTS pivout;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id, id2', 'piv, piv2', 'val, val2',
'val=avg, val2=[avg,sum]', '111', True);
SELECT * FROM pivout ORDER BY id,id2;
</pre> <pre class="result">
-[ RECORD 1 ]--------------+-----
id | 0
id2 | 0
val_avg_piv_null_piv2_0 | 111
val_avg_piv_null_piv2_100 | 111
val_avg_piv_null_piv2_200 | 111
val_avg_piv_null_piv2_300 | 111
val_avg_piv_10_piv2_0 | 1
val_avg_piv_10_piv2_100 | 111
val_avg_piv_10_piv2_200 | 111
val_avg_piv_10_piv2_300 | 111
val_avg_piv_20_piv2_0 | 111
val_avg_piv_20_piv2_100 | 111
val_avg_piv_20_piv2_200 | 111
val_avg_piv_20_piv2_300 | 111
val_avg_piv_30_piv2_0 | 111
val_avg_piv_30_piv2_100 | 111
val_avg_piv_30_piv2_200 | 111
val_avg_piv_30_piv2_300 | 111
val2_avg_piv_null_piv2_0 | 111
val2_avg_piv_null_piv2_100 | 111
val2_avg_piv_null_piv2_200 | 111
val2_avg_piv_null_piv2_300 | 111
val2_avg_piv_10_piv2_0 | 11
val2_avg_piv_10_piv2_100 | 111
val2_avg_piv_10_piv2_200 | 111
val2_avg_piv_10_piv2_300 | 111
val2_avg_piv_20_piv2_0 | 111
val2_avg_piv_20_piv2_100 | 111
val2_avg_piv_20_piv2_200 | 111
val2_avg_piv_20_piv2_300 | 111
val2_avg_piv_30_piv2_0 | 111
val2_avg_piv_30_piv2_100 | 111
val2_avg_piv_30_piv2_200 | 111
val2_avg_piv_30_piv2_300 | 111
val2_sum_piv_null_piv2_0 | 111
val2_sum_piv_null_piv2_100 | 111
val2_sum_piv_null_piv2_200 | 111
val2_sum_piv_null_piv2_300 | 111
val2_sum_piv_10_piv2_0 | 11
val2_sum_piv_10_piv2_100 | 111
val2_sum_piv_10_piv2_200 | 111
val2_sum_piv_10_piv2_300 | 111
val2_sum_piv_20_piv2_0 | 111
val2_sum_piv_20_piv2_100 | 111
val2_sum_piv_20_piv2_200 | 111
val2_sum_piv_20_piv2_300 | 111
val2_sum_piv_30_piv2_0 | 111
val2_sum_piv_30_piv2_100 | 111
val2_sum_piv_30_piv2_200 | 111
val2_sum_piv_30_piv2_300 | 111
-[ RECORD 2 ]--------------+-----
id | 0
id2 | 1
val_avg_piv_null_piv2_0 | 111
val_avg_piv_null_piv2_100 | 111
val_avg_piv_null_piv2_200 | 111
val_avg_piv_null_piv2_300 | 111
val_avg_piv_10_piv2_0 | 111
val_avg_piv_10_piv2_100 | 2
val_avg_piv_10_piv2_200 | 111
val_avg_piv_10_piv2_300 | 111
val_avg_piv_20_piv2_0 | 111
val_avg_piv_20_piv2_100 | 3
val_avg_piv_20_piv2_200 | 111
val_avg_piv_20_piv2_300 | 111
val_avg_piv_30_piv2_0 | 111
val_avg_piv_30_piv2_100 | 111
val_avg_piv_30_piv2_200 | 111
val_avg_piv_30_piv2_300 | 111
val2_avg_piv_null_piv2_0 | 111
val2_avg_piv_null_piv2_100 | 111
val2_avg_piv_null_piv2_200 | 111
val2_avg_piv_null_piv2_300 | 111
val2_avg_piv_10_piv2_0 | 111
val2_avg_piv_10_piv2_100 | 12
val2_avg_piv_10_piv2_200 | 111
val2_avg_piv_10_piv2_300 | 111
val2_avg_piv_20_piv2_0 | 111
val2_avg_piv_20_piv2_100 | 13
val2_avg_piv_20_piv2_200 | 111
val2_avg_piv_20_piv2_300 | 111
val2_avg_piv_30_piv2_0 | 111
val2_avg_piv_30_piv2_100 | 111
val2_avg_piv_30_piv2_200 | 111
val2_avg_piv_30_piv2_300 | 111
val2_sum_piv_null_piv2_0 | 111
val2_sum_piv_null_piv2_100 | 111
val2_sum_piv_null_piv2_200 | 111
val2_sum_piv_null_piv2_300 | 111
val2_sum_piv_10_piv2_0 | 111
val2_sum_piv_10_piv2_100 | 12
val2_sum_piv_10_piv2_200 | 111
val2_sum_piv_10_piv2_300 | 111
val2_sum_piv_20_piv2_0 | 111
val2_sum_piv_20_piv2_100 | 13
val2_sum_piv_20_piv2_200 | 111
val2_sum_piv_20_piv2_300 | 111
val2_sum_piv_30_piv2_0 | 111
val2_sum_piv_30_piv2_100 | 111
val2_sum_piv_30_piv2_200 | 111
val2_sum_piv_30_piv2_300 | 111
...
</pre></li>
<li>Create a dictionary for output column names: <pre class="example">
DROP TABLE IF EXISTS pivout, pivout_dictionary;
SELECT madlib.pivot('pivset_ext', 'pivout', 'id, id2', 'piv, piv2', 'val, val2',
'val=avg, val2=[avg,sum]', '111', True, True);
SELECT * FROM pivout_dictionary;
</pre> <pre class="result">
__pivot_cid__ | pval | agg | piv | piv2 | col_name
---------------+------+-----+-----+------+------------------------------
__p_1__ | val | avg | | 100 | "val_avg_piv_null_piv2_100"
__p_5__ | val | avg | 10 | 100 | "val_avg_piv_10_piv2_100"
__p_9__ | val | avg | 20 | 100 | "val_avg_piv_20_piv2_100"
__p_12__ | val | avg | 30 | 0 | "val_avg_piv_30_piv2_0"
__p_16__ | val2 | avg | | 0 | "val2_avg_piv_null_piv2_0"
__p_23__ | val2 | avg | 10 | 300 | "val2_avg_piv_10_piv2_300"
__p_27__ | val2 | avg | 20 | 300 | "val2_avg_piv_20_piv2_300"
__p_30__ | val2 | avg | 30 | 200 | "val2_avg_piv_30_piv2_200"
__p_34__ | val2 | sum | | 200 | "val2_sum_piv_null_piv2_200"
__p_38__ | val2 | sum | 10 | 200 | "val2_sum_piv_10_piv2_200"
__p_41__ | val2 | sum | 20 | 100 | "val2_sum_piv_20_piv2_100"
__p_45__ | val2 | sum | 30 | 100 | "val2_sum_piv_30_piv2_100"
__p_2__ | val | avg | | 200 | "val_avg_piv_null_piv2_200"
__p_6__ | val | avg | 10 | 200 | "val_avg_piv_10_piv2_200"
__p_11__ | val | avg | 20 | 300 | "val_avg_piv_20_piv2_300"
__p_15__ | val | avg | 30 | 300 | "val_avg_piv_30_piv2_300"
__p_19__ | val2 | avg | | 300 | "val2_avg_piv_null_piv2_300"
__p_20__ | val2 | avg | 10 | 0 | "val2_avg_piv_10_piv2_0"
__p_24__ | val2 | avg | 20 | 0 | "val2_avg_piv_20_piv2_0"
__p_28__ | val2 | avg | 30 | 0 | "val2_avg_piv_30_piv2_0"
__p_33__ | val2 | sum | | 100 | "val2_sum_piv_null_piv2_100"
__p_37__ | val2 | sum | 10 | 100 | "val2_sum_piv_10_piv2_100"
__p_42__ | val2 | sum | 20 | 200 | "val2_sum_piv_20_piv2_200"
__p_46__ | val2 | sum | 30 | 200 | "val2_sum_piv_30_piv2_200"
__p_3__ | val | avg | | 300 | "val_avg_piv_null_piv2_300"
__p_7__ | val | avg | 10 | 300 | "val_avg_piv_10_piv2_300"
__p_10__ | val | avg | 20 | 200 | "val_avg_piv_20_piv2_200"
__p_14__ | val | avg | 30 | 200 | "val_avg_piv_30_piv2_200"
__p_18__ | val2 | avg | | 200 | "val2_avg_piv_null_piv2_200"
__p_21__ | val2 | avg | 10 | 100 | "val2_avg_piv_10_piv2_100"
__p_25__ | val2 | avg | 20 | 100 | "val2_avg_piv_20_piv2_100"
__p_29__ | val2 | avg | 30 | 100 | "val2_avg_piv_30_piv2_100"
__p_32__ | val2 | sum | | 0 | "val2_sum_piv_null_piv2_0"
__p_36__ | val2 | sum | 10 | 0 | "val2_sum_piv_10_piv2_0"
__p_43__ | val2 | sum | 20 | 300 | "val2_sum_piv_20_piv2_300"
__p_47__ | val2 | sum | 30 | 300 | "val2_sum_piv_30_piv2_300"
__p_0__ | val | avg | | 0 | "val_avg_piv_null_piv2_0"
__p_4__ | val | avg | 10 | 0 | "val_avg_piv_10_piv2_0"
__p_8__ | val | avg | 20 | 0 | "val_avg_piv_20_piv2_0"
__p_13__ | val | avg | 30 | 100 | "val_avg_piv_30_piv2_100"
__p_17__ | val2 | avg | | 100 | "val2_avg_piv_null_piv2_100"
__p_22__ | val2 | avg | 10 | 200 | "val2_avg_piv_10_piv2_200"
__p_26__ | val2 | avg | 20 | 200 | "val2_avg_piv_20_piv2_200"
__p_31__ | val2 | avg | 30 | 300 | "val2_avg_piv_30_piv2_300"
__p_35__ | val2 | sum | | 300 | "val2_sum_piv_null_piv2_300"
__p_39__ | val2 | sum | 10 | 300 | "val2_sum_piv_10_piv2_300"
__p_40__ | val2 | sum | 20 | 0 | "val2_sum_piv_20_piv2_0"
__p_44__ | val2 | sum | 30 | 0 | "val2_sum_piv_30_piv2_0"
(48 rows)
</pre> <pre class="example">
SELECT * FROM pivout ORDER BY id,id2;
</pre> <pre class="result">
-[ RECORD 1 ]--
id | 0
id2 | 0
__p_0__ | 111
__p_1__ | 111
__p_2__ | 111
__p_3__ | 111
__p_4__ | 1
__p_5__ | 111
__p_6__ | 111
__p_7__ | 111
__p_8__ | 111
__p_9__ | 111
__p_10__ | 111
__p_11__ | 111
__p_12__ | 111
__p_13__ | 111
__p_14__ | 111
__p_15__ | 111
__p_16__ | 111
__p_17__ | 111
__p_18__ | 111
__p_19__ | 111
__p_20__ | 11
__p_21__ | 111
__p_22__ | 111
__p_23__ | 111
__p_24__ | 111
__p_25__ | 111
__p_26__ | 111
__p_27__ | 111
__p_28__ | 111
__p_29__ | 111
__p_30__ | 111
__p_31__ | 111
__p_32__ | 111
__p_33__ | 111
__p_34__ | 111
__p_35__ | 111
__p_36__ | 11
__p_37__ | 111
__p_38__ | 111
__p_39__ | 111
__p_40__ | 111
__p_41__ | 111
__p_42__ | 111
__p_43__ | 111
__p_44__ | 111
__p_45__ | 111
__p_46__ | 111
__p_47__ | 111
-[ RECORD 2 ]--
id | 0
id2 | 1
__p_0__ | 111
__p_1__ | 111
__p_2__ | 111
__p_3__ | 111
__p_4__ | 111
__p_5__ | 2
__p_6__ | 111
__p_7__ | 111
__p_8__ | 111
__p_9__ | 3
__p_10__ | 111
__p_11__ | 111
__p_12__ | 111
__p_13__ | 111
__p_14__ | 111
__p_15__ | 111
__p_16__ | 111
__p_17__ | 111
__p_18__ | 111
__p_19__ | 111
__p_20__ | 111
__p_21__ | 12
__p_22__ | 111
__p_23__ | 111
__p_24__ | 111
__p_25__ | 13
__p_26__ | 111
__p_27__ | 111
__p_28__ | 111
__p_29__ | 111
__p_30__ | 111
__p_31__ | 111
__p_32__ | 111
__p_33__ | 111
__p_34__ | 111
__p_35__ | 111
__p_36__ | 111
__p_37__ | 12
__p_38__ | 111
__p_39__ | 111
__p_40__ | 111
__p_41__ | 13
__p_42__ | 111
__p_43__ | 111
__p_44__ | 111
__p_45__ | 111
__p_46__ | 111
__p_47__ | 111
...
</pre></li>
</ol>
<p><a class="anchor" id="literature"></a></p><dl class="section user"><dt>Literature</dt><dd></dd></dl>
<p><a class="anchor" id="svm-lit-1"></a>[1] <a href="https://www.postgresql.org/docs/8.2/static/functions-aggregate.html">https://www.postgresql.org/docs/8.2/static/functions-aggregate.html</a></p>
<p>[2] <a href="https://www.postgresql.org/docs/8.2/static/sql-createaggregate.html">https://www.postgresql.org/docs/8.2/static/sql-createaggregate.html</a></p>
<p>[3] <a href="https://www.postgresql.org/docs/8.2/static/xaggr.html">https://www.postgresql.org/docs/8.2/static/xaggr.html</a> </p>
</div><!-- contents -->
</div><!-- doc-content -->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
<li class="footer">Generated on Tue Sep 20 2016 11:27:01 for MADlib by
<a href="http://www.doxygen.org/index.html">
<img class="footer" src="doxygen.png" alt="doxygen"/></a> 1.8.10 </li>
</ul>
</div>
</body>
</html>