layout: docs title: Pig adapter permalink: /docs/pig_adapter.html

Overview

The Pig adapter allows you to write queries in SQL and execute them using Apache Pig.

A simple example

Let's start with a simple example. First, we need a [model definition]({{ site.baseurl }}/docs/model.html), as follows.

{% highlight json %} { “version”: “1.0”, “defaultSchema”: “SALES”, “schemas”: [ { “name”: “PIG”, “type”: “custom”, “factory”: “org.apache.calcite.adapter.pig.PigSchemaFactory”, “tables”: [ { “name”: “t”, “type”: “custom”, “factory”: “org.apache.calcite.adapter.pig.PigTableFactory”, “operand”: { “file”: “data.txt”, “columns”: [“tc0”, “tc1”] } }, { “name”: “s”, “type”: “custom”, “factory”: “org.apache.calcite.adapter.pig.PigTableFactory”, “operand”: { “file”: “data2.txt”, “columns”: [“sc0”, “sc1”] } } ] } ] } {% endhighlight %}

Now, if you write the SQL query

{% highlight sql %} select * from “t” join “s” on “tc1” = “sc0” {% endhighlight %}

the Pig adapter will generate the Pig Latin script

{% highlight sql %} t = LOAD ‘data.txt’ USING PigStorage() AS (tc0:chararray, tc1:chararray); s = LOAD ‘data2.txt’ USING PigStorage() AS (sc0:chararray, sc1:chararray); t = JOIN t BY tc1, s BY sc0; {% endhighlight %}

which is then executed using Pig's runtime, typically MapReduce on Apache Hadoop.

Relationship to Piglet

Calcite has another component called Piglet. It allows you to write queries in a subset of Pig Latin, and execute them using any applicable Calcite adapter. So, Piglet is basically the opposite of the Pig adapter.