blob: 0b514d3871c0fc5a30392e8987baf8122333e112 [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# SQLContext Sharing <a name=\"top\"></a>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example shows how Toree enables sharing of the SQLContext across the variety of languages that it supports (Scala, Python, R, SQL). To demostrate, this notebook will load data using one language and read it from another. Refer to the [Spark documentation](http://spark.apache.org/docs/latest/sql-programming-guide.html) for details about the DataFrame and SQL APIs."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"alert alert-info\" role=\"alert\" style=\"margin-top: 10px\">\n",
"<p><strong>Note</strong><p>\n",
"\n",
"<p>Due to an issue installing R and running it using DockerMachine, we are not able to show an example with R.</p>\n",
"</div>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Table of Contents**\n",
"\n",
"1. [Create a DataFrame in Scala](#create-in-scala)\n",
"2. [Read DataFrame in Python](#read-in-python)\n",
"3. [Create a DataFrame in Python](#create-in-python)\n",
"4. [Read DataFrame in Scala](#read-in-scala)\n",
"5. [Read DataFrame in SQL](#read-in-sql)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create a DataFrame in Scala <a name=\"create-in-scala\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"val people = spark.read.json(\"people.json\")\n",
"people.createOrReplaceTempView(\"people\")\n",
"people.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read DataFrame in Python <a name=\"read-in-python\"></a> <span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%%PySpark\n",
"people = spark.table(\"people\")\n",
"people.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create a DataFrame in Python <a name=\"create-in-python\"></a> <span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%%PySpark\n",
"cars = spark.read.json(\"cars.json\")\n",
"cars.createOrReplaceTempView(\"cars\")\n",
"cars.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read DataFrame in Scala <a name=\"read-in-scala\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"val cars = spark.table(\"cars\")\n",
"cars.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read DataFrame in SQL <a name=\"read-in-sql\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%%sql\n",
"select * from cars where manufacturer == 'Audi'"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Apache Toree - Scala",
"language": "scala",
"name": "apache_toree_scala"
},
"language_info": {
"file_extension": ".scala",
"name": "scala",
"version": "2.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 0
}