blob: 5b4c3835676f4e1f1d1672e08f429135acd9339d [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Toree Magics<a name=\"top\"></a>\n",
"Magics are special \"functions\" which enable features or execute some special code. Magics can receive input arguments when they are invoked. There are two types of magics: `cell` magics and `line` magics. Magics invocations are not case sensitive.\n",
"\n",
"**Table of Contents**\n",
"\n",
"1. [Line Magics](#line-magics)\n",
" 1. [LsMagic](#lsmagic)\n",
" 1. [Truncation](#truncation)\n",
" 1. [ShowTypes](#showtypes)\n",
" 1. [AddJar](#addjar)\n",
" 1. [AddDeps](#adddeps)\n",
"1. [Cell Magics](#cell-magics)\n",
" 1. [DataFrame](#dataframe)\n",
" 1. [Html](#html)\n",
" 1. [JavaScript](#javascript)\n",
" 1. [PySpark](#pyspark)\n",
" 1. [SparkR](#sparkr)\n",
" 1. [SparkSQL](#sparksql)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Line Magics<a name=\"line-magics\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"Line magics are run on a single line and can have other code and line magics within the same cell. Line magics use the following syntax: \n",
"\n",
"```\n",
"%magicname [args]\n",
"```\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### %LsMagic<a name=\"lsmagic\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `LsMagic` is a magic to list all the available magics."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Available line magics:\n",
"%addjar %lsmagic %showtypes %adddeps %truncation\n",
"\n",
"Available cell magics:\n",
"%%sql %%html %%javascript %%rdd %%scala %%sparkr %%pyspark\n",
"\n",
"Type %<magic_name> for usage info.\n",
" \n"
]
}
],
"source": [
"%LsMagic"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### %Truncation<a name=\"truncation\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"Toree will, by default, truncate results from statements. This can be managed through the `%Truncation` magic. To see the current state of the truncation setting you can invoke the magic."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Truncation is currently on \n"
]
}
],
"source": [
"// invoke the truncation magic to see if truncation is on or off\n",
"%Truncation"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170..."
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"// return a value to see the truncation\n",
"(1 to 200)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Output will NOT be truncated\n"
]
},
{
"data": {
"text/plain": [
"Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%Truncation off\n",
"(1 to 200)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Output WILL be truncated.\n"
]
},
{
"data": {
"text/plain": [
"Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170..."
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%Truncation on\n",
"(1 to 200)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### %ShowTypes<a name=\"showtypes\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The type information for a result is hidden by default. This behavior can be changed by using the `%ShowTypes` magic. You can view the current state of `%ShowTypes` by invoking it with no arguments."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ShowTypes is currently off \n"
]
}
],
"source": [
"%ShowTypes"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"Hello types!"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"Hello types!\""
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Types will be printed.\n"
]
},
{
"data": {
"text/plain": [
"String = Hello types!"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%ShowTypes on\n",
"\"Hello types!\""
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(Int, String) = (1,Hello types!)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"(1, \"Hello types!\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Types will not be printed\n"
]
},
{
"data": {
"text/plain": [
"Hello types!"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%ShowTypes off\n",
"\"Hello types!\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### %AddJar<a name=\"addjar\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"`AddJar` is a magic that allows the addition of jars to Torree's environment. You can see the arguments for `AddJar` by invoking it with no arguments."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Usage: %AddJar <jar_url>\n",
"\n",
"Option Description \n",
"------ ----------- \n",
"-f forces re-download of specified jar\n",
"--magic loads jar as a magic extension \n"
]
}
],
"source": [
"%AddJar"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Starting download from https://repo1.maven.org/maven2/org/lwjgl/lwjgl/3.0.0b/lwjgl-3.0.0b.jar\n",
"Finished download of lwjgl-3.0.0b.jar\n"
]
}
],
"source": [
"%AddJar https://repo1.maven.org/maven2/org/lwjgl/lwjgl/3.0.0b/lwjgl-3.0.0b.jar"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"3.0.0b SNAPSHOT"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"org.lwjgl.Version.getVersion()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## %AddDeps<a name=\"adddeps\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"`AddDeps` is a magic to add dependencies from a maven repository. You can see the arguments for `AddDeps` by invoking it with no arguments."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Usage: %AddDeps my.company artifact-id version\n",
"\n",
"Option Description \n",
"------ ----------- \n",
"--abort-on-resolution-errors Abort (no downloads) when resolution \n",
" fails \n",
"--repository Adds an additional repository to \n",
" available list \n",
"--trace Prints out trace of download progress\n",
"--transitive Retrieve dependencies recursively \n",
"--verbose Prints out additional information \n"
]
}
],
"source": [
"%AddDeps"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note, that by default the `AddDeps` magic will only retrieve the specified dependency. If you want the transitive dependencies provide the `--transitive` flag."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Marking org.joda:joda-money:0.11 for download\n",
"Preparing to fetch from:\n",
"-> file:/tmp/toree_add_deps5662724810625125387/\n",
"-> https://repo1.maven.org/maven2\n",
"=> 1 (): Downloading https://repo1.maven.org/maven2/org/joda/joda-money/0.11/joda-money-0.11.pom.sha1\n",
"=> 1 (): Downloading https://repo1.maven.org/maven2/org/joda/joda-money/0.11/joda-money-0.11.pom\n",
"===> 1 (joda-money-0.11.pom.sha1): Is 40 total bytes\n",
"===> 1 (joda-money-0.11.pom.sha1): Downloaded 40 bytes (100.00%)\n",
"=> 1 (joda-money-0.11.pom.sha1): Finished downloading\n",
"===> 1 (joda-money-0.11.pom): Is 22792 total bytes\n",
"===> 1 (joda-money-0.11.pom): Downloaded 15727 bytes (69.00%)\n",
"===> 1 (joda-money-0.11.pom): Downloaded 22792 bytes (100.00%)\n",
"=> 1 (joda-money-0.11.pom): Finished downloading\n",
"=> 1 (): Downloading https://repo1.maven.org/maven2/org/joda/joda-money/0.11/joda-money-0.11.jar.sha1\n",
"=> 2 (): Downloading https://repo1.maven.org/maven2/org/joda/joda-money/0.11/joda-money-0.11.jar\n",
"===> 1 (joda-money-0.11.jar.sha1): Is 40 total bytes\n",
"===> 1 (joda-money-0.11.jar.sha1): Downloaded 40 bytes (100.00%)\n",
"=> 1 (joda-money-0.11.jar.sha1): Finished downloading\n",
"===> 2 (joda-money-0.11.jar): Is 63725 total bytes\n",
"===> 2 (joda-money-0.11.jar): Downloaded 15713 bytes (24.66%)\n",
"===> 2 (joda-money-0.11.jar): Downloaded 32097 bytes (50.37%)\n",
"===> 2 (joda-money-0.11.jar): Downloaded 48481 bytes (76.08%)\n",
"===> 2 (joda-money-0.11.jar): Downloaded 63725 bytes (100.00%)\n",
"=> 2 (joda-money-0.11.jar): Finished downloading\n",
"-> New file at /tmp/toree_add_deps5662724810625125387/https/repo1.maven.org/maven2/org/joda/joda-money/0.11/joda-money-0.11.jar\n"
]
}
],
"source": [
"%AddDeps org.joda joda-money 0.11 --transitive --trace --verbose"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"AUD"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"org.joda.money.CurrencyUnit.AUD"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Cell Magics<a name=\"cell-magics\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"Cell magics are magics which take the whole cell as their argument. They take the following form:\n",
"\n",
"```\n",
"%%magicname\n",
"line1\n",
"line2\n",
"...\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### %%DataFrame<a name=\"dataframe\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%DataFrame` magic is used to convert a Spark SQL DataFrame into various formats. Currently, `json`, `html`, and `csv` are supported. The magic takes an expression, which evauluates to a dataframe, to perform the conversion. So, we first need to create a DataFrame object for reference."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"case class DFRecord(key: String, value: Int)\n",
"val sqlc = spark\n",
"import sqlc.implicits._\n",
"val df = sc.parallelize(1 to 10).map(x => DFRecord(x.toString, x)).toDF()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The default output is `html`"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"%%dataframe [arguments]\n",
"DATAFRAME_CODE\n",
"\n",
"DATAGRAME_CODE can be any numbered lines of code, as long as the\n",
"last line is a reference to a variable which is a DataFrame.\n",
" Option Description \n",
"------ ----------- \n",
"--help Displays the help and usage text for \n",
" this magic. \n",
"--limit The type of the output: html \n",
" (default), csv, json (default: 10) \n",
"--output The type of the output: html \n",
" (default), csv, json (default: html)\n"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%dataframe"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><th>key</th><th>value</th></tr><tr><td>1</td><td>1</td></tr><tr><td>2</td><td>2</td></tr><tr><td>3</td><td>3</td></tr><tr><td>4</td><td>4</td></tr><tr><td>5</td><td>5</td></tr><tr><td>6</td><td>6</td></tr><tr><td>7</td><td>7</td></tr><tr><td>8</td><td>8</td></tr><tr><td>9</td><td>9</td></tr><tr><td>10</td><td>10</td></tr></table>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%dataframe\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can specify the `--output` argument to change the output type."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"key,value\n",
"1,1\n",
"2,2\n",
"3,3\n",
"4,4\n",
"5,5\n",
"6,6\n",
"7,7\n",
"8,8\n",
"9,9\n",
"10,10"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%dataframe --output=csv\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There is also an option to limit the number of records returned."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<table><tr><th>key</th><th>value</th></tr><tr><td>1</td><td>1</td></tr><tr><td>2</td><td>2</td></tr><tr><td>3</td><td>3</td></tr></table>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%dataframe --limit=3\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### %%Html<a name=\"html\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%HTML` magic allows you to return HTML."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/html": [
"<p>\n",
"Hello, <strong>Magics</strong>!\n",
"</p>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%html\n",
"<p>\n",
"Hello, <strong>Magics</strong>!\n",
"</p>"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### %%JavaScript<a name=\"javascript\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%JavaScript` magic allows to return JavaScript. The JavaScript code will run in the notebook."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%JavaScript\n",
"alert(\"Hello, Magics!\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### %%PySpark<a name=\"pyspark\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%PySpark` exposes an environment with and a python interpreter and a shared `SparkContext`."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"4950\n"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%PySpark\n",
"from operator import add\n",
"print(sc.parallelize(range(1, 100)).reduce(add))"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### %%SparkR<a name=\"sparkr\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%SparkR` exposes an environment with and an R interpreter and a shared `SparkContext`."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"* installing *binary* package ‘SparkR’ ...\n",
"* DONE (SparkR)\n",
"Loading required package: methods\n",
"\n",
"Attaching package: ‘SparkR’\n",
"\n",
"The following objects are masked from ‘package:stats’:\n",
"\n",
" filter, na.omit\n",
"\n",
"The following objects are masked from ‘package:base’:\n",
"\n",
" intersect, rbind, sample, subset, summary, table, transform\n",
"\n",
"NULL\n",
"[1] \"Received Id 680be9df-a384-455d-b548-405560772cc1 Code df <- createDataFrame(spark, faithful)\\nhead(df)\"\n",
"[1] \"Code expr df <- createDataFrame(spark, faithful)\"\n",
"[2] \"Code expr head(df)\" \n",
"[1] \"Result type character 7\"\n",
"[1] \"Success 680be9df-a384-455d-b548-405560772cc1 eruptions waiting\"\n",
"[2] \"Success 680be9df-a384-455d-b548-405560772cc1 1 3.600 79\"\n",
"[3] \"Success 680be9df-a384-455d-b548-405560772cc1 2 1.800 54\"\n",
"[4] \"Success 680be9df-a384-455d-b548-405560772cc1 3 3.333 74\"\n",
"[5] \"Success 680be9df-a384-455d-b548-405560772cc1 4 2.283 62\"\n",
"[6] \"Success 680be9df-a384-455d-b548-405560772cc1 5 4.533 85\"\n",
"[7] \"Success 680be9df-a384-455d-b548-405560772cc1 6 2.883 55\"\n",
"[1] \"Marking success with output: eruptions waiting\\n1 3.600 79\\n2 1.800 54\\n3 3.333 74\\n4 2.283 62\\n5 4.533 85\\n6 2.883 55\"\n"
]
},
{
"data": {
"text/plain": [
"eruptions waiting\n",
"1 3.600 79\n",
"2 1.800 54\n",
"3 3.333 74\n",
"4 2.283 62\n",
"5 4.533 85\n",
"6 2.883 55"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%SparkR\n",
"df <- createDataFrame(spark, faithful)\n",
"head(df)"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### %%SparkSQL<a name=\"sparksql\"></a><span style=\"float: right; font-size: 0.5em\"><a href=\"#top\">Top</a></span>\n",
"The `%%SparkSQL` magic allows for SQL queries to be performed against tables saved in spark."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"val sqlc = spark\n",
"import sqlc.implicits._\n",
"case class Record(key: String, value: Int)\n",
"val df = sc.parallelize(1 to 10).map(x => Record(x.toString, x)).toDF()\n",
"df.registerTempTable(\"MYTABLE\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"+---+-----+\n",
"|key|value|\n",
"+---+-----+\n",
"| 6| 6|\n",
"| 7| 7|\n",
"| 8| 8|\n",
"| 9| 9|\n",
"| 10| 10|\n",
"+---+-----+\n",
"\n"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%SQL\n",
"SELECT * FROM MYTABLE WHERE value >= 6"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"+---+-----+\n",
"|key|value|\n",
"+---+-----+\n",
"| 4| 4|\n",
"| 5| 5|\n",
"| 6| 6|\n",
"| 7| 7|\n",
"| 8| 8|\n",
"| 9| 9|\n",
"| 10| 10|\n",
"+---+-----+\n",
"\n"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%SQL\n",
"SELECT * FROM MYTABLE WHERE value >= 4"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Apache Toree - Scala",
"language": "scala",
"name": "apache_toree_scala"
},
"language_info": {
"file_extension": ".scala",
"name": "scala",
"version": "2.11.8"
}
},
"nbformat": 4,
"nbformat_minor": 0
}