blob: 018089e341c6d0a6d60fa1773f21d51b2b26342f [file] [log] [blame]
{
"paragraphs": [
{
"title": "Introduction",
"text": "%md\n\nThis is a tutorial note for Flink batch scenario (`To be noticed, you need to use flink 1.9 or afterwards`), . You can run flink scala api via `%flink` and run flink batch sql via `%flink.bsql`. \nThis note use flink\u0027s DataSet api to demonstrate flink\u0027s batch capablity. DataSet is only supported by flink planner, so here we have to specify the planner `zeppelin.flink.planner` as `flink`, otherwise it would use blink planner by default which doesn\u0027t support DataSet api.\n",
"user": "anonymous",
"dateUpdated": "2019-10-08 15:15:54.910",
"config": {
"tableHide": false,
"editorSetting": {
"language": "markdown",
"editOnDblClick": true,
"completionKey": "TAB",
"completionSupport": false
},
"colWidth": 12.0,
"editorMode": "ace/mode/markdown",
"fontSize": 9.0,
"editorHide": true,
"title": false,
"runOnSelectionChange": true,
"checkEmpty": true,
"results": {},
"enabled": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "HTML",
"data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThis is a tutorial note for Flink batch scenario (\u003ccode\u003eTo be noticed, you need to use flink 1.9 or afterwards\u003c/code\u003e), . You can run flink scala api via \u003ccode\u003e%flink\u003c/code\u003e and run flink batch sql via \u003ccode\u003e%flink.bsql\u003c/code\u003e.\u003cbr/\u003eThis note use flink\u0026rsquo;s DataSet api to demonstrate flink\u0026rsquo;s batch capablity. DataSet is only supported by flink planner, so here we have to specify the planner \u003ccode\u003ezeppelin.flink.planner\u003c/code\u003e as \u003ccode\u003eflink\u003c/code\u003e, otherwise it would use blink planner by default which doesn\u0026rsquo;t support DataSet api.\u003c/p\u003e\n\u003c/div\u003e"
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641095_-188362229",
"id": "paragraph_1547794482637_957545547",
"dateCreated": "2019-09-26 17:20:41.095",
"dateStarted": "2019-10-08 15:15:54.910",
"dateFinished": "2019-10-08 15:15:54.923",
"status": "FINISHED"
},
{
"title": "Configure Flink Interpreter",
"text": "%flink.conf\n\nFLINK_HOME \u003cFLINK_INSTALLATION\u003e\n# DataSet is only supported in flink planner, so here we use flink planner. By default it is blink planner\nzeppelin.flink.planner flink\n\n\n",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:37:39.948",
"config": {
"editorSetting": {
"language": "text",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 12.0,
"editorMode": "ace/mode/text",
"fontSize": 9.0,
"editorHide": false,
"title": false,
"runOnSelectionChange": true,
"checkEmpty": true,
"results": {},
"enabled": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": []
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641097_1856810430",
"id": "paragraph_1546565092490_1952685806",
"dateCreated": "2019-09-26 17:20:41.097",
"dateStarted": "2019-10-11 10:00:42.031",
"dateFinished": "2019-10-11 10:00:42.052",
"status": "FINISHED"
},
{
"text": "%sh\n\ncd /tmp\nwget https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv\n",
"user": "anonymous",
"dateUpdated": "2019-10-08 15:16:45.127",
"config": {
"runOnSelectionChange": true,
"title": false,
"checkEmpty": true,
"colWidth": 12.0,
"fontSize": 9.0,
"enabled": true,
"results": {},
"editorSetting": {
"language": "sh",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": false
},
"editorMode": "ace/mode/sh"
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TEXT",
"data": "--2019-10-08 15:16:46-- https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv\nResolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.120.2\nConnecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.120.2|:443... connected.\nHTTP request sent, awaiting response... 200 OK\nLength: 461474 (451K) [application/octet-stream]\nSaving to: \u0027bank.csv\u0027\n\n 0K .......... .......... .......... .......... .......... 11% 91.0K 4s\n 50K .......... .......... .......... .......... .......... 22% 568K 2s\n 100K .......... .......... .......... .......... .......... 33% 372K 2s\n 150K .......... .......... .......... .......... .......... 44% 176K 1s\n 200K .......... .......... .......... .......... .......... 55% 626K 1s\n 250K .......... .......... .......... .......... .......... 66% 326K 1s\n 300K .......... .......... .......... .......... .......... 77% 433K 0s\n 350K .......... .......... .......... .......... .......... 88% 77.0K 0s\n 400K .......... .......... .......... .......... .......... 99% 77.4M 0s\n 450K 100% 13.2K\u003d2.1s\n\n2019-10-08 15:16:50 (219 KB/s) - \u0027bank.csv\u0027 saved [461474/461474]\n\n"
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489915993_-1377803690",
"id": "paragraph_1569489915993_-1377803690",
"dateCreated": "2019-09-26 17:25:15.993",
"dateStarted": "2019-10-08 15:16:45.132",
"dateFinished": "2019-10-08 15:16:50.387",
"status": "FINISHED"
},
{
"title": "Load Bank Data",
"text": "%flink\n\nval bankText \u003d benv.readTextFile(\"/tmp/bank.csv\")\nval bank \u003d bankText.map(s \u003d\u003e s.split(\";\")).filter(s \u003d\u003e s(0) !\u003d \"\\\"age\\\"\").map(\n s \u003d\u003e (s(0).toInt,\n s(1).replaceAll(\"\\\"\", \"\"),\n s(2).replaceAll(\"\\\"\", \"\"),\n s(3).replaceAll(\"\\\"\", \"\"),\n s(5).replaceAll(\"\\\"\", \"\").toInt\n )\n )\n\nbtenv.registerDataSet(\"bank\", bank, \u0027age, \u0027job, \u0027marital, \u0027education, \u0027balance)\n\n\n",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:01:44.897",
"config": {
"editorSetting": {
"language": "scala",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 12.0,
"editorMode": "ace/mode/scala",
"fontSize": 9.0,
"title": false,
"runOnSelectionChange": true,
"checkEmpty": true,
"results": {
"0": {
"graph": {
"mode": "table",
"height": 111.0,
"optionOpen": false
}
}
},
"enabled": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TEXT",
"data": "\u001b[1m\u001b[34mbankText\u001b[0m: \u001b[1m\u001b[32morg.apache.flink.api.scala.DataSet[String]\u001b[0m \u003d org.apache.flink.api.scala.DataSet@683ccfd\n\u001b[1m\u001b[34mbank\u001b[0m: \u001b[1m\u001b[32morg.apache.flink.api.scala.DataSet[(Int, String, String, String, Int)]\u001b[0m \u003d org.apache.flink.api.scala.DataSet@35e6c08f\n"
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641099_-2030350664",
"id": "paragraph_1546584347815_1533642635",
"dateCreated": "2019-09-26 17:20:41.099",
"dateStarted": "2019-10-11 10:01:44.904",
"dateFinished": "2019-10-11 10:02:12.333",
"status": "FINISHED"
},
{
"text": "%flink.bsql\n\ndescribe bank\n\n\n\n\n\n\n\n\n",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:07:51.800",
"config": {
"editorSetting": {
"language": "sql",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 6.0,
"editorMode": "ace/mode/sql",
"fontSize": 9.0,
"runOnSelectionChange": true,
"title": false,
"checkEmpty": true,
"results": {
"0": {
"graph": {
"mode": "table",
"height": 300.0,
"optionOpen": false,
"setting": {
"table": {
"tableGridState": {},
"tableColumnTypeState": {
"names": {
"table": "string"
},
"updated": false
},
"tableOptionSpecHash": "[{\"name\":\"useFilter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable filter for columns\"},{\"name\":\"showPagination\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable pagination for better navigation\"},{\"name\":\"showAggregationFooter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable a footer for displaying aggregated values\"}]",
"tableOptionValue": {
"useFilter": false,
"showPagination": false,
"showAggregationFooter": false
},
"updated": false,
"initialized": false
}
},
"commonSetting": {}
}
}
},
"enabled": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TEXT",
"data": "Column\tType\nOptional[age]\tOptional[INT]\nOptional[job]\tOptional[STRING]\nOptional[marital]\tOptional[STRING]\nOptional[education]\tOptional[STRING]\nOptional[balance]\tOptional[INT]\n"
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569491358206_1385745589",
"id": "paragraph_1569491358206_1385745589",
"dateCreated": "2019-09-26 17:49:18.206",
"dateStarted": "2019-10-11 10:07:51.808",
"dateFinished": "2019-10-11 10:07:52.076",
"status": "FINISHED"
},
{
"text": "%flink.bsql\n\nselect age, count(1) as v\nfrom bank \nwhere age \u003c 30 \ngroup by age \norder by age",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:07:53.645",
"config": {
"editorSetting": {
"language": "sql",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 6.0,
"editorMode": "ace/mode/sql",
"fontSize": 9.0,
"runOnSelectionChange": true,
"title": false,
"checkEmpty": true,
"results": {
"0": {
"graph": {
"mode": "table",
"height": 300.0,
"optionOpen": false,
"setting": {
"table": {
"tableGridState": {},
"tableColumnTypeState": {
"names": {
"age": "string",
"v": "string"
},
"updated": false
},
"tableOptionSpecHash": "[{\"name\":\"useFilter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable filter for columns\"},{\"name\":\"showPagination\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable pagination for better navigation\"},{\"name\":\"showAggregationFooter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable a footer for displaying aggregated values\"}]",
"tableOptionValue": {
"useFilter": false,
"showPagination": false,
"showAggregationFooter": false
},
"updated": false,
"initialized": false
}
},
"commonSetting": {}
}
}
},
"enabled": true
},
"settings": {
"params": {},
"forms": {}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TABLE",
"data": "age\tv\n19\t4\n20\t3\n21\t7\n22\t9\n23\t20\n24\t24\n25\t44\n26\t77\n27\t94\n28\t103\n29\t97\n"
},
{
"type": "TEXT",
"data": ""
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569491511585_-677666348",
"id": "paragraph_1569491511585_-677666348",
"dateCreated": "2019-09-26 17:51:51.585",
"dateStarted": "2019-10-11 10:07:53.654",
"dateFinished": "2019-10-11 10:08:10.439",
"status": "FINISHED"
},
{
"text": "%flink.bsql\n\nselect age, count(1) as v \nfrom bank \nwhere age \u003c ${maxAge\u003d30} \ngroup by age \norder by age",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:08:13.910",
"config": {
"editorSetting": {
"language": "sql",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 6.0,
"editorMode": "ace/mode/sql",
"fontSize": 9.0,
"runOnSelectionChange": true,
"title": false,
"checkEmpty": true,
"results": {
"0": {
"graph": {
"mode": "pieChart",
"height": 300.0,
"optionOpen": false,
"setting": {
"table": {
"tableGridState": {},
"tableColumnTypeState": {
"names": {
"age": "string",
"v": "string"
},
"updated": false
},
"tableOptionSpecHash": "[{\"name\":\"useFilter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable filter for columns\"},{\"name\":\"showPagination\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable pagination for better navigation\"},{\"name\":\"showAggregationFooter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable a footer for displaying aggregated values\"}]",
"tableOptionValue": {
"useFilter": false,
"showPagination": false,
"showAggregationFooter": false
},
"updated": false,
"initialized": false
},
"multiBarChart": {
"rotate": {
"degree": "-45"
},
"xLabelStatus": "default"
}
},
"commonSetting": {},
"keys": [
{
"name": "age",
"index": 0.0,
"aggr": "sum"
}
],
"groups": [],
"values": [
{
"name": "v",
"index": 1.0,
"aggr": "sum"
}
]
},
"helium": {}
}
},
"enabled": true
},
"settings": {
"params": {
"maxAge": "40"
},
"forms": {
"maxAge": {
"type": "TextBox",
"name": "maxAge",
"defaultValue": "30",
"hidden": false
}
}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TABLE",
"data": "age\tv\n19\t4\n20\t3\n21\t7\n22\t9\n23\t20\n24\t24\n25\t44\n26\t77\n27\t94\n28\t103\n29\t97\n30\t150\n31\t199\n32\t224\n33\t186\n34\t231\n35\t180\n36\t188\n37\t161\n38\t159\n39\t130\n"
},
{
"type": "TEXT",
"data": ""
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641100_115468969",
"id": "paragraph_1546592465802_-1181051373",
"dateCreated": "2019-09-26 17:20:41.100",
"dateStarted": "2019-10-11 10:08:13.923",
"dateFinished": "2019-10-11 10:08:14.876",
"status": "FINISHED"
},
{
"text": "%flink.bsql\n\nselect age, count(1) as v \nfrom bank \nwhere marital\u003d\u0027${marital\u003dsingle,single|divorced|married}\u0027 \ngroup by age \norder by age",
"user": "anonymous",
"dateUpdated": "2019-10-11 10:37:59.493",
"config": {
"editorSetting": {
"language": "sql",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"colWidth": 6.0,
"editorMode": "ace/mode/sql",
"fontSize": 9.0,
"runOnSelectionChange": true,
"title": false,
"checkEmpty": true,
"results": {
"0": {
"graph": {
"mode": "multiBarChart",
"height": 300.0,
"optionOpen": false,
"setting": {
"table": {
"tableGridState": {},
"tableColumnTypeState": {
"names": {
"age": "string",
"v": "string"
},
"updated": false
},
"tableOptionSpecHash": "[{\"name\":\"useFilter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable filter for columns\"},{\"name\":\"showPagination\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable pagination for better navigation\"},{\"name\":\"showAggregationFooter\",\"valueType\":\"boolean\",\"defaultValue\":false,\"widget\":\"checkbox\",\"description\":\"Enable a footer for displaying aggregated values\"}]",
"tableOptionValue": {
"useFilter": false,
"showPagination": false,
"showAggregationFooter": false
},
"updated": false,
"initialized": false
},
"multiBarChart": {
"rotate": {
"degree": "-45"
},
"xLabelStatus": "default"
},
"lineChart": {
"rotate": {
"degree": "-45"
},
"xLabelStatus": "default"
},
"stackedAreaChart": {
"rotate": {
"degree": "-45"
},
"xLabelStatus": "default"
}
},
"commonSetting": {},
"keys": [
{
"name": "age",
"index": 0.0,
"aggr": "sum"
}
],
"groups": [],
"values": [
{
"name": "v",
"index": 1.0,
"aggr": "sum"
}
]
},
"helium": {}
}
},
"enabled": true
},
"settings": {
"params": {
"marital": "married"
},
"forms": {
"marital": {
"type": "Select",
"options": [
{
"value": "single"
},
{
"value": "divorced"
},
{
"value": "married"
}
],
"name": "marital",
"defaultValue": "single",
"hidden": false
}
}
},
"results": {
"code": "SUCCESS",
"msg": [
{
"type": "TABLE",
"data": "age\tv\n23\t3\n24\t11\n25\t11\n26\t18\n27\t26\n28\t23\n29\t37\n30\t56\n31\t104\n32\t105\n33\t103\n34\t142\n35\t109\n36\t117\n37\t100\n38\t99\n39\t88\n40\t105\n41\t97\n42\t91\n43\t79\n44\t68\n45\t76\n46\t82\n47\t78\n48\t91\n49\t87\n50\t74\n51\t63\n52\t66\n53\t75\n54\t56\n55\t68\n56\t50\n57\t78\n58\t67\n59\t56\n60\t36\n61\t15\n62\t5\n63\t7\n64\t6\n65\t4\n66\t7\n67\t5\n68\t1\n69\t5\n70\t5\n71\t5\n72\t4\n73\t6\n74\t2\n75\t3\n76\t1\n77\t5\n78\t2\n79\t3\n80\t6\n81\t1\n83\t2\n86\t1\n87\t1\n"
},
{
"type": "TEXT",
"data": ""
}
]
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641100_-1622522691",
"id": "paragraph_1546592478596_-1766740165",
"dateCreated": "2019-09-26 17:20:41.100",
"dateStarted": "2019-10-11 10:08:16.423",
"dateFinished": "2019-10-11 10:08:17.295",
"status": "FINISHED"
},
{
"text": "\n",
"user": "anonymous",
"dateUpdated": "2019-09-26 17:54:48.292",
"config": {
"colWidth": 12.0,
"fontSize": 9.0,
"enabled": true,
"results": {},
"editorSetting": {
"language": "scala",
"editOnDblClick": false,
"completionKey": "TAB",
"completionSupport": true
},
"editorMode": "ace/mode/scala"
},
"settings": {
"params": {},
"forms": {}
},
"apps": [],
"progressUpdateIntervalMs": 500,
"jobName": "paragraph_1569489641101_-1182720873",
"id": "paragraph_1553093710610_-1734599499",
"dateCreated": "2019-09-26 17:20:41.101",
"status": "READY"
}
],
"name": "Flink Batch Tutorial",
"id": "2EN1E1ATY",
"defaultInterpreterGroup": "spark",
"version": "0.9.0-SNAPSHOT",
"permissions": {},
"noteParams": {},
"noteForms": {},
"angularObjects": {},
"config": {
"isZeppelinNotebookCronEnable": false
},
"info": {},
"path": "/Flink Batch Tutorial"
}