blob: d4be18ea05041285e8f658594afb65f3d532a3c6 [file] [log] [blame]
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Slack Summaries\n",
"This notebook shows how to ingest Slack messages and generate threads summaries.\n",
"\n",
"NOTE. This notebook uses data generated on a dummy Slack server, so messages, replies, and summaries aren't really insightful.\n",
"\n",
"## Pre-requisites\n",
"The first step is to `dlt init` the Slack pipeline, but this is already done for this repository. All you need to do is configure your Slack application to get a User OAuth Token (see [docs](https://dlthub.com/docs/dlt-ecosystem/verified-sources/slack)). Then, you can it add it to `.dlt/secrets.toml` as follow:\n",
"\n",
"```toml\n",
"[sources.slack]\n",
"access_token = \"xoxb-...\"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract & Load (dlt)\n",
"The \"extract\" and \"load\" steps consist of getting the raw data to destination for further transformations.\n",
"\n",
"### Set up\n",
"Define your dlt `Pipeline` and `Source`"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"slack_pipeline.dataset_name=slack_data_20240404085714\n",
"slack_pipeline.pipeline_name=slack\n"
]
}
],
"source": [
"import dlt\n",
"import slack\n",
"\n",
"slack_pipeline = dlt.pipeline(\n",
" pipeline_name=\"slack\",\n",
" destination='duckdb',\n",
" dataset_name=\"slack_data\",\n",
" full_refresh=True,\n",
")\n",
"\n",
"dlt_source = slack.slack_source(\n",
" selected_channels=[\"general\", \"dlt\"],\n",
" replies=True,\n",
")\n",
"\n",
"print(f\"\"\"{slack_pipeline.dataset_name=:}\n",
"{slack_pipeline.pipeline_name=:}\"\"\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Execution\n",
"Run your dlt pipeline. `load_info` will contain a lot metadata about execution."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Pipeline slack load step completed in 1.05 seconds\n",
"1 load package(s) were loaded to destination duckdb and into dataset slack_data_20240404085714\n",
"The duckdb destination used duckdb:////home/tjean/projects/dagworks/hamilton/examples/dlt/slack.duckdb location to store data\n",
"Load package 1712264235.2934568 is LOADED and contains no failed jobs\n"
]
}
],
"source": [
"load_info = slack_pipeline.run(dlt_source)\n",
"print(load_info)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transform (Hamilton + Ibis)\n",
"\n",
"Now that raw data is loaded (in DuckDB), we use Hamilton to organize a dataflow of Ibis data transformations. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define\n",
"\n",
"In the next cells, we use Python functions to define a Hamilton dataflow. \n",
"\n",
"By using the Hamilton Jupyter plugin, we can define the dataflow interactively. Adding and removing functions to the cell will change its shape. "
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"%load_ext hamilton.plugins.jupyter_magic\n",
"from IPython.display import display"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"2052pt\" height=\"492pt\"\n",
" viewBox=\"0.00 0.00 2052.00 491.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 487.5)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-487.5 2048,-487.5 2048,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"43,-217.5 43,-475.5 141,-475.5 141,-217.5 43,-217.5\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-460.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- threads -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>threads</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1867,-215.5C1867,-215.5 1808,-215.5 1808,-215.5 1802,-215.5 1796,-209.5 1796,-203.5 1796,-203.5 1796,-163.5 1796,-163.5 1796,-157.5 1802,-151.5 1808,-151.5 1808,-151.5 1867,-151.5 1867,-151.5 1873,-151.5 1879,-157.5 1879,-163.5 1879,-163.5 1879,-203.5 1879,-203.5 1879,-209.5 1873,-215.5 1867,-215.5\"/>\n",
"<text text-anchor=\"start\" x=\"1807\" y=\"-194.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads</text>\n",
"<text text-anchor=\"start\" x=\"1819\" y=\"-166.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- insert_threads -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>insert_threads</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M2032,-215.5C2032,-215.5 1920,-215.5 1920,-215.5 1914,-215.5 1908,-209.5 1908,-203.5 1908,-203.5 1908,-163.5 1908,-163.5 1908,-157.5 1914,-151.5 1920,-151.5 1920,-151.5 2032,-151.5 2032,-151.5 2038,-151.5 2044,-157.5 2044,-163.5 2044,-163.5 2044,-203.5 2044,-203.5 2044,-209.5 2038,-215.5 2032,-215.5\"/>\n",
"<text text-anchor=\"start\" x=\"1919\" y=\"-194.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">insert_threads</text>\n",
"<text text-anchor=\"start\" x=\"1966.5\" y=\"-166.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">int</text>\n",
"</g>\n",
"<!-- threads&#45;&gt;insert_threads -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>threads&#45;&gt;insert_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1879.08,-183.5C1885.07,-183.5 1891.39,-183.5 1897.82,-183.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1897.98,-187 1907.98,-183.5 1897.98,-180 1897.98,-187\"/>\n",
"</g>\n",
"<!-- threads.with_summary -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>threads.with_summary</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1755,-215.5C1755,-215.5 1578,-215.5 1578,-215.5 1572,-215.5 1566,-209.5 1566,-203.5 1566,-203.5 1566,-163.5 1566,-163.5 1566,-157.5 1572,-151.5 1578,-151.5 1578,-151.5 1755,-151.5 1755,-151.5 1761,-151.5 1767,-157.5 1767,-163.5 1767,-163.5 1767,-203.5 1767,-203.5 1767,-209.5 1761,-215.5 1755,-215.5\"/>\n",
"<text text-anchor=\"start\" x=\"1577\" y=\"-194.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_summary</text>\n",
"<text text-anchor=\"start\" x=\"1648\" y=\"-166.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_summary&#45;&gt;threads -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>threads.with_summary&#45;&gt;threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1767.27,-183.5C1773.53,-183.5 1779.66,-183.5 1785.54,-183.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1785.82,-187 1795.82,-183.5 1785.82,-180 1785.82,-187\"/>\n",
"</g>\n",
"<!-- db_con -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>db_con</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M332,-217.5C332,-217.5 240,-217.5 240,-217.5 234,-217.5 228,-211.5 228,-205.5 228,-205.5 228,-165.5 228,-165.5 228,-159.5 234,-153.5 240,-153.5 240,-153.5 332,-153.5 332,-153.5 338,-153.5 344,-159.5 344,-165.5 344,-165.5 344,-205.5 344,-205.5 344,-211.5 338,-217.5 332,-217.5\"/>\n",
"<text text-anchor=\"start\" x=\"258.5\" y=\"-196.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">db_con</text>\n",
"<text text-anchor=\"start\" x=\"239\" y=\"-168.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">BaseBackend</text>\n",
"</g>\n",
"<!-- channel_replies -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>channel_replies</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M528.5,-216.5C528.5,-216.5 408.5,-216.5 408.5,-216.5 402.5,-216.5 396.5,-210.5 396.5,-204.5 396.5,-204.5 396.5,-164.5 396.5,-164.5 396.5,-158.5 402.5,-152.5 408.5,-152.5 408.5,-152.5 528.5,-152.5 528.5,-152.5 534.5,-152.5 540.5,-158.5 540.5,-164.5 540.5,-164.5 540.5,-204.5 540.5,-204.5 540.5,-210.5 534.5,-216.5 528.5,-216.5\"/>\n",
"<text text-anchor=\"start\" x=\"407.5\" y=\"-195.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_replies</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-167.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_replies -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-185.18C357.49,-185.11 372,-185.03 386.21,-184.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"386.51,-188.45 396.49,-184.89 386.47,-181.45 386.51,-188.45\"/>\n",
"</g>\n",
"<!-- channel_message -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>channel_message</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M537,-133.5C537,-133.5 400,-133.5 400,-133.5 394,-133.5 388,-127.5 388,-121.5 388,-121.5 388,-81.5 388,-81.5 388,-75.5 394,-69.5 400,-69.5 400,-69.5 537,-69.5 537,-69.5 543,-69.5 549,-75.5 549,-81.5 549,-81.5 549,-121.5 549,-121.5 549,-127.5 543,-133.5 537,-133.5\"/>\n",
"<text text-anchor=\"start\" x=\"399\" y=\"-112.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_message</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-84.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_message -->\n",
"<g id=\"edge14\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-158.92C358.4,-152.27 373.96,-145.03 389.08,-137.99\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.83,-141.04 398.42,-133.65 387.88,-134.69 390.83,-141.04\"/>\n",
"</g>\n",
"<!-- channel_threads -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>channel_threads</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M718,-174.5C718,-174.5 590,-174.5 590,-174.5 584,-174.5 578,-168.5 578,-162.5 578,-162.5 578,-122.5 578,-122.5 578,-116.5 584,-110.5 590,-110.5 590,-110.5 718,-110.5 718,-110.5 724,-110.5 730,-116.5 730,-122.5 730,-122.5 730,-162.5 730,-162.5 730,-168.5 724,-174.5 718,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"589\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_threads</text>\n",
"<text text-anchor=\"start\" x=\"635.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channels_collection -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>channels_collection</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M926,-174.5C926,-174.5 775,-174.5 775,-174.5 769,-174.5 763,-168.5 763,-162.5 763,-162.5 763,-122.5 763,-122.5 763,-116.5 769,-110.5 775,-110.5 775,-110.5 926,-110.5 926,-110.5 932,-110.5 938,-116.5 938,-122.5 938,-122.5 938,-162.5 938,-162.5 938,-168.5 932,-174.5 926,-174.5\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M930,-178.5C930,-178.5 771,-178.5 771,-178.5 765,-178.5 759,-172.5 759,-166.5 759,-166.5 759,-118.5 759,-118.5 759,-112.5 765,-106.5 771,-106.5 771,-106.5 930,-106.5 930,-106.5 936,-106.5 942,-112.5 942,-118.5 942,-118.5 942,-166.5 942,-166.5 942,-172.5 936,-178.5 930,-178.5\"/>\n",
"<text text-anchor=\"start\" x=\"774\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channels_collection</text>\n",
"<text text-anchor=\"start\" x=\"832\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channel_threads&#45;&gt;channels_collection -->\n",
"<g id=\"edge12\" class=\"edge\">\n",
"<title>channel_threads&#45;&gt;channels_collection</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M740.44,-142.5C743.21,-142.5 746,-142.5 748.8,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"740.2,-142.5 730.2,-138 735.2,-142.5 730.2,-142.5 730.2,-142.5 730.2,-142.5 735.2,-142.5 730.2,-147 740.2,-142.5 740.2,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"748.91,-146 758.91,-142.5 748.91,-139 748.91,-146\"/>\n",
"</g>\n",
"<!-- summary_prompt -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>summary_prompt</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1469,-256.5C1469,-256.5 1335,-256.5 1335,-256.5 1329,-256.5 1323,-250.5 1323,-244.5 1323,-244.5 1323,-204.5 1323,-204.5 1323,-198.5 1329,-192.5 1335,-192.5 1335,-192.5 1469,-192.5 1469,-192.5 1475,-192.5 1481,-198.5 1481,-204.5 1481,-204.5 1481,-244.5 1481,-244.5 1481,-250.5 1475,-256.5 1469,-256.5\"/>\n",
"<text text-anchor=\"start\" x=\"1334\" y=\"-235.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">summary_prompt</text>\n",
"<text text-anchor=\"start\" x=\"1392.5\" y=\"-207.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- summary_prompt&#45;&gt;threads.with_summary -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>summary_prompt&#45;&gt;threads.with_summary</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1481.05,-212.31C1504.53,-208.64 1530.75,-204.55 1555.87,-200.62\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1556.48,-204.07 1565.82,-199.07 1555.4,-197.15 1556.48,-204.07\"/>\n",
"</g>\n",
"<!-- threads.with_format_messages -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>threads.with_format_messages</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1226,-174.5C1226,-174.5 983,-174.5 983,-174.5 977,-174.5 971,-168.5 971,-162.5 971,-162.5 971,-122.5 971,-122.5 971,-116.5 977,-110.5 983,-110.5 983,-110.5 1226,-110.5 1226,-110.5 1232,-110.5 1238,-116.5 1238,-122.5 1238,-122.5 1238,-162.5 1238,-162.5 1238,-168.5 1232,-174.5 1226,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"982\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_format_messages</text>\n",
"<text text-anchor=\"start\" x=\"1086\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_aggregate_thread -->\n",
"<g id=\"node11\" class=\"node\">\n",
"<title>threads.with_aggregate_thread</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1525,-174.5C1525,-174.5 1279,-174.5 1279,-174.5 1273,-174.5 1267,-168.5 1267,-162.5 1267,-162.5 1267,-122.5 1267,-122.5 1267,-116.5 1273,-110.5 1279,-110.5 1279,-110.5 1525,-110.5 1525,-110.5 1531,-110.5 1537,-116.5 1537,-122.5 1537,-122.5 1537,-162.5 1537,-162.5 1537,-168.5 1531,-174.5 1525,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"1278\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_aggregate_thread</text>\n",
"<text text-anchor=\"start\" x=\"1383.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_format_messages&#45;&gt;threads.with_aggregate_thread -->\n",
"<g id=\"edge16\" class=\"edge\">\n",
"<title>threads.with_format_messages&#45;&gt;threads.with_aggregate_thread</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1238.12,-142.5C1244.22,-142.5 1250.35,-142.5 1256.47,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1256.76,-146 1266.76,-142.5 1256.76,-139 1256.76,-146\"/>\n",
"</g>\n",
"<!-- channel_replies&#45;&gt;channel_threads -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>channel_replies&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M540.71,-168.2C549.66,-166.15 558.89,-164.04 568.02,-161.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"568.92,-165.34 577.89,-159.69 567.36,-158.51 568.92,-165.34\"/>\n",
"</g>\n",
"<!-- channels_collection&#45;&gt;threads.with_format_messages -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>channels_collection&#45;&gt;threads.with_format_messages</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M942.12,-142.5C948.09,-142.5 954.17,-142.5 960.31,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"960.68,-146 970.68,-142.5 960.68,-139 960.68,-146\"/>\n",
"</g>\n",
"<!-- channel_message&#45;&gt;channel_threads -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>channel_message&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M549.19,-119.3C555.41,-120.69 561.7,-122.1 567.96,-123.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"567.21,-126.92 577.73,-125.68 568.74,-120.09 567.21,-126.92\"/>\n",
"</g>\n",
"<!-- threads.with_aggregate_thread&#45;&gt;threads.with_summary -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>threads.with_aggregate_thread&#45;&gt;threads.with_summary</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1537.33,-163.48C1543.47,-164.44 1549.6,-165.4 1555.67,-166.35\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1555.4,-169.85 1565.82,-167.93 1556.48,-162.93 1555.4,-169.85\"/>\n",
"</g>\n",
"<!-- channel -->\n",
"<g id=\"node12\" class=\"node\">\n",
"<title>channel</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M330.5,-131.5C330.5,-131.5 241.5,-131.5 241.5,-131.5 235.5,-131.5 229.5,-125.5 229.5,-119.5 229.5,-119.5 229.5,-79.5 229.5,-79.5 229.5,-73.5 235.5,-67.5 241.5,-67.5 241.5,-67.5 330.5,-67.5 330.5,-67.5 336.5,-67.5 342.5,-73.5 342.5,-79.5 342.5,-79.5 342.5,-119.5 342.5,-119.5 342.5,-125.5 336.5,-131.5 330.5,-131.5\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M334.5,-135.5C334.5,-135.5 237.5,-135.5 237.5,-135.5 231.5,-135.5 225.5,-129.5 225.5,-123.5 225.5,-123.5 225.5,-75.5 225.5,-75.5 225.5,-69.5 231.5,-63.5 237.5,-63.5 237.5,-63.5 334.5,-63.5 334.5,-63.5 340.5,-63.5 346.5,-69.5 346.5,-75.5 346.5,-75.5 346.5,-123.5 346.5,-123.5 346.5,-129.5 340.5,-135.5 334.5,-135.5\"/>\n",
"<text text-anchor=\"start\" x=\"255\" y=\"-110.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel</text>\n",
"<text text-anchor=\"start\" x=\"240.5\" y=\"-82.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Parallelizable</text>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_replies -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-127.57C360.63,-134.18 375.79,-141.31 390.46,-148.22\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.48,-148.23 397.61,-156.56 395,-150.36 399.52,-152.49 399.52,-152.49 399.52,-152.49 395,-150.36 401.44,-148.42 390.48,-148.23 390.48,-148.23\"/>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_message -->\n",
"<g id=\"edge13\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-100.16C356.64,-100.27 367.25,-100.39 377.85,-100.51\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"378,-100.51 387.95,-105.12 383,-100.56 388,-100.62 388,-100.62 388,-100.62 383,-100.56 388.05,-96.12 378,-100.51 378,-100.51\"/>\n",
"</g>\n",
"<!-- _db_con_inputs -->\n",
"<g id=\"node13\" class=\"node\">\n",
"<title>_db_con_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"165,-208 19,-208 19,-163 165,-163 165,-208\"/>\n",
"<text text-anchor=\"start\" x=\"34\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"95\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _db_con_inputs&#45;&gt;db_con -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>_db_con_inputs&#45;&gt;db_con</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M165.04,-185.5C182.22,-185.5 200.5,-185.5 217.47,-185.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"217.85,-189 227.85,-185.5 217.85,-182 217.85,-189\"/>\n",
"</g>\n",
"<!-- _channel_replies_inputs -->\n",
"<g id=\"node14\" class=\"node\">\n",
"<title>_channel_replies_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-281 213,-281 213,-236 359,-236 359,-281\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_replies_inputs&#45;&gt;channel_replies -->\n",
"<g id=\"edge11\" class=\"edge\">\n",
"<title>_channel_replies_inputs&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M342.14,-235.9C356.3,-230.09 371.84,-223.72 387,-217.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"388.45,-220.69 396.38,-213.66 385.8,-214.21 388.45,-220.69\"/>\n",
"</g>\n",
"<!-- _channel_message_inputs -->\n",
"<g id=\"node15\" class=\"node\">\n",
"<title>_channel_message_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-45 213,-45 213,0 359,0 359,-45\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_message_inputs&#45;&gt;channel_message -->\n",
"<g id=\"edge15\" class=\"edge\">\n",
"<title>_channel_message_inputs&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M338.72,-45.14C353.18,-51.47 369.26,-58.5 385.02,-65.4\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"383.68,-68.64 394.25,-69.44 386.49,-62.22 383.68,-68.64\"/>\n",
"</g>\n",
"<!-- _channel_inputs -->\n",
"<g id=\"node16\" class=\"node\">\n",
"<title>_channel_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"184,-122 0,-122 0,-77 184,-77 184,-122\"/>\n",
"<text text-anchor=\"start\" x=\"15\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">selected_channels</text>\n",
"<text text-anchor=\"start\" x=\"148\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">list</text>\n",
"</g>\n",
"<!-- _channel_inputs&#45;&gt;channel -->\n",
"<g id=\"edge17\" class=\"edge\">\n",
"<title>_channel_inputs&#45;&gt;channel</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M184.22,-99.5C194.56,-99.5 204.98,-99.5 214.98,-99.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"215.25,-103 225.25,-99.5 215.24,-96 215.25,-103\"/>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node17\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"121.5,-444 62.5,-444 62.5,-407 121.5,-407 121.5,-444\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-421.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node18\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M120,-389C120,-389 64,-389 64,-389 58,-389 52,-383 52,-377 52,-377 52,-364 52,-364 52,-358 58,-352 64,-352 64,-352 120,-352 120,-352 126,-352 132,-358 132,-364 132,-364 132,-377 132,-377 132,-383 126,-389 120,-389\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-366.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"<!-- expand -->\n",
"<g id=\"node19\" class=\"node\">\n",
"<title>expand</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M117,-330C117,-330 67,-330 67,-330 61,-330 55,-324 55,-318 55,-318 55,-305 55,-305 55,-299 61,-293 67,-293 67,-293 117,-293 117,-293 123,-293 129,-299 129,-305 129,-305 129,-318 129,-318 129,-324 123,-330 117,-330\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M121,-334C121,-334 63,-334 63,-334 57,-334 51,-328 51,-322 51,-322 51,-301 51,-301 51,-295 57,-289 63,-289 63,-289 121,-289 121,-289 127,-289 133,-295 133,-301 133,-301 133,-322 133,-322 133,-328 127,-334 121,-334\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-307.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">expand</text>\n",
"</g>\n",
"<!-- collect -->\n",
"<g id=\"node20\" class=\"node\">\n",
"<title>collect</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M113.5,-267C113.5,-267 70.5,-267 70.5,-267 64.5,-267 58.5,-261 58.5,-255 58.5,-255 58.5,-242 58.5,-242 58.5,-236 64.5,-230 70.5,-230 70.5,-230 113.5,-230 113.5,-230 119.5,-230 125.5,-236 125.5,-242 125.5,-242 125.5,-255 125.5,-255 125.5,-261 119.5,-267 113.5,-267\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M117.5,-271C117.5,-271 66.5,-271 66.5,-271 60.5,-271 54.5,-265 54.5,-259 54.5,-259 54.5,-238 54.5,-238 54.5,-232 60.5,-226 66.5,-226 66.5,-226 117.5,-226 117.5,-226 123.5,-226 129.5,-232 129.5,-238 129.5,-238 129.5,-259 129.5,-259 129.5,-265 123.5,-271 117.5,-271\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-244.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">collect</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7f2f45127c10>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%cell_to_module -m jupyter_transform -d\n",
"\n",
"import textwrap\n",
"\n",
"import dlt\n",
"import ibis\n",
"import ibis.expr.types as ir\n",
"import openai\n",
"from hamilton.function_modifiers import pipe, source, step\n",
"from hamilton.htypes import Parallelizable, Collect\n",
"\n",
"\n",
"def db_con(pipeline: dlt.Pipeline) -> ibis.BaseBackend:\n",
" backend = ibis.connect(f\"{pipeline.pipeline_name}.duckdb\")\n",
" ibis.set_backend(backend)\n",
" return backend\n",
"\n",
"\n",
"def channel(selected_channels: list[str]) -> Parallelizable[str]:\n",
" for channel in selected_channels:\n",
" yield channel\n",
"\n",
"\n",
"def _epoch_microseconds(timestamp: ir.Column) -> ir.Column:\n",
" seconds_from_epoch = timestamp.epoch_seconds()\n",
" microseconds = timestamp.microsecond() / int(10e5)\n",
" return seconds_from_epoch + microseconds\n",
" \n",
"\n",
"def channel_message(\n",
" channel: str,\n",
" db_con: ibis.BaseBackend, \n",
" pipeline: dlt.Pipeline,\n",
") -> ir.Table:\n",
" \"\"\"Load table containing parent messages of a channel.\n",
" the timestamps `thread_ts` and `ts` are converted to strings.\n",
" `thread_ts` is not None if the message has replies / started a thread. Otherwise,\n",
" `thread_ts` == `ts`. Coalesce is used to fill these None values with `ts`\n",
"\n",
" Slack reference: https://api.slack.com/messaging/retrieving#finding_threads\n",
" \"\"\"\n",
" return (\n",
" db_con.table(\n",
" f\"{channel}_message\",\n",
" schema=pipeline.dataset_name,\n",
" database=pipeline.pipeline_name,\n",
" )\n",
" .mutate(\n",
" thread_ts=_epoch_microseconds(ibis._.thread_ts).cast(str),\n",
" ts=_epoch_microseconds(ibis._.ts).cast(str),\n",
" )\n",
" .mutate(thread_ts=ibis.coalesce(ibis._.thread_ts, ibis._.ts))\n",
" )\n",
"\n",
"\n",
"def channel_replies(\n",
" channel: str,\n",
" db_con: ibis.BaseBackend, \n",
" pipeline: dlt.Pipeline,\n",
") -> ir.Table:\n",
" \"\"\"Create table for replies\"\"\"\n",
" return db_con.table(\n",
" f\"{channel}_replies_message\",\n",
" schema=pipeline.dataset_name,\n",
" database=pipeline.pipeline_name,\n",
" )\n",
" \n",
"\n",
"def channel_threads(channel_message: ir.Table, channel_replies: ir.Table) -> ir.Table:\n",
" \"\"\"Union of parent messages and replies. Sort by thread start, then message timestamp\"\"\"\n",
" columns = [\"channel\", \"thread_ts\", \"ts\", \"user\", \"text\", \"_dlt_load_id\", \"_dlt_id\"]\n",
" return (\n",
" ibis.union(\n",
" channel_message.select(columns),\n",
" channel_replies.select(columns),\n",
" )\n",
" .order_by([ibis._.thread_ts, ibis._.ts])\n",
" )\n",
"\n",
"\n",
"def channels_collection(channel_threads: Collect[ir.Table]) -> ir.Table:\n",
" \"\"\"Collect `channel_threads` for all channels\"\"\"\n",
" return ibis.union(*channel_threads)\n",
"\n",
"\n",
"def _format_messages(threads: ir.Table) -> ir.Table:\n",
" \"\"\"Assign a user id per thread and prefix messages with it\"\"\"\n",
" thread_user_id_expr = (ibis.dense_rank().over(order_by=\"user\") + 1).cast(str)\n",
" return threads.group_by(\"thread_ts\").mutate(\n",
" message=thread_user_id_expr.concat(\": \", ibis._.text)\n",
" )\n",
"\n",
"\n",
"def _aggregate_thread(threads: ir.Table) -> ir.Table:\n",
" \"\"\"Create threads as a single string by concatenating messages\n",
"\n",
" Functions decorates with `@ibis.udf` are loaded by the Ibis backend.\n",
" They aren't meant to be called directly.\n",
" ref: https://ibis-project.org/how-to/extending/builtin\n",
" \"\"\"\n",
"\n",
" @ibis.udf.agg.builtin(name=\"string_agg\")\n",
" def _string_agg(arg, sep: str = \"\\n \") -> str:\n",
" raise NotImplementedError\n",
"\n",
" @ibis.udf.agg.builtin(name=\"array_agg\")\n",
" def _array_agg(arg) -> list[str]:\n",
" raise NotImplementedError\n",
"\n",
" return threads.group_by(\"thread_ts\").agg(\n",
" thread=_string_agg(ibis._.message),\n",
" num_messages=ibis._.count(),\n",
" users=_array_agg(ibis._.user).unique(),\n",
" _dlt_load_id=ibis._._dlt_load_id.max(),\n",
" _dlt_id=_array_agg(ibis._._dlt_id),\n",
" )\n",
"\n",
"\n",
"def summary_prompt() -> str:\n",
" \"\"\"LLM prompt to summarize Slack thread\"\"\"\n",
" return textwrap.dedent(\n",
" \"\"\"Hamilton is an open source library to write dataflows in Python. It is used by developers for data engineering, data science, machine learning, and LLM workflows.\n",
" Next is a discussion thread about Hamilton started by User1. Complete these tasks: identify the issue raised by User1, summarize the discussion, indicate if you think the issue was resolved.\n",
"\n",
" DISCUSSION THREAD\n",
" {text}\n",
" \"\"\"\n",
" )\n",
"\n",
"def _summary(threads: ir.Table, prompt: str) -> ir.Table:\n",
" \"\"\"Generate a summary for each thread.\n",
" Uses a scalar Python UDF executed by the backend.\n",
" \"\"\"\n",
" # Ibis requires `str` type hint even if None is allowed\n",
" @ibis.udf.scalar.python\n",
" def _openai_completion_udf(text: str, prompt_template: str) -> str:\n",
" \"\"\"Fill `prompt` with `text` and use OpenAI chat completion.\n",
" Returns None if:\n",
" - `text` is empty\n",
" - `content` is too long\n",
" - OpenAI call fails\n",
" \"\"\"\n",
" if len(text) == 0:\n",
" return None\n",
"\n",
" content = prompt_template.format(text=text)\n",
"\n",
" if len(content) // 4 > 8191:\n",
" return None\n",
"\n",
" client = openai.OpenAI()\n",
"\n",
" response = client.chat.completions.create(\n",
" model=\"gpt-3.5-turbo\",\n",
" messages=[{\"role\": \"user\", \"content\": content}],\n",
" )\n",
" try:\n",
" output = response.choices[0].message.content\n",
" except Exception:\n",
" output = None\n",
"\n",
" return output\n",
" \n",
" return threads.mutate(summary=_openai_completion_udf(threads.thread, prompt))\n",
"\n",
"\n",
"# @pipe operator facilitates managing the function/node namespace\n",
"@pipe(\n",
" step(_format_messages), step(_aggregate_thread), step(_summary, prompt=source(\"summary_prompt\"))\n",
")\n",
"def threads(channels_collection: ir.Table) -> ir.Table:\n",
" \"\"\"Create `threads` table by formatting, aggregating messages,\n",
" and generating summaries.\n",
" \"\"\"\n",
" return channels_collection\n",
"\n",
"\n",
"def insert_threads(threads: ir.Table) -> int:\n",
" \"\"\"Save `threads` table and return row count.\"\"\"\n",
" db_con = ibis.get_backend()\n",
" threads_table = db_con.create_table(\"threads\", threads)\n",
" db_con.insert(\"threads\", threads)\n",
" return int(threads_table.count().execute())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Execute\n",
"Create a Hamilton Driver with the defined dataflow module and request nodes to compute. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Driver definition\n",
"\n",
"We pass the module `jupyter_transform` that we defined in this notebook, but it could be modules stored in `.py` files. The `.enable_dynamic_execution()` statement is required because we're using Hamilton's `Parallelizable/Collect` feature."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from hamilton import driver\n",
"import jupyter_transform\n",
"\n",
"dr = (\n",
" driver.Builder()\n",
" .enable_dynamic_execution(allow_experimental_mode=True)\n",
" .with_modules(jupyter_transform)\n",
" .build()\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Driver execution\n",
"We create a dictionary to hold the inputs `pipeline` and `selected_channels` (the same passed to the dlt pipeline) required by the Hamilton dataflow.\n",
"\n",
"Then, we call `dr.execute()` with the list of nodes to compute and the inputs. This returns a dictionary of results."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"1545pt\" height=\"547pt\"\n",
" viewBox=\"0.00 0.00 1545.00 546.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 542.5)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-542.5 1541,-542.5 1541,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"43,-217.5 43,-530.5 141,-530.5 141,-217.5 43,-217.5\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-515.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- db_con -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>db_con</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M332,-217.5C332,-217.5 240,-217.5 240,-217.5 234,-217.5 228,-211.5 228,-205.5 228,-205.5 228,-165.5 228,-165.5 228,-159.5 234,-153.5 240,-153.5 240,-153.5 332,-153.5 332,-153.5 338,-153.5 344,-159.5 344,-165.5 344,-165.5 344,-205.5 344,-205.5 344,-211.5 338,-217.5 332,-217.5\"/>\n",
"<text text-anchor=\"start\" x=\"258.5\" y=\"-196.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">db_con</text>\n",
"<text text-anchor=\"start\" x=\"239\" y=\"-168.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">BaseBackend</text>\n",
"</g>\n",
"<!-- channel_replies -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>channel_replies</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M528.5,-216.5C528.5,-216.5 408.5,-216.5 408.5,-216.5 402.5,-216.5 396.5,-210.5 396.5,-204.5 396.5,-204.5 396.5,-164.5 396.5,-164.5 396.5,-158.5 402.5,-152.5 408.5,-152.5 408.5,-152.5 528.5,-152.5 528.5,-152.5 534.5,-152.5 540.5,-158.5 540.5,-164.5 540.5,-164.5 540.5,-204.5 540.5,-204.5 540.5,-210.5 534.5,-216.5 528.5,-216.5\"/>\n",
"<text text-anchor=\"start\" x=\"407.5\" y=\"-195.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_replies</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-167.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_replies -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-185.18C357.49,-185.11 372,-185.03 386.21,-184.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"386.51,-188.45 396.49,-184.89 386.47,-181.45 386.51,-188.45\"/>\n",
"</g>\n",
"<!-- channel_message -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>channel_message</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M537,-133.5C537,-133.5 400,-133.5 400,-133.5 394,-133.5 388,-127.5 388,-121.5 388,-121.5 388,-81.5 388,-81.5 388,-75.5 394,-69.5 400,-69.5 400,-69.5 537,-69.5 537,-69.5 543,-69.5 549,-75.5 549,-81.5 549,-81.5 549,-121.5 549,-121.5 549,-127.5 543,-133.5 537,-133.5\"/>\n",
"<text text-anchor=\"start\" x=\"399\" y=\"-112.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_message</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-84.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_message -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-158.92C358.4,-152.27 373.96,-145.03 389.08,-137.99\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.83,-141.04 398.42,-133.65 387.88,-134.69 390.83,-141.04\"/>\n",
"</g>\n",
"<!-- channel_threads -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>channel_threads</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M718,-174.5C718,-174.5 590,-174.5 590,-174.5 584,-174.5 578,-168.5 578,-162.5 578,-162.5 578,-122.5 578,-122.5 578,-116.5 584,-110.5 590,-110.5 590,-110.5 718,-110.5 718,-110.5 724,-110.5 730,-116.5 730,-122.5 730,-122.5 730,-162.5 730,-162.5 730,-168.5 724,-174.5 718,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"589\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_threads</text>\n",
"<text text-anchor=\"start\" x=\"635.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channels_collection -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>channels_collection</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M926,-174.5C926,-174.5 775,-174.5 775,-174.5 769,-174.5 763,-168.5 763,-162.5 763,-162.5 763,-122.5 763,-122.5 763,-116.5 769,-110.5 775,-110.5 775,-110.5 926,-110.5 926,-110.5 932,-110.5 938,-116.5 938,-122.5 938,-122.5 938,-162.5 938,-162.5 938,-168.5 932,-174.5 926,-174.5\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M930,-178.5C930,-178.5 771,-178.5 771,-178.5 765,-178.5 759,-172.5 759,-166.5 759,-166.5 759,-118.5 759,-118.5 759,-112.5 765,-106.5 771,-106.5 771,-106.5 930,-106.5 930,-106.5 936,-106.5 942,-112.5 942,-118.5 942,-118.5 942,-166.5 942,-166.5 942,-172.5 936,-178.5 930,-178.5\"/>\n",
"<text text-anchor=\"start\" x=\"774\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channels_collection</text>\n",
"<text text-anchor=\"start\" x=\"832\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channel_threads&#45;&gt;channels_collection -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>channel_threads&#45;&gt;channels_collection</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M740.44,-142.5C743.21,-142.5 746,-142.5 748.8,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"740.2,-142.5 730.2,-138 735.2,-142.5 730.2,-142.5 730.2,-142.5 730.2,-142.5 735.2,-142.5 730.2,-147 740.2,-142.5 740.2,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"748.91,-146 758.91,-142.5 748.91,-139 748.91,-146\"/>\n",
"</g>\n",
"<!-- channel -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>channel</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M330.5,-131.5C330.5,-131.5 241.5,-131.5 241.5,-131.5 235.5,-131.5 229.5,-125.5 229.5,-119.5 229.5,-119.5 229.5,-79.5 229.5,-79.5 229.5,-73.5 235.5,-67.5 241.5,-67.5 241.5,-67.5 330.5,-67.5 330.5,-67.5 336.5,-67.5 342.5,-73.5 342.5,-79.5 342.5,-79.5 342.5,-119.5 342.5,-119.5 342.5,-125.5 336.5,-131.5 330.5,-131.5\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M334.5,-135.5C334.5,-135.5 237.5,-135.5 237.5,-135.5 231.5,-135.5 225.5,-129.5 225.5,-123.5 225.5,-123.5 225.5,-75.5 225.5,-75.5 225.5,-69.5 231.5,-63.5 237.5,-63.5 237.5,-63.5 334.5,-63.5 334.5,-63.5 340.5,-63.5 346.5,-69.5 346.5,-75.5 346.5,-75.5 346.5,-123.5 346.5,-123.5 346.5,-129.5 340.5,-135.5 334.5,-135.5\"/>\n",
"<text text-anchor=\"start\" x=\"255\" y=\"-110.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel</text>\n",
"<text text-anchor=\"start\" x=\"240.5\" y=\"-82.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Parallelizable</text>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_replies -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-127.57C360.63,-134.18 375.79,-141.31 390.46,-148.22\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.48,-148.23 397.61,-156.56 395,-150.36 399.52,-152.49 399.52,-152.49 399.52,-152.49 395,-150.36 401.44,-148.42 390.48,-148.23 390.48,-148.23\"/>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_message -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-100.16C356.64,-100.27 367.25,-100.39 377.85,-100.51\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"378,-100.51 387.95,-105.12 383,-100.56 388,-100.62 388,-100.62 388,-100.62 383,-100.56 388.05,-96.12 378,-100.51 378,-100.51\"/>\n",
"</g>\n",
"<!-- channel_replies&#45;&gt;channel_threads -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>channel_replies&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M540.71,-168.2C549.66,-166.15 558.89,-164.04 568.02,-161.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"568.92,-165.34 577.89,-159.69 567.36,-158.51 568.92,-165.34\"/>\n",
"</g>\n",
"<!-- threads.with_format_messages -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>threads.with_format_messages</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1226,-174.5C1226,-174.5 983,-174.5 983,-174.5 977,-174.5 971,-168.5 971,-162.5 971,-162.5 971,-122.5 971,-122.5 971,-116.5 977,-110.5 983,-110.5 983,-110.5 1226,-110.5 1226,-110.5 1232,-110.5 1238,-116.5 1238,-122.5 1238,-122.5 1238,-162.5 1238,-162.5 1238,-168.5 1232,-174.5 1226,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"982\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_format_messages</text>\n",
"<text text-anchor=\"start\" x=\"1086\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channels_collection&#45;&gt;threads.with_format_messages -->\n",
"<g id=\"edge13\" class=\"edge\">\n",
"<title>channels_collection&#45;&gt;threads.with_format_messages</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M942.12,-142.5C948.09,-142.5 954.17,-142.5 960.31,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"960.68,-146 970.68,-142.5 960.68,-139 960.68,-146\"/>\n",
"</g>\n",
"<!-- channel_message&#45;&gt;channel_threads -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>channel_message&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M549.19,-119.3C555.41,-120.69 561.7,-122.1 567.96,-123.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"567.21,-126.92 577.73,-125.68 568.74,-120.09 567.21,-126.92\"/>\n",
"</g>\n",
"<!-- threads.with_aggregate_thread -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>threads.with_aggregate_thread</title>\n",
"<path fill=\"#ffc857\" stroke=\"black\" d=\"M1525,-174.5C1525,-174.5 1279,-174.5 1279,-174.5 1273,-174.5 1267,-168.5 1267,-162.5 1267,-162.5 1267,-122.5 1267,-122.5 1267,-116.5 1273,-110.5 1279,-110.5 1279,-110.5 1525,-110.5 1525,-110.5 1531,-110.5 1537,-116.5 1537,-122.5 1537,-122.5 1537,-162.5 1537,-162.5 1537,-168.5 1531,-174.5 1525,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"1278\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_aggregate_thread</text>\n",
"<text text-anchor=\"start\" x=\"1383.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_format_messages&#45;&gt;threads.with_aggregate_thread -->\n",
"<g id=\"edge12\" class=\"edge\">\n",
"<title>threads.with_format_messages&#45;&gt;threads.with_aggregate_thread</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1238.12,-142.5C1244.22,-142.5 1250.35,-142.5 1256.47,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1256.76,-146 1266.76,-142.5 1256.76,-139 1256.76,-146\"/>\n",
"</g>\n",
"<!-- _db_con_inputs -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>_db_con_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"165,-208 19,-208 19,-163 165,-163 165,-208\"/>\n",
"<text text-anchor=\"start\" x=\"34\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"95\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _db_con_inputs&#45;&gt;db_con -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>_db_con_inputs&#45;&gt;db_con</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M165.04,-185.5C182.22,-185.5 200.5,-185.5 217.47,-185.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"217.85,-189 227.85,-185.5 217.85,-182 217.85,-189\"/>\n",
"</g>\n",
"<!-- _channel_inputs -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>_channel_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"184,-122 0,-122 0,-77 184,-77 184,-122\"/>\n",
"<text text-anchor=\"start\" x=\"15\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">selected_channels</text>\n",
"<text text-anchor=\"start\" x=\"148\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">list</text>\n",
"</g>\n",
"<!-- _channel_inputs&#45;&gt;channel -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>_channel_inputs&#45;&gt;channel</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M184.22,-99.5C194.56,-99.5 204.98,-99.5 214.98,-99.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"215.25,-103 225.25,-99.5 215.24,-96 215.25,-103\"/>\n",
"</g>\n",
"<!-- _channel_replies_inputs -->\n",
"<g id=\"node11\" class=\"node\">\n",
"<title>_channel_replies_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-281 213,-281 213,-236 359,-236 359,-281\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_replies_inputs&#45;&gt;channel_replies -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>_channel_replies_inputs&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M342.14,-235.9C356.3,-230.09 371.84,-223.72 387,-217.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"388.45,-220.69 396.38,-213.66 385.8,-214.21 388.45,-220.69\"/>\n",
"</g>\n",
"<!-- _channel_message_inputs -->\n",
"<g id=\"node12\" class=\"node\">\n",
"<title>_channel_message_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-45 213,-45 213,0 359,0 359,-45\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_message_inputs&#45;&gt;channel_message -->\n",
"<g id=\"edge11\" class=\"edge\">\n",
"<title>_channel_message_inputs&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M338.72,-45.14C353.18,-51.47 369.26,-58.5 385.02,-65.4\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"383.68,-68.64 394.25,-69.44 386.49,-62.22 383.68,-68.64\"/>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node13\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"121.5,-499 62.5,-499 62.5,-462 121.5,-462 121.5,-499\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-476.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node14\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M120,-444C120,-444 64,-444 64,-444 58,-444 52,-438 52,-432 52,-432 52,-419 52,-419 52,-413 58,-407 64,-407 64,-407 120,-407 120,-407 126,-407 132,-413 132,-419 132,-419 132,-432 132,-432 132,-438 126,-444 120,-444\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-421.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"<!-- output -->\n",
"<g id=\"node15\" class=\"node\">\n",
"<title>output</title>\n",
"<path fill=\"#ffc857\" stroke=\"black\" d=\"M114,-389C114,-389 70,-389 70,-389 64,-389 58,-383 58,-377 58,-377 58,-364 58,-364 58,-358 64,-352 70,-352 70,-352 114,-352 114,-352 120,-352 126,-358 126,-364 126,-364 126,-377 126,-377 126,-383 120,-389 114,-389\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-366.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">output</text>\n",
"</g>\n",
"<!-- expand -->\n",
"<g id=\"node16\" class=\"node\">\n",
"<title>expand</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M117,-330C117,-330 67,-330 67,-330 61,-330 55,-324 55,-318 55,-318 55,-305 55,-305 55,-299 61,-293 67,-293 67,-293 117,-293 117,-293 123,-293 129,-299 129,-305 129,-305 129,-318 129,-318 129,-324 123,-330 117,-330\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M121,-334C121,-334 63,-334 63,-334 57,-334 51,-328 51,-322 51,-322 51,-301 51,-301 51,-295 57,-289 63,-289 63,-289 121,-289 121,-289 127,-289 133,-295 133,-301 133,-301 133,-322 133,-322 133,-328 127,-334 121,-334\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-307.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">expand</text>\n",
"</g>\n",
"<!-- collect -->\n",
"<g id=\"node17\" class=\"node\">\n",
"<title>collect</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M113.5,-267C113.5,-267 70.5,-267 70.5,-267 64.5,-267 58.5,-261 58.5,-255 58.5,-255 58.5,-242 58.5,-242 58.5,-236 64.5,-230 70.5,-230 70.5,-230 113.5,-230 113.5,-230 119.5,-230 125.5,-236 125.5,-242 125.5,-242 125.5,-255 125.5,-255 125.5,-261 119.5,-267 113.5,-267\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M117.5,-271C117.5,-271 66.5,-271 66.5,-271 60.5,-271 54.5,-265 54.5,-259 54.5,-259 54.5,-238 54.5,-238 54.5,-232 60.5,-226 66.5,-226 66.5,-226 117.5,-226 117.5,-226 123.5,-226 129.5,-232 129.5,-238 129.5,-238 129.5,-259 129.5,-259 129.5,-265 123.5,-271 117.5,-271\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-244.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">collect</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7f2f46264280>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>thread_ts</th>\n",
" <th>thread</th>\n",
" <th>num_messages</th>\n",
" <th>users</th>\n",
" <th>_dlt_load_id</th>\n",
" <th>_dlt_id</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1711048769.908319</td>\n",
" <td>1: from the general channel again</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[YzwD7S5kc4OYeA]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1711053147.089719</td>\n",
" <td>1: along with a reply\\n 1: another message</td>\n",
" <td>2</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[LItHtsbAAX6/Kg, XFxI2RmALQkXfg]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1711048757.443209</td>\n",
" <td>1: hello world</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[x/vVKj7+sMyNvg]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1711048765.747779</td>\n",
" <td>1: general channel</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[aT+AcnCycBlp8A]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1711048761.709929</td>\n",
" <td>1: 2nd message</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[ENPKHYpQW15AFg]</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>1711048764.747779</td>\n",
" <td>1: my 2nd reply\\n 1: will this be picked up by...</td>\n",
" <td>3</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[kEt1kEo9e0mkqQ, LatviGHcnPd8yw, lyx94g8HxvVcIA]</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" thread_ts thread \\\n",
"0 1711048769.908319 1: from the general channel again \n",
"1 1711053147.089719 1: along with a reply\\n 1: another message \n",
"2 1711048757.443209 1: hello world \n",
"3 1711048765.747779 1: general channel \n",
"4 1711048761.709929 1: 2nd message \n",
"5 1711048764.747779 1: my 2nd reply\\n 1: will this be picked up by... \n",
"\n",
" num_messages users _dlt_load_id \\\n",
"0 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"1 2 [U06R4CY0Q65] 1712264235.2934568 \n",
"2 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"3 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"4 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"5 3 [U06R4CY0Q65] 1712264235.2934568 \n",
"\n",
" _dlt_id \n",
"0 [YzwD7S5kc4OYeA] \n",
"1 [LItHtsbAAX6/Kg, XFxI2RmALQkXfg] \n",
"2 [x/vVKj7+sMyNvg] \n",
"3 [aT+AcnCycBlp8A] \n",
"4 [ENPKHYpQW15AFg] \n",
"5 [kEt1kEo9e0mkqQ, LatviGHcnPd8yw, lyx94g8HxvVcIA] "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"inputs = dict(\n",
" pipeline=slack_pipeline,\n",
" selected_channels=[\"general\", \"dlt\"],\n",
") \n",
"final_vars = [\"threads.with_aggregate_thread\"]\n",
"\n",
"results = dr.execute(final_vars, inputs=inputs)\n",
"# `threads.with_aggregate_thread` is an ibis expr, execute it to return a pandas DataFrame\n",
"df = results[\"threads.with_aggregate_thread\"].to_pandas()\n",
"\n",
"display(dr.visualize_execution(final_vars=final_vars, inputs=inputs), df)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next cells require your OpenAI key to execute the full dataflow and compute the summaries. By requesting `insert_threads`, we could directly save the Ibis results without leaving DuckDB!"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"import os \n",
"import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"Enter your OpenAI key\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"1887pt\" height=\"547pt\"\n",
" viewBox=\"0.00 0.00 1887.00 546.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 542.5)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-542.5 1883,-542.5 1883,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"43,-217.5 43,-530.5 141,-530.5 141,-217.5 43,-217.5\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-515.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- threads -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>threads</title>\n",
"<path fill=\"#ffc857\" stroke=\"black\" d=\"M1867,-215.5C1867,-215.5 1808,-215.5 1808,-215.5 1802,-215.5 1796,-209.5 1796,-203.5 1796,-203.5 1796,-163.5 1796,-163.5 1796,-157.5 1802,-151.5 1808,-151.5 1808,-151.5 1867,-151.5 1867,-151.5 1873,-151.5 1879,-157.5 1879,-163.5 1879,-163.5 1879,-203.5 1879,-203.5 1879,-209.5 1873,-215.5 1867,-215.5\"/>\n",
"<text text-anchor=\"start\" x=\"1807\" y=\"-194.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads</text>\n",
"<text text-anchor=\"start\" x=\"1819\" y=\"-166.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_summary -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>threads.with_summary</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1755,-215.5C1755,-215.5 1578,-215.5 1578,-215.5 1572,-215.5 1566,-209.5 1566,-203.5 1566,-203.5 1566,-163.5 1566,-163.5 1566,-157.5 1572,-151.5 1578,-151.5 1578,-151.5 1755,-151.5 1755,-151.5 1761,-151.5 1767,-157.5 1767,-163.5 1767,-163.5 1767,-203.5 1767,-203.5 1767,-209.5 1761,-215.5 1755,-215.5\"/>\n",
"<text text-anchor=\"start\" x=\"1577\" y=\"-194.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_summary</text>\n",
"<text text-anchor=\"start\" x=\"1648\" y=\"-166.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_summary&#45;&gt;threads -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>threads.with_summary&#45;&gt;threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1767.27,-183.5C1773.53,-183.5 1779.66,-183.5 1785.54,-183.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1785.82,-187 1795.82,-183.5 1785.82,-180 1785.82,-187\"/>\n",
"</g>\n",
"<!-- db_con -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>db_con</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M332,-217.5C332,-217.5 240,-217.5 240,-217.5 234,-217.5 228,-211.5 228,-205.5 228,-205.5 228,-165.5 228,-165.5 228,-159.5 234,-153.5 240,-153.5 240,-153.5 332,-153.5 332,-153.5 338,-153.5 344,-159.5 344,-165.5 344,-165.5 344,-205.5 344,-205.5 344,-211.5 338,-217.5 332,-217.5\"/>\n",
"<text text-anchor=\"start\" x=\"258.5\" y=\"-196.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">db_con</text>\n",
"<text text-anchor=\"start\" x=\"239\" y=\"-168.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">BaseBackend</text>\n",
"</g>\n",
"<!-- channel_replies -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>channel_replies</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M528.5,-216.5C528.5,-216.5 408.5,-216.5 408.5,-216.5 402.5,-216.5 396.5,-210.5 396.5,-204.5 396.5,-204.5 396.5,-164.5 396.5,-164.5 396.5,-158.5 402.5,-152.5 408.5,-152.5 408.5,-152.5 528.5,-152.5 528.5,-152.5 534.5,-152.5 540.5,-158.5 540.5,-164.5 540.5,-164.5 540.5,-204.5 540.5,-204.5 540.5,-210.5 534.5,-216.5 528.5,-216.5\"/>\n",
"<text text-anchor=\"start\" x=\"407.5\" y=\"-195.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_replies</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-167.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_replies -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-185.18C357.49,-185.11 372,-185.03 386.21,-184.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"386.51,-188.45 396.49,-184.89 386.47,-181.45 386.51,-188.45\"/>\n",
"</g>\n",
"<!-- channel_message -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>channel_message</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M537,-133.5C537,-133.5 400,-133.5 400,-133.5 394,-133.5 388,-127.5 388,-121.5 388,-121.5 388,-81.5 388,-81.5 388,-75.5 394,-69.5 400,-69.5 400,-69.5 537,-69.5 537,-69.5 543,-69.5 549,-75.5 549,-81.5 549,-81.5 549,-121.5 549,-121.5 549,-127.5 543,-133.5 537,-133.5\"/>\n",
"<text text-anchor=\"start\" x=\"399\" y=\"-112.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_message</text>\n",
"<text text-anchor=\"start\" x=\"450\" y=\"-84.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- db_con&#45;&gt;channel_message -->\n",
"<g id=\"edge13\" class=\"edge\">\n",
"<title>db_con&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M344.11,-158.92C358.4,-152.27 373.96,-145.03 389.08,-137.99\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.83,-141.04 398.42,-133.65 387.88,-134.69 390.83,-141.04\"/>\n",
"</g>\n",
"<!-- channel_threads -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>channel_threads</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M718,-174.5C718,-174.5 590,-174.5 590,-174.5 584,-174.5 578,-168.5 578,-162.5 578,-162.5 578,-122.5 578,-122.5 578,-116.5 584,-110.5 590,-110.5 590,-110.5 718,-110.5 718,-110.5 724,-110.5 730,-116.5 730,-122.5 730,-122.5 730,-162.5 730,-162.5 730,-168.5 724,-174.5 718,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"589\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel_threads</text>\n",
"<text text-anchor=\"start\" x=\"635.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channels_collection -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>channels_collection</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M926,-174.5C926,-174.5 775,-174.5 775,-174.5 769,-174.5 763,-168.5 763,-162.5 763,-162.5 763,-122.5 763,-122.5 763,-116.5 769,-110.5 775,-110.5 775,-110.5 926,-110.5 926,-110.5 932,-110.5 938,-116.5 938,-122.5 938,-122.5 938,-162.5 938,-162.5 938,-168.5 932,-174.5 926,-174.5\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M930,-178.5C930,-178.5 771,-178.5 771,-178.5 765,-178.5 759,-172.5 759,-166.5 759,-166.5 759,-118.5 759,-118.5 759,-112.5 765,-106.5 771,-106.5 771,-106.5 930,-106.5 930,-106.5 936,-106.5 942,-112.5 942,-118.5 942,-118.5 942,-166.5 942,-166.5 942,-172.5 936,-178.5 930,-178.5\"/>\n",
"<text text-anchor=\"start\" x=\"774\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channels_collection</text>\n",
"<text text-anchor=\"start\" x=\"832\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channel_threads&#45;&gt;channels_collection -->\n",
"<g id=\"edge11\" class=\"edge\">\n",
"<title>channel_threads&#45;&gt;channels_collection</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M740.44,-142.5C743.21,-142.5 746,-142.5 748.8,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"740.2,-142.5 730.2,-138 735.2,-142.5 730.2,-142.5 730.2,-142.5 730.2,-142.5 735.2,-142.5 730.2,-147 740.2,-142.5 740.2,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"748.91,-146 758.91,-142.5 748.91,-139 748.91,-146\"/>\n",
"</g>\n",
"<!-- channel -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>channel</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M330.5,-131.5C330.5,-131.5 241.5,-131.5 241.5,-131.5 235.5,-131.5 229.5,-125.5 229.5,-119.5 229.5,-119.5 229.5,-79.5 229.5,-79.5 229.5,-73.5 235.5,-67.5 241.5,-67.5 241.5,-67.5 330.5,-67.5 330.5,-67.5 336.5,-67.5 342.5,-73.5 342.5,-79.5 342.5,-79.5 342.5,-119.5 342.5,-119.5 342.5,-125.5 336.5,-131.5 330.5,-131.5\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M334.5,-135.5C334.5,-135.5 237.5,-135.5 237.5,-135.5 231.5,-135.5 225.5,-129.5 225.5,-123.5 225.5,-123.5 225.5,-75.5 225.5,-75.5 225.5,-69.5 231.5,-63.5 237.5,-63.5 237.5,-63.5 334.5,-63.5 334.5,-63.5 340.5,-63.5 346.5,-69.5 346.5,-75.5 346.5,-75.5 346.5,-123.5 346.5,-123.5 346.5,-129.5 340.5,-135.5 334.5,-135.5\"/>\n",
"<text text-anchor=\"start\" x=\"255\" y=\"-110.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">channel</text>\n",
"<text text-anchor=\"start\" x=\"240.5\" y=\"-82.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Parallelizable</text>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_replies -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-127.57C360.63,-134.18 375.79,-141.31 390.46,-148.22\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"390.48,-148.23 397.61,-156.56 395,-150.36 399.52,-152.49 399.52,-152.49 399.52,-152.49 395,-150.36 401.44,-148.42 390.48,-148.23 390.48,-148.23\"/>\n",
"</g>\n",
"<!-- channel&#45;&gt;channel_message -->\n",
"<g id=\"edge12\" class=\"edge\">\n",
"<title>channel&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M346.6,-100.16C356.64,-100.27 367.25,-100.39 377.85,-100.51\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"378,-100.51 387.95,-105.12 383,-100.56 388,-100.62 388,-100.62 388,-100.62 383,-100.56 388.05,-96.12 378,-100.51 378,-100.51\"/>\n",
"</g>\n",
"<!-- summary_prompt -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>summary_prompt</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1469,-256.5C1469,-256.5 1335,-256.5 1335,-256.5 1329,-256.5 1323,-250.5 1323,-244.5 1323,-244.5 1323,-204.5 1323,-204.5 1323,-198.5 1329,-192.5 1335,-192.5 1335,-192.5 1469,-192.5 1469,-192.5 1475,-192.5 1481,-198.5 1481,-204.5 1481,-204.5 1481,-244.5 1481,-244.5 1481,-250.5 1475,-256.5 1469,-256.5\"/>\n",
"<text text-anchor=\"start\" x=\"1334\" y=\"-235.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">summary_prompt</text>\n",
"<text text-anchor=\"start\" x=\"1392.5\" y=\"-207.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- summary_prompt&#45;&gt;threads.with_summary -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>summary_prompt&#45;&gt;threads.with_summary</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1481.05,-212.31C1504.53,-208.64 1530.75,-204.55 1555.87,-200.62\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1556.48,-204.07 1565.82,-199.07 1555.4,-197.15 1556.48,-204.07\"/>\n",
"</g>\n",
"<!-- channel_replies&#45;&gt;channel_threads -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>channel_replies&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M540.71,-168.2C549.66,-166.15 558.89,-164.04 568.02,-161.95\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"568.92,-165.34 577.89,-159.69 567.36,-158.51 568.92,-165.34\"/>\n",
"</g>\n",
"<!-- threads.with_format_messages -->\n",
"<g id=\"node11\" class=\"node\">\n",
"<title>threads.with_format_messages</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1226,-174.5C1226,-174.5 983,-174.5 983,-174.5 977,-174.5 971,-168.5 971,-162.5 971,-162.5 971,-122.5 971,-122.5 971,-116.5 977,-110.5 983,-110.5 983,-110.5 1226,-110.5 1226,-110.5 1232,-110.5 1238,-116.5 1238,-122.5 1238,-122.5 1238,-162.5 1238,-162.5 1238,-168.5 1232,-174.5 1226,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"982\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_format_messages</text>\n",
"<text text-anchor=\"start\" x=\"1086\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- channels_collection&#45;&gt;threads.with_format_messages -->\n",
"<g id=\"edge16\" class=\"edge\">\n",
"<title>channels_collection&#45;&gt;threads.with_format_messages</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M942.12,-142.5C948.09,-142.5 954.17,-142.5 960.31,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"960.68,-146 970.68,-142.5 960.68,-139 960.68,-146\"/>\n",
"</g>\n",
"<!-- channel_message&#45;&gt;channel_threads -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>channel_message&#45;&gt;channel_threads</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M549.19,-119.3C555.41,-120.69 561.7,-122.1 567.96,-123.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"567.21,-126.92 577.73,-125.68 568.74,-120.09 567.21,-126.92\"/>\n",
"</g>\n",
"<!-- threads.with_aggregate_thread -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>threads.with_aggregate_thread</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M1525,-174.5C1525,-174.5 1279,-174.5 1279,-174.5 1273,-174.5 1267,-168.5 1267,-162.5 1267,-162.5 1267,-122.5 1267,-122.5 1267,-116.5 1273,-110.5 1279,-110.5 1279,-110.5 1525,-110.5 1525,-110.5 1531,-110.5 1537,-116.5 1537,-122.5 1537,-122.5 1537,-162.5 1537,-162.5 1537,-168.5 1531,-174.5 1525,-174.5\"/>\n",
"<text text-anchor=\"start\" x=\"1278\" y=\"-153.3\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">threads.with_aggregate_thread</text>\n",
"<text text-anchor=\"start\" x=\"1383.5\" y=\"-125.3\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Table</text>\n",
"</g>\n",
"<!-- threads.with_aggregate_thread&#45;&gt;threads.with_summary -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>threads.with_aggregate_thread&#45;&gt;threads.with_summary</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1537.33,-163.48C1543.47,-164.44 1549.6,-165.4 1555.67,-166.35\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1555.4,-169.85 1565.82,-167.93 1556.48,-162.93 1555.4,-169.85\"/>\n",
"</g>\n",
"<!-- threads.with_format_messages&#45;&gt;threads.with_aggregate_thread -->\n",
"<g id=\"edge15\" class=\"edge\">\n",
"<title>threads.with_format_messages&#45;&gt;threads.with_aggregate_thread</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M1238.12,-142.5C1244.22,-142.5 1250.35,-142.5 1256.47,-142.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"1256.76,-146 1266.76,-142.5 1256.76,-139 1256.76,-146\"/>\n",
"</g>\n",
"<!-- _db_con_inputs -->\n",
"<g id=\"node12\" class=\"node\">\n",
"<title>_db_con_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"165,-208 19,-208 19,-163 165,-163 165,-208\"/>\n",
"<text text-anchor=\"start\" x=\"34\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"95\" y=\"-181.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _db_con_inputs&#45;&gt;db_con -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>_db_con_inputs&#45;&gt;db_con</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M165.04,-185.5C182.22,-185.5 200.5,-185.5 217.47,-185.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"217.85,-189 227.85,-185.5 217.85,-182 217.85,-189\"/>\n",
"</g>\n",
"<!-- _channel_inputs -->\n",
"<g id=\"node13\" class=\"node\">\n",
"<title>_channel_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"184,-122 0,-122 0,-77 184,-77 184,-122\"/>\n",
"<text text-anchor=\"start\" x=\"15\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">selected_channels</text>\n",
"<text text-anchor=\"start\" x=\"148\" y=\"-95.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">list</text>\n",
"</g>\n",
"<!-- _channel_inputs&#45;&gt;channel -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>_channel_inputs&#45;&gt;channel</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M184.22,-99.5C194.56,-99.5 204.98,-99.5 214.98,-99.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"215.25,-103 225.25,-99.5 215.24,-96 215.25,-103\"/>\n",
"</g>\n",
"<!-- _channel_replies_inputs -->\n",
"<g id=\"node14\" class=\"node\">\n",
"<title>_channel_replies_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-281 213,-281 213,-236 359,-236 359,-281\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-254.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_replies_inputs&#45;&gt;channel_replies -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>_channel_replies_inputs&#45;&gt;channel_replies</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M342.14,-235.9C356.3,-230.09 371.84,-223.72 387,-217.5\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"388.45,-220.69 396.38,-213.66 385.8,-214.21 388.45,-220.69\"/>\n",
"</g>\n",
"<!-- _channel_message_inputs -->\n",
"<g id=\"node15\" class=\"node\">\n",
"<title>_channel_message_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"359,-45 213,-45 213,0 359,0 359,-45\"/>\n",
"<text text-anchor=\"start\" x=\"228\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">pipeline</text>\n",
"<text text-anchor=\"start\" x=\"289\" y=\"-18.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Pipeline</text>\n",
"</g>\n",
"<!-- _channel_message_inputs&#45;&gt;channel_message -->\n",
"<g id=\"edge14\" class=\"edge\">\n",
"<title>_channel_message_inputs&#45;&gt;channel_message</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M338.72,-45.14C353.18,-51.47 369.26,-58.5 385.02,-65.4\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"383.68,-68.64 394.25,-69.44 386.49,-62.22 383.68,-68.64\"/>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node16\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"121.5,-499 62.5,-499 62.5,-462 121.5,-462 121.5,-499\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-476.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node17\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M120,-444C120,-444 64,-444 64,-444 58,-444 52,-438 52,-432 52,-432 52,-419 52,-419 52,-413 58,-407 64,-407 64,-407 120,-407 120,-407 126,-407 132,-413 132,-419 132,-419 132,-432 132,-432 132,-438 126,-444 120,-444\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-421.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"<!-- output -->\n",
"<g id=\"node18\" class=\"node\">\n",
"<title>output</title>\n",
"<path fill=\"#ffc857\" stroke=\"black\" d=\"M114,-389C114,-389 70,-389 70,-389 64,-389 58,-383 58,-377 58,-377 58,-364 58,-364 58,-358 64,-352 70,-352 70,-352 114,-352 114,-352 120,-352 126,-358 126,-364 126,-364 126,-377 126,-377 126,-383 120,-389 114,-389\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-366.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">output</text>\n",
"</g>\n",
"<!-- expand -->\n",
"<g id=\"node19\" class=\"node\">\n",
"<title>expand</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#56e39f\" d=\"M117,-330C117,-330 67,-330 67,-330 61,-330 55,-324 55,-318 55,-318 55,-305 55,-305 55,-299 61,-293 67,-293 67,-293 117,-293 117,-293 123,-293 129,-299 129,-305 129,-305 129,-318 129,-318 129,-324 123,-330 117,-330\"/>\n",
"<path fill=\"none\" stroke=\"#56e39f\" d=\"M121,-334C121,-334 63,-334 63,-334 57,-334 51,-328 51,-322 51,-322 51,-301 51,-301 51,-295 57,-289 63,-289 63,-289 121,-289 121,-289 127,-289 133,-295 133,-301 133,-301 133,-322 133,-322 133,-328 127,-334 121,-334\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-307.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">expand</text>\n",
"</g>\n",
"<!-- collect -->\n",
"<g id=\"node20\" class=\"node\">\n",
"<title>collect</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"#ea5556\" d=\"M113.5,-267C113.5,-267 70.5,-267 70.5,-267 64.5,-267 58.5,-261 58.5,-255 58.5,-255 58.5,-242 58.5,-242 58.5,-236 64.5,-230 70.5,-230 70.5,-230 113.5,-230 113.5,-230 119.5,-230 125.5,-236 125.5,-242 125.5,-242 125.5,-255 125.5,-255 125.5,-261 119.5,-267 113.5,-267\"/>\n",
"<path fill=\"none\" stroke=\"#ea5556\" d=\"M117.5,-271C117.5,-271 66.5,-271 66.5,-271 60.5,-271 54.5,-265 54.5,-259 54.5,-259 54.5,-238 54.5,-238 54.5,-232 60.5,-226 66.5,-226 66.5,-226 117.5,-226 117.5,-226 123.5,-226 129.5,-232 129.5,-238 129.5,-238 129.5,-259 129.5,-259 129.5,-265 123.5,-271 117.5,-271\"/>\n",
"<text text-anchor=\"middle\" x=\"92\" y=\"-244.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">collect</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7f2f21975d50>"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>thread_ts</th>\n",
" <th>thread</th>\n",
" <th>num_messages</th>\n",
" <th>users</th>\n",
" <th>_dlt_load_id</th>\n",
" <th>_dlt_id</th>\n",
" <th>summary</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1711048769.908319</td>\n",
" <td>1: from the general channel again</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[YzwD7S5kc4OYeA]</td>\n",
" <td>User1: Has anyone encountered issues with Hami...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>1711053147.089719</td>\n",
" <td>1: another message\\n 1: along with a reply</td>\n",
" <td>2</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[XFxI2RmALQkXfg, LItHtsbAAX6/Kg]</td>\n",
" <td>User1:\\n Hey everyone, I'm having troub...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>1711048757.443209</td>\n",
" <td>1: hello world</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[x/vVKj7+sMyNvg]</td>\n",
" <td>2: Hi User1, what's up?\\n \\n 3: ...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>1711048761.709929</td>\n",
" <td>1: 2nd message</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[ENPKHYpQW15AFg]</td>\n",
" <td>User1: Hi everyone, I've been using Hamilton f...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>1711048765.747779</td>\n",
" <td>1: general channel</td>\n",
" <td>1</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[aT+AcnCycBlp8A]</td>\n",
" <td>User1: \"I've been trying to use Hamilton for m...</td>\n",
" </tr>\n",
" <tr>\n",
" <th>5</th>\n",
" <td>1711048764.747779</td>\n",
" <td>1: my 2nd reply\\n 1: my 1st reply\\n 1: will th...</td>\n",
" <td>3</td>\n",
" <td>[U06R4CY0Q65]</td>\n",
" <td>1712264235.2934568</td>\n",
" <td>[kEt1kEo9e0mkqQ, lyx94g8HxvVcIA, LatviGHcnPd8yw]</td>\n",
" <td>User1's issue: User1 is questioning whether ce...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" thread_ts thread \\\n",
"0 1711048769.908319 1: from the general channel again \n",
"1 1711053147.089719 1: another message\\n 1: along with a reply \n",
"2 1711048757.443209 1: hello world \n",
"3 1711048761.709929 1: 2nd message \n",
"4 1711048765.747779 1: general channel \n",
"5 1711048764.747779 1: my 2nd reply\\n 1: my 1st reply\\n 1: will th... \n",
"\n",
" num_messages users _dlt_load_id \\\n",
"0 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"1 2 [U06R4CY0Q65] 1712264235.2934568 \n",
"2 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"3 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"4 1 [U06R4CY0Q65] 1712264235.2934568 \n",
"5 3 [U06R4CY0Q65] 1712264235.2934568 \n",
"\n",
" _dlt_id \\\n",
"0 [YzwD7S5kc4OYeA] \n",
"1 [XFxI2RmALQkXfg, LItHtsbAAX6/Kg] \n",
"2 [x/vVKj7+sMyNvg] \n",
"3 [ENPKHYpQW15AFg] \n",
"4 [aT+AcnCycBlp8A] \n",
"5 [kEt1kEo9e0mkqQ, lyx94g8HxvVcIA, LatviGHcnPd8yw] \n",
"\n",
" summary \n",
"0 User1: Has anyone encountered issues with Hami... \n",
"1 User1:\\n Hey everyone, I'm having troub... \n",
"2 2: Hi User1, what's up?\\n \\n 3: ... \n",
"3 User1: Hi everyone, I've been using Hamilton f... \n",
"4 User1: \"I've been trying to use Hamilton for m... \n",
"5 User1's issue: User1 is questioning whether ce... "
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"final_vars = [\"threads\"] # replace by `\"insert_threads\"` to directly store results\n",
"results = dr.execute(final_vars, inputs=inputs)\n",
"df2 = results[\"threads\"].to_pandas()\n",
"\n",
"display(dr.visualize_execution(final_vars=final_vars, inputs=inputs), df2)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}