blob: d09ef9abf166e0b07adfc7b8b323103ee2d756c3 [file] [log] [blame]
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Execute this cell to install dependencies\n",
"%pip install sf-hamilton[visualization]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reverse ETL for your timesheets [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/examples/reverse_etl/notebook.ipynb) [![GitHub badge](https://img.shields.io/badge/github-view_source-2b3137?logo=github)](https://github.com/dagworks-inc/hamilton/blob/main/examples/reverse_etl/notebook.ipynb)\n",
"\n",
"\n",
"This notebook contains a tutorial on how to use Hamilton to extract time tracking data from Timewarrior, transform it into a timesheet, and upload it to Google Sheets. You'll learn how to create dataflows in a notebook with Hamilton, and how to push data online using the Google Sheets API. The resulting dataflow can also be used as a Timewarrior extension to upload your timesheet using `timew report upload_timesheet.py`. See the [README](./README.md) for details! "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 1. Taskwarrior and Timewarrior\n",
"[Taskwarrior](https://taskwarrior.org/) is an open-source command line tool to create task lists. It pairs nicely with [Timewarrior](https://timewarrior.net/), a command line tool for time tracking. By adding [this hook](https://timewarrior.net/docs/taskwarrior/) for Taskwarrior, you can specify whenever you `start` / `stop` a task, and it will be tracked in Timewarrior.\n",
"\n",
"At the beginning of the day, I enter the tasks I intend to complete in Taskwarrior. I then share them with my colleagues during our daily standup. When you start working on a task, use `task $ID start`, use `task $ID stop` for a break, and do `task $ID done` whenever the task is completed.\n",
"\n",
"While nobody likes filling timesheets, this approach is lightweight and can be done directly from the terminal in your IDE which minimizes context switching. In addition, I use a VSCode extension that adds a small indicator of the current task. At a glance, I can verify that time tracking is active and the correct task is selected. If you mess up, you can always edit records in Timewarrior with `timew start 15min ago DESCRIPTION` or `timew track 11:15 - 11:45 DESCRIPTION`\n",
"\n",
"![](timewarrior_screenshot.png)\n",
"\n",
"Taskwarrior and Timewarrior provide well-defined APIs for extension and easy integration with other tools. In my case, I want to create an up-to-date timesheet and push it to Google Sheets. We'll create two dataflows:\n",
"1. Parse raw data and prepare the timesheet\n",
"2. Upload incremental records to Google Sheets"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 2. Prepare a timesheet with Hamilton\n",
"Hamilton defines the dataflow using regular Python functions. This helps breakdown data transformations into composable steps and improves readability via type hints and docstrings. We invite you to read the dataflow code to learn about the different design decisions.\n",
"\n",
"We'll use the Hamilton VSCode extension to define the dataflow in a notebook. The first line of the cell `%%cell_to_module timesheet_module` indicates that we'll be defining a Python module in this code cell with the name `timesheet_module`. Hamilton will be able to create the dataflow from this module. Learn more via the [video tutorial](https://www.youtube.com/watch?v=Z3ZT2ur2jg8) or [notebook example](https://github.com/DAGWorks-Inc/hamilton/blob/main/examples/jupyter_notebook_magic/example.ipynb)\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# load the notebook extension\n",
"%load_ext hamilton.plugins.jupyter_magic"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"746pt\" height=\"344pt\"\n",
" viewBox=\"0.00 0.00 746.00 344.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 340)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-340 742,-340 742,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"31.5,-142 31.5,-328 127.5,-328 127.5,-142 31.5,-142\"/>\n",
"<text text-anchor=\"middle\" x=\"79.5\" y=\"-312.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- mode -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>mode</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"103.5,-50 49.5,-50 49.5,0 109.5,0 109.5,-44 103.5,-50\"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"103.5,-50 103.5,-44 \"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"109.5,-44 103.5,-44 \"/>\n",
"<text text-anchor=\"start\" x=\"57.5\" y=\"-35.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">mode</text>\n",
"<text text-anchor=\"start\" x=\"59.5\" y=\"-7.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">demo</text>\n",
"</g>\n",
"<!-- input_data -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>input_data</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M147,-132C147,-132 12,-132 12,-132 6,-132 0,-126 0,-120 0,-120 0,-80 0,-80 0,-74 6,-68 12,-68 12,-68 147,-68 147,-68 153,-68 159,-74 159,-80 159,-80 159,-120 159,-120 159,-126 153,-132 147,-132\"/>\n",
"<text text-anchor=\"start\" x=\"11\" y=\"-110.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">input_data: mode</text>\n",
"<text text-anchor=\"start\" x=\"70\" y=\"-82.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- intervals -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>intervals</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M296,-132C296,-132 228,-132 228,-132 222,-132 216,-126 216,-120 216,-120 216,-80 216,-80 216,-74 222,-68 228,-68 228,-68 296,-68 296,-68 302,-68 308,-74 308,-80 308,-80 308,-120 308,-120 308,-126 302,-132 296,-132\"/>\n",
"<text text-anchor=\"start\" x=\"227\" y=\"-110.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">intervals</text>\n",
"<text text-anchor=\"start\" x=\"251.5\" y=\"-82.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">list</text>\n",
"</g>\n",
"<!-- input_data&#45;&gt;intervals -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>input_data&#45;&gt;intervals</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M159.14,-100C174.74,-100 190.81,-100 205.44,-100\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"205.81,-103.5 215.81,-100 205.81,-96.5 205.81,-103.5\"/>\n",
"</g>\n",
"<!-- daily_df -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>daily_df</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M595,-95C595,-95 520,-95 520,-95 514,-95 508,-89 508,-83 508,-83 508,-43 508,-43 508,-37 514,-31 520,-31 520,-31 595,-31 595,-31 601,-31 607,-37 607,-43 607,-43 607,-83 607,-83 607,-89 601,-95 595,-95\"/>\n",
"<text text-anchor=\"start\" x=\"527\" y=\"-73.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">daily_df</text>\n",
"<text text-anchor=\"start\" x=\"519\" y=\"-45.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- timesheet -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>timesheet</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M726,-95C726,-95 648,-95 648,-95 642,-95 636,-89 636,-83 636,-83 636,-43 636,-43 636,-37 642,-31 648,-31 648,-31 726,-31 726,-31 732,-31 738,-37 738,-43 738,-43 738,-83 738,-83 738,-89 732,-95 726,-95\"/>\n",
"<text text-anchor=\"start\" x=\"647\" y=\"-73.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">timesheet</text>\n",
"<text text-anchor=\"start\" x=\"648.5\" y=\"-45.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- daily_df&#45;&gt;timesheet -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>daily_df&#45;&gt;timesheet</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M607.07,-63C613.11,-63 619.34,-63 625.53,-63\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"625.88,-66.5 635.88,-63 625.88,-59.5 625.88,-66.5\"/>\n",
"</g>\n",
"<!-- intervals_df -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>intervals_df</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M467,-95C467,-95 377,-95 377,-95 371,-95 365,-89 365,-83 365,-83 365,-43 365,-43 365,-37 371,-31 377,-31 377,-31 467,-31 467,-31 473,-31 479,-37 479,-43 479,-43 479,-83 479,-83 479,-89 473,-95 467,-95\"/>\n",
"<text text-anchor=\"start\" x=\"376\" y=\"-73.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">intervals_df</text>\n",
"<text text-anchor=\"start\" x=\"383.5\" y=\"-45.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- intervals&#45;&gt;intervals_df -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>intervals&#45;&gt;intervals_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M308.28,-89.4C322.71,-86.02 339,-82.2 354.61,-78.55\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"355.81,-81.86 364.75,-76.17 354.21,-75.05 355.81,-81.86\"/>\n",
"</g>\n",
"<!-- intervals_df&#45;&gt;daily_df -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>intervals_df&#45;&gt;daily_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M479.23,-63C485.35,-63 491.58,-63 497.73,-63\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"497.95,-66.5 507.95,-63 497.95,-59.5 497.95,-66.5\"/>\n",
"</g>\n",
"<!-- _intervals_df_inputs -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>_intervals_df_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"336,-49.5 188,-49.5 188,-4.5 336,-4.5 336,-49.5\"/>\n",
"<text text-anchor=\"start\" x=\"203\" y=\"-22.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">must_include</text>\n",
"<text text-anchor=\"start\" x=\"302\" y=\"-22.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _intervals_df_inputs&#45;&gt;intervals_df -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>_intervals_df_inputs&#45;&gt;intervals_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M336.45,-43.74C342.54,-45.12 348.68,-46.52 354.71,-47.9\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"354.22,-51.37 364.75,-50.18 355.77,-44.55 354.22,-51.37\"/>\n",
"</g>\n",
"<!-- config -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>config</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"103,-297 50,-297 50,-261 109,-261 109,-291 103,-297\"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"103,-297 103,-291 \"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"109,-291 103,-291 \"/>\n",
"<text text-anchor=\"middle\" x=\"79.5\" y=\"-275.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">config</text>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"109,-242.5 50,-242.5 50,-205.5 109,-205.5 109,-242.5\"/>\n",
"<text text-anchor=\"middle\" x=\"79.5\" y=\"-220.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M107.5,-187.5C107.5,-187.5 51.5,-187.5 51.5,-187.5 45.5,-187.5 39.5,-181.5 39.5,-175.5 39.5,-175.5 39.5,-162.5 39.5,-162.5 39.5,-156.5 45.5,-150.5 51.5,-150.5 51.5,-150.5 107.5,-150.5 107.5,-150.5 113.5,-150.5 119.5,-156.5 119.5,-162.5 119.5,-162.5 119.5,-175.5 119.5,-175.5 119.5,-181.5 113.5,-187.5 107.5,-187.5\"/>\n",
"<text text-anchor=\"middle\" x=\"79.5\" y=\"-165.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7ff26c11be50>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%cell_to_module timesheet_module --display --config mode=demo\n",
"# --display enables visualization\n",
"# --config mode=demo sets {\"mode\": \"demo\"} for the config\n",
"import subprocess\n",
"\n",
"import pandas as pd\n",
"from hamilton.function_modifiers import config\n",
"from timewreport.parser import TimeWarriorParser\n",
"\n",
"\n",
"# `input_data` will use this function when config key `mode` isn't set to `demo`\n",
"@config.when_not(mode=\"demo\")\n",
"def input_data__default() -> str:\n",
" \"\"\"Get raw text output of Timewarrior `export` command.\"\"\"\n",
" result = subprocess.run([\"timew\", \"export\"], stdout=subprocess.PIPE, text=True)\n",
" return \"\\n\" + result.stdout # requires padding a newline for valid parsing\n",
"\n",
"\n",
"# `input_data` will use this function when config key `mode` is set to `demo`\n",
"@config.when(mode=\"demo\")\n",
"def input_data__demo() -> str:\n",
" \"\"\"Dummy Timewarrior output data\"\"\"\n",
" return \"\"\"\n",
" [\n",
" {\"start\":\"20230302T144552Z\",\"end\":\"20230302T151236Z\",\"tags\":[\"personal\",\"Do groceries\"]},\n",
" {\"start\":\"20231121T202636Z\",\"end\":\"20231121T221714Z\",\"tags\":[\"DW\",\"Write blog post\"]},\n",
" {\"start\":\"20240103T221249Z\",\"end\":\"20240103T223917Z\",\"tags\":[\"DW\",\"team sync\"]}\n",
" ]\n",
" \"\"\"\n",
"\n",
"\n",
"def intervals(input_data: str) -> list[dict]:\n",
" \"\"\"Parse Timewarrior's output and return a list of time intervals\"\"\"\n",
" parser = TimeWarriorParser(input_data)\n",
" return [\n",
" dict(\n",
" date=interval.get_end(),\n",
" duration=interval.get_duration(),\n",
" tags=interval.get_tags(),\n",
" )\n",
" for interval in parser.get_intervals()\n",
" ]\n",
"\n",
"\n",
"def _is_tag_included(tags: list[str], target: str) -> bool:\n",
" \"\"\"Check if the `target` string is included in a tag.\n",
" For example, target `foo` is in tag `foobar`. \n",
" \n",
" This allows some flexibility with regards to typos in tags.\n",
" Since Timewarrior expects `+$TAG` while Timewarrior expects\n",
" `$TAG`, uncessary `+` are sometime present in Timewarrior. \n",
" \"\"\"\n",
" for tag in tags:\n",
" if target in tag: \n",
" return True\n",
" return False\n",
"\n",
"\n",
"def _get_description(tags: list[str]) -> str:\n",
" \"\"\"Return the first tag containing more than one word.\n",
" \n",
" The Taskwarrior hook adds tags for task: project, tags, and description\n",
" to the generated Timewarrior intervals. project and tags don't allow for \n",
" whitespaces; only description does.\n",
"\n",
" Hits an edge case for single word descriptions. A single word task is\n",
" a poorly written task and is uncommon.\n",
" \"\"\"\n",
" for tag in tags:\n",
" if len(tag.split(\" \")) > 1:\n",
" return tag\n",
" return \"\"\n",
"\n",
"\n",
"def intervals_df(intervals: list[dict], must_include: str = \"DW\") -> pd.DataFrame:\n",
" \"\"\"Convert the Timewarrior parsed intervals to a dataframe.\n",
"\n",
" If `must_include`, check tags to ensure the string is present in a given tag.\n",
" The default value is `DW`, which stands for DAGWorks, my current employer.\n",
" \"\"\"\n",
" df = pd.DataFrame.from_records(intervals)\n",
"\n",
" if must_include:\n",
" df = df.loc[df.tags.apply(_is_tag_included, target=must_include)]\n",
"\n",
" df[\"week_day\"] = df[\"date\"].dt.weekday\n",
" df[\"week_number\"] = df[\"date\"].dt.isocalendar().week\n",
" df[\"date\"] = pd.to_datetime(df[\"date\"]).dt.date\n",
" df[\"description\"] = df.tags.apply(_get_description)\n",
"\n",
" return df[[\"date\", \"week_number\", \"week_day\", \"duration\", \"description\"]]\n",
"\n",
"\n",
"def _deduplicate_items(items: list[str]) -> str:\n",
" \"\"\"Custom aggregation that reduces a list of strings\n",
" to a set of unique items before joining them. \n",
" \"\"\"\n",
" return '; '.join(set(items))\n",
"\n",
"\n",
"def daily_df(intervals_df: pd.DataFrame) -> pd.DataFrame:\n",
" \"\"\"Aggregate intervals daily to match the granularity of timesheet entries\n",
" \n",
" Note that the order of items in the `agg()` dict determines column orders.\n",
" Also, we need to use `.reset_index()` to recover the `date` column.\n",
" \"\"\"\n",
" return (\n",
" intervals_df\n",
" .groupby(\"date\")\n",
" .agg(\n",
" dict(\n",
" week_number=\"first\",\n",
" week_day=\"first\",\n",
" description=_deduplicate_items,\n",
" duration=\"sum\",\n",
" )\n",
" ).reset_index()\n",
" )\n",
"\n",
"\n",
"def timesheet(daily_df: pd.DataFrame) -> pd.DataFrame:\n",
" \"\"\"Converts seconds to hours. Then, convert to the number of quarter hours\n",
" and round to nearest value.\n",
" \"\"\"\n",
" daily_df[\"duration\"] = daily_df[\"duration\"].dt.total_seconds() / 3600\n",
" daily_df[\"duration\"] = (daily_df[\"duration\"] / 0.25).round() * 0.25\n",
" daily_df[\"weekly_cumulative\"] = daily_df.groupby(\"week_number\")[\"duration\"].cumsum()\n",
" return daily_df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's execute the dataflow defined above. We will use the config `mode=demo`, which will load dummy data in case you're not a Timewarrior user. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>date</th>\n",
" <th>week_number</th>\n",
" <th>week_day</th>\n",
" <th>description</th>\n",
" <th>duration</th>\n",
" <th>weekly_cumulative</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>2023-11-21</td>\n",
" <td>47</td>\n",
" <td>1</td>\n",
" <td>Write blog post</td>\n",
" <td>1.75</td>\n",
" <td>1.75</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2024-01-03</td>\n",
" <td>1</td>\n",
" <td>2</td>\n",
" <td>team sync</td>\n",
" <td>0.50</td>\n",
" <td>0.50</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" date week_number week_day description duration \\\n",
"0 2023-11-21 47 1 Write blog post 1.75 \n",
"1 2024-01-03 1 2 team sync 0.50 \n",
"\n",
" weekly_cumulative \n",
"0 1.75 \n",
"1 0.50 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from hamilton import driver\n",
"\n",
"dr = (\n",
" driver.Builder()\n",
" .with_modules(timesheet_module)\n",
" .with_config(dict(mode=\"demo\"))\n",
" .build()\n",
")\n",
"results = dr.execute([\"timesheet\"])\n",
"\n",
"results[\"timesheet\"].head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 3. Upload timesheet to Google Sheets\n",
"Now, we need to upload our timesheet dataframe to Google Sheets. See this [step-by-step guide](https://hackernoon.com/how-to-use-the-google-sheets-api-with-python) to get your Google Sheets API credentials. They should have the following schema:\n",
"\n",
"```json\n",
"credentials = {\n",
" \"type\": \"service_account\",\n",
" \"project_id\": \"...\",\n",
" \"private_key_id\": \"...\",\n",
" \"private_key\": \"-----BEGIN PRIVATE KEY----- ...\",\n",
" \"client_email\": \"... .iam.gserviceaccount.com\",\n",
" \"client_id\": \"...\",\n",
" \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n",
" \"token_uri\": \"https://oauth2.googleapis.com/token\",\n",
" \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n",
" \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/ ...\",\n",
" \"universe_domain\": \"googleapis.com\"\n",
"}\n",
"```\n",
"\n",
"We create a second module called `google_sheets_module`. It creates a Google `Resource` to interact with Google Sheets and execute queries to retrieve and upload records."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# configuration to display our dataflow from top to bottom\n",
"display_conf = dict(orient=\"TB\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"644pt\" height=\"749pt\"\n",
" viewBox=\"0.00 0.00 644.00 749.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 745)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-745 640,-745 640,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"454,-657 454,-733 628,-733 628,-657 454,-657\"/>\n",
"<text text-anchor=\"middle\" x=\"541\" y=\"-717.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- incremental_records -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>incremental_records</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M170,-253C170,-253 12,-253 12,-253 6,-253 0,-247 0,-241 0,-241 0,-201 0,-201 0,-195 6,-189 12,-189 12,-189 170,-189 170,-189 176,-189 182,-195 182,-201 182,-201 182,-241 182,-241 182,-247 176,-253 170,-253\"/>\n",
"<text text-anchor=\"start\" x=\"11\" y=\"-231.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">incremental_records</text>\n",
"<text text-anchor=\"start\" x=\"52.5\" y=\"-203.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- insert_records_query -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>insert_records_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M340,-158C340,-158 178,-158 178,-158 172,-158 166,-152 166,-146 166,-146 166,-106 166,-106 166,-100 172,-94 178,-94 178,-94 340,-94 340,-94 346,-94 352,-100 352,-106 352,-106 352,-146 352,-146 352,-152 346,-158 340,-158\"/>\n",
"<text text-anchor=\"start\" x=\"177\" y=\"-136.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">insert_records_query</text>\n",
"<text text-anchor=\"start\" x=\"246\" y=\"-108.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- incremental_records&#45;&gt;insert_records_query -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>incremental_records&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M147.41,-188.77C162.31,-180.52 178.51,-171.55 193.84,-163.07\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"195.69,-166.05 202.74,-158.14 192.3,-159.93 195.69,-166.05\"/>\n",
"</g>\n",
"<!-- existing_records -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>existing_records</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M154.5,-348C154.5,-348 27.5,-348 27.5,-348 21.5,-348 15.5,-342 15.5,-336 15.5,-336 15.5,-296 15.5,-296 15.5,-290 21.5,-284 27.5,-284 27.5,-284 154.5,-284 154.5,-284 160.5,-284 166.5,-290 166.5,-296 166.5,-296 166.5,-336 166.5,-336 166.5,-342 160.5,-348 154.5,-348\"/>\n",
"<text text-anchor=\"start\" x=\"26.5\" y=\"-326.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">existing_records</text>\n",
"<text text-anchor=\"start\" x=\"52.5\" y=\"-298.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- existing_records&#45;&gt;incremental_records -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>existing_records&#45;&gt;incremental_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M91,-283.77C91,-277.33 91,-270.45 91,-263.71\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"94.5,-263.3 91,-253.3 87.5,-263.3 94.5,-263.3\"/>\n",
"</g>\n",
"<!-- get_records_query -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>get_records_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M292.5,-442C292.5,-442 149.5,-442 149.5,-442 143.5,-442 137.5,-436 137.5,-430 137.5,-430 137.5,-390 137.5,-390 137.5,-384 143.5,-378 149.5,-378 149.5,-378 292.5,-378 292.5,-378 298.5,-378 304.5,-384 304.5,-390 304.5,-390 304.5,-430 304.5,-430 304.5,-436 298.5,-442 292.5,-442\"/>\n",
"<text text-anchor=\"start\" x=\"148.5\" y=\"-420.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">get_records_query</text>\n",
"<text text-anchor=\"start\" x=\"208\" y=\"-392.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- get_records_query&#45;&gt;existing_records -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>get_records_query&#45;&gt;existing_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M176.99,-377.85C166.19,-370.21 154.52,-361.95 143.37,-354.06\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"145.15,-351.03 134.97,-348.11 141.11,-356.75 145.15,-351.03\"/>\n",
"</g>\n",
"<!-- google_service -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>google_service</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M432.5,-631C432.5,-631 317.5,-631 317.5,-631 311.5,-631 305.5,-625 305.5,-619 305.5,-619 305.5,-579 305.5,-579 305.5,-573 311.5,-567 317.5,-567 317.5,-567 432.5,-567 432.5,-567 438.5,-567 444.5,-573 444.5,-579 444.5,-579 444.5,-619 444.5,-619 444.5,-625 438.5,-631 432.5,-631\"/>\n",
"<text text-anchor=\"start\" x=\"316.5\" y=\"-609.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">google_service</text>\n",
"<text text-anchor=\"start\" x=\"342.5\" y=\"-581.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Resource</text>\n",
"</g>\n",
"<!-- google_service&#45;&gt;get_records_query -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>google_service&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M406.37,-566.62C419.8,-549.54 431.66,-527.37 428,-505\"/>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M428,-503C419.66,-452.06 365.46,-429.17 314.62,-418.96\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"314.98,-415.47 304.51,-417.08 313.7,-422.35 314.98,-415.47\"/>\n",
"</g>\n",
"<!-- google_service&#45;&gt;insert_records_query -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>google_service&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M428,-503C414.64,-421.42 462.23,-392.25 428,-317\"/>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M428,-315C386.59,-232.28 288.41,-324.91 229,-254 208.74,-229.82 219.4,-194.36 233.33,-167.06\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"236.48,-168.6 238.15,-158.14 230.32,-165.28 236.48,-168.6\"/>\n",
"</g>\n",
"<!-- format_spreadsheet_query -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>format_spreadsheet_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M532.5,-64C532.5,-64 323.5,-64 323.5,-64 317.5,-64 311.5,-58 311.5,-52 311.5,-52 311.5,-12 311.5,-12 311.5,-6 317.5,0 323.5,0 323.5,0 532.5,0 532.5,0 538.5,0 544.5,-6 544.5,-12 544.5,-12 544.5,-52 544.5,-52 544.5,-58 538.5,-64 532.5,-64\"/>\n",
"<text text-anchor=\"start\" x=\"322.5\" y=\"-42.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">format_spreadsheet_query</text>\n",
"<text text-anchor=\"start\" x=\"415\" y=\"-14.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- google_service&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge12\" class=\"edge\">\n",
"<title>google_service&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M428,-315C408.19,-273.73 417.32,-142.29 423.68,-74.43\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"427.2,-74.39 424.67,-64.1 420.23,-73.72 427.2,-74.39\"/>\n",
"</g>\n",
"<!-- create_header_query -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>create_header_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M208,-536C208,-536 44,-536 44,-536 38,-536 32,-530 32,-524 32,-524 32,-484 32,-484 32,-478 38,-472 44,-472 44,-472 208,-472 208,-472 214,-472 220,-478 220,-484 220,-484 220,-524 220,-524 220,-530 214,-536 208,-536\"/>\n",
"<text text-anchor=\"start\" x=\"43\" y=\"-514.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">create_header_query</text>\n",
"<text text-anchor=\"start\" x=\"113\" y=\"-486.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- google_service&#45;&gt;create_header_query -->\n",
"<g id=\"edge14\" class=\"edge\">\n",
"<title>google_service&#45;&gt;create_header_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M305.31,-571.97C278.55,-561.98 247.57,-550.41 218.97,-539.73\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"219.97,-536.36 209.38,-536.14 217.52,-542.92 219.97,-536.36\"/>\n",
"</g>\n",
"<!-- insert_records_query&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge11\" class=\"edge\">\n",
"<title>insert_records_query&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M316.22,-93.85C330.94,-85.83 346.91,-77.14 362.05,-68.9\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"363.73,-71.97 370.84,-64.11 360.39,-65.82 363.73,-71.97\"/>\n",
"</g>\n",
"<!-- create_header_query&#45;&gt;get_records_query -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>create_header_query&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M158.16,-471.85C165.59,-464.66 173.58,-456.92 181.29,-449.46\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"184.12,-451.59 188.87,-442.11 179.25,-446.56 184.12,-451.59\"/>\n",
"</g>\n",
"<!-- _incremental_records_inputs -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>_incremental_records_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"383,-349 185,-349 185,-283 383,-283 383,-349\"/>\n",
"<text text-anchor=\"start\" x=\"200\" y=\"-322.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">primary_key</text>\n",
"<text text-anchor=\"start\" x=\"320.5\" y=\"-322.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"208\" y=\"-301.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">timesheet</text>\n",
"<text text-anchor=\"start\" x=\"292\" y=\"-301.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- _incremental_records_inputs&#45;&gt;incremental_records -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>_incremental_records_inputs&#45;&gt;incremental_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M217.58,-282.99C200.65,-274.84 182.35,-266.02 165.07,-257.69\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"166.11,-254.31 155.58,-253.12 163.07,-260.62 166.11,-254.31\"/>\n",
"</g>\n",
"<!-- _get_records_query_inputs -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>_get_records_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"399.5,-537 238.5,-537 238.5,-471 399.5,-471 399.5,-537\"/>\n",
"<text text-anchor=\"start\" x=\"278\" y=\"-510.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"366\" y=\"-510.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"254\" y=\"-489.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"366\" y=\"-489.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _get_records_query_inputs&#45;&gt;get_records_query -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>_get_records_query_inputs&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M284.72,-470.82C277.27,-463.83 269.32,-456.36 261.65,-449.16\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"263.79,-446.37 254.1,-442.07 259,-451.47 263.79,-446.37\"/>\n",
"</g>\n",
"<!-- _google_service_inputs -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>_google_service_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"444,-706 306,-706 306,-661 444,-661 444,-706\"/>\n",
"<text text-anchor=\"start\" x=\"321\" y=\"-679.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">credentials</text>\n",
"<text text-anchor=\"start\" x=\"403\" y=\"-679.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- _google_service_inputs&#45;&gt;google_service -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>_google_service_inputs&#45;&gt;google_service</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M375,-660.82C375,-654.87 375,-648.21 375,-641.54\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"378.5,-641.15 375,-631.15 371.5,-641.15 378.5,-641.15\"/>\n",
"</g>\n",
"<!-- _insert_records_query_inputs -->\n",
"<g id=\"node11\" class=\"node\">\n",
"<title>_insert_records_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"399.5,-254 238.5,-254 238.5,-188 399.5,-188 399.5,-254\"/>\n",
"<text text-anchor=\"start\" x=\"278\" y=\"-227.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"366\" y=\"-227.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"254\" y=\"-206.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"366\" y=\"-206.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _insert_records_query_inputs&#45;&gt;insert_records_query -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>_insert_records_query_inputs&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M298.35,-187.99C293.97,-181.2 289.28,-173.94 284.74,-166.89\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"287.44,-164.63 279.08,-158.12 281.56,-168.42 287.44,-164.63\"/>\n",
"</g>\n",
"<!-- _format_spreadsheet_query_inputs -->\n",
"<g id=\"node12\" class=\"node\">\n",
"<title>_format_spreadsheet_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"617.5,-159 456.5,-159 456.5,-93 617.5,-93 617.5,-159\"/>\n",
"<text text-anchor=\"start\" x=\"496\" y=\"-132.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"584\" y=\"-132.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"472\" y=\"-111.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"584\" y=\"-111.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _format_spreadsheet_query_inputs&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge13\" class=\"edge\">\n",
"<title>_format_spreadsheet_query_inputs&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M498.88,-92.82C490.42,-85.68 481.38,-78.05 472.68,-70.71\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"474.72,-67.85 464.82,-64.07 470.2,-73.2 474.72,-67.85\"/>\n",
"</g>\n",
"<!-- _create_header_query_inputs -->\n",
"<g id=\"node13\" class=\"node\">\n",
"<title>_create_header_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"235,-632 17,-632 17,-566 235,-566 235,-632\"/>\n",
"<text text-anchor=\"start\" x=\"50\" y=\"-605.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">timesheet</text>\n",
"<text text-anchor=\"start\" x=\"144\" y=\"-605.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">DataFrame</text>\n",
"<text text-anchor=\"start\" x=\"32\" y=\"-584.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"172.5\" y=\"-584.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _create_header_query_inputs&#45;&gt;create_header_query -->\n",
"<g id=\"edge15\" class=\"edge\">\n",
"<title>_create_header_query_inputs&#45;&gt;create_header_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M126,-565.99C126,-559.65 126,-552.91 126,-546.3\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"129.5,-546.12 126,-536.12 122.5,-546.12 129.5,-546.12\"/>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node14\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"619.5,-702 560.5,-702 560.5,-665 619.5,-665 619.5,-702\"/>\n",
"<text text-anchor=\"middle\" x=\"590\" y=\"-679.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node15\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M530,-702C530,-702 474,-702 474,-702 468,-702 462,-696 462,-690 462,-690 462,-677 462,-677 462,-671 468,-665 474,-665 474,-665 530,-665 530,-665 536,-665 542,-671 542,-677 542,-677 542,-690 542,-690 542,-696 536,-702 530,-702\"/>\n",
"<text text-anchor=\"middle\" x=\"502\" y=\"-679.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7ff26599dcd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"%%cell_to_module google_sheets_module --display display_conf\n",
"# --display enables visualization, it receives the variable `display_conf`\n",
"import pandas as pd\n",
"from google.oauth2.service_account import Credentials\n",
"from googleapiclient.discovery import build, Resource\n",
"\n",
"\n",
"def google_service(credentials: dict) -> Resource:\n",
" \"\"\"Create a Google Resource used to interact with Google Sheets.\"\"\"\n",
" return build('sheets', 'v4', credentials=Credentials.from_service_account_info(credentials))\n",
"\n",
"\n",
"def create_header_query(\n",
" timesheet: pd.DataFrame,\n",
" google_service: Resource,\n",
" spreadsheet_id: str,\n",
") -> dict:\n",
" \"\"\"Query to create the header of our timesheet\"\"\"\n",
" columns = timesheet.columns.to_list()\n",
" header_query = (\n",
" google_service.spreadsheets()\n",
" .values()\n",
" .update(\n",
" spreadsheetId=spreadsheet_id,\n",
" range=\"Sheet1!A1\",\n",
" valueInputOption='RAW',\n",
" body={\"values\": [columns]},\n",
" )\n",
" )\n",
" return header_query.execute()\n",
"\n",
"\n",
"def get_records_query(\n",
" create_header_query: dict,\n",
" google_service: Resource,\n",
" spreadsheet_id: str,\n",
" sheet_id: str,\n",
") -> dict:\n",
" \"\"\"Query to get the existing values in our timesheet.\n",
"\n",
" `create_header_query` is a dependency even though it doesn't \n",
" pass any data because it needs to be executed first to enable\n",
" parsing sheets values into a dataframe.\n",
"\n",
" Parsing the query results in a separate function, facilitates\n",
" debugging by letting us inspect the query response. \n",
" \"\"\"\n",
" query = (\n",
" google_service.spreadsheets()\n",
" .values()\n",
" .get(\n",
" spreadsheetId=spreadsheet_id,\n",
" range=sheet_id,\n",
" majorDimension=\"ROWS\"\n",
" )\n",
" )\n",
" return query.execute()\n",
"\n",
"\n",
"def existing_records(get_records_query: dict) -> pd.DataFrame:\n",
" \"\"\"Parse the existing records into a dataframe.\n",
" The first row contains the header/columns, followed by daily\n",
" timesheet records.\n",
" \"\"\"\n",
" columns = get_records_query[\"values\"][0]\n",
" rows = get_records_query[\"values\"][1:]\n",
" return pd.DataFrame(rows, columns=columns)\n",
" \n",
"\n",
"def incremental_records(\n",
" timesheet: pd.DataFrame,\n",
" existing_records: pd.DataFrame,\n",
" primary_key: str = \"date\",\n",
") -> pd.DataFrame:\n",
" \"\"\"Check the `primary_key` in the current timesheet and the existing records\n",
" and narrow it to new keys. Only return new records, sorted by `date`.\n",
"\n",
" Since `existing_records()` returns JSON-serializable types, we have to make sure\n",
" both dataframes have the same type for the primary key. For example, `timesheet`\n",
" would have column `date: Date`, but `existing_records` will have `date: str`\n",
" \"\"\"\n",
" timesheet[primary_key] = timesheet[primary_key].astype(str)\n",
"\n",
" existing_keys = set(existing_records[primary_key])\n",
" new_keys = set(timesheet[primary_key]).difference(existing_keys)\n",
" return timesheet.loc[timesheet[primary_key].isin(new_keys)].sort_values(\"date\")\n",
"\n",
"\n",
"def insert_records_query(\n",
" incremental_records: pd.DataFrame,\n",
" google_service: Resource,\n",
" spreadsheet_id: str,\n",
" sheet_id: str,\n",
") -> dict:\n",
" \"\"\"Query to insert new timesheet records in Google Sheets.\n",
"\n",
" Google expects row-major entries where each row is an iterable (i.e., list, tuple)\n",
" Therefore, we need to unpack the dictionary. Doing `tuple(record.values())` ensures\n",
" the order of the elements respects the order of the columns specified by `timesheet.columns.to_list()`\n",
" in `create_header_query()`.\n",
" \"\"\"\n",
" # convert Date objects to string to make JSON-serializable\n",
" incremental_records[\"date\"] = incremental_records[\"date\"].astype(str)\n",
" rows = [\n",
" tuple(record.values()) for record\n",
" in incremental_records.to_dict(orient=\"records\")\n",
" ]\n",
" query = (\n",
" google_service.spreadsheets()\n",
" .values()\n",
" .append(\n",
" spreadsheetId=spreadsheet_id,\n",
" range=sheet_id,\n",
" valueInputOption='RAW',\n",
" insertDataOption='OVERWRITE',\n",
" body={\"values\": rows},\n",
" )\n",
" )\n",
" query_results = query.execute()\n",
" return query_results\n",
"\n",
"\n",
"def format_spreadsheet_query(\n",
" insert_records_query: dict,\n",
" google_service: Resource,\n",
" spreadsheet_id: str,\n",
" sheet_id: str,\n",
") -> dict:\n",
" \"\"\"Get all records after updating the timesheet and style rows\n",
" by adding a border whenever the `week_number` changes\n",
" \"\"\"\n",
" # use a node function directly\n",
" records = get_records_query({}, google_service, spreadsheet_id, sheet_id)\n",
"\n",
" requests = []\n",
" current_week = None\n",
" for idx, record in enumerate(records[\"values\"]):\n",
" columns = records[\"values\"][0]\n",
" week_number_idx = columns.index(\"week_number\")\n",
" \n",
" if current_week != record[week_number_idx]:\n",
" current_week = record[week_number_idx]\n",
"\n",
" requests.append({\n",
" \"updateBorders\": {\n",
" \"range\": {\n",
" \"sheetId\": 0,\n",
" \"startRowIndex\": idx,\n",
" \"endRowIndex\": idx+1,\n",
" \"startColumnIndex\": 0,\n",
" \"endColumnIndex\": len(columns),\n",
" },\n",
" \"top\": {\"style\": \"SOLID\", \"width\": 2 }\n",
" }\n",
" })\n",
"\n",
" return google_service.spreadsheets().batchUpdate(spreadsheetId=spreadsheet_id, body={\"requests\": requests}).execute()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now build a new driver that includes both our `timesheet_module` and our `google_sheets_module`. See how the two dataflows connect seamlessly."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"image/svg+xml": [
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
"<!-- Generated by graphviz version 2.43.0 (0)\n",
" -->\n",
"<!-- Title: %3 Pages: 1 -->\n",
"<svg width=\"856pt\" height=\"1034pt\"\n",
" viewBox=\"0.00 0.00 855.50 1033.50\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
"<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 1029.5)\">\n",
"<title>%3</title>\n",
"<polygon fill=\"white\" stroke=\"transparent\" points=\"-4,4 -4,-1029.5 851.5,-1029.5 851.5,4 -4,4\"/>\n",
"<g id=\"clust1\" class=\"cluster\">\n",
"<title>cluster__legend</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"247,-941.5 247,-1017.5 498,-1017.5 498,-941.5 247,-941.5\"/>\n",
"<text text-anchor=\"middle\" x=\"372.5\" y=\"-1002.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">Legend</text>\n",
"</g>\n",
"<!-- mode -->\n",
"<g id=\"node1\" class=\"node\">\n",
"<title>mode</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"54,-993 0,-993 0,-943 60,-943 60,-987 54,-993\"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"54,-993 54,-987 \"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"60,-987 54,-987 \"/>\n",
"<text text-anchor=\"start\" x=\"8\" y=\"-978.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">mode</text>\n",
"<text text-anchor=\"start\" x=\"10\" y=\"-950.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">demo</text>\n",
"</g>\n",
"<!-- incremental_records -->\n",
"<g id=\"node2\" class=\"node\">\n",
"<title>incremental_records</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M400,-253C400,-253 242,-253 242,-253 236,-253 230,-247 230,-241 230,-241 230,-201 230,-201 230,-195 236,-189 242,-189 242,-189 400,-189 400,-189 406,-189 412,-195 412,-201 412,-201 412,-241 412,-241 412,-247 406,-253 400,-253\"/>\n",
"<text text-anchor=\"start\" x=\"241\" y=\"-231.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">incremental_records</text>\n",
"<text text-anchor=\"start\" x=\"282.5\" y=\"-203.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- insert_records_query -->\n",
"<g id=\"node11\" class=\"node\">\n",
"<title>insert_records_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M570,-158C570,-158 408,-158 408,-158 402,-158 396,-152 396,-146 396,-146 396,-106 396,-106 396,-100 402,-94 408,-94 408,-94 570,-94 570,-94 576,-94 582,-100 582,-106 582,-106 582,-146 582,-146 582,-152 576,-158 570,-158\"/>\n",
"<text text-anchor=\"start\" x=\"407\" y=\"-136.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">insert_records_query</text>\n",
"<text text-anchor=\"start\" x=\"476\" y=\"-108.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- incremental_records&#45;&gt;insert_records_query -->\n",
"<g id=\"edge14\" class=\"edge\">\n",
"<title>incremental_records&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M377.41,-188.77C392.31,-180.52 408.51,-171.55 423.84,-163.07\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"425.69,-166.05 432.74,-158.14 422.3,-159.93 425.69,-166.05\"/>\n",
"</g>\n",
"<!-- input_data -->\n",
"<g id=\"node3\" class=\"node\">\n",
"<title>input_data</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M224.5,-1000C224.5,-1000 89.5,-1000 89.5,-1000 83.5,-1000 77.5,-994 77.5,-988 77.5,-988 77.5,-948 77.5,-948 77.5,-942 83.5,-936 89.5,-936 89.5,-936 224.5,-936 224.5,-936 230.5,-936 236.5,-942 236.5,-948 236.5,-948 236.5,-988 236.5,-988 236.5,-994 230.5,-1000 224.5,-1000\"/>\n",
"<text text-anchor=\"start\" x=\"88.5\" y=\"-978.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">input_data: mode</text>\n",
"<text text-anchor=\"start\" x=\"147.5\" y=\"-950.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- intervals -->\n",
"<g id=\"node6\" class=\"node\">\n",
"<title>intervals</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M191,-907C191,-907 123,-907 123,-907 117,-907 111,-901 111,-895 111,-895 111,-855 111,-855 111,-849 117,-843 123,-843 123,-843 191,-843 191,-843 197,-843 203,-849 203,-855 203,-855 203,-895 203,-895 203,-901 197,-907 191,-907\"/>\n",
"<text text-anchor=\"start\" x=\"122\" y=\"-885.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">intervals</text>\n",
"<text text-anchor=\"start\" x=\"146.5\" y=\"-857.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">list</text>\n",
"</g>\n",
"<!-- input_data&#45;&gt;intervals -->\n",
"<g id=\"edge6\" class=\"edge\">\n",
"<title>input_data&#45;&gt;intervals</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M157,-935.94C157,-930 157,-923.7 157,-917.49\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"160.5,-917.23 157,-907.23 153.5,-917.23 160.5,-917.23\"/>\n",
"</g>\n",
"<!-- daily_df -->\n",
"<g id=\"node4\" class=\"node\">\n",
"<title>daily_df</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M263.5,-721C263.5,-721 188.5,-721 188.5,-721 182.5,-721 176.5,-715 176.5,-709 176.5,-709 176.5,-669 176.5,-669 176.5,-663 182.5,-657 188.5,-657 188.5,-657 263.5,-657 263.5,-657 269.5,-657 275.5,-663 275.5,-669 275.5,-669 275.5,-709 275.5,-709 275.5,-715 269.5,-721 263.5,-721\"/>\n",
"<text text-anchor=\"start\" x=\"195.5\" y=\"-699.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">daily_df</text>\n",
"<text text-anchor=\"start\" x=\"187.5\" y=\"-671.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- timesheet -->\n",
"<g id=\"node7\" class=\"node\">\n",
"<title>timesheet</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M265,-628C265,-628 187,-628 187,-628 181,-628 175,-622 175,-616 175,-616 175,-576 175,-576 175,-570 181,-564 187,-564 187,-564 265,-564 265,-564 271,-564 277,-570 277,-576 277,-576 277,-616 277,-616 277,-622 271,-628 265,-628\"/>\n",
"<text text-anchor=\"start\" x=\"186\" y=\"-606.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">timesheet</text>\n",
"<text text-anchor=\"start\" x=\"187.5\" y=\"-578.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- daily_df&#45;&gt;timesheet -->\n",
"<g id=\"edge7\" class=\"edge\">\n",
"<title>daily_df&#45;&gt;timesheet</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M226,-656.94C226,-651 226,-644.7 226,-638.49\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"229.5,-638.23 226,-628.23 222.5,-638.23 229.5,-638.23\"/>\n",
"</g>\n",
"<!-- existing_records -->\n",
"<g id=\"node5\" class=\"node\">\n",
"<title>existing_records</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M393.5,-347C393.5,-347 266.5,-347 266.5,-347 260.5,-347 254.5,-341 254.5,-335 254.5,-335 254.5,-295 254.5,-295 254.5,-289 260.5,-283 266.5,-283 266.5,-283 393.5,-283 393.5,-283 399.5,-283 405.5,-289 405.5,-295 405.5,-295 405.5,-335 405.5,-335 405.5,-341 399.5,-347 393.5,-347\"/>\n",
"<text text-anchor=\"start\" x=\"265.5\" y=\"-325.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">existing_records</text>\n",
"<text text-anchor=\"start\" x=\"291.5\" y=\"-297.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- existing_records&#45;&gt;incremental_records -->\n",
"<g id=\"edge2\" class=\"edge\">\n",
"<title>existing_records&#45;&gt;incremental_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M326.95,-282.85C326.34,-276.56 325.68,-269.85 325.04,-263.27\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"328.5,-262.73 324.04,-253.11 321.53,-263.41 328.5,-262.73\"/>\n",
"</g>\n",
"<!-- intervals_df -->\n",
"<g id=\"node9\" class=\"node\">\n",
"<title>intervals_df</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M271,-814C271,-814 181,-814 181,-814 175,-814 169,-808 169,-802 169,-802 169,-762 169,-762 169,-756 175,-750 181,-750 181,-750 271,-750 271,-750 277,-750 283,-756 283,-762 283,-762 283,-802 283,-802 283,-808 277,-814 271,-814\"/>\n",
"<text text-anchor=\"start\" x=\"180\" y=\"-792.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">intervals_df</text>\n",
"<text text-anchor=\"start\" x=\"187.5\" y=\"-764.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">DataFrame</text>\n",
"</g>\n",
"<!-- intervals&#45;&gt;intervals_df -->\n",
"<g id=\"edge11\" class=\"edge\">\n",
"<title>intervals&#45;&gt;intervals_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M180.55,-842.94C185.6,-836.28 191,-829.15 196.25,-822.23\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"199.07,-824.31 202.32,-814.23 193.49,-820.09 199.07,-824.31\"/>\n",
"</g>\n",
"<!-- timesheet&#45;&gt;incremental_records -->\n",
"<g id=\"edge1\" class=\"edge\">\n",
"<title>timesheet&#45;&gt;incremental_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M220.17,-563.62C210.87,-504.81 198.25,-375.63 246,-283 250.29,-274.67 256.28,-267.08 263.01,-260.29\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"265.64,-262.63 270.57,-253.25 260.87,-257.5 265.64,-262.63\"/>\n",
"</g>\n",
"<!-- create_header_query -->\n",
"<g id=\"node12\" class=\"node\">\n",
"<title>create_header_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M438,-534C438,-534 274,-534 274,-534 268,-534 262,-528 262,-522 262,-522 262,-482 262,-482 262,-476 268,-470 274,-470 274,-470 438,-470 438,-470 444,-470 450,-476 450,-482 450,-482 450,-522 450,-522 450,-528 444,-534 438,-534\"/>\n",
"<text text-anchor=\"start\" x=\"273\" y=\"-512.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">create_header_query</text>\n",
"<text text-anchor=\"start\" x=\"343\" y=\"-484.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- timesheet&#45;&gt;create_header_query -->\n",
"<g id=\"edge17\" class=\"edge\">\n",
"<title>timesheet&#45;&gt;create_header_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M270.01,-563.85C280.81,-556.21 292.48,-547.95 303.63,-540.06\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"305.89,-542.75 312.03,-534.11 301.85,-537.03 305.89,-542.75\"/>\n",
"</g>\n",
"<!-- get_records_query -->\n",
"<g id=\"node8\" class=\"node\">\n",
"<title>get_records_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M522.5,-440C522.5,-440 379.5,-440 379.5,-440 373.5,-440 367.5,-434 367.5,-428 367.5,-428 367.5,-388 367.5,-388 367.5,-382 373.5,-376 379.5,-376 379.5,-376 522.5,-376 522.5,-376 528.5,-376 534.5,-382 534.5,-388 534.5,-388 534.5,-428 534.5,-428 534.5,-434 528.5,-440 522.5,-440\"/>\n",
"<text text-anchor=\"start\" x=\"378.5\" y=\"-418.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">get_records_query</text>\n",
"<text text-anchor=\"start\" x=\"438\" y=\"-390.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- get_records_query&#45;&gt;existing_records -->\n",
"<g id=\"edge5\" class=\"edge\">\n",
"<title>get_records_query&#45;&gt;existing_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M409.7,-375.94C400.07,-368.7 389.72,-360.91 379.77,-353.43\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"381.62,-350.44 371.53,-347.23 377.42,-356.04 381.62,-350.44\"/>\n",
"</g>\n",
"<!-- intervals_df&#45;&gt;daily_df -->\n",
"<g id=\"edge4\" class=\"edge\">\n",
"<title>intervals_df&#45;&gt;daily_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M226,-749.94C226,-744 226,-737.7 226,-731.49\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"229.5,-731.23 226,-721.23 222.5,-731.23 229.5,-731.23\"/>\n",
"</g>\n",
"<!-- google_service -->\n",
"<g id=\"node10\" class=\"node\">\n",
"<title>google_service</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M654.5,-628C654.5,-628 539.5,-628 539.5,-628 533.5,-628 527.5,-622 527.5,-616 527.5,-616 527.5,-576 527.5,-576 527.5,-570 533.5,-564 539.5,-564 539.5,-564 654.5,-564 654.5,-564 660.5,-564 666.5,-570 666.5,-576 666.5,-576 666.5,-616 666.5,-616 666.5,-622 660.5,-628 654.5,-628\"/>\n",
"<text text-anchor=\"start\" x=\"538.5\" y=\"-606.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">google_service</text>\n",
"<text text-anchor=\"start\" x=\"564.5\" y=\"-578.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">Resource</text>\n",
"</g>\n",
"<!-- google_service&#45;&gt;get_records_query -->\n",
"<g id=\"edge9\" class=\"edge\">\n",
"<title>google_service&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M633.23,-563.59C647.96,-547.09 660.93,-525.62 658,-503\"/>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M658,-501C651.31,-449.43 596.31,-426.59 544.69,-416.56\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"545.22,-413.1 534.76,-414.77 543.98,-419.99 545.22,-413.1\"/>\n",
"</g>\n",
"<!-- google_service&#45;&gt;insert_records_query -->\n",
"<g id=\"edge15\" class=\"edge\">\n",
"<title>google_service&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M658,-501C647.43,-419.46 692.24,-390.75 658,-316\"/>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M658,-314C640.08,-278.47 612,-294.81 574,-283 524.08,-267.48 492.09,-295.27 460,-254 440.82,-229.32 450.8,-194.28 464.04,-167.25\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"467.32,-168.54 468.82,-158.05 461.11,-165.31 467.32,-168.54\"/>\n",
"</g>\n",
"<!-- google_service&#45;&gt;create_header_query -->\n",
"<g id=\"edge18\" class=\"edge\">\n",
"<title>google_service&#45;&gt;create_header_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M527.29,-568.39C502.28,-558.84 473.7,-547.93 447.14,-537.79\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"448.05,-534.39 437.46,-534.1 445.55,-540.93 448.05,-534.39\"/>\n",
"</g>\n",
"<!-- format_spreadsheet_query -->\n",
"<g id=\"node13\" class=\"node\">\n",
"<title>format_spreadsheet_query</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M762.5,-64C762.5,-64 553.5,-64 553.5,-64 547.5,-64 541.5,-58 541.5,-52 541.5,-52 541.5,-12 541.5,-12 541.5,-6 547.5,0 553.5,0 553.5,0 762.5,0 762.5,0 768.5,0 774.5,-6 774.5,-12 774.5,-12 774.5,-52 774.5,-52 774.5,-58 768.5,-64 762.5,-64\"/>\n",
"<text text-anchor=\"start\" x=\"552.5\" y=\"-42.8\" font-family=\"Helvetica,sans-Serif\" font-weight=\"bold\" font-size=\"14.00\">format_spreadsheet_query</text>\n",
"<text text-anchor=\"start\" x=\"645\" y=\"-14.8\" font-family=\"Helvetica,sans-Serif\" font-style=\"italic\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- google_service&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge21\" class=\"edge\">\n",
"<title>google_service&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M658,-314C638.13,-272.94 647.29,-141.93 653.67,-74.29\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"657.19,-74.29 654.66,-64 650.22,-73.62 657.19,-74.29\"/>\n",
"</g>\n",
"<!-- insert_records_query&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge20\" class=\"edge\">\n",
"<title>insert_records_query&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M546.22,-93.85C560.94,-85.83 576.91,-77.14 592.05,-68.9\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"593.73,-71.97 600.84,-64.11 590.39,-65.82 593.73,-71.97\"/>\n",
"</g>\n",
"<!-- create_header_query&#45;&gt;get_records_query -->\n",
"<g id=\"edge8\" class=\"edge\">\n",
"<title>create_header_query&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M388.16,-469.85C395.59,-462.66 403.58,-454.92 411.29,-447.46\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"414.12,-449.59 418.87,-440.11 409.25,-444.56 414.12,-449.59\"/>\n",
"</g>\n",
"<!-- _incremental_records_inputs -->\n",
"<g id=\"node14\" class=\"node\">\n",
"<title>_incremental_records_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"564.5,-337.5 423.5,-337.5 423.5,-292.5 564.5,-292.5 564.5,-337.5\"/>\n",
"<text text-anchor=\"start\" x=\"439\" y=\"-310.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">primary_key</text>\n",
"<text text-anchor=\"start\" x=\"531\" y=\"-310.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _incremental_records_inputs&#45;&gt;incremental_records -->\n",
"<g id=\"edge3\" class=\"edge\">\n",
"<title>_incremental_records_inputs&#45;&gt;incremental_records</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M453.44,-292.43C434.04,-282.11 410.3,-269.49 388.3,-257.79\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"389.8,-254.62 379.33,-253.02 386.51,-260.8 389.8,-254.62\"/>\n",
"</g>\n",
"<!-- _get_records_query_inputs -->\n",
"<g id=\"node15\" class=\"node\">\n",
"<title>_get_records_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"629.5,-535 468.5,-535 468.5,-469 629.5,-469 629.5,-535\"/>\n",
"<text text-anchor=\"start\" x=\"508\" y=\"-508.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"596\" y=\"-508.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"484\" y=\"-487.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"596\" y=\"-487.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _get_records_query_inputs&#45;&gt;get_records_query -->\n",
"<g id=\"edge10\" class=\"edge\">\n",
"<title>_get_records_query_inputs&#45;&gt;get_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M514.72,-468.82C507.27,-461.83 499.32,-454.36 491.65,-447.16\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"493.79,-444.37 484.1,-440.07 489,-449.47 493.79,-444.37\"/>\n",
"</g>\n",
"<!-- _intervals_df_inputs -->\n",
"<g id=\"node16\" class=\"node\">\n",
"<title>_intervals_df_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"369,-897.5 221,-897.5 221,-852.5 369,-852.5 369,-897.5\"/>\n",
"<text text-anchor=\"start\" x=\"236\" y=\"-870.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">must_include</text>\n",
"<text text-anchor=\"start\" x=\"335\" y=\"-870.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _intervals_df_inputs&#45;&gt;intervals_df -->\n",
"<g id=\"edge12\" class=\"edge\">\n",
"<title>_intervals_df_inputs&#45;&gt;intervals_df</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M278.65,-852.43C271.79,-843.39 263.6,-832.59 255.75,-822.23\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"258.38,-819.91 249.55,-814.06 252.8,-824.14 258.38,-819.91\"/>\n",
"</g>\n",
"<!-- _google_service_inputs -->\n",
"<g id=\"node17\" class=\"node\">\n",
"<title>_google_service_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"666,-711.5 528,-711.5 528,-666.5 666,-666.5 666,-711.5\"/>\n",
"<text text-anchor=\"start\" x=\"543\" y=\"-684.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">credentials</text>\n",
"<text text-anchor=\"start\" x=\"625\" y=\"-684.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">dict</text>\n",
"</g>\n",
"<!-- _google_service_inputs&#45;&gt;google_service -->\n",
"<g id=\"edge13\" class=\"edge\">\n",
"<title>_google_service_inputs&#45;&gt;google_service</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M597,-666.43C597,-657.93 597,-647.87 597,-638.08\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"600.5,-638.06 597,-628.06 593.5,-638.06 600.5,-638.06\"/>\n",
"</g>\n",
"<!-- _insert_records_query_inputs -->\n",
"<g id=\"node18\" class=\"node\">\n",
"<title>_insert_records_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"629.5,-254 468.5,-254 468.5,-188 629.5,-188 629.5,-254\"/>\n",
"<text text-anchor=\"start\" x=\"508\" y=\"-227.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"596\" y=\"-227.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"484\" y=\"-206.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"596\" y=\"-206.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _insert_records_query_inputs&#45;&gt;insert_records_query -->\n",
"<g id=\"edge16\" class=\"edge\">\n",
"<title>_insert_records_query_inputs&#45;&gt;insert_records_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M528.35,-187.99C523.97,-181.2 519.28,-173.94 514.74,-166.89\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"517.44,-164.63 509.08,-158.12 511.56,-168.42 517.44,-164.63\"/>\n",
"</g>\n",
"<!-- _create_header_query_inputs -->\n",
"<g id=\"node19\" class=\"node\">\n",
"<title>_create_header_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"456.5,-618.5 295.5,-618.5 295.5,-573.5 456.5,-573.5 456.5,-618.5\"/>\n",
"<text text-anchor=\"start\" x=\"311\" y=\"-591.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"423\" y=\"-591.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _create_header_query_inputs&#45;&gt;create_header_query -->\n",
"<g id=\"edge19\" class=\"edge\">\n",
"<title>_create_header_query_inputs&#45;&gt;create_header_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M371.26,-573.2C369.35,-564.43 367.09,-554.02 364.9,-543.94\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"368.29,-543.08 362.75,-534.05 361.45,-544.56 368.29,-543.08\"/>\n",
"</g>\n",
"<!-- _format_spreadsheet_query_inputs -->\n",
"<g id=\"node20\" class=\"node\">\n",
"<title>_format_spreadsheet_query_inputs</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"847.5,-159 686.5,-159 686.5,-93 847.5,-93 847.5,-159\"/>\n",
"<text text-anchor=\"start\" x=\"726\" y=\"-132.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">sheet_id</text>\n",
"<text text-anchor=\"start\" x=\"814\" y=\"-132.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"<text text-anchor=\"start\" x=\"702\" y=\"-111.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">spreadsheet_id</text>\n",
"<text text-anchor=\"start\" x=\"814\" y=\"-111.8\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">str</text>\n",
"</g>\n",
"<!-- _format_spreadsheet_query_inputs&#45;&gt;format_spreadsheet_query -->\n",
"<g id=\"edge22\" class=\"edge\">\n",
"<title>_format_spreadsheet_query_inputs&#45;&gt;format_spreadsheet_query</title>\n",
"<path fill=\"none\" stroke=\"black\" d=\"M728.88,-92.82C720.42,-85.68 711.38,-78.05 702.68,-70.71\"/>\n",
"<polygon fill=\"black\" stroke=\"black\" points=\"704.72,-67.85 694.82,-64.07 700.2,-73.2 704.72,-67.85\"/>\n",
"</g>\n",
"<!-- config -->\n",
"<g id=\"node21\" class=\"node\">\n",
"<title>config</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" points=\"483.5,-986 430.5,-986 430.5,-950 489.5,-950 489.5,-980 483.5,-986\"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"483.5,-986 483.5,-980 \"/>\n",
"<polyline fill=\"none\" stroke=\"black\" points=\"489.5,-980 483.5,-980 \"/>\n",
"<text text-anchor=\"middle\" x=\"460\" y=\"-964.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">config</text>\n",
"</g>\n",
"<!-- input -->\n",
"<g id=\"node22\" class=\"node\">\n",
"<title>input</title>\n",
"<polygon fill=\"#ffffff\" stroke=\"black\" stroke-dasharray=\"5,2\" points=\"412.5,-986.5 353.5,-986.5 353.5,-949.5 412.5,-949.5 412.5,-986.5\"/>\n",
"<text text-anchor=\"middle\" x=\"383\" y=\"-964.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">input</text>\n",
"</g>\n",
"<!-- function -->\n",
"<g id=\"node23\" class=\"node\">\n",
"<title>function</title>\n",
"<path fill=\"#b4d8e4\" stroke=\"black\" d=\"M323,-986.5C323,-986.5 267,-986.5 267,-986.5 261,-986.5 255,-980.5 255,-974.5 255,-974.5 255,-961.5 255,-961.5 255,-955.5 261,-949.5 267,-949.5 267,-949.5 323,-949.5 323,-949.5 329,-949.5 335,-955.5 335,-961.5 335,-961.5 335,-974.5 335,-974.5 335,-980.5 329,-986.5 323,-986.5\"/>\n",
"<text text-anchor=\"middle\" x=\"295\" y=\"-964.3\" font-family=\"Helvetica,sans-Serif\" font-size=\"14.00\">function</text>\n",
"</g>\n",
"</g>\n",
"</svg>\n"
],
"text/plain": [
"<graphviz.graphs.Digraph at 0x7ff2658dacd0>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from hamilton import driver\n",
"\n",
"dr = (\n",
" driver.Builder()\n",
" .with_modules(timesheet_module, google_sheets_module)\n",
" .with_config(dict(mode=\"demo\"))\n",
" .build()\n",
")\n",
"dr.display_all_functions(orient=\"TB\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To execute it, we will need to specify as input the `spreadsheet_id`, `sheet_id`, and `credentials`. After executing the next cell, you should see your Google Sheet updated with your daily timesheet records!"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"records_uploaded=0\n"
]
}
],
"source": [
"inputs = dict(\n",
" spreadsheet_id=..., # add your spreadsheet_id\n",
" sheet_id=\"Sheet1\", # default name in a new spreadsheet\n",
" credentials=..., # add your credentials\n",
")\n",
"results = dr.execute([\"insert_records_query\", \"format_spreadsheet_query\"], inputs=inputs)\n",
"\n",
"records_uploaded = results[\"insert_records_query\"].get(\"updates\", {}).get(\"updatedRows\", 0)\n",
"print(f\"{records_uploaded=:}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# VoilĂ !\n",
"You learned how to use Hamilton to load consolidated data (L), transform it using Hamilton dataflows (T), and upload it to an end-user application like Google Sheets (E). This is what is colloquially called \"Reverse ETL\" because typically scenarios (ETL) where we go from end-user application to well-formed data.\n",
"\n",
"If you're a Timewarrior user, I invited you to read the `README` in full to learn how to use this dataflow as a [Timewarrior extension](https://timewarrior.net/docs/api/), which would allow you to run this script directly from Timewarrior! In my case, `timew upload` uploads my timesheet to Google Sheets!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}