blob: 03ff203f03a52dde36917d6faafc68d1aa6cfcfe [file] [log] [blame]
<!DOCTYPE html><html class="default" lang="en"><head><meta charSet="utf-8"/><meta http-equiv="x-ua-compatible" content="IE=edge"/><title>apache-beam</title><meta name="description" content="Documentation for apache-beam"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="stylesheet" href="assets/style.css"/><link rel="stylesheet" href="assets/highlight.css"/><script async src="assets/search.js" id="search-script"></script></head><body><script>document.documentElement.dataset.theme = localStorage.getItem("tsd-theme") || "os"</script><header class="tsd-page-toolbar">
<div class="tsd-toolbar-contents container">
<div class="table-cell" id="tsd-search" data-base=".">
<div class="field"><label for="tsd-search-field" class="tsd-widget tsd-toolbar-icon search no-caption"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><path d="M15.7824 13.833L12.6666 10.7177C12.5259 10.5771 12.3353 10.499 12.1353 10.499H11.6259C12.4884 9.39596 13.001 8.00859 13.001 6.49937C13.001 2.90909 10.0914 0 6.50048 0C2.90959 0 0 2.90909 0 6.49937C0 10.0896 2.90959 12.9987 6.50048 12.9987C8.00996 12.9987 9.39756 12.4863 10.5008 11.6239V12.1332C10.5008 12.3332 10.5789 12.5238 10.7195 12.6644L13.8354 15.7797C14.1292 16.0734 14.6042 16.0734 14.8948 15.7797L15.7793 14.8954C16.0731 14.6017 16.0731 14.1267 15.7824 13.833ZM6.50048 10.499C4.29094 10.499 2.50018 8.71165 2.50018 6.49937C2.50018 4.29021 4.28781 2.49976 6.50048 2.49976C8.71001 2.49976 10.5008 4.28708 10.5008 6.49937C10.5008 8.70852 8.71314 10.499 6.50048 10.499Z" fill="var(--color-text)"></path></svg></label><input type="text" id="tsd-search-field" aria-label="Search"/></div>
<div class="field">
<div id="tsd-toolbar-links"></div></div>
<ul class="results">
<li class="state loading">Preparing search index...</li>
<li class="state failure">The search index is not available</li></ul><a href="index.html" class="title">apache-beam</a></div>
<div class="table-cell" id="tsd-widgets"><a href="#" class="tsd-widget tsd-toolbar-icon menu no-caption" data-toggle="menu" aria-label="Menu"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><rect x="1" y="3" width="14" height="2" fill="var(--color-text)"></rect><rect x="1" y="7" width="14" height="2" fill="var(--color-text)"></rect><rect x="1" y="11" width="14" height="2" fill="var(--color-text)"></rect></svg></a></div></div></header>
<div class="container container-main">
<div class="col-8 col-content">
<div class="tsd-page-title">
<h2>apache-beam</h2></div>
<div class="tsd-panel tsd-typography"><!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<a href="#typescript-beam-sdk" id="typescript-beam-sdk" style="color: inherit; text-decoration: none;">
<h1>TypeScript Beam SDK</h1>
</a>
<p>A library for writing <a href="https://beam.apache.org/">Apache Beam</a>
pipelines in Typescript.</p>
<p>As well as being a fully-functioning SDK, it serves as a cleaner, more modern
template for building SDKs in other languages
(see README-dev.md for more details).</p>
<a href="#getting-started" id="getting-started" style="color: inherit; text-decoration: none;">
<h2>Getting started</h2>
</a>
<p>The Typescript SDK can be installed with</p>
<pre><code><span class="hl-0">npm</span><span class="hl-1"> </span><span class="hl-0">install</span><span class="hl-1"> </span><span class="hl-0">apache_beam</span>
</code></pre>
<p>Due to its extensive use of cross-language transforms, it is recommended that
Python 3 and Java be available on the system as well.</p>
<p>A fully working setup is provided as a clonable
<a href="https://github.com/apache/beam-starter-typescript">starter project on github</a>.</p>
<a href="#running-a-pipeline" id="running-a-pipeline" style="color: inherit; text-decoration: none;">
<h3>Running a pipeline</h3>
</a>
<p>Beam pipelines can be run on a variety of
<a href="https://beam.apache.org/documentation/#runners">runners</a>.
The typical way to create a runner is with
<code>beam.runners.runner.create_runner({runner: &quot;runnerType&quot;, ...})</code>,
as seen in the <a href="https://github.com/apache/beam/blob/master/sdks/typescript/src/apache_beam/examples/wordcount.ts">wordcount example</a>.</p>
<p>After building, to run locally one can execute:</p>
<pre><code><span class="hl-0">node</span><span class="hl-1"> </span><span class="hl-0">path</span><span class="hl-1">/</span><span class="hl-0">to</span><span class="hl-1">/</span><span class="hl-0">main</span><span class="hl-1">.</span><span class="hl-0">js</span><span class="hl-1"> --</span><span class="hl-0">runner</span><span class="hl-1">=</span><span class="hl-0">direct</span>
</code></pre>
<p>To run against Flink, where the local infrastructure is automatically
downloaded and set up:</p>
<pre><code><span class="hl-0">node</span><span class="hl-1"> </span><span class="hl-0">path</span><span class="hl-1">/</span><span class="hl-0">to</span><span class="hl-1">/</span><span class="hl-0">main</span><span class="hl-1">.</span><span class="hl-0">js</span><span class="hl-1"> --</span><span class="hl-0">runner</span><span class="hl-1">=</span><span class="hl-0">flink</span>
</code></pre>
<p>To run on Dataflow:</p>
<pre><code><span class="hl-0">node</span><span class="hl-1"> </span><span class="hl-0">path</span><span class="hl-1">/</span><span class="hl-0">to</span><span class="hl-1">/</span><span class="hl-0">main</span><span class="hl-1">.</span><span class="hl-0">js</span><span class="hl-1"> \</span><br/><span class="hl-1"> --</span><span class="hl-0">runner</span><span class="hl-1">=</span><span class="hl-0">dataflow</span><span class="hl-1"> \</span><br/><span class="hl-1"> --</span><span class="hl-0">project</span><span class="hl-1">=</span><span class="hl-0">$</span><span class="hl-1">{</span><span class="hl-2">PROJECT_ID</span><span class="hl-1">} \</span><br/><span class="hl-1"> --</span><span class="hl-0">tempLocation</span><span class="hl-1">=</span><span class="hl-3">gs</span><span class="hl-1">:</span><span class="hl-4">//${GCS_BUCKET}/wordcount-js/temp --region=${REGION}</span>
</code></pre>
<a href="#api" id="api" style="color: inherit; text-decoration: none;">
<h2>API</h2>
</a>
<p>We generally try to apply the concepts from the Beam API in a TypeScript
idiomatic way, but it should be noted that few of the initial developers
have extensive (if any) JavaScript/TypeScript development experience, so
feedback is greatly appreciated.</p>
<p>In addition, some notable departures are taken from the traditional SDKs:</p>
<ul>
<li><p>We take a &quot;relational foundations&quot; approach, where
<a href="https://docs.google.com/document/d/1tnG2DPHZYbsomvihIpXruUmQ12pHGK0QIvXS1FOTgRc/edit#heading=h.puuotbien1gf">schema&#39;d data</a>
is the primary way to interact with data, and we generally eschew the key-value
requiring transforms in favor of a more flexible approach naming fields or
expressions. JavaScript&#39;s native Object is used as the row type.</p>
</li>
<li><p>As part of being schema-first we also de-emphasize Coders as a first-class
concept in the SDK, relegating it to an advanced feature used for interop.
Though we can infer schemas from individual elements, it is still TBD to
figure out if/how we can leverage the type system and/or function introspection
to regularly infer schemas at construction time. A fallback coder using BSON
encoding is used when we don&#39;t have sufficient type information.</p>
</li>
<li><p>We have added additional methods to the PCollection object, notably <code>map</code>
and <code>flatmap</code>, <a href="https://www.mail-archive.com/dev@beam.apache.org/msg06035.html">rather than only allowing apply</a>.
In addition, <code>apply</code> can accept a function argument <code>(PCollection) =&gt; ...</code> as
well as a PTransform subclass, which treats this callable as if it were a
PTransform&#39;s expand.</p>
</li>
<li><p>In the other direction, we have eliminated the
<a href="https://s.apache.org/no-beam-pipeline">problematic Pipeline object</a>
from the API, instead providing a <code>Root</code> PValue on which pipelines are built,
and invoking run() on a Runner. We offer a less error-prone <code>Runner.run</code>
which finishes only when the pipeline is completely finished as well as
<code>Runner.runAsync</code> which returns a handle to the running pipeline.</p>
</li>
<li><p>Rather than introduce PCollectionTuple, PCollectionList, etc. we let PValue
literally be an
<a href="https://github.com/robertwb/beam-javascript/blob/de4390dd767f046903ac23fead5db333290462db/sdks/node-ts/src/apache_beam/pvalue.ts#L116">array or object with PValue values</a>
which transforms can consume or produce.
These are applied by wrapping them with the <code>P</code> operator, e.g.
<code>P([pc1, pc2, pc3]).apply(new Flatten())</code>.</p>
</li>
<li><p>Like Python, <code>flatMap</code> and <code>ParDo.process</code> return multiple elements by
yielding them from a generator, rather than invoking a passed-in callback.
TBD how to output to multiple distinct PCollections.
There is currently an operation to split a PCollection into multiple
PCollections based on the properties of the elements, and
we may consider using a callback for side outputs.</p>
</li>
<li><p>The <code>map</code>, <code>flatMap</code>, and <code>ParDo.process</code> methods take an additional
(optional) context argument, which is similar to the keyword arguments
used in Python. These are javascript objects whose members may be constants
(which are passed as is) or special DoFnParam objects which provide getters to
element-specific information (such as the current timestamp, window,
or side input) at runtime.</p>
</li>
<li><p>Rather than introduce multiple-output complexity into the map/do operations
themselves, producing multiple outputs is done by following with a new
<code>Split</code> primitive that takes a
<code>PCollection&lt;{a?: AType, b: BType, ... }&gt;</code> and produces an object
<code>{a: PCollection&lt;AType&gt;, b: PCollection&lt;BType&gt;, ...}</code>.</p>
</li>
<li><p>JavaScript supports (and encourages) an asynchronous programing model, with
many libraries requiring use of the async/await paradigm.
As there is no way (by design) to go from the asynchronous style back to
the synchronous style, this needs to be taken into account
when designing the API.
We currently offer asynchronous variants of <code>PValue.apply(...)</code> (in addition
to the synchronous ones, as they are easier to chain) as well as making
<code>Runner.run</code> asynchronous. TBD to do this for all user callbacks as well.</p>
</li>
</ul>
<p>An example pipeline can be found at <a href="https://github.com/apache/beam/blob/master/sdks/typescript/src/apache_beam/examples/wordcount.ts">https://github.com/apache/beam/blob/master/sdks/typescript/src/apache_beam/examples/wordcount.ts</a>
and more documentation can be found in the <a href="https://beam.apache.org/documentation/programming-guide/">beam programming guide</a>.</p>
</div></div>
<div class="col-4 col-menu menu-sticky-wrap menu-highlight">
<div class="tsd-navigation settings">
<details class="tsd-index-accordion"><summary class="tsd-accordion-summary">
<h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><path d="M4.93896 8.531L12 15.591L19.061 8.531L16.939 6.409L12 11.349L7.06098 6.409L4.93896 8.531Z" fill="var(--color-text)"></path></svg> Settings</h3></summary>
<div class="tsd-accordion-details">
<div class="tsd-filter-visibility">
<h4 class="uppercase">Member Visibility</h4><form>
<ul id="tsd-filter-options">
<li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-protected" name="protected"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Protected</span></label></li>
<li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-private" name="private"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Private</span></label></li>
<li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-inherited" name="inherited" checked/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Inherited</span></label></li>
<li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-external" name="external"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>External</span></label></li></ul></form></div>
<div class="tsd-theme-toggle">
<h4 class="uppercase">Theme</h4><select id="theme"><option value="os">OS</option><option value="light">Light</option><option value="dark">Dark</option></select></div></div></details></div>
<nav class="tsd-navigation primary">
<details class="tsd-index-accordion" open><summary class="tsd-accordion-summary">
<h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><path d="M4.93896 8.531L12 15.591L19.061 8.531L16.939 6.409L12 11.349L7.06098 6.409L4.93896 8.531Z" fill="var(--color-text)"></path></svg> Modules</h3></summary>
<div class="tsd-accordion-details">
<ul>
<li class="current selected"><a href="modules.html">apache-<wbr/>beam</a>
<ul>
<li class="tsd-kind-module"><a href="modules/coders_coders.html">coders/coders</a></li>
<li class="tsd-kind-module"><a href="modules/coders_js_coders.html">coders/js_<wbr/>coders</a></li>
<li class="tsd-kind-module"><a href="modules/coders_required_coders.html">coders/required_<wbr/>coders</a></li>
<li class="tsd-kind-module"><a href="modules/coders_row_coder.html">coders/row_<wbr/>coder</a></li>
<li class="tsd-kind-module"><a href="modules/coders_standard_coders.html">coders/standard_<wbr/>coders</a></li>
<li class="tsd-kind-module"><a href="modules/index.html">index</a></li>
<li class="tsd-kind-module"><a href="modules/io_avroio.html">io/avroio</a></li>
<li class="tsd-kind-module"><a href="modules/io_avroio-1.html">io/avroio</a></li>
<li class="tsd-kind-module"><a href="modules/io_bigqueryio.html">io/bigqueryio</a></li>
<li class="tsd-kind-module"><a href="modules/io_bigqueryio-1.html">io/bigqueryio</a></li>
<li class="tsd-kind-module"><a href="modules/io_kafka.html">io/kafka</a></li>
<li class="tsd-kind-module"><a href="modules/io_kafka-1.html">io/kafka</a></li>
<li class="tsd-kind-module"><a href="modules/io_parquetio.html">io/parquetio</a></li>
<li class="tsd-kind-module"><a href="modules/io_parquetio-1.html">io/parquetio</a></li>
<li class="tsd-kind-module"><a href="modules/io_pubsub.html">io/pubsub</a></li>
<li class="tsd-kind-module"><a href="modules/io_pubsub-1.html">io/pubsub</a></li>
<li class="tsd-kind-module"><a href="modules/io_pubsublite.html">io/pubsublite</a></li>
<li class="tsd-kind-module"><a href="modules/io_pubsublite-1.html">io/pubsublite</a></li>
<li class="tsd-kind-module"><a href="modules/io_schemaio.html">io/schemaio</a></li>
<li class="tsd-kind-module"><a href="modules/io_schemaio-1.html">io/schemaio</a></li>
<li class="tsd-kind-module"><a href="modules/io_textio.html">io/textio</a></li>
<li class="tsd-kind-module"><a href="modules/io_textio-1.html">io/textio</a></li>
<li class="tsd-kind-module"><a href="modules/options_pipeline_options.html">options/pipeline_<wbr/>options</a></li>
<li class="tsd-kind-module"><a href="modules/pvalue.html">pvalue</a></li>
<li class="tsd-kind-module"><a href="modules/runners.html">runners</a></li>
<li class="tsd-kind-module"><a href="modules/runners-1.html">runners</a></li>
<li class="tsd-kind-module"><a href="modules/runners_dataflow.html">runners/dataflow</a></li>
<li class="tsd-kind-module"><a href="modules/runners_dataflow-1.html">runners/dataflow</a></li>
<li class="tsd-kind-module"><a href="modules/runners_direct_runner.html">runners/direct_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_direct_runner-1.html">runners/direct_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_flink.html">runners/flink</a></li>
<li class="tsd-kind-module"><a href="modules/runners_flink-1.html">runners/flink</a></li>
<li class="tsd-kind-module"><a href="modules/runners_portable_runner_runner.html">runners/portable_<wbr/>runner/runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_portable_runner_runner-1.html">runners/portable_<wbr/>runner/runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_runner.html">runners/runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_runner-1.html">runners/runner</a></li>
<li class="tsd-kind-module"><a href="modules/runners_universal.html">runners/universal</a></li>
<li class="tsd-kind-module"><a href="modules/runners_universal-1.html">runners/universal</a></li>
<li class="tsd-kind-module"><a href="modules/serialization.html">serialization</a></li>
<li class="tsd-kind-module"><a href="modules/testing_assert.html">testing/assert</a></li>
<li class="tsd-kind-module"><a href="modules/testing_assert-1.html">testing/assert</a></li>
<li class="tsd-kind-module"><a href="modules/testing_multi_pipeline_runner.html">testing/multi_<wbr/>pipeline_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/testing_multi_pipeline_runner-1.html">testing/multi_<wbr/>pipeline_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/testing_proto_printing_runner.html">testing/proto_<wbr/>printing_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/testing_proto_printing_runner-1.html">testing/proto_<wbr/>printing_<wbr/>runner</a></li>
<li class="tsd-kind-module"><a href="modules/transforms.html">transforms</a></li>
<li class="tsd-kind-module"><a href="modules/transforms-1.html">transforms</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_combiners.html">transforms/combiners</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_combiners-1.html">transforms/combiners</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_create.html">transforms/create</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_create-1.html">transforms/create</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_external.html">transforms/external</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_external-1.html">transforms/external</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_flatten.html">transforms/flatten</a>
<ul>
<li class="tsd-kind-namespace tsd-parent-kind-module"><a href="modules/transforms_flatten.flatten.html">flatten</a></li></ul></li>
<li class="tsd-kind-module"><a href="modules/transforms_flatten-1.html">transforms/flatten</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_group_and_combine.html">transforms/group_<wbr/>and_<wbr/>combine</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_group_and_combine-1.html">transforms/group_<wbr/>and_<wbr/>combine</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_internal.html">transforms/internal</a>
<ul>
<li class="tsd-kind-namespace tsd-parent-kind-module"><a href="modules/transforms_internal.combinePerKey.html">combine<wbr/>Per<wbr/>Key</a></li>
<li class="tsd-kind-namespace tsd-parent-kind-module"><a href="modules/transforms_internal.groupByKey.html">group<wbr/>By<wbr/>Key</a></li>
<li class="tsd-kind-namespace tsd-parent-kind-module"><a href="modules/transforms_internal.impulse.html">impulse</a></li></ul></li>
<li class="tsd-kind-module"><a href="modules/transforms_internal-1.html">transforms/internal</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_pardo.html">transforms/pardo</a>
<ul>
<li class="tsd-kind-namespace tsd-parent-kind-module"><a href="modules/transforms_pardo.parDo.html">par<wbr/>Do</a></li></ul></li>
<li class="tsd-kind-module"><a href="modules/transforms_pardo-1.html">transforms/pardo</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_python.html">transforms/python</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_python-1.html">transforms/python</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_sql.html">transforms/sql</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_sql-1.html">transforms/sql</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_transform.html">transforms/transform</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_transform-1.html">transforms/transform</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_utils.html">transforms/utils</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_utils-1.html">transforms/utils</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_window.html">transforms/window</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_window-1.html">transforms/window</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_windowings.html">transforms/windowings</a></li>
<li class="tsd-kind-module"><a href="modules/transforms_windowings-1.html">transforms/windowings</a></li>
<li class="tsd-kind-module"><a href="modules/utils_service.html">utils/service</a></li>
<li class="tsd-kind-module"><a href="modules/values.html">values</a></li></ul></li></ul></div></details></nav></div></div>
<div class="container tsd-generator">
<p>Generated using <a href="https://typedoc.org/" target="_blank">TypeDoc</a></p></div>
<div class="overlay"></div><script src="assets/main.js"></script></body></html>