blob: 5a124aa3ccb40d60c8af28d2dbd2c5c2f563184e [file] [log] [blame]
<!--
~ Licensed to the Apache Software Foundation (ASF) under one
~ or more contributor license agreements. See the NOTICE file
~ distributed with this work for additional information
~ regarding copyright ownership. The ASF licenses this file
~ to you under the Apache License, Version 2.0 (the
~ "License"); you may not use this file except in compliance
~ with the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<html>
<h2>Event Time Triggers</h2>
<p>
When collecting and grouping data into windows, Beam uses triggers to determine when to emit the
aggregated results of each window (referred to as a pane). If you use Beam’s default windowing
configuration and default trigger, Beam outputs the aggregated result when it estimates all data
has arrived, and discards all subsequent data for that window.
</p>
<p>
You can set triggers for your PCollections to change this default behavior. Beam provides a
number of pre-built triggers that you can set:
</p>
<div>
<ul>
<li>Event time triggers</li>
<li>Processing time triggers</li>
<li>Data-driven triggers</li>
<li>Composite triggers</li>
</ul>
</div>
<p>
Event time triggers operate on the event time, as indicated by the timestamp on each data
element. Beam’s default trigger is event time-based.
</p>
<p>
The AfterWatermark trigger operates on event time. The AfterWatermark trigger emits the contents
of a window after the watermark passes the end of the window, based on the timestamps attached
to the data elements. The watermark is a global progress metric, and is Beam’s notion of input
completeness within your pipeline at any given point. AfterWatermark.pastEndOfWindow() only fires
when the watermark passes the end of the window.
</p>
<p>
<b>Kata:</b> Given that events are being generated every second, please implement a trigger that
emits the number of events count within a fixed window of 5-second duration.
</p>
<br>
<div class="hint">
Use <a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/windowing/FixedWindows.html">
FixedWindows</a> with 5-second duration using
<a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/windowing/AfterWatermark.html#pastEndOfWindow--">
AfterWatermark.pastEndOfWindow()</a> trigger.
</div>
<div class="hint">
Set the <a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/windowing/Window.html#withAllowedLateness-org.joda.time.Duration-">
allowed lateness</a> to 0 with
<a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/windowing/Window.html#discardingFiredPanes--">
discarding accumulation mode</a>.
</div>
<div class="hint">
Use <a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/Combine.html#globally-org.apache.beam.sdk.transforms.CombineFnBase.GlobalCombineFn-">
Combine.globally</a> and
<a href="https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/transforms/Count.html#combineFn--">
Count.combineFn</a> to calculate the count of events.
</div>
<div class="hint">
Refer to the Beam Programming Guide
<a href="https://beam.apache.org/documentation/programming-guide/#event-time-triggers">
"Event time triggers"</a> section for more information.
</div>
</html>