You can use Beam's windowing semantics in two ways:
PCollections
before passing them to a BeamSql
transformPCollections
Triggering can only be used by setting it on your input PCollections
; there are no SQL extensions for specifying triggering.
This section covers the use of SQL extensions to directly apply windowing.
Beam SQL supports windowing functions specified in GROUP BY
clause. TIMESTAMP
field is required in this case. It is used as event timestamp for rows.
Supported windowing functions:
TUMBLE
, or fixed windows. Example of how define a fixed window with duration of 1 hour:SELECT f_int, COUNT(*) FROM PCOLLECTION GROUP BY f_int, TUMBLE(f_timestamp, INTERVAL '1' HOUR)
HOP
, or sliding windows. Example of how to define a sliding windows for every 30 minutes with 1 hour duration:SELECT f_int, COUNT(*) FROM PCOLLECTION GROUP BY f_int, HOP(f_timestamp, INTERVAL '30' MINUTE, INTERVAL '1' HOUR)
SESSION
, session windows. Example of how to define a session window with 5 minutes gap duration:SELECT f_int, COUNT(*) FROM PCOLLECTION GROUP BY f_int, SESSION(f_timestamp, INTERVAL '5' MINUTE)
Note: when no windowing function is specified in the query, then windowing strategy of the input PCollections
is unchanged by the SQL query. If windowing function is specified in the query, then the windowing function of the PCollection
is updated accordingly, but trigger stays unchanged.