Beam’s new Managed API streamlines how you use existing I/Os, offering both simplicity and powerful enhancements. I/Os are now configured through a lightweight, consistent interface: a simple configuration map with a unified API that spans multiple connectors.
With Managed I/O, runners gain deeper insight into each I/O’s structure and intent. This allows the runner to optimize performance, adjust behavior dynamically, or even replace the I/O with a more efficient or updated implementation behind the scenes.
For example, the DataflowRunner can seamlessly upgrade a Managed transform to its latest SDK version, automatically applying bug fixes and new features (no manual updates or user intervention required!)
The Managed API is directly accessible through the Java and Python SDKs.
Additionally, some SDKs use the Managed API internally. For example, the Iceberg connector used in Beam YAML and Beam SQL is invoked via the Managed API under the hood.
Note: required configuration fields are bolded.
KAFKA WriteKAFKA ReadICEBERG ReadICEBERG Writefootruncate(foo, N)bucket(foo, N)hour(foo)day(foo)month(foo)year(foo)void(foo)For more information on partition transforms, please visit https://iceberg.apache.org/spec/#partition-transforms. table_properties map[str, str] Iceberg table properties to be set on the table when it is created. For more information on table properties, please visit https://iceberg.apache.org/docs/latest/configuration/#table-properties. triggering_frequency_seconds int32 For a streaming pipeline, sets the frequency at which snapshots are produced.
ICEBERG_CDC ReadBIGQUERY ReadBIGQUERY WritePOSTGRES ReadPOSTGRES WriteMYSQL ReadMYSQL WriteSQLSERVER ReadSQLSERVER Write