tree: 2ee95e2d094fa27dc0a4c1d878830b4ecccd4a89 [path history] [tgz]
  1. runtime/
  2. sdk/
  3. sinks/
  4. sources/
  5. README.md
core/connectors/README.md

Apache Iggy Connectors

The highly performant and modular runtime for statically typed, yet dynamically loaded connectors. Ingest the data from the external sources and push it further to the Iggy streams, or fetch the data from the Iggy streams and push it further to the external sources. Create your own Rust plugins by simply implementing either the Source or Sink trait and build custom pipelines for the data processing.

The docker image is available, and can be fetched via docker pull apache/iggy-connect.

Features

  • High Performance: Utilizes Rust's performance characteristics to ensure fast data ingestion and egress.
  • Low memory footprint: Designed with memory efficiency in mind, minimizing the memory footprint of the connectors.
  • Modular Design: Designed with modularity in mind, allowing for easy extension and customization.
  • Dynamic Loading: Supports dynamic loading of plugins, enabling seamless integration with various data sources and sinks.
  • Statically Typed: Ensures type safety and compile-time checks, reducing runtime errors.
  • Easy Customization: Provides a simple interface for implementing custom connectors, making it easy to create new plugins.
  • Data transformation: Supports data transformation with the help of existing functions.
  • Powerful configuration: Define your sinks, sources, and transformations in the configuration file or fetch them from a remote HTTP API.
  • Flexible configuration providers: Support for local file-based and HTTP-based configuration providers for centralized configuration management.

Quick Start

  1. Build the project in release mode (or debug, and update the connectors paths in the config accordingly), and make sure that the plugins specified in core/connectors/runtime/example_config/connectors/ directory under path are available. The configuration must be provided in toml format, with files following the {connector_name}_{type}[_v{N}].toml naming convention.

  2. Run docker compose up -d from /examples/rust/src/sink-data-producer which will start the Quickwit server to be used by an example sink connector. At this point, you can access the Quickwit UI at http://localhost:7280 - check this dashboard again later on, after the events index will be created.

  3. Set environment variable IGGY_CONNECTORS_CONFIG_PATH=core/connectors/runtime/example_config/config.toml (adjust the path as needed) pointing to the runtime configuration file.

  4. Start the Iggy server and invoke the following commands via Iggy CLI to create the example streams and topics used by the sample connectors.

    iggy --username iggy --password iggy stream create example_stream
    iggy --username iggy --password iggy topic create example_stream example_topic 1 none 1d
    iggy --username iggy --password iggy stream create qw
    iggy --username iggy --password iggy topic create qw records 1 none 1d
    
  5. Execute cargo run --example sink-data-producer -r which will start the example data producer application, sending the messages to previously created qw stream and records topic (this will be used by the Quickwit sink connector).

  6. Start the connector runtime cargo run --bin iggy-connectors -r - you should be able to browse Quickwit UI with records being constantly added to the events index. At the same time, you should see the new messages being added to the example stream and topic1 topic by the test source connector - you can use Iggy Web UI to browse the data. The messages will have applied the basic fields transformations.

Runtime

All the connectors are implemented as Rust libraries and can be used as a part of the connector runtime. The runtime is responsible for managing the lifecycle of the connectors and providing the necessary infrastructure for the connectors to run. For more information, please refer to the runtime documentation.

Plugin path resolution

The path field in connector configs points to the shared library (.so, .dylib, .dll). The runtime resolves it as follows:

  1. Extension — if the path has no recognized extension, the OS-native one is appended automatically (.so on Linux, .dylib on macOS, .dll on Windows).

  2. Absolute paths — used as-is.

  3. Relative paths — searched in order, returning the first match:

    • the literal relative path (from working directory)
    • directory of the runtime binary (filename only)
    • current working directory (filename only)
    • /usr/lib, /usr/lib64, /lib, /lib64, /usr/local/lib, /usr/local/lib64

Examples:

# Relative — resolved against search dirs; extension appended on Linux
path = "target/release/libiggy_connector_stdout_sink"

# Absolute — used directly
path = "/opt/iggy/plugins/libiggy_connector_stdout_sink.so"

If the library is not found, the runtime logs all searched paths to help diagnose the issue.

Sink

Sinks are responsible for consuming the messages from the configured stream(s) and topic(s) and sending them further to the specified destination. For example, the Quickwit sink connector is responsible for sending the messages to the Quickwit indexer.

Please refer to the Sink documentation for the details about the configuration and the sample implementation.

When implementing Sink, make sure to use the sink_connector! macro to expose the FFI interface and allow the connector runtime to register the sink with the runtime. The macro also exports the connector‘s version (from Cargo.toml) which is reported in the runtime’s /stats endpoint. Each sink should have its own, custom configuration, which is passed along with the unique plugin ID via expected new() method.

Available Sinks

  • Elasticsearch Sink - sends messages to Elasticsearch indices
  • Iceberg Sink - writes data to Apache Iceberg tables via REST catalog
  • PostgreSQL Sink - stores messages in PostgreSQL database tables
  • Quickwit Sink - indexes messages in Quickwit search engine
  • Stdout Sink - prints messages to standard output (useful for debugging/development)

Source

Sources are responsible for producing the messages to the configured stream(s) and topic(s). For example, the Test source connector will generate the random messages that will be then sent to the configured stream and topic.

Please refer to the Source documentation for the details about the configuration and the sample implementation.

Available Sources

  • Elasticsearch Source - polls documents from Elasticsearch indices
  • PostgreSQL Source - reads rows from PostgreSQL tables with multiple consumption strategies (delete after read, mark as processed, timestamp tracking)
  • Random Source - generates random test messages (useful for testing/development)

Building the connectors

New connector can be built simply by implementing either Sink or Source trait. Please check the sink or source documentation, as well as the existing examples under /sinks and /sources directories.

Transformations

Field transformations (depending on the supported payload formats) can be applied to the messages either before they are sent to the specified topic (e.g. when produced by the source connectors), or before consumed by the sink connectors. To add the new transformation, simply implement the Transform trait and extend the existing load function. Each transform may have its own, custom configuration.

To find out more about the transforms, stream decoders or encoders, please refer to the SDK documentation.