tree: e6a339b96376b74b7cc64e8a7fd27a35d0bc4c78 [path history] [tgz]
  1. src/
  2. build.gradle
  3. README.md
sdks/java/io/sparkreceiver/2/README.md

SparkReceiverIO

SparkReceiverIO provides I/O transforms to read messages from an Apache Spark Receiver org.apache.spark.streaming.receiver.Receiver as an unbounded source.

Prerequistes

SparkReceiverIO supports Spark Receivers (Spark version 2.4).

  1. Corresponding Spark Receiver should implement HasOffset interface.
  2. Records should have the numeric field that represents record offset. Example: RecordId field for Salesforce and vid field for Hubspot Receivers. For more details please see GetOffsetUtils class from CDAP plugins examples.

Adding support for a new Spark Receiver

To add SparkReceiverIO support for a new Spark Receiver, perform the following steps:

  1. Add Spark Receiver to the Maven Central repository (see Sonatype publishing guidelines). Example: Hubspot CDAP plugin Maven repository.
  2. Add Spark Receiver Maven dependency to the build.gradle file. Example: implementation "io.cdap:hubspot-plugins:1.0.0".
  3. Implement function that will define how to get Long offset from the record of the Spark Receiver. Example: see GetOffsetUtils class from CDAP plugins examples.
  4. Construct ReceiverBuilder object by passing class of record that you want to read (e.g. String) and your Spark Receiver class name (dependency from step 2). Example:
       ReceiverBuilder<String, HubspotReceiver> receiverBuilder =
       new ReceiverBuilder<>(HubspotReceiver.class).withConstructorArgs();
    
  5. Use your Spark Receiver with SparkReceiverIO:
    1. Pass correct getOffsetFn (from step 3) and correct ReceiverBuilder (from step 4). Example:
    SparkReceiverIO.Read<V> reader =
             SparkReceiverIO.<V>read()
                 .withGetOffsetFn(getOffsetFn)
                 .withSparkReceiverBuilder(receiverBuilder);
    

To learn more, please check out CDAP Streaming plugins complete examples where Spark Receivers are used.

Dependencies

To use SparkReceiverIO, add a dependency on beam-sdks-java-io-sparkreceiver.

<dependency>
    <groupId>org.apache.beam</groupId>
    <artifactId>beam-sdks-java-io-sparkreceiver</artifactId>
    <version>...</version>
</dependency>

Documentation

The documentation and usage examples are maintained in JavaDoc for SparkReceiverIO class.