[BAHIR-66] Switch to Java binding for ZeroMQ

Initially, I just wanted to implement integration test for BAHIR-66.
Google pointed me to JeroMQ, which provides official ZeroMQ binding
for Java and does not require native libraries. I have decided to give
it a try, but quickly realized that akka-zeromq module (transient
dependency from current Bahir master) is not compatible with JeroMQ.
Actually Akka team also wanted to move to JeroMQ (akka/akka#13856),
but in the end decided to remove akka-zeromq project completely
(akka/akka#15864, https://www.lightbend.com/blog/akka-roadmap-update-2014).

Having in mind that akka-zeromq does not support latest version of ZeroMQ
protocol and further development may come delayed, I have decided to refactor
streaming-zeromq implementation and leverage JeroMQ. With the change we receive
various benefits, such as support for PUB-SUB and PUSH-PULL messaging patterns
and the ability to bind the socket on whatever end of communication channel
(see test cases), subscription to multiple channels, etc. JeroMQ seems pretty
reliable and reconnection is handled out-of-the-box. Actually, we could even
start the ZeroMQ subscriber trying to connect to remote socket before other
end created and bound the socket. While I tried to preserve backward compatibility
of method signatures, there was no easy way to support Akka API and business
logic that users could put there (e.g. akka.actor.ActorSystem).

Closes #71
10 files changed
tree: 495e5483426afa770842a780386a1fd43b62aee7
  1. bin/
  2. dev/
  3. distribution/
  4. sql-cloudant/
  5. sql-streaming-akka/
  6. sql-streaming-mqtt/
  7. streaming-akka/
  8. streaming-mqtt/
  9. streaming-pubsub/
  10. streaming-twitter/
  11. streaming-zeromq/
  12. .gitattributes
  13. .gitignore
  14. .travis.yml
  15. LICENSE
  16. NOTICE
  17. pom.xml
  18. README.md
  19. scalastyle-config.xml
README.md

Apache Bahir

Apache Bahir provides extensions to distributed analytics platforms such as Apache Spark & Apache Flink.

http://bahir.apache.org/

Apache Bahir origins

The Initial Bahir source code (see issue BAHIR-1) containing the source for the Apache Spark streaming connectors for akka, mqtt, twitter, zeromq extracted from Apache Spark revision 8301fad (before the deletion of the streaming connectors akka, mqtt, twitter, zeromq).

Source code structure

Source code folder structure:

- streaming-akka
  - examples/src/main/...
  - src/main/...
- streaming-mqtt
  - examples
  - src
  - python
- ...

Building Bahir

Bahir is built using Apache Maven. To build Bahir and its example programs, run:

mvn -DskipTests clean install

Running tests

Testing first requires building Bahir. Once Bahir is built, tests can be run using:

mvn test

Example programs

Each extension currently available in Apache Bahir has an example application located under the “examples” folder.

Documentation

Currently, each submodule has its own README.md, with information on example usages and API.

Furthermore, to generate scaladocs for each module:

$ mvn package

Scaladocs is generated in, MODULE_NAME/target/site/scaladocs/index.html. __ Where MODULE_NAME is one of, sql-streaming-mqtt, streaming-akka, streaming-mqtt, streaming-zeromq, streaming-twitter. __

A note about Apache Spark integration

Currently, each module in Bahir is available through spark packages. Please follow linking sub section in module specific README.md for more details.