Eagle Declarative Streaming DSL

DSL Format

{
	config {
	  config.key = configValue
	}

	schema {
	  metricStreamSchema {
	    metric: string
	    value: double
	    timestamp: long
	  }
	}

	dataflow {
	  kafkaSource.source1 {
	    schema = "metricStreamSchema"
	  }
	  kafkaSource.source2 {
	    schema = {
	      metric: string
	      value: double
	      timestamp: long
	    }
	  }
	}
}

Usage

val pipeline = Pipeline.parseResource("pipeline.conf")
val stream = Pipeline.compile(pipeline)
stream.submit[storm]

Features

  • [x] Compile DSL Configure to Pipeline model
  • [x] Compile Pipeline model to Stream Execution Graph
  • [x] Submit Stream Execution Graph to actual running environment say storm
  • [x] Support Alert and Persistence for metric monitoring
  • [ ] Extensible stream module management and automatically scan and register module
  • [x] Pipeline runner CLI tool and shell script
  • [ ] Decouple pipeline compiler and scheduler into individual modules
  • [ ] Stream Pipeline Scheduler
  • [ ] Graph editor to define streaming graph in UI
  • [?] JSON/Config & Scala Case Class Mapping (https://github.com/scala/pickling)
  • [?] Raw message structure oriented programing is a little ugly, we should define a generic message/event consist of [payload:stream/timestamp/serializer/deserializer,data:message]
  • [ ] Provide stream schema inline and send to metadata when submitting
  • [ ] UI should support specify executorId when defining new stream
  • [ ] Lack of a entity named StreamEntity for the workflow of defining topology&policy end-to-end
  • [!] Fix configuration conflict, should pass through Config instead of ConfigFactory.load() manually
  • [ ] Override application configuration with pipeline configuration
  • [ ] Refactor schema registration structure and automatically submit stream schema when submitting pipeline
  • [ ] Submit alertStream, alertExecutorId mapping to AlertExecutorService when submitting pipeline
  • [x] Supports inputs field to define connector