title: “Quickstart: Scala API”

  • This will be replaced by the TOC {:toc}

Start working on your Flink Scala program in a few simple steps.

Requirements

The only requirements are working Maven 3.0.4 (or higher) and Java 6.x (or higher) installations.

Create Project

Use one of the following commands to create a project:

Inspect Project

There will be a new directory in your working directory. If you've used the curl approach, the directory is called quickstart. Otherwise, it has the name of your artifactId.

The sample project is a Maven project, which contains two classes. Job is a basic skeleton program and WordCountJob a working example. Please note that the main method of both classes allow you to start Flink in a development/testing mode.

We recommend to import this project into your IDE. For Eclipse, you need the following plugins, which you can install from the provided Eclipse Update Sites:

The IntelliJ IDE also supports Maven and offers a plugin for Scala development.

Build Project

If you want to build your project, go to your project directory and issue themvn clean package command. You will find a jar that runs on every Flink cluster in target/your-artifact-id-1.0-SNAPSHOT.jar. There is also a fat-jar, target/your-artifact-id-1.0-SNAPSHOT-flink-fat-jar.jar. This also contains all dependencies that get added to the maven project.

Next Steps

Write your application!

The quickstart project contains a WordCount implementation, the “Hello World” of Big Data processing systems. The goal of WordCount is to determine the frequencies of words in a text, e.g., how often do the terms “the” or “house” occurs in all Wikipedia texts.

Sample Input:

big data is big

Sample Output:

big 2
data 1
is 1

The following code shows the WordCount implementation from the Quickstart which processes some text lines with two operators (FlatMap and Reduce), and writes the prints the resulting words and counts to std-out.

object WordCountJob {
  def main(args: Array[String]) {

    // set up the execution environment
    val env = ExecutionEnvironment.getExecutionEnvironment

    // get input data
    val text = env.fromElements("To be, or not to be,--that is the question:--",
      "Whether 'tis nobler in the mind to suffer", "The slings and arrows of outrageous fortune",
      "Or to take arms against a sea of troubles,")

    val counts = text.flatMap { _.toLowerCase.split("\\W+") }
      .map { (_, 1) }
      .groupBy(0)
      .sum(1)

    // emit result
    counts.print()

    // execute program
    env.execute("WordCount Example")
  }
}

{% gh_link /flink-examples/flink-scala-examples/src/main/scala/org/apache/flink/examples/scala/wordcount/WordCount.scala “Check GitHub” %} for the full example code.

For a complete overview over our API, have a look at the Programming Guide and further example programs. If you have any trouble, ask on our Mailing List. We are happy to provide help.