blob: 9af9033079fad77c30c28409bd5abf093043fb05 [file] [log] [blame]
[
{
"title": "Committers",
"tags": "",
"keywords": "",
"url": "../docs/committers",
"summary": "",
"body": "## Commit activityTo see commit activity for Apache Quarks, click [here](https://github.com/apache/incubator-quarks/pulse).## How to become a committerYou can become a committer by contributing to Quarks. Qualifications for a new committer include:* **Sustained Contributions**: Potential committers should have a history of contributions to Quarks. They will create pull requests over a period of time.* **Quality of Contributions**: Potential committers should submit code that adds value to Quarks, including tests and documentation as needed. They should comment in a positive way on issues and pull requests, providing guidance and input to improve Quarks.* **Community Involvement**: Committers should participate in discussions in a positive way, triage and fix bugs, and interact with users who submit questions. They will remain courteous, and helpful, and encourage new members to join the Quarks community."
},
{
"title": "Common Quarks operations",
"tags": "",
"keywords": "",
"url": "../docs/common-quarks-operations",
"summary": "",
"body": "In the [Getting started guide](quarks-getting-started), we covered a Quarks application where we read from a device's simulated temperature sensor. Yet Quarks supports more operations than simple filtering. Data analysis and streaming require a suite of functionality, the most important components of which will be outlined below.## TStream.map()`TStream.map()` is arguably the most used method in the Quarks API. Its two main purposes are to perform stateful or stateless operations on a stream's tuples, and to produce a `TStream` with tuples of a different type from that of the calling stream.### Changing a TStream's tuple typeIn addition to filtering tuples, `TStream`s support operations that *transform* tuples from one Java type to another by invoking the `TStream.map()` method.This is useful in cases such as calculating the floating point average of a list of `Integer`s, or tokenizing a Java String into a list of `String`s. To demonstrate this, let's say we have a `TStream` which contains a few lines, each of which contains multiple words:```javaTStream lines = topology.strings( \"this is a line\", \"this is another line\", \"there are three lines now\", \"and now four\");```We then want to print the third word in each line. The best way to do this is to convert each line to a list of `String`s by tokenizing them. We can do this in one line of code with the `TStream.map()` method:```javaTStream > wordsInLine = lines.map(tuple -> Arrays.asList(tuple.split(\" \")));```Since each tuple is now a list of strings, the `wordsInLine` stream is of type `List`. As you can see, the `map()` method has the ability to change the type of the `TStream`. Finally, we can use the `wordsInLine` stream to print the third word in each line.```javawordsInLine.sink(list -> System.out.println(list.get(2)));```As mentioned in the [Getting started guide](quarks-getting-started), a `TStream` can be parameterized to any serializable Java type, including ones created by the user.### Performing stateful operationsIn all previous examples, the operations performed on a `TStream` have been stateless; keeping track of information over multiple invocations of the same operation has not been necessary. What if we want to keep track of the number of Strings sent over a stream? To do this, we need our `TStream.map()` method to contain a counter as state.This can be achieved by creating an anonymous `Function` class, and giving it the required fields.```javaTStream streamOfStrings = ...;TStream counts = streamOfStrings.map(new Function() { int count = 0; @Override public Integer apply(String arg0) { count = count + 1; return count; }});```The `count` field will now contain the number of `String`s which were sent over `streamOfStrings`. Although this is a simple example, the anonymous `Function` passed to `TStream.map()` can contain any kind of state! This could be a `HashMap`, a running list of tuples, or any serializable Java type. The state will be maintained throughout the entire runtime of your application."
},
{
"title": "Apache Quarks community",
"tags": "",
"keywords": "",
"url": "../docs/community",
"summary": "",
"body": "Every volunteer project obtains its strength from the people involved in it. We invite you to participate as much or as little as you choose.You can:* Use our project and provide a feedback.* Provide us with the use-cases.* Report bugs and submit patches.* Contribute code, javadocs, documentation.Visit the [Contributing](http://www.apache.org/foundation/getinvolved.html) page for general Apache contribution information. If you plan to make any significant contribution, you will need to have an Individual Contributor License Agreement [\\(ICLA\\)](https://www.apache.org/licenses/icla.txt) on file with Apache.## Mailing listGet help using {{ site.data.project.short_name }} or contribute to the project on our mailing lists:{% if site.data.project.user_list %}* [site.data.project.user_list](mailto:{{ site.data.project.user_list }}) is for usage questions, help, and announcements. [subscribe](mailto:{{ site.data.project.user_list_subscribe }}?subject=send this email to subscribe), [unsubscribe](mailto:{{ site.data.project.dev_list_unsubscribe }}?subject=send this email to unsubscribe), [archives]({{ site.data.project.user_list_archive_mailarchive }}){% endif %}* [{{ site.data.project.dev_list }}](mailto:{{ site.data.project.dev_list }}) is for people who want to contribute code to {{ site.data.project.short_name }}. [subscribe](mailto:{{ site.data.project.dev_list_subscribe }}?subject=send this email to subscribe), [unsubscribe](mailto:{{ site.data.project.dev_list_unsubscribe }}?subject=send this email to unsubscribe), [Apache archives]({{ site.data.project.dev_list_archive }}), [mail-archive.com archives]({{ site.data.project.dev_list_archive_mailarchive }})* [{{ site.data.project.commits_list }}](mailto:{{ site.data.project.commits_list }}) is for commit messages and patches to {{ site.data.project.short_name }}. [subscribe](mailto:{{ site.data.project.commits_list_subscribe }}?subject=send this email to subscribe), [unsubscribe](mailto:{{ site.data.project.commits_list_unsubscribe }}?subject=send this email to unsubscribe), [Apache archives]({{ site.data.project.commits_list_archive }}), [mail-archive.com archives]({{ site.data.project.commits_list_archive_mailarchive }})## Issue trackerWe use Jira here: [https://issues.apache.org/jira/browse/{{ site.data.project.jira }}](https://issues.apache.org/jira/browse/{{ site.data.project.jira }})### Bug reportsFound bug? Create an issue in [Jira](https://issues.apache.org/jira/browse/{{ site.data.project.jira }}).Before submitting an issue, please:* Verify that the bug does in fact exist* Search the issue tracker to verify there is no existing issue reporting the bug you've found* Consider tracking down the bug yourself in the {{ site.data.project.short_name }} source and submitting a pull request along with your bug report. This is a great time saver for the {{ site.data.project.short_name }} developers and helps ensure the bug will be fixed quickly.### Feature requestsEnhancement requests for new features are also welcome. The more concrete the request is and the better rationale you provide, the greater the chance it will incorporated into future releases. To make a request, create an issue in [Jira](https://issues.apache.org/jira/browse/{{ site.data.project.jira }}).## Source codeThe project sources are accessible via the [source code repository]({{ site.data.project.source_repository }}) which is also mirrored in [GitHub]({{ site.data.project.source_repository_mirror }}).When you are considering a code contribution, make sure there is an [Jira issue](https://issues.apache.org/jira/browse/{{ site.data.project.jira }}) that describes your work or the bug you are fixing. For significant contributions, please discuss your proposed changes in the issue so that others can comment on your plans. Someone else may be working on the same functionality, so it's good to communicate early and often. A committer is more likely to accept your change if there is clear information in the issue.To contribute, [fork](https://help.github.com/articles/fork-a-repo/) the [mirror]({{ site.data.project.source_repository_mirror }}) and issue a [pull request](https://help.github.com/articles/using-pull-requests/). Put the Jira issue number, e.g. {{ site.data.project.jira }}-100 in the pull request title. The tag [WIP] can also be used in the title of pull requests to indicate that you are not ready to merge but want feedback. Remove [WIP] when you are ready for merge. Make sure you document your code and contribute tests along with the code.Read [DEVELOPMENT.md](https://github.com/apache/incubator-quarks/blob/master/DEVELOPMENT.md) at the top of the code tree for details on setting up your development environment.## Web site and documentation source codeThe project website and documentation sources are accessible via the [website source code repository]({{ site.data.project.website_repository }}) which is also mirrored in [GitHub]({{ site.data.project.website_repository_mirror }}). Contributing changes to the web site and documentation is similar to contributing code. Follow the instructions in the *Source Code* section above, but fork and issue a pull request against the [web site mirror]({{ site.data.project.website_repository_mirror }}). Follow the instructions in the top-level [README.md]({{ site.data.project.website_repository_mirror }}/blob/master/README.md) for details on contributing to the web site and documentation.You will need to use [Markdown](https://daringfireball.net/projects/markdown/) and [Jekyll](http://jekyllrb.com) to develop pages. See:* [Markdown Cheat Sheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet)* [Jekyll on linux and Mac](https://jekyllrb.com/)* [Jekyll on Windows](https://jekyllrb.com/docs/windows/) is not officially supported but people have gotten it to work"
},
{
"title": "Application console",
"tags": "",
"keywords": "",
"url": "../docs/console",
"summary": "",
"body": "## Visualizing and monitoring your applicationThe Quarks application console is a web application that enables you to visualize your application topology and monitor the tuples flowing through your application. The kind of oplets used in the topology, as well as the stream tags included in the topology, are also visible in the console.## Adding the console web app to your applicationTo use the console, you must use the Quarks classes that provide the service to access the console web application or directly call the `HttpServer` class itself, start the server and then obtain the console URL.The easiest way to include the console in your application is to use the the `DevelopmentProvider` class. `DevelopmentProvider` is a subclass of `DirectProvider` and adds services such as access to the console web application and counter oplets used to determine tuple counts. You can get the URL for the console from the `DevelopmentProvider` using the `getService` method as shown in a hypothetical application shown below:```javaimport java.util.concurrent.TimeUnit;import quarks.console.server.HttpServer;import quarks.providers.development.DevelopmentProvider;import quarks.topology.TStream;import quarks.topology.Topology;public class TempSensorApplication { public static void main(String[] args) throws Exception { TempSensor sensor = new TempSensor(); DevelopmentProvider dp = new DevelopmentProvider(); Topology topology = dp.newTopology(); TStream tempReadings = topology.poll(sensor, 1, TimeUnit.MILLISECONDS); TStream filteredReadings = tempReadings.filter(reading -> reading 80); filteredReadings.print(); System.out.println(dp.getServices().getService(HttpServer.class).getConsoleUrl()); dp.submit(topology); }}```Note that the console URL is being printed to `System.out`. The `filteredReadings` are as well, since `filteredReadings.print()` is being called in the application. You may need to scroll your terminal window up to see the output for the console URL.Optionally, you can modify the above code in the application to have a timeout before submitting the topology, which would allow you to see the console URL before any other output is shown. The modification would look like this:```java// Print the console URL and wait for 10 seconds before submitting the topologySystem.out.println(dp.getServices().getService(HttpServer.class).getConsoleUrl());try { TimeUnit.SECONDS.sleep(10);} catch (InterruptedException e) { // Do nothing}dp.submit(topology);```The other way to embed the console in your application is shown in the `HttpServerSample.java` example (on [GitHub](https://github.com/apache/incubator-quarks/blob/master/samples/console/src/main/java/quarks/samples/console/HttpServerSample.java)). It gets the `HttpServer` instance, starts it, and prints out the console URL. Note that it does not submit a job, so when the console is displayed in the browser, there are no running jobs and therefore no topology graph. The example is meant to show how to get the `HttpServer` instance, start the console web app and get the URL of the console.## Accessing the consoleThe console URL has the following format:`http://host_name:port_number/console`Once it is obtained from `System.out`, enter it in a browser window.If you cannot access the console at this URL, ensure there is a `console.war` file in the `webapps` directory. If the `console.war` file cannot be found, an exception will be thrown (in `std.out`) indicating `console.war` was not found.## ConsoleWaterDetector sampleTo see the features of the console in action and as a way to demonstrate how to monitor a topology in the console, let's look at the `ConsoleWaterDetector` sample (on [GitHub](https://github.com/apache/incubator-quarks/blob/master/samples/console/src/main/java/quarks/samples/console/ConsoleWaterDetector.java)).Prior to running any console applications, the `console.war` file must be built as mentioned above. If you are building quarks from a Git repository, go to the top level Quarks directory and run `ant`.Here is an example in my environment:```Susans-MacBook-Pro-247:quarks susancline$ pwd/Users/susancline/git/quarksSusans-MacBook-Pro-247:quarks susancline$ antBuildfile: /Users/susancline/git/quarks/build.xmlsetcommitversion:init:suball:init:project.component:compile:...[javadoc] Constructing Javadoc information...[javadoc] Standard Doclet version 1.8.0_71[javadoc] Building tree for all the packages and classes...[javadoc] Generating /Users/susancline/git/quarks/target/docs/javadoc/quarks/analytics/sensors/package-summary.html...[javadoc] Copying file /Users/susancline/git/quarks/analytics/sensors/src/main/java/quarks/analytics/sensors/doc-files/deadband.png to directory /Users/susancline/git/quarks/target/docs/javadoc/quarks/analytics/sensors/doc-files...[javadoc] Generating /Users/susancline/git/quarks/target/docs/javadoc/quarks/topology/package-summary.html...[javadoc] Copying file /Users/susancline/git/quarks/api/topology/src/main/java/quarks/topology/doc-files/sources.html to directory /Users/susancline/git/quarks/target/docs/javadoc/quarks/topology/doc-files...[javadoc] Building index for all the packages and classes...[javadoc] Building index for all classes...all:BUILD SUCCESSFULTotal time: 3 seconds```This command will let you know that `console.war` was built and is in the correct place, under the `webapps` directory.```Susans-MacBook-Pro-247:quarks susancline$ find . -name console.war -print./target/java8/console/webapps/console.war```Now we know we have built `console.war`, so we're good to go. To run this sample from the command line:```Susans-MacBook-Pro-247:quarks susancline$ pwd/Users/susancline/git/quarksSusans-MacBook-Pro-247:quarks susancline$ java -cp target/java8/samples/lib/quarks.samples.console.jar:. quarks.samples.console.ConsoleWaterDetector```If everything is successful, you'll start seeing output. You may have to scroll back up to get the URL of the console:```Susans-MacBook-Pro-247:quarks susancline$ java -cp target/java8/samples/lib/quarks.samples.console.jar:. quarks.samples.console.ConsoleWaterDetectorMar 07, 2016 12:04:52 PM org.eclipse.jetty.util.log.Log initializedINFO: Logging initialized @176msMar 07, 2016 12:04:53 PM org.eclipse.jetty.server.Server doStartINFO: jetty-9.3.6.v20151106Mar 07, 2016 12:04:53 PM org.eclipse.jetty.server.handler.ContextHandler doStartINFO: Started o.e.j.s.ServletContextHandler@614c5515{/jobs,null,AVAILABLE}Mar 07, 2016 12:04:53 PM org.eclipse.jetty.server.handler.ContextHandler doStartINFO: Started o.e.j.s.ServletContextHandler@77b52d12{/metrics,null,AVAILABLE}Mar 07, 2016 12:04:53 PM org.eclipse.jetty.webapp.StandardDescriptorProcessor visitServletINFO: NO JSP Support for /console, did not find org.eclipse.jetty.jsp.JettyJspServletMar 07, 2016 12:04:53 PM org.eclipse.jetty.server.handler.ContextHandler doStartINFO: Started o.e.j.w.WebAppContext@2d554825{/console,file:///private/var/folders/0c/pb4rznhj7sbc886t30w4vpxh0000gn/T/jetty-0.0.0.0-0-console.war-_console-any-3101338829524954950.dir/webapp/,AVAILABLE}{/console.war}Mar 07, 2016 12:04:53 PM org.eclipse.jetty.server.AbstractConnector doStartINFO: Started ServerConnector@66480dd7{HTTP/1.1,[http/1.1]}{0.0.0.0:57964}Mar 07, 2016 12:04:53 PM org.eclipse.jetty.server.Server doStartINFO: Started @426mshttp://localhost:57964/consoleWell1 alert, ecoli value is 1Well1 alert, temp value is 48Well3 alert, ecoli value is 1```Now point your browser to the URL displayed above in the output from running the Java command to launch the `ConsoleWaterDetector` application. In this case, the URL is `http://localhost:57964/console`.Below is a screen shot of what you should see if everything is working properly:## ConsoleWaterDetector application scenarioThe application is now running in your browser. Let's discuss the scenario for the application.A county agency is responsible for ensuring the safety of residents well water. Each well they monitor has four different sensor types:* Temperature* Acidity* Ecoli* LeadThe sample application topology monitors 3 wells:* For the hypothetical scenario, Well1 and Well3 produce 'unhealthy' values from their sensors on occasion. Well2 always produces 'healthy' values.* Each well that is to be measured is added to the topology. The topology polls each sensor (temp, ecoli, etc.) for each well as a unit. A `TStream` is returned from polling the toplogy and represents a sensor reading. Each sensor reading for the well has a tag added to it with the reading type i.e, \"temp\", and the well id. Once all of the sensor readings are obtained and the tags added, each sensor reading is 'unioned' into a single `TStream`. Look at the `waterDetector` method for details on this.* Now, each well has a single stream with each of the sensors readings as a property with a name and value in the `TStream`. Next the `alertFilter` method is called on the `TStream` representing each well. This method checks the values for each well's sensors to determine if they are 'out of range' for healthy values. The `filter` oplet is used to do this. If any of the sensor's readings are out of the acceptable range the tuple is passed along. Those that are within an acceptable range are discarded.* Next the applications' `splitAlert` method is called on each well's stream that contains the union of all the sensor readings that are out of range. The `splitAlert` method uses the `split` oplet to split the incoming stream into 5 different streams. Only those tuples that are out of range for each stream, which represents each sensor type, will be returned. The object returned from `splitAlert` is a list of `TStream` objects. The `splitAlert` method is shown below: ```java public static List> splitAlert(TStream alertStream, int wellId) { List> allStreams = alertStream.split(5, tuple -> { if (tuple.get(\"temp\") != null) { JsonObject tempObj = new JsonObject(); int temp = tuple.get(\"temp\").getAsInt(); if (temp = TEMP_ALERT_MAX) { tempObj.addProperty(\"temp\", temp); return 0; } else { return -1; } } else if (tuple.get(\"acidity\") != null){ JsonObject acidObj = new JsonObject(); int acid = tuple.get(\"acidity\").getAsInt(); if (acid = ACIDITY_ALERT_MAX) { acidObj.addProperty(\"acidity\", acid); return 1; } else { return -1; } } else if (tuple.get(\"ecoli\") != null) { JsonObject ecoliObj = new JsonObject(); int ecoli = tuple.get(\"ecoli\").getAsInt(); if (ecoli >= ECOLI_ALERT) { ecoliObj.addProperty(\"ecoli\", ecoli); return 2; } else { return -1; } } else if (tuple.get(\"lead\") != null) { JsonObject leadObj = new JsonObject(); int lead = tuple.get(\"lead\").getAsInt(); if (lead >= LEAD_ALERT_MAX) { leadObj.addProperty(\"lead\", lead); return 3; } else { return -1; } } else { return -1; } }); return allStreams; } ```* Next we want to get the temperature stream from the first well and put a rate meter on it to determine the rate at which the out of range values are flowing in the stream ```java List> individualAlerts1 = splitAlert(filteredReadings1, 1); // Put a rate meter on well1's temperature sensor output Metrics.rateMeter(individualAlerts1.get(0)); ```* Next all the sensors for well 1 have tags added to the stream indicating the stream is out of range for that sensor and the well id. Next a sink is added, passing the tuple to a `Consumer` that formats a string to `System.out` containing the well id, alert type (sensor type) and value of the sensor. ```java // Put a rate meter on well1's temperature sensor output Metrics.rateMeter(individualAlerts1.get(0)); individualAlerts1.get(0).tag(TEMP_ALERT_TAG, \"well1\").sink(tuple -> System.out.println(\"\\n\" + formatAlertOutput(tuple, \"1\", \"temp\"))); individualAlerts1.get(1).tag(ACIDITY_ALERT_TAG, \"well1\").sink(tuple -> System.out.println(formatAlertOutput(tuple, \"1\", \"acidity\"))); individualAlerts1.get(2).tag(ECOLI_ALERT_TAG, \"well1\").sink(tuple -> System.out.println(formatAlertOutput(tuple, \"1\", \"ecoli\"))); individualAlerts1.get(3).tag(LEAD_ALERT_TAG, \"well1\").sink(tuple -> System.out.println(formatAlertOutput(tuple, \"1\", \"lead\"))); ```Output in the terminal window from the `formatAlertOutput` method will look like this:```Well1 alert, temp value is 86Well3 alert, ecoli value is 2Well1 alert, ecoli value is 1Well3 alert, acidity value is 1Well1 alert, lead value is 12Well1 alert, ecoli value is 2Well3 alert, lead value is 10Well3 alert, acidity value is 10```Notice how only those streams that are out of range for the temperature sensor type show output.## Detecting zero tuple countsAt the end of the `ConsoleWaterDetector` application is this snippet of code, added after the topology has been submitted:```javadp.submit(wellTopology);while (true) { MetricRegistry metricRegistry = dp.getServices().getService(MetricRegistry.class); SortedMap counters = metricRegistry.getCounters(); Set> values = counters.entrySet(); for (Entry e : values) { if (e.getValue().getCount() == 0) { System.out.println(\"Counter Op:\" + e.getKey() + \" tuple count: \" + e.getValue().getCount()); } } Thread.sleep(2000);}```What this does is get all the counters in the `MetricRegistry` class and print out the name of the counter oplet they are monitoring along with the tuple count if it is zero. Here is some sample output:```Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_44 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_45 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_46 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_47 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_89 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_95 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_96 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_97 has a tuple count of zero!Counter Op:TupleCounter.quarks.oplet.JOB_0.OP_98 has a tuple count of zero!```To summarize what the application is doing:* Unions all sensor type readings for a single well* Filters all sensor type readings for a single well, passing on an object that only contains tuples for the object that have at least one sensor type with out of range values* Splits the object that contained name/value pairs for sensor type and readings into individual sensor types returning only those streams that contain out of range values* Outputs to the command line the well id, sensor type and value that is out of range* Tags are added at various points in the topology for easier identification of either the well or some out of range condition* The topology contains counters to measure tuple counts since `DevelopmentProvider` was used* Individual rate meters were placed on `well1` and `well3`'s temperature sensors to determine the rate of 'unhealthy' values* Prints out the name of the counter oplets whose tuple counts are zero## Topology graph controlsNow that you have an understanding of what the application is doing, let's look at some of the controls in the console, so we can learn how to monitor the application. Below is a screen shot of the top controls: the controls that affect the Topology Graph.* **Job**: A drop down to select which job is being displayed in the Topology Graph. An application can contain multiple jobs.* **State**: Hovering over the 'State' icon shows information about the selected job. The current and next states of the job, the job id and the job name.* **View by**: This select is used to change how the topology graph is displayed. The three options for this select are: - Static flow - Tuple count - Oplet kind - Currently it is set to 'Static flow'. This means the oplets (represented as circles in the topology graph) do not change size, nor do the lines or links (representing the edges of the topology graph) change width or position. The graph is not being refreshed when it is in 'Static flow' mode.* **Refresh interval**: Allows the user to select an interval between 3 - 20 seconds to refresh the tuple count values in the graph. Every X seconds the metrics for the topology graph are refreshed. More about metrics a little bit later.* **Pause graph**: Stops the refresh interval timer. Once the 'Pause graph' button is clicked, the user must push 'Resume graph' for the graph to be updated, and then refreshed at the interval set in the 'Refresh interval' timer. It can be helpful to pause the graph if multiple oplets are occupying the same area on the graph, and their names become unreadable. Once the graph is paused, the user can drag an oplet off of another oplet to better view the name and see the edge(s) that connect them.* **Show tags**: If the checkbox appears in the top controls, it means: - The 'View by' layer is capable of displaying stream tags - The topology currently shown in the topology graph has stream tags associated with it* **Show all tags**: Selecting this checkbox shows all the tags present in the topology. If you want to see only certain tags, uncheck this box and select the button labeled 'Select individual tags ...'. A dialog will appear, and you can select one or all of the tags listed in the dialog which are present in the topology. The next aspect of the console we'll look at are the popups available when selecting 'View all oplet properties', hovering over an oplet and hovering over an edge (link).The screen shot below shows the output from clicking on the 'View all oplet properties' link directly below the job selector:Looking at the sixth line in the table, where the Name is 'OP_5', we can see that the Oplet kind is a `Map`, a `quarks.oplet.functional.Map`, the Tuple count is 0 (this is because the view is in Static flow mode - the graph does not show the number of tuples flowing in it), the source oplet is 'OP_55', the target oplet is 'OP_60', and there are no stream tags coming from the source or target streams. Relationships for all oplets can be viewed in this manner.Now, looking at the graph, if we want to see the relationships for a single oplet, we can hover over it. The image below shows the hover when we are over 'OP_5'.You can also hover over the edges of the topology graph to get information. Hover over the edge (link) between 'OP_0' and 'OP_55'. The image shows the name and kind of the oplet as the source, and the name and kind of oplet as the target. Again the tuple count is 0 since this is the 'Static flow' view. The last item of information in the tooltip is the tags on the stream.One or many tags can be added to a stream. In this case we see the tags 'temperature' and 'well1'.The section of the code that adds the tags 'temperature' and 'well1' is in the `waterDetector` method of the `ConsoleWaterDetector` class.```javapublic static TStream waterDetector(Topology topology, int wellId) { Random rNum = new Random(); TStream temp = topology.poll(() -> rNum.nextInt(TEMP_RANDOM_HIGH - TEMP_RANDOM_LOW) + TEMP_RANDOM_LOW, 1, TimeUnit.SECONDS); TStream acidity = topology.poll(() -> rNum.nextInt(ACIDITY_RANDOM_HIGH - ACIDITY_RANDOM_LOW) + ACIDITY_RANDOM_LOW, 1, TimeUnit.SECONDS); TStream ecoli = topology.poll(() -> rNum.nextInt(ECOLI_RANDOM_HIGH - ECOLI_RANDOM_LOW) + ECOLI_RANDOM_LOW, 1, TimeUnit.SECONDS); TStream lead = topology.poll(() -> rNum.nextInt(LEAD_RANDOM_HIGH - LEAD_RANDOM_LOW) + LEAD_RANDOM_LOW, 1, TimeUnit.SECONDS); TStream id = topology.poll(() -> wellId, 1, TimeUnit.SECONDS); // add tags to each sensor temp.tag(\"temperature\", \"well\" + wellId);```### LegendThe legend(s) that appear in the console depend on the view currently displayed. In the static flow mode, if no stream tags are present, there is no legend. In this example we have stream tags in the topology, so the static flow mode gives us the option to select 'Show tags'. If selected, the result is the addition of the stream tags legend:This legend shows all the tags that have been added to the topology, regardless of whether or not 'Show all tags' is checked or specific tags have been selected from the dialog that appears when the 'Select individual tags ...' button is clicked.### Topology graphNow that we've covered most of the ways to modify the view of the topology graph and discussed the application, let's look at the topology graph as a way to understand our application.When analyzing what is happening in your application, here are some ways you might use the console to help you understand it:* Topology of the application - how the edges and vertices of the graph are related* Tuple flow - tuple counts since the application was started* The affect of filters or maps on the downstream streams* Stream tags - if tags are added dynamically based on a condition, where the streams with tags are displayed in the topologyLet's start with the static flow view of the topology. We can look at the graph, and we can also hover over any of the oplets or streams to better understand the connections. Also, we can click 'View all oplet properties' and see the relationships in a tabular format.The other thing to notice in the static flow view are the tags. Look for any colored edges (the links between the oplets). All of the left-most oplets have streams with tags. Most of them have the color that corresponds to 'Multiple tags'. If you hover over the edges, you can see the tags. It's obvious that we have tagged each sensor with the sensor type and the well id.Now, if you look to the far right, you can see more tags on streams coming out of a `split` oplet. They also have multiple tags, and hovering over them you can determine that they represent out of range values for each sensor type for the well. Notice how the `split` oplet, OP_43, has no tags in the streams coming out of it. If you follow that split oplet back, you can determine from the first tags that it is part of the well 2 stream.If you refer back to the `ConsoleWaterDetector` source, you can see that no tags were placed on the streams coming out of `well2`'s split because they contained no out of range values.Let's switch the view to Oplet kind now. It will make more clear which oplets are producing the streams with the tags on them. Below is an image of how the graph looks after switching to the Oplet kind view.In the Oplet kind view the links are all the same width, but the circles representing the oplets are sized according to tuple flow. Notice how the circles representing OP_10, OP_32 and OP_21 are large in relation to OP_80, OP_88 and OP_89. As a matter of fact, we can't even see the circle representing OP_89. Looking at OP_35 and then the Oplet kind legend, you can see by the color that it is a Filter oplet. This is because the filter that we used against `well2`, which is the stream that OP_35 is part of returned no tuples. This is a bit difficult to see. Let's look at the Tuple count view.The Tuple count view will make it more clear that no tuples are following out of OP_35, which represents the filter for `well2` and only returns out of range values. You may recall that in this example `well2` returned no out of range values. Below is the screen shot of the graph in 'Tuple count' view mode.The topology graph oplets can sometimes sit on top of each other. If this is the case, pause the refresh and use your mouse to pull down on the oplets that are in the same position. This will allow you to see their name. Alternately, you can use the 'View all properties' table to see the relationships between oplets.### MetricsIf you scroll the browser window down, you can see a Metrics section. This section appears when the application contains the following:* A `DevelopmentProvider` is used; this automatically inserts counters on the streams of the topology* A `quarks.metrics.Metric.Counter` or `quarks.metrics.Metric.RateMeter` is added to an individual stream## CountersIn the `ConsoleWaterDetector` application we used a `DevelopmentProvider`. Therefore, counters were added to most streams (edges) with the following exceptions (from the [Javadoc]({{ site.docsurl }}/lastest/quarks/metrics/Metrics.html#counter-quarks.topology.TStream-) for `quarks.metrics.Metrics`):*Oplets are only inserted upstream from a FanOut oplet.**If a chain of Peek oplets exists between oplets A and B, a Metric oplet is inserted after the last Peek, right upstream from oplet B.**If a chain of Peek oplets is followed by a FanOut, a metric oplet is inserted between the last Peek and the FanOut oplet.The implementation is not idempotent; previously inserted metric oplets are treated as regular graph vertices. Calling the method twice will insert a new set of metric oplets into the graph.*Also, the application inserts counters on `well2`'s streams after the streams from the individual sensors were unioned and then split:```javaList> individualAlerts2 = splitAlert(filteredReadings2, 2);TStream alert0Well2 = individualAlerts2.get(0);alert0Well2 = Metrics.counter(alert0Well2);alert0Well2.tag(\"well2\", \"temp\");TStream alert1Well2 = individualAlerts2.get(1);alert1Well2 = Metrics.counter(alert1Well2);alert1Well2.tag(\"well2\", \"acidity\");TStream alert2Well2 = individualAlerts2.get(2);alert2Well2 = Metrics.counter(alert2Well2);alert2Well2.tag(\"well2\", \"ecoli\");TStream alert3Well2 = individualAlerts2.get(3);alert3Well2 = Metrics.counter(alert3Well2);alert3Well2.tag(\"well2\", \"lead\");```When looking at the select next to the label 'Metrics', make sure the 'Count, oplets OP_37, OP_49 ...' is selected. This select compares all of the counters in the topology visualized as a bar graph. An image is shown below:Hover over individual bars to get the value of the number of tuples flowing through that oplet since the application was started. You can also see the oplet name. You can see that some of the oplets have zero tuples flowing through them.The bars that are the tallest and therefore have the highest tuple count are OP_76, OP_67 and OP_65. If you look back up to the topology graph, in the Tuple count view, you can see that the edges (streams) surrounding these oplets have the color that corresponds to the highest tuple count (in the pictures above that color is bright orange in the Tuple count legend).### Rate metersThe other type of metric we can look at are rate meter metrics. In the `ConsoleWaterDetector` application we added two rate meters here with the objective of comparing the rate of out of range readings between `well1` and `well3`:```List> individualAlerts1 = splitAlert(filteredReadings1, 1);// Put a rate meter on well1's temperature sensor outputMetrics.rateMeter(individualAlerts1.get(0));...List> individualAlerts3 = splitAlert(filteredReadings3, 3);// Put a rate meter on well3's temperature sensor outputMetrics.rateMeter(individualAlerts3.get(0));```Rate meters contain the following metrics for each stream they are added to: * Tuple count * The rate of change in the tuple count. The following rates are available for a single stream: - 1 minute rate change - 5 minute rate change - 15 minute rate change - Mean rate changeNow change the Metrics select to the 'MeanRate'. In our example these correspond to oplets OP_37 and OP_49:Hovering over the slightly larger bar, the one to the right, the name is OP_49. Looking at the topology graph and changing the view to 'Static flow', follow the edges back from OP_49 until you can see an edge with a tag on it. You can see that OP_49's source is OP_51, whose source is OP_99. The edge between OP_99 and it's source OP_48 has multiple tags. Hovering over this stream, the tags are 'TEMP out of range' and 'well3'.If a single rate meter is placed on a stream, in addition to plotting a bar chart, a line chart over the last 20 measures can be viewed. For example, if I comment out the addition of the rate meter for `well1` and then rerun the application, the Metrics section will look like the image below. I selected the 'OneMinuteRate' and 'Line chart' for Chart type:## SummaryThe intent of the information on this page is to help you understand the following:* How to add the console application to a Quarks application* How to run the `ConsoleWaterDetector` sample* The design/architecture in the `ConsoleWaterDetector` application* The controls for the Topology graph are and what they do, including the different views of the graph* The legend for the graph* How to interpret the graph and use the tooltips over the edges and vertices, as well as the 'View all properties' link* How to add counters and rate meters to a topology* How to use the metrics section to understand tuple counters and rate meters* How to correlate values from the metrics section with the topology graphThe Quarks console will continue to evolve and improve. Please open an issue if you see a problem with the existing console, but more importantly add an issue if you have an idea of how to make the console better.The more folks write Quarks applications and view them in the console, the more information we can gather from the community about what is needed in the console. Please consider making a contribution if there is a feature in the console that would really help you and others!"
},
{
"title": "FAQ",
"tags": "",
"keywords": "",
"url": "../docs/faq",
"summary": "",
"body": "## What is Apache Quarks?Quarks provides APIs and a lightweight runtime to analyze streaming data at the edge.## What do you mean by the edge?The edge includes devices, gateways, equipment, vehicles, systems, appliances and sensors of all kinds as part of the Internet of Things.## How is Apache Quarks used?Quarks can be used at the edge of the Internet of Things, for example, to analyze data on devices, engines, connected cars, etc. Quarks could be on the device itself, or a gateway device collecting data from local devices. You can write an edge application on Quarks and connect it to a Cloud service, such as the IBM Watson IoT Platform. It can also be used for enterprise data collection and analysis; for example log collectors, application data, and data center analytics.## How are applications developed?Applications are developed using a functional flow API to define operations on data streams that are executed as a graph of \"oplets\" in a lightweight embeddable runtime. The SDK provides capabilities like windowing, aggregation and connectors with an extensible model for the community to expand its capabilities.## What APIs does Apache Quarks support?Currently, Quarks supports APIs for Java and Android. Support for additional languages, such as Python, is likely as more developers get involved. Please consider joining the Quarks open source development community to accelerate the contributions of additional APIs.## What type of analytics can be done with Apache Quarks?Quarks provides windowing, aggregation and simple filtering. It uses Apache Common Math to provide simple analytics aimed at device sensors. Quarks is also extensible, so you can call existing libraries from within your Quarks application. In the future, Quarks will include more analytics, either exposing more functionality from Apache Common Math, other libraries or hand-coded analytics.## What connectors does Apache Quarks support?Quarks supports connectors for MQTT, HTTP, JDBC, File, Apache Kafka and IBM Watson IoT Platform. Quarks is extensible; you can add the connector of your choice.## What centralized streaming analytic systems does Apache Quarks support?Quarks supports open source technology (such as Apache Spark, Apache Storm, Flink and samza), IBM Streams (on-premises or IBM Streaming Analytics on Bluemix), or any custom application of your choice.## Why do I need Apache Quarks on the edge, rather than my streaming analytic system?Quarks is designed for the edge, rather than a more centralized system. It has a small footprint, suitable for running on devices. Quarks provides simple analytics, allowing a device to analyze data locally and to only send to the centralized system if there is a need, reducing communication costs.## Why do I need Apache Quarks, rather than coding the complete application myself?Quarks is a tool for edge analytics that allows you to be more productive. Quarks provides a consistent data model (streams and windows) and provides useful functionality, such as aggregations, joins, etc. Using Quarks lets you to take advantage of this functionality, allowing you to focus on your application needs.## Where can I download Apache Quarks to try it out?Quarks is migrating from github quarks-edge to Apache. You can download the source from Apache and build it yourself [here](https://github.com/apache/incubator-quarks). You can also find already built pre-Apache releases of Quarks for download [here](https://github.com/quarks-edge/quarks/releases/latest). These releases are not associated with Apache.## How do I get started?Getting started is simple. Once you have downloaded Quarks, everything you need to know to get up and running, you will find [here](quarks-getting-started). We suggest you also run the [Quarks sample programs](samples) to familiarize yourselves with the code base.## How can I get involved?We would love to have your help! Visit [Get Involved](community) to learn more about how to get involved.## How can I contribute code?Just submit a [pull request](https://github.com/apache/incubator-quarks) and wait for a committer to review. For more information, visit our [committer page](committers) and read [DEVELOPMENT.md](https://github.com/apache/incubator-quarks/blob/master/DEVELOPMENT.md) at the top of the code tree.## Can I become a committer?Read about Quarks committers and how to become a committer [here](committers).## Where can I get the code?The source code is available [here](https://github.com/apache/incubator-quarks).## Can I take a copy of the code and fork it for my own use?Yes. Quarks is available under the Apache 2.0 license which allows you to fork the code. We hope you will contribute your changes back to the Quarks community.## How do I suggest new features?Click [Issues](https://issues.apache.org/jira/browse/QUARKS) to submit requests for new features. You may browse or query the Issues database to see what other members of the Quarks community have already requested.## How do I submit bug reports?Click [Issues](https://issues.apache.org/jira/browse/QUARKS) to submit a bug report.## How do I ask questions about Apache Quarks?Use [site.data.project.user_list](mailto:{{ site.data.project.user_list }}) to submit questions to the Quarks community.## Why is Apache Quarks open source?With the growth of the Internet of Things there is a need to execute analytics at the edge. Quarks was developed to address requirements for analytics at the edge for IoT use cases that were not addressed by central analytic solutions. These capabilities will be useful to many organizations and that the diverse nature of edge devices and use cases is best addressed by an open community. Our goal is to develop a vibrant community of developers and users to expand the capabilities and real-world use of Quarks by companies and individuals to enable edge analytics and further innovation for the IoT space."
},
{
"title": "Introduction",
"tags": "getting_started",
"keywords": "",
"url": "../docs/home",
"summary": "",
"body": "## Apache Quarks overviewDevices and sensors are everywhere, and more are coming online every day. You need a way to analyze all of the data coming from your devices, but it can be expensive to transmit all of the data from a sensor to your central analytics engine.Quarks is an open source programming model and runtime for edge devices that enables you to analyze data and events at the device. When you analyze on the edge, you can:* Reduce the amount of data that you transmit to your analytics server* Reduce the amount of data that you storeA Quarks application uses analytics to determine when data needs to be sent to a back-end system for further analysis, action, or storage. For example, you can use Quarks to determine whether a system is running outside of normal parameters, such as an engine that is running too hot.If the system is running normally, you don’t need to send this data to your back-end system; it’s an added cost and an additional load on your system to process and store. However, if Quarks detects an issue, you can transmit that data to your back-end system to determine why the issue is occurring and how to resolve the issue.Quarks enables you to shift from sending a continuous flow of trivial data to the server to sending only essential and meaningful data as it occurs. This is especially important when the cost of communication is high, such as when using a cellular network to transmit data, or when bandwidth is limited.The following use cases describe the primary situations in which you would use Quarks:* **Internet of Things (IoT)**: Analyze data on distributed edge devices and mobile devices to: - Reduce the cost of transmitting data - Provide local feedback at the devices* **Embedded in an application server instance**: Analyze application server error logs in real time without impacting network traffic* **Server rooms and machine rooms**: Analyze machine health in real time without impacting network traffic or when bandwidth is limited### Deployment environmentsThe following environments have been tested for deployment on edge devices:* Java 8, including Raspberry Pi B and Pi2 B* Java 7* Android### Edge devices and back-end systemsYou can send data from an Apache Quarks application to your back-end system when you need to perform analysis that cannot be performed on the edge device, such as:* Running a complex analytic algorithm that requires more resources, such as CPU or memory, than are available on the edge device* Maintaining large amounts of state information about a device, such as several hours worth of state information for a patient’s medical device* Correlating data from the device with data from other sources, such as: - Weather data - Social media data - Data of record, such as a patient’s medical history or trucking manifests - Data from other devicesQuarks communicates with your back-end systems through the following message hubs:* MQTT – The messaging standard for IoT* IBM Watson IoT Platform – A cloud-based service that provides a device model on top of MQTT* Apache Kafka – An enterprise-level message bus* Custom message hubsYour back-end systems can also use analytics to interact with and control edge devices. For example:* A traffic alert system can send an alert to vehicles that are heading towards an area where an accident occurred* A vehicle monitoring system can reduce the maximum engine revs to reduce the chance of failure before the next scheduled service if it detects patterns that indicate a potential problem"
},
{
"title": "Quarks",
"tags": "",
"keywords": "",
"url": "../",
"summary": "",
"body": ""
},
{
"title": "Overview",
"tags": "",
"keywords": "",
"url": "../docs/overview",
"summary": "",
"body": "# Apache Quarks OverviewDevices and sensors are everywhere, and more are coming online every day. You need a way to analyze all of the data coming from your devices, but it can be expensive to transmit all of the data from a sensor to your central analytics engine.Apache Quarks is an open source programming model and runtime for edge devices that enables you to analyze data and events at the device. When you analyze on the edge, you can:* Reduce the amount of data that you transmit to your analytics server* Reduce the amount of data that you storeA Quarks application uses analytics to determine when data needs to be sent to a back-end system for further analysis, action, or storage. For example, you can use Quarks to determine whether a system is running outside of normal parameters, such as an engine that is running too hot.If the system is running normally, you don’t need to send this data to your back-end system; it’s an added cost and an additional load on your system to process and store. However, if Quarks detects an issue, you can transmit that data to your back-end system to determine why the issue is occurring and how to resolve the issue. Quarks enables you to shift from sending a continuous flow of trivial data to the server to sending only essential and meaningful data as it occurs. This is especially important when the cost of communication is high, such as when using a cellular network to transmit data, or when bandwidth is limited.The following use cases describe the primary situations in which you would use Quarks:* *Internet of Things (IoT):* Analyze data on distributed edge devices and mobile devices to: * Reduce the cost of transmitting data * Provide local feedback at the devices* *Embedded in an application server instance:* Analyze application server error logs in real time without impacting network traffic* *Server rooms and machine rooms:* Analyze machine health in real time without impacting network traffic or when bandwidth is limited## Deployment environmentsThe following environments have been tested for deployment on edge devices:* Java 8, including Raspberry Pi B and Pi2 B* Java 7* Android## Edge devices and back-end systemsYou can send data from a Quarks application to your back-end system when you need to perform analysis that cannot be performed on the edge device, such as:* Running a complex analytic algorithm that requires more resources, such as CPU or memory, than are available on the edge device.* Maintaining large amounts of state information about a device, such as several hours worth of state information for a patient’smedical device.* Correlating data from the device with data from other sources, such as: * Weather data * Social media data * Data of record, such as a patient’s medical history or trucking manifests * Data from other devicesQuarks communicates with your back-end systems through the following message hubs:* MQTT – The messaging standard for IoT* IBM Watson IoT Platform – A cloud-based service that provides a device model on top of MQTT* Apache Kafka – An enterprise-level message bus* Custom message hubsYour back-end systems can also use analytics to interact with and control edge devices. For example:* A traffic alert system can send an alert to vehicles that are heading towards an area where an accident occurred* A vehicle monitoring system can reduce the maximum engine revs to reduce the chance of failure before the next scheduled service if it detects patterns that indicate a potential problem"
},
{
"title": "Getting started with Apache Quarks",
"tags": "",
"keywords": "",
"url": "../docs/quarks-getting-started",
"summary": "",
"body": "## What is Apache Quarks?Quarks is an open source programming model and runtime for edge devices that enables you to analyze streaming data on your edge devices. When you analyze on the edge, you can:* Reduce the amount of data that you transmit to your analytics server* Reduce the amount of data that you storeFor more information, see the [Quarks overview](home).### Apache Quarks and streaming analyticsThe fundamental building block of a Quarks application is a **stream**: a continuous sequence of tuples (messages, events, sensor readings, and so on).The Quarks API provides the ability to process or analyze each tuple as it appears on a stream, resulting in a derived stream.Source streams are streams that originate data for analysis, such as readings from a device's temperature sensor.Streams are terminated using sink functions that can perform local device control or send information to centralized analytic systems through a message hub.Quarks' primary API is functional where streams are sourced, transformed, analyzed or sinked though functions, typically represented as lambda expressions, such as `reading -> reading 80` to filter temperature readings in Fahrenheit.### Downloading Apache QuarksTo use Quarks, access the source code and build it. You can read more about building Quarks [here](https://github.com/apache/incubator-quarks/blob/master/DEVELOPMENT.md).After you build the Quarks package, you can set up your environment.### Setting up your environmentEnsure that you are running a supported environment. For more information, see the [Quarks overview](home). This guide assumes you're running Java 8.The Quarks Java 8 JAR files are located in the `quarks/java8/lib` directory.1. Create a new Java project in Eclipse, and specify Java 8 as the execution environment JRE: 2. Modify the Java build path to include all of the JAR files in the `quarks\\java8\\lib` directory: Your environment is set up! You can start writing your first Quarks application.## Creating a simple applicationIf you're new to Quarks or to writing streaming applications, the best way to get started is to write a simple program.Quarks is a framework that pushes data analytics and machine learning to *edge devices*. (Edge devices include things like routers, gateways, machines, equipment, sensors, appliances, or vehicles that are connected to a network.) Quarks enables you to process data locally—such as, in a car engine, on an Android phone, or Raspberry Pi—before you send data over a network.For example, if your device takes temperature readings from a sensor 1,000 times per second, it is more efficient to process the data locally and send only interesting or unexpected results over the network. To simulate this, let's define a (simulated) TempSensor class:```javaimport java.util.Random;import quarks.function.Supplier;/** * Every time get() is called, TempSensor generates a temperature reading. */public class TempSensor implements Supplier { double currentTemp = 65.0; Random rand; TempSensor(){ rand = new Random(); } @Override public Double get() { // Change the current temperature some random amount double newTemp = rand.nextGaussian() + currentTemp; currentTemp = newTemp; return currentTemp; }}```Every time you call `TempSensor.get()`, it returns a new temperature reading. The continuous temperature readings are a stream of data that a Quarks application can process.Our sample Quarks application processes this stream by filtering the data and printing the results. Let's define a TempSensorApplication class for the application:```javaimport java.util.concurrent.TimeUnit;import quarks.providers.direct.DirectProvider;import quarks.topology.TStream;import quarks.topology.Topology;public class TempSensorApplication { public static void main(String[] args) throws Exception { TempSensor sensor = new TempSensor(); DirectProvider dp = new DirectProvider(); Topology topology = dp.newTopology(); TStream tempReadings = topology.poll(sensor, 1, TimeUnit.MILLISECONDS); TStream filteredReadings = tempReadings.filter(reading -> reading 80); filteredReadings.print(); dp.submit(topology); }}```To understand how the application processes the stream, let's review each line.### Specifying a providerYour first step when you write a Quarks application is to create a [`DirectProvider`]({{ site.docsurl }}/lastest/index.html?quarks/providers/direct/DirectProvider.html):```javaDirectProvider dp = new DirectProvider();```A `Provider` is an object that contains information on how and where your Quarks application will run. A `DirectProvider` is a type of Provider that runs your application directly within the current virtual machine when its `submit()` method is called.### Creating a topologyAdditionally a Provider is used to create a [`Topology`]({{ site.docsurl }}/lastest/index.html?quarks/topology/Topology.html) instance:```javaTopology topology = dp.newTopology();```In Quarks, `Topology` is a container that describes the structure of your application:* Where the streams in the application come from* How the data in the stream is modifiedIn the TempSensor application above, we have exactly one data source: the `TempSensor` object. We define the source stream by calling `topology.poll()`, which takes both a `Supplier` function and a time parameter to indicate how frequently readings should be taken. In our case, we read from the sensor every millisecond:```javaTStream tempReadings = topology.poll(sensor, 1, TimeUnit.MILLISECONDS);```### Defining the `TStream` objectCalling `topology.poll()` to define a source stream creates a `TStream` instance, which represents the series of readings taken from the temperature sensor.A streaming application can run indefinitely, so the `TStream` might see an arbitrarily large number of readings pass through it. Because a `TStream` represents the flow of your data, it supports a number of operations which allow you to modify your data.### Filtering a `TStream`In our example, we want to filter the stream of temperature readings, and remove any \"uninteresting\" or expected readings—specifically readings which are above 50 degrees and below 80 degrees. To do this, we call the `TStream`'s `filter` method and pass in a function that returns *true* if the data is interesting and *false* if the data is uninteresting:```javaTStream filteredReadings = tempReadings.filter(reading -> reading 80);```As you can see, the function that is passed to `filter` operates on each tuple individually. Unlike data streaming frameworks like [Apache Spark](https://spark.apache.org/), which operate on a collection of data in batch mode, Quarks achieves low latency processing by manipulating each piece of data as soon as it becomes available. Filtering a `TStream` produces another `TStream` that contains only the filtered tuples; for example, the `filteredReadings` stream.### Printing to outputWhen our application detects interesting data (data outside of the expected parameters), we want to print results. You can do this by calling the `TStream.print()` method, which prints using `.toString()` on each tuple that passes through the stream:```javafilteredReadings.print();```Unlike `TStream.filter()`, `TStream.print()` does not produce another `TStream`. This is because `TStream.print()` is a **sink**, which represents the terminus of a stream.In addition to `TStream.print()` there are other sink operations that send tuples to an MQTT server, JDBC connection, file, or Kafka cluster. Additionally, you can define your own sink by invoking `TStream.sink()` and passing in your own function.### Submitting your applicationNow that your application has been completely declared, the final step is to run your application.`DirectProvider` contains a `submit()` method, which runs your application directly within the current virtual machine:```javadp.submit(topology);```After you run your program, you should see output containing only \"interesting\" data coming from your sensor:```49.90403231177259647.9783750403908446.5927233630903146.68154455165293447.400819234155236...```As you can see, all temperatures are outside the 50-80 degree range. In terms of a real-world application, this would prevent a device from sending superfluous data over a network, thereby reducing communication costs.## Further examplesThis example demonstrates a small piece of Quarks' functionality. Quarks supports more complicated topologies, such as topologies that require merging and splitting data streams, or perform operations which aggregate the last *N* seconds of data (for example, calculating a moving average).For more complex examples, see:* [Quarks sample programs](samples)* [Common Quarks operations](common-quarks-operations)"
},
{
"title": "Apache Quarks documentation",
"tags": "",
"keywords": "",
"url": "../docs/quarks_index",
"summary": "",
"body": "## New documentationApache Quarks is evolving, and so is the documentation. If the existing documentation hasn't answered your questions, you can request new or updated documentation by opening a [Jira](https://issues.apache.org/jira/browse/QUARKS) issue.## Providing feedbackTo provide feedback on our documentation:1. Navigate to the documentation page for which you are providing feedback1. Click on the **Feedback** button in the top right cornerThis will open an issue for the page that you are currently visiting.## Contributing documentationIf you have ideas on how we can better document or explain some of the concepts, we would love to have your contribution! This site uses GitHub's flavor of Markdown and Jekyll markdown for our documentation.Refer to this documentation on GitHub's flavor of Markdown: [Writing on GitHub](https://help.github.com/categories/writing-on-github).Refer to this documentation to get started: [Using Jekyll with Pages](https://help.github.com/articles/using-jekyll-with-pages/).To contribute, clone this project locally, make your changes, and create a [pull request](https://github.com/apache/incubator-quarks-website/pulls).To learn more, visit [Get Involved](getinvolved)."
},
{
"title": "Quickstart IBM Watson IoT Platform sample",
"tags": "",
"keywords": "",
"url": "../docs/quickstart",
"summary": "",
"body": "## Quarks to Quickstart quickly!IoT devices running quarks applications typically connect to back-end analytic systems through a message hub. Message hubs are used to isolate the back-end system from having to handle connections from thousands to millions of devices.An example of such a message hub designed for the Internet of Things is [IBM Watson IoT Platform](https://internetofthings.ibmcloud.com/). This cloud service runs on IBM's Bluemix cloud platformand Quarks provides a [connector]({{ site.docsurl }}/lastest/index.html?quarks/connectors/iotf/IotfDevice.html).You can test out the service without any registration by using its Quickstart service and the Quarks sample application: [code](https://github.com/apache/incubator-quarks/blob/master/samples/connectors/src/main/java/quarks/samples/connectors/iotf/IotfQuickstart.java), [JavaDocs]({{ site.docsurl }}/lastest/index.html?quarks/samples/connectors/iotf/IotfQuickstart.html).You can execute the class directly from Eclipse, or using the script: [`quarks/java8/scripts/connectors/iotf/runiotfquickstart.sh`](https://github.com/quarks-edge/quarks/blob/master/scripts/connectors/iotf/runiotfquickstart.sh)When run it produces output like this, with a URL as the third line.Pointing any browser on any machine to that URL takes you to a view of the data coming from the sample application. This view is executing in Bluemix, thus the device events from this sample are being sent over the public internet to the Quickstart Bluemix service.Here's an example view:## Quarks codeThe full source is at: [IotfQuickstart.java](https://github.com/apache/incubator-quarks/blob/master/samples/connectors/src/main/java/quarks/samples/connectors/iotf/IotfQuickstart.java).The first step to is to create a `IotDevice` instance that represents the connection to IBM Watson IoT Platform Quickstart service.```java// Declare a connection to IoTF Quickstart serviceString deviceId = \"qs\" + Long.toHexString(new Random().nextLong());IotDevice device = IotfDevice.quickstart(topology, deviceId);```Now any stream can send device events to the Quickstart service by simply calling its `events()` method. Here we map a stream of random numbers into JSON as the payload for a device event is typically JSON.```javaTStream json = raw.map(v -> { JsonObject j = new JsonObject(); j.addProperty(\"temp\", v[0]); j.addProperty(\"humidity\", v[1]); j.addProperty(\"objectTemp\", v[2]); return j;});```Now we have a stream of simulated sensor reading events as JSON tuples (`json`) we send them as events with event identifer (type) `sensors` using `device`.```javadevice.events(json, \"sensors\", QoS.FIRE_AND_FORGET);```It's that simple to send tuples on a Quarks stream to IBM Watson IoT Platform as device events."
},
{
"title": "Using an adaptable deadtime filter",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_adaptable_deadtime_filter",
"summary": "",
"body": "Oftentimes, an application wants to control the frequency that continuously generated analytic results are made available to other parts of the application or published to other applications or an event hub.For example, an application polls an engine temperature sensor every second and performs various analytics on each reading — an analytic result is generated every second. By default, the application only wants to publish a (healthy) analytic result every 30 minutes. However, under certain conditions, the desire is to publish every per-second analytic result.Such a condition may be locally detected, such as detecting a sudden rise in the engine temperature or it may be as a result of receiving some external command to change the publishing frequency.Note this is a different case than simply changing the polling frequency for the sensor as doing that would disable local continuous monitoring and analysis of the engine temperature.This case needs a *deadtime filter* and Quarks provides one for your use! In contrast to a *deadband filter*, which skips tuples based on a deadband value range, a deadtime filter skips tuples based on a *deadtime period* following a tuple that is allowed to pass through. For example, if the deadtime period is 30 minutes, after allowing a tuple to pass, the filter skips any tuples received for the next 30 minutes. The next tuple received after that is allowed to pass through, and a new deadtime period is begun.See `quarks.analytics.sensors.Filters.deadtime()` (on [GitHub](https://github.com/apache/incubator-quarks/blob/master/analytics/sensors/src/main/java/quarks/analytics/sensors/Filters.java)) and `quarks.analytics.sensors.Deadtime` (on [GitHub](https://github.com/apache/incubator-quarks/blob/master/analytics/sensors/src/main/java/quarks/analytics/sensors/Deadtime.java)).This recipe demonstrates how to use an adaptable deadtime filter.A Quarks `IotProvider` ad `IoTDevice` with its command streams would be a natural way to control the application. In this recipe we will just simulate a \"set deadtime period\" command stream.## Create a polled sensor readings stream```javaTopology top = ...;SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor();TStream engineTemp = top.poll(tempSensor, 1, TimeUnit.SECONDS) .tag(\"engineTemp\");```It's also a good practice to add tags to streams to improve the usability of the development mode Quarks console.## Create a deadtime filtered stream—initially no deadtimeIn this recipe we'll just filter the direct ``engineTemp`` sensor reading stream. In practice this filtering would be performed after some analytics stages and used as the input to ``IotDevice.event()`` or some other connector publish operation.```javaDeadtime deadtime = new Deadtime();TStream deadtimeFilteredEngineTemp = engineTemp.filter(deadtime) .tag(\"deadtimeFilteredEngineTemp\");```## Define a \"set deadtime period\" method```javastatic void setDeadtimePeriod(Deadtime deadtime, long period, TimeUnit unit) { System.out.println(\"Setting deadtime period=\"+period+\" \"+unit); deadtime.setPeriod(period, unit);}```## Process the \"set deadtime period\" command streamOur commands are on the ``TStream cmds`` stream. Each ``JsonObject`` tuple is a command with the properties \"period\" and \"unit\".```javacmds.sink(json -> setDeadtimePeriod(deadtimeFilteredEngineTemp, json.getAsJsonPrimitive(\"period\").getAsLong(), TimeUnit.valueOf(json.getAsJsonPrimitive(\"unit\").getAsString())));```## The final applicationWhen the application is run it will initially print out temperature sensor readings every second for 15 seconds—the deadtime period is 0. Then every 15 seconds the application will toggle the deadtime period between 5 seconds and 0 seconds, resulting in a reduction in tuples being printed during the 5 second deadtime period.```javaimport java.util.Date;import java.util.concurrent.TimeUnit;import java.util.concurrent.atomic.AtomicInteger;import com.google.gson.JsonObject;import quarks.analytics.sensors.Deadtime;import quarks.console.server.HttpServer;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;/** * A recipe for using an Adaptable Deadtime Filter. */public class AdaptableDeadtimeFilterRecipe { /** * Poll a temperature sensor to periodically obtain temperature readings. * Create a \"deadtime\" filtered stream: after passing a tuple, * any tuples received during the \"deadtime\" are filtered out. * Then the next tuple is passed through and a new deadtime period begun. * * Respond to a simulated command stream to change the deadtime window * duration. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DevelopmentProvider(); System.out.println(\"development console url: \" + dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"TemperatureSensor\"); // Generate a polled temperature sensor stream and set it alias SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor(); TStream engineTemp = top.poll(tempSensor, 1, TimeUnit.SECONDS) .tag(\"engineTemp\"); // Filter out tuples during the specified \"deadtime window\" // Initially no filtering. Deadtime deadtime = new Deadtime(); TStream deadtimeFilteredEngineTemp = engineTemp.filter(deadtime) .tag(\"deadtimeFilteredEngineTemp\"); // Report the time each temperature reading arrives and the value deadtimeFilteredEngineTemp.peek(tuple -> System.out.println(new Date() + \" temp=\" + tuple)); // Generate a simulated \"set deadtime period\" command stream TStream cmds = simulatedSetDeadtimePeriodCmds(top); // Process the commands to change the deadtime window period cmds.sink(json -> setDeadtimePeriod(deadtime, json.getAsJsonPrimitive(\"period\").getAsLong(), TimeUnit.valueOf(json.getAsJsonPrimitive(\"unit\").getAsString()))); dp.submit(top); } static void setDeadtimePeriod(Deadtime deadtime, long period, TimeUnit unit) { System.out.println(\"Setting deadtime period=\"+period+\" \"+unit); deadtime.setPeriod(period, unit); } static TStream simulatedSetDeadtimePeriodCmds(Topology top) { AtomicInteger lastPeriod = new AtomicInteger(-1); TStream cmds = top.poll(() -> { // don't change on first invocation if (lastPeriod.get() == -1) { lastPeriod.incrementAndGet(); return null; } // toggle between 0 and 5 sec deadtime period int newPeriod = lastPeriod.get() == 5 ? 0 : 5; lastPeriod.set(newPeriod); JsonObject jo = new JsonObject(); jo.addProperty(\"period\", newPeriod); jo.addProperty(\"unit\", TimeUnit.SECONDS.toString()); return jo; }, 15, TimeUnit.SECONDS) .tag(\"cmds\"); return cmds; }}```"
},
{
"title": "Changing a filter's range",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_adaptable_filter_range",
"summary": "",
"body": "The [Detecting a sensor value out of range](recipe_value_out_of_range.html) recipe introduced the basics of filtering as well as the use of a [Range]({{ site.docsurl }}/lastest//lastest/quarks/analytics/sensors/Range.html).Oftentimes, a user wants a filter's behavior to be adaptable rather than static. A filter's range can be made changeable via commands from some external source or just changed as a result of some other local analytics.A Quarks `IotProvider` and `IoTDevice` with its command streams would be a natural way to control the application. In this recipe we will just simulate a \"set optimal temp range\" command stream.The string form of a `Range` is natural, consise, and easy to use. As such it's a convenient form to use as external range format. The range string can easily be converted back into a `Range`.We're going to assume familiarity with that earlier recipe and those concepts and focus on just the \"adaptable range specification\" aspect of this recipe.## Define the rangeA `java.util.concurrent.atomic.AtomicReference` is used to provide the necessary thread synchronization.```javastatic Range DEFAULT_TEMP_RANGE = Ranges.valueOfDouble(\"[77.0..91.0]\");static AtomicReference> optimalTempRangeRef = new AtomicReference(DEFAULT_TEMP_RANGE);```## Define a method to change the range```javastatic void setOptimalTempRange(Range range) { System.out.println(\"Using optimal temperature range: \" + range); optimalTempRangeRef.set(range);}```The filter just uses `optimalTempRangeRef.get()` to use the current range setting.## Simulate a command streamA `TStream> setRangeCmds` stream is created and a new range specification tuple is generated every 10 seconds. A `sink()` on the stream calls `setOptimalTempRange()` to change the range and hence the filter's bahavior.```java// Simulate a command stream to change the optimal range.// Such a stream might be from an IotDevice command.String[] ranges = new String[] { \"[70.0..120.0]\", \"[80.0..130.0]\", \"[90.0..140.0]\",};AtomicInteger count = new AtomicInteger(0);TStream> setRangeCmds = top.poll(() -> Ranges.valueOfDouble(ranges[count.incrementAndGet() % ranges.length]), 10, TimeUnit.SECONDS);setRangeCmds.sink(tuple -> setOptimalTempRange(tuple));```## The final application```javaimport java.util.concurrent.TimeUnit;import java.util.concurrent.atomic.AtomicInteger;import java.util.concurrent.atomic.AtomicReference;import quarks.analytics.sensors.Range;import quarks.analytics.sensors.Ranges;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;/** * Detect a sensor value out of expected range. * Simulate an adaptable range changed by external commands. */public class AdaptableFilterRange { /** * Optimal temperatures (in Fahrenheit) */ static Range DEFAULT_TEMP_RANGE = Ranges.valueOfDouble(\"[77.0..91.0]\"); static AtomicReference> optimalTempRangeRef = new AtomicReference(DEFAULT_TEMP_RANGE); static void setOptimalTempRange(Range range) { System.out.println(\"Using optimal temperature range: \" + range); optimalTempRangeRef.set(range); } /** * Polls a simulated temperature sensor to periodically obtain * temperature readings (in Fahrenheit). Use a simple filter * to determine when the temperature is out of the optimal range. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(\"TemperatureSensor\"); // Generate a stream of temperature sensor readings SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor(); TStream temp = top.poll(tempSensor, 1, TimeUnit.SECONDS); // Simple filter: Perform analytics on sensor readings to detect when // the temperature is out of the optimal range and generate warnings TStream simpleFiltered = temp.filter(tuple -> !optimalTempRangeRef.get().contains(tuple)); simpleFiltered.sink(tuple -> System.out.println(\"Temperature is out of range! \" + \"It is \" + tuple + \"\\u00b0F!\")); // See what the temperatures look like temp.print(); // Simulate a command stream to change the optimal range. // Such a stream might be from an IotDevice command. String[] ranges = new String[] { \"[70.0..120.0]\", \"[80.0..130.0]\", \"[90.0..140.0]\", }; AtomicInteger count = new AtomicInteger(0); TStream> setRangeCmds = top.poll( () -> Ranges.valueOfDouble(ranges[count.incrementAndGet() % ranges.length]), 10, TimeUnit.SECONDS); setRangeCmds.sink(tuple -> setOptimalTempRange(tuple)); dp.submit(top); }}```"
},
{
"title": "Changing a polled source stream's period",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_adaptable_polling_source",
"summary": "",
"body": "The [Writing a source function](recipe_source_function.html) recipe introduced the basics of creating a source stream by polling a data source periodically.Oftentimes, a user wants the poll frequency to be adaptable rather than static. For example, an event such as a sudden rise in a temperature sensor may motivate more frequent polling of the sensor and analysis of the data until the condition subsides. A change in the poll frequency may be driven by locally performed analytics or via a command from an external source.A Quarks `IotProvider` and `IoTDevice` with its command streams would be a natural way to control the application. In this recipe we will just simulate a \"set poll period\" command stream.The `Topology.poll()` [documentation]({{ site.docsurl }}/lastest/quarks/topology/Topology.html#poll-quarks.function.Supplier-long-java.util.concurrent.TimeUnit-) describes how the poll period may be changed at runtime.The mechanism is based on a more general Quarks runtime `quarks.execution.services.ControlService` service. The runtime registers \"control beans\" for entities that are controllable. These controls can be retrieved at runtime via the service.At runtime, `Topology.poll()` registers a `quarks.execution.mbeans.PeriodMXBean` control. __Retrieving the control at runtime requires setting an alias on the poll generated stream using `TStream.alias()`.__## Create the polled stream and set its alias```javaTopology top = ...;SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor();TStream engineTemp = top.poll(tempSensor, 1, TimeUnit.SECONDS) .alias(\"engineTemp\") .tag(\"engineTemp\");```It's also a good practice to add tags to streams to improve the usability of the development mode Quarks console.## Define a \"set poll period\" method```javastatic void setPollPeriod(TStream pollStream, long period, TimeUnit unit) { // get the topology's runtime ControlService service ControlService cs = pollStream.topology().getRuntimeServiceSupplier() .get().getService(ControlService.class); // using the the stream's alias, get its PeriodMXBean control PeriodMXBean control = cs.getControl(TStream.TYPE, pollStream.getAlias(), PeriodMXBean.class); // change the poll period using the control System.out.println(\"Setting period=\"+period+\" \"+unit+\" stream=\"+pollStream); control.setPeriod(period, unit);}```## Process the \"set poll period\" command streamOur commands are on the `TStream cmds` stream. Each `JsonObject` tuple is a command with the properties \"period\" and \"unit\".```javacmds.sink(json -> setPollPeriod(engineTemp, json.getAsJsonPrimitive(\"period\").getAsLong(), TimeUnit.valueOf(json.getAsJsonPrimitive(\"unit\").getAsString())));```## The final application```javaimport java.util.Date;import java.util.concurrent.TimeUnit;import java.util.concurrent.atomic.AtomicInteger;import com.google.gson.JsonObject;import quarks.execution.mbeans.PeriodMXBean;import quarks.execution.services.ControlService;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;/** * A recipe for a polled source stream with an adaptable poll period. */public class AdaptablePolledSource { /** * Poll a temperature sensor to periodically obtain temperature readings. * Respond to a simulated command stream to change the poll period. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(\"TemperatureSensor\"); // Generate a polled temperature sensor stream and set its alias SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor(); TStream engineTemp = top.poll(tempSensor, 1, TimeUnit.SECONDS) .alias(\"engineTemp\") .tag(\"engineTemp\"); // Report the time each temperature reading arrives and the value engineTemp.peek(tuple -> System.out.println(new Date() + \" temp=\" + tuple)); // Generate a simulated \"set poll period\" command stream TStream cmds = simulatedSetPollPeriodCmds(top); // Process the commands to change the poll period cmds.sink(json -> setPollPeriod(engineTemp, json.getAsJsonPrimitive(\"period\").getAsLong(), TimeUnit.valueOf(json.getAsJsonPrimitive(\"unit\").getAsString()))); dp.submit(top); } static void setPollPeriod(TStream pollStream, long period, TimeUnit unit) { // get the topology's runtime ControlService service ControlService cs = pollStream.topology().getRuntimeServiceSupplier() .get().getService(ControlService.class); // using the the stream's alias, get its PeriodMXBean control PeriodMXBean control = cs.getControl(TStream.TYPE, pollStream.getAlias(), PeriodMXBean.class); // change the poll period using the control System.out.println(\"Setting period=\"+period+\" \"+unit+\" stream=\"+pollStream); control.setPeriod(period, unit); } static TStream simulatedSetPollPeriodCmds(Topology top) { AtomicInteger lastPeriod = new AtomicInteger(1); TStream cmds = top.poll(() -> { // toggle between 1 and 2 sec period int newPeriod = lastPeriod.get() == 1 ? 2 : 1; lastPeriod.set(newPeriod); JsonObject jo = new JsonObject(); jo.addProperty(\"period\", newPeriod); jo.addProperty(\"unit\", TimeUnit.SECONDS.toString()); return jo; }, 5, TimeUnit.SECONDS) .tag(\"cmds\"); return cmds; }}```"
},
{
"title": "Splitting a stream to apply different processing and combining the results into a single stream",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_combining_streams_processing_results",
"summary": "",
"body": "In some cases, a developer might want to perform analytics taking into account the nature of the data. Say, for example, the data consists of log records each containing a level attribute. It would be logical to handle *fatal* log messages differently than *info* or *debug* messages. The same reasoning could also apply in the healthcare industry.Suppose doctors at a hospital would like to monitor patients' states using a bedside heart monitor. They would like to apply different analytics to the monitor readings based on the severity category of the blood pressure readings. For instance, if a patient is in hypertensive crisis (due to extremely high blood pressure), the doctors may want to analyze the patient's heart rate to determine risk of a stroke.In this instance, we can use `split` to separate blood pressure readings by category (five in total) and perform additional analytics on each of the resulting streams. After processing the data, we show how to define a new stream of alerts for each category and `union` the streams to create a stream containing all alerts.## Setting up the applicationWe assume that the environment has been set up following the steps outlined in the [Getting started guide](../docs/quarks-getting-started).First, we need to define a class for a heart monitor. We generate random blood pressure readings, each consisting of the systolic pressure (the top number) and the diastolic pressure (the bottom number). For example, with a blood pressure of 115/75 (read as \"115 over 75\"), the systolic pressure is 115 and the diastolic pressure is 75. These two pressures are stored in a `map`, and each call to `get()` returns new values.```javaimport java.util.HashMap;import java.util.Map;import java.util.Random;import quarks.function.Supplier;public class HeartMonitorSensor implements Supplier> { private static final long serialVersionUID = 1L; // Initial blood pressure public Integer currentSystolic = 115; public Integer currentDiastolic = 75; Random rand; public HeartMonitorSensor() { rand = new Random(); } /** * Every call to this method returns a map containing a random systolic * pressure and a random diastolic pressure. */ @Override public Map get() { // Change the current pressure by some random amount between -2 and 2 Integer newSystolic = rand.nextInt(2 + 1 + 2) - 2 + currentSystolic; currentSystolic = newSystolic; Integer newDiastolic = rand.nextInt(2 + 1 + 2) - 2 + currentDiastolic; currentDiastolic = newDiastolic; Map pressures = new HashMap(); pressures.put(\"Systolic\", currentSystolic); pressures.put(\"Diastolic\", currentDiastolic); return pressures; }}```Now, let's start our application by creating a `DirectProvider` and `Topology`. We choose a `DevelopmentProvider` so that we can view the topology graph using the console URL. We have also created a `HeartMonitor`.```javaimport java.util.HashSet;import java.util.List;import java.util.Map;import java.util.Set;import java.util.concurrent.TimeUnit;import quarks.console.server.HttpServer;import quarks.function.ToIntFunction;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.HeartMonitorSensor;import quarks.topology.TStream;import quarks.topology.Topology;public class CombiningStreamsProcessingResults { public static void main(String[] args) { HeartMonitorSensor monitor = new HeartMonitorSensor(); DirectProvider dp = new DevelopmentProvider(); System.out.println(dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"heartMonitor\"); // The rest of the code pieces belong here }}```## Generating heart monitor sensor readingsThe next step is to simulate a stream of readings. In our `main()`, we use the `poll()` method to generate a flow of tuples (readings), where each tuple arrives every millisecond. Unlikely readings are filtered out.```java// Generate a stream of heart monitor readingsTStream> readings = top .poll(monitor, 1, TimeUnit.MILLISECONDS) .filter(tuple -> tuple.get(\"Systolic\") > 50 && tuple.get(\"Diastolic\") > 30) .filter(tuple -> tuple.get(\"Systolic\") > split(int n, ToIntFunction splitter)````split` returns a `List` of `TStream` objects, where each item in the list is one of the resulting output streams. In this case, one stream in the list will contain a flow of tuples where the blood pressure reading belongs to one of the five blood pressure categories. Another stream will contain a flow of tuples where the blood pressure reading belongs to a different blood pressure category, and so on.There are two input parameters. You must specify `n`, the number of output streams, as well as a `splitter` method. `splitter` processes each incoming tuple individually and determines on which of the output streams the tuple will be placed. In this method, you can break down your placement rules into different branches, where each branch returns an integer indicating the index of the output stream in the list.Going back to our example, let's see how we can use `split` to achieve our goal. We pass in `6` as the first argument, as we want five output streams (i.e., a stream for each of the five different blood pressure categories) in addition one stream for invalid values. Our `splitter` method should then define how tuples will be placed on each of the five streams. We define a rule for each category, such that if the systolic and diastolic pressures of a reading fall in a certain range, then that reading belongs to a specific category. For example, if we are processing a tuple with a blood pressure reading of 150/95 (*High Blood Pressure (Hypertension) Stage 1* category), then we return `2`, meaning that the tuple will be placed in the stream at index `2` in the `categories` list. We follow a similar process for the other 4 categories, ordering the streams from lowest to highest severity.```javaList>> categories = readings.split(6, tuple -> { int s = tuple.get(\"Systolic\"); int d = tuple.get(\"Diastolic\"); if (s = 120 && s = 80 && d = 140 && s = 90 && d = 160 && s = 100 && d = 180 && d >= 110) { // Hypertensive Crisis return 4; } else { // Invalid return -1; }});```Note that instead of `split`, we could have performed five different `filter` operations. However, `split` is favored for cleaner code and more efficient processing as each tuple is only analyzed once.## Applying different processing against the streams to generate alertsAt this point, we have 6 output streams, one for each blood pressure category and one for invalid values (which we will ignore). We can easily retrieve a stream by using the standard `List` operation `get()`. For instance, we can retrieve all heart monitor readings with a blood pressure reading in the *Normal* category by retrieving the `TStream` at index `0` as we defined previously. Similarly, we can retrieve the other streams associated with the other four categories. The streams are tagged so that we can easily locate them in the topology graph.```java// Get each individual streamTStream> normal = categories.get(0).tag(\"normal\");TStream> prehypertension = categories.get(1).tag(\"prehypertension\");TStream> hypertension_stage1 = categories.get(2).tag(\"hypertension_stage1\");TStream> hypertension_stage2 = categories.get(3).tag(\"hypertension_stage2\");TStream> hypertensive = categories.get(4).tag(\"hypertensive\");```The hospital can then use these streams to perform analytics on each stream and generate alerts based on the blood pressure category. For this simple example, a different number of transformations/filters is applied to each stream (known as a processing pipeline) to illustrate that very different processing can be achieved that is specific to the category at hand.```java// Category: NormalTStream normalAlerts = normal .filter(tuple -> tuple.get(\"Systolic\") > 80 && tuple.get(\"Diastolic\") > 50) .tag(\"normal\") .map(tuple -> { return \"All is normal. BP is \" + tuple.get(\"Systolic\") + \"/\" + tuple.get(\"Diastolic\") + \".\\n\"; }) .tag(\"normal\");// Category: Prehypertension categoryTStream prehypertensionAlerts = prehypertension .map(tuple -> { return \"At high risk for developing hypertension. BP is \" + tuple.get(\"Systolic\") + \"/\" + tuple.get(\"Diastolic\") + \".\\n\"; }) .tag(\"prehypertension\");// Category: High Blood Pressure (Hypertension) Stage 1TStream hypertension_stage1Alerts = hypertension_stage1 .map(tuple -> { return \"Monitor closely, patient has high blood pressure. \" + \"BP is \" + tuple.get(\"Systolic\") + \"/\" + tuple.get(\"Diastolic\") + \".\\n\"; }) .tag(\"hypertension_stage1\") .modify(tuple -> \"High Blood Pressure (Hypertension) Stage 1\\n\" + tuple) .tag(\"hypertension_stage1\");// Category: High Blood Pressure (Hypertension) Stage 2TStream hypertension_stage2Alerts = hypertension_stage2 .filter(tuple -> tuple.get(\"Systolic\") >= 170 && tuple.get(\"Diastolic\") >= 105) .tag(\"hypertension_stage2\") .peek(tuple -> System.out.println(\"BP: \" + tuple.get(\"Systolic\") + \"/\" + tuple.get(\"Diastolic\"))) .map(tuple -> { return \"Warning! Monitor closely, patient is at risk of a hypertensive crisis!\\n\"; }) .tag(\"hypertension_stage2\") .modify(tuple -> \"High Blood Pressure (Hypertension) Stage 2\\n\" + tuple) .tag(\"hypertension_stage2\");// Category: Hypertensive CrisisTStream hypertensiveAlerts = hypertensive .filter(tuple -> tuple.get(\"Systolic\") >= 180) .tag(\"hypertensive\") .peek(tuple -> System.out.println(\"BP: \" + tuple.get(\"Systolic\") + \"/\" + tuple.get(\"Diastolic\"))) .map(tuple -> { return \"Emergency! See to patient immediately!\\n\"; }) .tag(\"hypertensive\") .modify(tuple -> tuple.toUpperCase()) .tag(\"hypertensive\") .modify(tuple -> \"Hypertensive Crisis!!!\\n\" + tuple) .tag(\"hypertensive\");```## Combining the alert streamsAt this point, we have five streams of alerts. Suppose the doctors are interested in seeing a combination of the *Normal* alerts and *Prehypertension* alerts. Or, suppose that they would like to see all of the alerts from all categories together. Here, `union` comes in handy. For more details about `union`, refer to the [Javadoc]({{ site.docsurl }}/lastest/quarks/topology/TStream.html#union-quarks.topology.TStream-).There are two ways to define a union. You can either union a `TStream` with another `TStream`, or with a set of streams (`Set>`). In both cases, a single `TStream` is returned containing the tuples that flow on the input stream(s).Let's look at the first case, unioning a stream with a single stream. We can create a stream containing *Normal* alerts and *Prehypertension* alerts by unioning `normalAlerts` with `prehypertensionAlerts`.```java// Additional processing for these streams could go here. In this case, union two streams// to obtain a single stream containing alerts from the normal and prehypertension alert streams.TStream normalAndPrehypertensionAlerts = normalAlerts.union(prehypertensionAlerts);```We can also create a stream containing alerts from all categories by looking at the other case, unioning a stream with a set of streams. We'll first create a set of `TStream` objects containing the alerts from the other three categories.```java// Set of streams containing alerts from the other categoriesSet> otherAlerts = new HashSet();otherAlerts.add(hypertension_stage1Alerts);otherAlerts.add(hypertension_stage2Alerts);otherAlerts.add(hypertensiveAlerts);```We can then create an `allAlerts` stream by calling `union` on `normalAndPrehypertensionAlerts` and `otherAlerts`. `allAlerts` will contain all of the tuples from:1. `normalAlerts`2. `prehypertensionAlerts`3. `hypertension_stage1Alerts`4. `hypertension_stage2Alerts`5. `hypertensiveAlerts````java// Union a stream with a set of streams to obtain a single stream containing alerts from// all alert streamsTStream allAlerts = normalAndPrehypertensionAlerts.union(otherAlerts);```Finally, we can terminate the stream and print out all alerts.```java// Terminate the stream by printing out alerts from all categoriesallAlerts.sink(tuple -> System.out.println(tuple));```We end our application by submitting the `Topology`. Note that this application is available as a [sample](https://github.com/apache/incubator-quarks/blob/master/samples/topology/src/main/java/quarks/samples/topology/CombiningStreamsProcessingResults.java).## Observing the outputWhen the final application is run, the output looks something like the following:```BP: 176/111High Blood Pressure (Hypertension) Stage 2Warning! Monitor closely, patient is at risk of a hypertensive crisis!BP: 178/111High Blood Pressure (Hypertension) Stage 2Warning! Monitor closely, patient is at risk of a hypertensive crisis!BP: 180/110Hypertensive Crisis!!!EMERGENCY! SEE TO PATIENT IMMEDIATELY!```## A look at the topology graphLet's see what the topology graph looks like. We can view it using the console URL that was printed to standard output at the start of the application. Notice how the graph makes it easier to visualize the resulting flow of the application."
},
{
"title": "How can I run several analytics on a tuple concurrently?",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_concurrent_analytics",
"summary": "",
"body": "If you have several independent lengthy analytics to perform on each tuple, you may determine that it would be advantageous to perform the analytics concurrently and then combine their results.The overall proessing time for a single tuple is then roughly that of the slowest analytic pipeline instead of the aggregate of each analytic pipeline.This usage model is in contrast to what's often referred to as _parallel_ tuple processing where several tuples are processed in parallel in replicated pipeline channels.e.g., for independent analytic pipelines A1, A2, and A3, you want to change the serial processing flow graph from:```sensorReadings -> A1 -> A2 -> A3 -> results```to a flow where the analytics run concurrently in a flow like:``` |-> A1 ->|sensorReadings -> |-> A2 ->| -> results |-> A3 ->|```The key to the above flow is to use a _barrier_ to synchronize the results from each of the pipelines so they can be combined into a single result tuple. Each of the concurrent channels also needs a thread to run its analytic pipeline.`PlumbingStreams.concurrent()` builds a concurrent flow graph for you. Alternatively, you can use `PlumbingStreams.barrier()` and `PlumbingStreams.isolate()` and build a concurrent flow graph yourself.More specifically `concurrent()` generates a flow like:``` |-> isolate(1) -> pipeline1 -> |stream -> |-> isolate(1) -> pipeline2 -> |-> barrier(10) -> combiner |-> isolate(1) -> pipeline3 -> |```It's easy to use `concurrent()`!## Define the collection of analytic pipelines to runFor the moment assume we have defined methods to create each pipeline: `a1pipeline()`, `a2pipeline()` and `a3pipeline()`. In this simple recipe each pipeline receives a `TStream` as input and generates a `TStream` as output.```javaList, TStream>> pipelines = new ArrayList();pipelines.add(a1pipeline());pipelines.add(a2pipeline());pipelines.add(a3pipeline());```## Define the result combinerEach pipeline creates one result tuple for each input tuple. The `barrier` collects one tuple from each pipeline and then creates a list of those tuples. The combiner is invoked with that list to generate the final aggregate result tuple.In this recipe the combiner is a simple lambda function that returns the input list:```javaFunction, List> combiner = list -> list;```## Build the concurrent flow```javaTStream> results = PlumbingStreams.concurrent(readings, pipelines, combiner);```## Define your analytic pipelinesFor each analytic pipeline, define a `Function, TStream>` that will create the pipeline. That is, define a function that takes a `TStream` as its input and yields a `TStream` as its result. Of course, `U` can be the same type as `T`.In this recipe we'll just define some very simple pipelines and use sleep to simulate some long processing times.Here's the A3 pipeline builder:```javastatic Function,TStream> a3pipeline() { // simple 3 stage pipeline simulating some amount of work by sleeping return stream -> stream.map(tuple -> { sleep(800, TimeUnit.MILLISECONDS); return \"This is the a3pipeline result for tuple \"+tuple; }).tag(\"a3.stage1\") .map(Functions.identity()).tag(\"a3.stage2\") .map(Functions.identity()).tag(\"a3.stage3\");}```## The final applicationWhen the application is run it prints out an aggregate result (a list of one tuple from each pipeline) every second. If the three pipelines were run serially, it would take on the order of 2.4 seconds to generate each aggregate result.```javapackage quarks.samples.topology;import java.util.ArrayList;import java.util.Date;import java.util.List;import java.util.concurrent.TimeUnit;import quarks.console.server.HttpServer;import quarks.function.Function;import quarks.function.Functions;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimpleSimulatedSensor;import quarks.topology.TStream;import quarks.topology.Topology;import quarks.topology.plumbing.PlumbingStreams;/** * A recipe for concurrent analytics. */public class ConcurrentRecipe { /** * Concurrently run a collection of long running independent * analytic pipelines on each tuple. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DevelopmentProvider(); System.out.println(\"development console url: \" + dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"ConcurrentRecipe\"); // Define the list of independent unique analytic pipelines to include List,TStream>> pipelines = new ArrayList(); pipelines.add(a1pipeline()); pipelines.add(a2pipeline()); pipelines.add(a3pipeline()); // Define the result combiner function. The combiner receives // a tuple containing a list of tuples, one from each pipeline, // and returns a result tuple of any type from them. // In this recipe we'll just return the list. Function,List> combiner = list -> list; // Generate a polled simulated sensor stream SimpleSimulatedSensor sensor = new SimpleSimulatedSensor(); TStream readings = top.poll(sensor, 1, TimeUnit.SECONDS) .tag(\"readings\"); // Build the concurrent analytic pipeline flow TStream> results = PlumbingStreams.concurrent(readings, pipelines, combiner) .tag(\"results\"); // Print out the results. results.sink(list -> System.out.println(new Date().toString() + \" results tuple: \" + list)); System.out.println(\"Notice how an aggregate result is generated every second.\" + \"\\nEach aggregate result would take 2.4sec if performed serially.\"); dp.submit(top); } /** Function to create analytic pipeline a1 and add it to a stream */ private static Function,TStream> a1pipeline() { // a simple 1 stage pipeline simulating some amount of work by sleeping return stream -> stream.map(tuple -> { sleep(800, TimeUnit.MILLISECONDS); return \"This is the a1pipeline result for tuple \"+tuple; }).tag(\"a1.stage1\"); } /** Function to create analytic pipeline a2 and add it to a stream */ private static Function,TStream> a2pipeline() { // a simple 2 stage pipeline simulating some amount of work by sleeping return stream -> stream.map(tuple -> { sleep(800, TimeUnit.MILLISECONDS); return \"This is the a2pipeline result for tuple \"+tuple; }).tag(\"a2.stage1\") .map(Functions.identity()).tag(\"a2.stage2\"); } /** Function to create analytic pipeline a3 and add it to a stream */ private static Function,TStream> a3pipeline() { // a simple 3 stage pipeline simulating some amount of work by sleeping return stream -> stream.map(tuple -> { sleep(800, TimeUnit.MILLISECONDS); return \"This is the a3pipeline result for tuple \"+tuple; }).tag(\"a3.stage1\") .map(Functions.identity()).tag(\"a3.stage2\") .map(Functions.identity()).tag(\"a3.stage3\"); } private static void sleep(long period, TimeUnit unit) throws RuntimeException { try { Thread.sleep(unit.toMillis(period)); } catch (InterruptedException e) { throw new RuntimeException(\"Interrupted\", e); } }}```"
},
{
"title": "Applying different processing against a single stream",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_different_processing_against_stream",
"summary": "",
"body": "In the previous [recipe](recipe_value_out_of_range), we learned how to filter a stream to obtain the interesting sensor readings and ignore the mundane data. Typically, a user scenario is more involved, where data is processed using different stream operations. Consider the following scenario, for example.Suppose a package delivery company would like to monitor the gas mileage of their delivery trucks using embedded sensors. They would like to apply different analytics to the sensor data that can be used to make more informed business decisions. For instance, if a truck is reporting consistently poor gas mileage readings, the company might want to consider replacing that truck to save on gas costs. Perhaps the company also wants to convert the sensor readings to JSON format in order to easily display the data on a web page. It may also be interested in determining the expected gallons of gas used based on the current gas mileage.In this instance, we can take the stream of gas mileage sensor readings and apply multiple types of processing against it so that we end up with streams that serve different purposes.## Setting up the applicationWe assume that the environment has been set up following the steps outlined in the [Getting started guide](../docs/quarks-getting-started). Let's begin by creating a `DirectProvider` and `Topology`. We choose a `DevelopmentProvider` so that we can view the topology graph using the console URL (refer to the [Application console](../docs/console) page for a more detailed explanation of this provider). The gas mileage bounds, initial gas mileage value, and the number of miles in a typical delivery route have also been defined.```javaimport java.text.DecimalFormat;import java.util.concurrent.TimeUnit;import com.google.gson.JsonObject;import quarks.analytics.sensors.Ranges;import quarks.console.server.HttpServer;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimpleSimulatedSensor;import quarks.topology.TStream;import quarks.topology.Topology;public class ApplyDifferentProcessingAgainstStream { /** * Gas mileage (in miles per gallon, or mpg) value bounds */ static double MPG_LOW = 7.0; static double MPG_HIGH = 14.0; /** * Initial gas mileage sensor value */ static double INITIAL_MPG = 10.5; /** * Hypothetical value for the number of miles in a typical delivery route */ static double ROUTE_MILES = 80; public static void main(String[] args) throws Exception { DirectProvider dp = new DevelopmentProvider(); System.out.println(dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"GasMileageSensor\"); // The rest of the code pieces belong here }}```## Generating gas mileage sensor readingsThe next step is to simulate a stream of gas mileage readings using [`SimpleSimulatedSensor`](https://github.com/apache/incubator-quarks/blob/master/samples/utils/src/main/java/quarks/samples/utils/sensor/SimpleSimulatedSensor.java). We set the initial gas mileage and delta factor in the first two arguments. The last argument ensures that the sensor reading falls in an acceptable range (between 7.0 mpg and 14.0 mpg). In our `main()`, we use the `poll()` method to generate a flow of tuples (readings), where each tuple arrives every second.```java// Generate a stream of gas mileage sensor readingsSimpleSimulatedSensor mpgSensor = new SimpleSimulatedSensor(INITIAL_MPG, 0.4, Ranges.closed(MPG_LOW, MPG_HIGH));TStream mpgReadings = top.poll(mpgSensor, 1, TimeUnit.SECONDS);```## Applying different processing to the streamThe company can now perform analytics on the `mpgReadings` stream and feed it to different functions.First, we can filter out gas mileage values that are considered poor and tag the resulting stream for easier viewing in the console.```java// Filter out the poor gas mileage readingsTStream poorMpg = mpgReadings .filter(mpg -> mpg json = mpgReadings .map(mpg -> { JsonObject jObj = new JsonObject(); jObj.addProperty(\"gasMileage\", mpg); return jObj; }).tag(\"mapped\");```In addition, we can calculate the estimated gallons of gas used based on the current gas mileage using `modify`.```java// Modify gas mileage stream to obtain a stream containing the estimated gallons of gas usedDecimalFormat df = new DecimalFormat(\"#.#\");TStream gallonsUsed = mpgReadings .modify(mpg -> Double.valueOf(df.format(ROUTE_MILES / mpg))).tag(\"modified\");```The three examples demonstrated here are a small subset of the many other possibilities of stream processing.With each of these resulting streams, the company can perform further analytics, but at this point, we terminate the streams by printing out the tuples on each stream.```java// Terminate the streamspoorMpg.sink(mpg -> System.out.println(\"Poor gas mileage! \" + mpg + \" mpg\"));json.sink(mpg -> System.out.println(\"JSON: \" + mpg));gallonsUsed.sink(gas -> System.out.println(\"Gallons of gas: \" + gas + \"\\n\"));```We end our application by submitting the `Topology`.## Observing the outputWhen the final application is run, the output looks something like the following:```JSON: {\"gasMileage\":9.5}Gallons of gas: 8.4JSON: {\"gasMileage\":9.2}Gallons of gas: 8.7Poor gas mileage! 9.0 mpgJSON: {\"gasMileage\":9.0}Gallons of gas: 8.9Poor gas mileage! 8.8 mpgJSON: {\"gasMileage\":8.8}Gallons of gas: 9.1```## A look at the topology graphLet's see what the topology graph looks like. We can view it using the console URL that was printed to standard output at the start of the application. We see that original stream is fanned out to three separate streams, and the `filter`, `map`, and `modify` operations are applied.## The final application```javaimport java.text.DecimalFormat;import java.util.concurrent.TimeUnit;import com.google.gson.JsonObject;import quarks.analytics.sensors.Ranges;import quarks.console.server.HttpServer;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimpleSimulatedSensor;import quarks.topology.TStream;import quarks.topology.Topology; /** * Fan out stream and perform different analytics on the resulting streams. */public class ApplyDifferentProcessingAgainstStream { /** * Gas mileage (in miles per gallon, or mpg) value bounds */ static double MPG_LOW = 7.0; static double MPG_HIGH = 14.0; /** * Initial gas mileage sensor value */ static double INITIAL_MPG = 10.5; /** * Hypothetical value for the number of miles in a typical delivery route */ static double ROUTE_MILES = 80; /** * Polls a simulated delivery truck sensor to periodically obtain * gas mileage readings (in miles/gallon). Feed the stream of sensor * readings to different functions (filter, map, and modify). */ public static void main(String[] args) throws Exception { DirectProvider dp = new DevelopmentProvider(); System.out.println(dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"GasMileageSensor\"); // Generate a stream of gas mileage sensor readings SimpleSimulatedSensor mpgSensor = new SimpleSimulatedSensor(INITIAL_MPG, 0.4, Ranges.closed(MPG_LOW, MPG_HIGH)); TStream mpgReadings = top.poll(mpgSensor, 1, TimeUnit.SECONDS); // Filter out the poor gas mileage readings TStream poorMpg = mpgReadings .filter(mpg -> mpg json = mpgReadings .map(mpg -> { JsonObject jObj = new JsonObject(); jObj.addProperty(\"gasMileage\", mpg); return jObj; }).tag(\"mapped\"); // Modify gas mileage stream to obtain a stream containing the estimated gallons of gas used DecimalFormat df = new DecimalFormat(\"#.#\"); TStream gallonsUsed = mpgReadings .modify(mpg -> Double.valueOf(df.format(ROUTE_MILES / mpg))).tag(\"modified\"); // Terminate the streams poorMpg.sink(mpg -> System.out.println(\"Poor gas mileage! \" + mpg + \" mpg\")); json.sink(mpg -> System.out.println(\"JSON: \" + mpg)); gallonsUsed.sink(gas -> System.out.println(\"Gallons of gas: \" + gas + \"\\n\")); dp.submit(top); }}```"
},
{
"title": "Dynamically Enabling Analytic Flows",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_dynamic_analytic_control",
"summary": "",
"body": "This recipe addresses the question: How can I dynamically enable or disable entire portions of my application's analytics?Imagine a topology that has a variety of analytics that it can perform. Each analytic flow comes with certain costs in terms of demands on the CPU or memory and implications for power consumption. Hence an application may wish to dynamically control whether or not an analytic flow is currently enabled.## ValveA `quarks.topology.plumbing.Valve` is a simple construct that can be inserted in stream flows to dynamically enable or disable downstream processing. A valve is either open or closed. When used as a `Predicate` to `TStream.filter()`, `filter` passes tuples only when the valve is open. Hence downstream processing is enabled when the valve is open and effectively disabled when the valve is closed.For example, consider a a topology consisting of 3 analytic processing flows that want to be dynamically enabled or disabled:```javaValve flow1Valve = new Valve(); // default is openValve flow2Valve = new Valve(false); // closedValve flow3Valve = new Valve(false);TStream readings = topology.poll(mySensor, 1, TimeUnit.SECONDS);addAnalyticFlow1(readings.filter(flow1Valve));addAnalyticFlow2(readings.filter(flow2Valve));addAnalyticFlow3(readings.filter(flow3Valve));```Elsewhere in the application, perhaps as a result of processing some device command from an external service such as when using an `IotProvider` or `IotDevice`, valves may be opened and closed dynamically to achieve the desired effects. For example:```javaTStream cmds = simulatedValveCommands(topology);cmds.sink(json -> { String valveId = json.getPrimitive(\"valve\").getAsString(); boolean isOpen = json.getPrimitive(\"isOpen\").getAsBoolean(); switch(valveId) { case \"flow1\": flow1Valve.setOpen(isOpen); break; case \"flow2\": flow2Valve.setOpen(isOpen); break; case \"flow3\": flow3Valve.setOpen(isOpen); break; }});```## Loosely coupled Quarks applicationsAnother approach for achieving dynamic control over what analytics flows are running is to utilize loosely coupled applications.In this approach, the overall application is partitioned into multiple applications (topologies). In the above example there could be four applications: one that publishes the sensor `readings` stream, and one for each of the analytic flows.The separate applications can connect to each other's streams using the `quarks.connectors.pubsub.PublishSubscribe` connector.Rather than having all of the analytic applications running all of the time, applications can be registered with a `quarks.topology.services.ApplicationService`. Registered applications can then be started and stopped dynamically.The `quarks.providers.iot.IotProvider` is designed to facilitate this style of use."
},
{
"title": "Using an external configuration file for filter ranges",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_external_filter_range",
"summary": "",
"body": "The [Detecting a sensor value out of range](recipe_value_out_of_range.html) recipe introduced the basics of filtering as well as the use of a [Range]({{ site.docsurl }}/lastest/quarks/analytics/sensors/Range.html).Oftentimes, a user wants to initialize a range specification from an external configuration file so the application code is more easily configured and reusable.The string form of a `Range` is natural, consise, and easy to use. As such it's a convenient form to use in configuration files or for users to enter. The range string can easily be converted back into a `Range`.We're going to assume familiarity with that earlier recipe and those concepts and focus on just the \"external range specification\" aspect of this recipe.## Create a configuration fileThe file's syntax is that for a `java.util.Properties` object. See the `Range` [documentation](https://github.com/apache/incubator-quarks/blob/master/analytics/sensors/src/main/java/quarks/analytics/sensors/Range.java) for its string syntax.Put this into a file:```# the Range string for the temperature sensor optimal rangeoptimalTempRange=[77.0..91.0]```Supply the pathname to this file as an argument to the application when you run it.## Loading the configuration fileA `java.util.Properties` object is often used for configuration parameters and it is easy to load the properties from a file.```java// Load the configuration file with the path string in configFilePathProperties props = new Properties();props.load(Files.newBufferedReader(new File(configFilePath).toPath()));```## Initializing the `Range````java// initialize the range from a Range string in the properties.// Use a default value if a range isn't present.static String DEFAULT_TEMP_RANGE_STR = \"[60.0..100.0]\";static Range optimalTempRange = Ranges.valueOfDouble( props.getProperty(\"optimalTempRange\", defaultRange));```## The final application```javaimport java.io.File;import java.nio.file.Files;import java.util.Properties;import java.util.concurrent.TimeUnit;import quarks.analytics.sensors.Range;import quarks.analytics.sensors.Ranges;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;/** * Detect a sensor value out of expected range. * Get the range specification from a configuration file. */public class ExternalFilterRange { /** * Optimal temperatures (in Fahrenheit) */ static String DEFAULT_TEMP_RANGE_STR = \"[60.0..100.0]\"; static Range optimalTempRange; /** Initialize the application's configuration */ static void initializeConfiguration(String configFilePath) throws Exception { // Load the configuration file Properties props = new Properties(); props.load(Files.newBufferedReader(new File(configFilePath).toPath())); // initialize the range from a Range string in the properties. // Use a default value if a range isn't present in the properties. optimalTempRange = Ranges.valueOfDouble( props.getProperty(\"optimalTempRange\", DEFAULT_TEMP_RANGE_STR)); System.out.println(\"Using optimal temperature range: \" + optimalTempRange); } /** * Polls a simulated temperature sensor to periodically obtain * temperature readings (in Fahrenheit). Use a simple filter * to determine when the temperature is out of the optimal range. */ public static void main(String[] args) throws Exception { if (args.length != 1) throw new Exception(\"missing pathname to configuration file\"); String configFilePath = args[0]; DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(\"TemperatureSensor\"); // Initialize the configuration initializeConfiguration(configFilePath); // Generate a stream of temperature sensor readings SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor(); TStream temp = top.poll(tempSensor, 1, TimeUnit.SECONDS); // Simple filter: Perform analytics on sensor readings to detect when // the temperature is out of the optimal range and generate warnings TStream simpleFiltered = temp.filter(tuple -> !optimalTempRange.contains(tuple)); simpleFiltered.sink(tuple -> System.out.println(\"Temperature is out of range! \" + \"It is \" + tuple + \"\\u00b0F!\")); // See what the temperatures look like temp.print(); dp.submit(top); }}```"
},
{
"title": "Hello Quarks!",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_hello_quarks",
"summary": "",
"body": "Quarks' pure Java implementation is a powerful feature which allows it to be run on the majority of JVM-compatible systems. It also has the added benefit of enabling the developer to develop applications entirely within the Eclipse and IntelliJ ecosystems. For the purposes of this recipe, it will be assumed that the developer is using Eclipse. To begin the Hello Quarks recipe, create a new project and import the necessary libraries as outlined in the [Getting started guide](../docs/quarks-getting-started). Next, write the following template application:``` javapublic static void main(String[] args) { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology();}```The *`DirectProvider`* is an object which allows the user to submit and run the final application. It also creates the *`Topology`* object, which gives the developer the ability to define a stream of strings.## Using `Topology.strings()`The primary abstraction in Quarks is the `TStream`. A *`TStream`* represents the flow of data in a Quarks application; for example, the periodic floating point readings from a temperature sensor. The data items which are sent through a `TStream` are Java objects — in the \"Hello Quarks!\" example, we are sending two strings. There are a number of ways to create a `TStream`, and `Topology.strings()` is the simplest. The user specifies a number of strings which will be used as the stream's data items.``` javapublic static void main(String[] args) { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); TStream helloStream = top.strings(\"Hello\", \"Quarks!\");}```The `helloStream` stream is created, and the \"Hello\" and \"Quarks!\" strings will be sent as its two data items.## Printing to output`TStream.print()` can be used to print the data items of a stream to standard output by invoking the `toString()` method of each data item. In this case the data items are already strings, but in principle `TStream.print()` can be called on any stream, regardless of the datatype carried by the stream.``` javapublic static void main(String[] args) { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); TStream helloStream = top.strings(\"Hello\", \"Quarks!\"); helloStream.print();}```## Submitting the applicationThe only remaining step is to submit the application, which is performed by the `DirectProvider`. Submitting a Quarks application initializes the threads which execute the `Topology`, and begins processing its data sources.``` javapublic static void main(String[] args) { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); TStream helloStream = top.strings(\"Hello\", \"Quarks!\"); helloStream.print(); dp.submit(top);}```After running the application, the output is \"Hello Quarks!\":```HelloQuarks!```"
},
{
"title": "How can I run analytics on several tuples in parallel?",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_parallel_analytics",
"summary": "",
"body": "If the duration of your per-tuple analytic processing makes your application unable to keep up with the tuple ingest rate or result generation rate, you can often run analytics on several tuples in parallel to improve performance.The overall proessing time for a single tuple is still the same but the processing for each tuple is overlapped. In the extreme your application may be able to process N tuples in the same time that it would have processed one.This usage model is in contrast to what's been called _concurrent analytics_, where multiple different independent analytics for a single tuple are performed concurrently, as when using `PlumbingStreams.concurrent()`.e.g., imagine your analytic pipeline has three stages to it: A1, A2, A3, and that A2 dominates the processing time. You want to change the serial processing flow graph from:```sensorReadings -> A1 -> A2 -> A3 -> results```to a flow where the A2 analytics run on several tuples in parallel in a flow like:``` |-> A2-channel0 ->|sensorReadings -> A1 -> |-> A2-channel1 ->| -> A3 -> results |-> A2-channel2 ->| |-> A2-channel3 ->| |-> A2-channel4 ->| ...```The key to the above flow is to use a _splitter_ to distribute the tuples among the parallel channels. Each of the parallel channels also needs a thread to run its analytic pipeline.`PlumbingStreams.parallel()` builds a parallel flow graph for you. Alternatively, you can use `TStream.split()`, `PlumbingStreams.isolate()`, and `TStream.union()` and build a parallel flow graph yourself.More specifically `parallel()` generates a flow like:``` |-> isolate(10) -> pipeline-ch0 -> |stream -> split(width,splitter) -> |-> isolate(10) -> pipeline-ch1 -> |-> union -> isolate(width) |-> isolate(10) -> pipeline-ch2 -> | ...```It's easy to use `parallel()`!## Define the splitterThe splitter function partitions the tuples among the parallel channels. `PlumbingStreams.roundRobinSplitter()` is a commonly used splitter that simply cycles among each channel in succession. The round robin strategy works great when the processing time of tuples is uniform. Other splitter functions may use information in the tuple to decide how to partition them.This recipe just uses the round robin splitter for a `TStream`.```javaint width = 5; // number of parallel channelsToIntFunction splitter = PlumbingStreams.roundRobinSplitter(width);```Another possibility is to use a \"load balanced splitter\" configuration. That is covered below.## Define the pipeline to run in parallelDefine a `BiFunction, Integer, TStream>` that builds the pipeline. That is, define a function that receives a `TStream` and an integer `channel` and creates a pipeline for that channel that returns a `TStream`.Many pipelines don't care what channel they're being constructed for. While the pipeline function typically yields the same pipeline processing for each channel there is no requirement for it to do so.In this simple recipe the pipeline receives a `TStream` as input and generates a `TStream` as output.```javastatic BiFunction, Integer, TStream> pipeline() { // a simple 4 stage pipeline simulating some amount of work by sleeping return (stream, channel) -> { String tagPrefix = \"pipeline-ch\"+channel; return stream.map(tuple -> { sleep(1000, TimeUnit.MILLISECONDS); return \"This is the \"+tagPrefix+\" result for tuple \"+tuple; }).tag(tagPrefix+\".stage1\") .map(Functions.identity()).tag(tagPrefix+\".stage2\") .map(Functions.identity()).tag(tagPrefix+\".stage3\"); .map(Functions.identity()).tag(tagPrefix+\".stage4\"); };}```## Build the parallel flowGiven a width, splitter and pipeline function it just takes a single call:```javaTStream results = PlumbingStreams.parallel(readings, width, splitter, pipeline());```## Load balanced parallel flowA load balanced parallel flow allocates an incoming tuple to the first available parallel channel. When tuple processing times are variable, using a load balanced parallel flow can result in greater overall throughput.To create a load balanced parallel flow simply use the `parallelBalanced()` method instead of `parallel()`. Everything is the same except you don't supply a splitter: ```javaTStream results = PlumbingStreams.parallelBalanced(readings, width, pipeline());```## The final applicationWhen the application is run it prints out 5 (width) tuples every second. Without the parallel channels, it would only print one tuple each second.```javapackage quarks.samples.topology;import java.util.Date;import java.util.concurrent.TimeUnit;import quarks.console.server.HttpServer;import quarks.function.BiFunction;import quarks.function.Functions;import quarks.providers.development.DevelopmentProvider;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimpleSimulatedSensor;import quarks.topology.TStream;import quarks.topology.Topology;import quarks.topology.plumbing.PlumbingStreams;/** * A recipe for parallel analytics. */public class ParallelRecipe { /** * Process several tuples in parallel in a replicated pipeline. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DevelopmentProvider(); System.out.println(\"development console url: \" + dp.getServices().getService(HttpServer.class).getConsoleUrl()); Topology top = dp.newTopology(\"ParallelRecipe\"); // The number of parallel processing channels to generate int width = 5; // Define the splitter ToIntFunction splitter = PlumbingStreams.roundRobinSplitter(width); // Generate a polled simulated sensor stream SimpleSimulatedSensor sensor = new SimpleSimulatedSensor(); TStream readings = top.poll(sensor, 10, TimeUnit.MILLISECONDS) .tag(\"readings\"); // Build the parallel analytic pipelines flow TStream results = PlumbingStreams.parallel(readings, width, splitter, pipeline()) .tag(\"results\"); // Print out the results. results.sink(tuple -> System.out.println(new Date().toString() + \" \" + tuple)); System.out.println(\"Notice that \"+width+\" results are generated every second - one from each parallel channel.\" + \"\\nOnly one result would be generated each second if performed serially.\"); dp.submit(top); } /** Function to create analytic pipeline and add it to a stream */ private static BiFunction,Integer,TStream> pipeline() { // a simple 3 stage pipeline simulating some amount of work by sleeping return (stream, channel) -> { String tagPrefix = \"pipeline-ch\"+channel; return stream.map(tuple -> { sleep(1000, TimeUnit.MILLISECONDS); return \"This is the \"+tagPrefix+\" result for tuple \"+tuple; }).tag(tagPrefix+\".stage1\") .map(Functions.identity()).tag(tagPrefix+\".stage2\") .map(Functions.identity()).tag(tagPrefix+\".stage3\"); }; } private static void sleep(long period, TimeUnit unit) throws RuntimeException { try { Thread.sleep(unit.toMillis(period)); } catch (InterruptedException e) { throw new RuntimeException(\"Interrupted\", e); } }}```"
},
{
"title": "Writing a source function",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_source_function",
"summary": "",
"body": "In the previous [Hello Quarks!](recipe_hello_quarks) example, we create a data source which generates two Java `String`s and prints them to output. Yet Quarks sources support the ability generate any data type as a source, not just Java types such as `String`s and `Double`s. Moreover, because the user supplies the code which generates the data, the user has complete flexibility for *how* the data is generated. This recipe demonstrates how a user could write such a custom data source.## Custom source: reading the lines of a web page{{site.data.alerts.note}} Quarks' API provides convenience methods for performing HTTP requests. For the sake of example we are writing a HTTP data source manually, but in principle there are easier methods. {{site.data.alerts.end}}One example of a custom data source could be retrieving the contents of a web page and printing each line to output. For example, the user could be querying the Yahoo Finance website for the most recent stock price data of Bank of America, Cabot Oil & Gas, and Freeport-McMoRan Inc.:``` javapublic static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); final URL url = new URL(\"http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl\");}```Given the correctly formatted URL to request the data, we can use the *`Topology.source()`* method to generate each line of the page as a data item on the stream. `Topology.source()` takes a Java `Supplier` that returns an `Iterable`. The supplier is invoked once, and the items returned from the Iterable are used as the stream's data items. For example, the following `queryWebsite` method returns a supplier which queries a URL and returns an `Iterable` of its contents:``` javaprivate static Supplier > queryWebsite(URL url) throws Exception{ return () -> { List lines = new LinkedList(); try { InputStream is = url.openStream(); BufferedReader br = new BufferedReader( new InputStreamReader(is)); for(String s = br.readLine(); s != null; s = br.readLine()) lines.add(s); } catch (Exception e) { e.printStackTrace(); } return lines; };}```When invoking `Topology.source()`, we can use `queryWebsite` to return the required supplier, passing in the URL.```javapublic static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); final URL url = new URL(\"http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl\"); TStream linesOfWebsite = top.source(queryWebsite(url));}```Source methods such as `Topology.source()` and `Topology.strings()` return a `TStream`. If we print the `linesOfWebsite` stream to standard output and run the application, we can see that it correctly generates the data and feeds it into the Quarks runtime:**Output**:```java\"BAC\",\"Bank of America Corporation Com\",13.150,13.140,\"12:00pm - 13.145\"\"COG\",\"Cabot Oil & Gas Corporation Com\",21.6800,21.6700,\"12:00pm - 21.6775\"\"FCX\",\"Freeport-McMoRan, Inc. Common S\",8.8200,8.8100,\"12:00pm - 8.8035\"```## Polling source: reading data periodicallyA much more common scenario for a developer is the periodic generation of data from a source operator — a data source may need to be polled every 5 seconds, 3 hours, or any time frame. To this end, `Topology` exposes the `poll()` method which can be used to call a function at the frequency of the user's choosing. For example, a user might want to query Yahoo Finance every two seconds to retrieve the most up to date ticker price for a stock:```javapublic static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(); final URL url = new URL(\"http://finance.yahoo.com/d/quotes.csv?s=BAC+COG+FCX&f=snabl\"); TStream> source = top.poll(queryWebsite(url), 2, TimeUnit.SECONDS); source.print(); dp.submit(top);}```**Output**:It's important to note that calls to `DirectProvider.submit()` are non-blocking; the main thread will exit, and the threads executing the topology will continue to run. (Also, to see changing stock prices, the above example needs to be run during open trading hours. Otherwise, it will simply return the same results every time the website is polled)."
},
{
"title": "Detecting a sensor value out of expected range",
"tags": "",
"keywords": "",
"url": "../recipes/recipe_value_out_of_range",
"summary": "",
"body": "Oftentimes, a user expects a sensor value to fall within a particular range. If a reading is outside the accepted limits, the user may want to determine what caused the anomaly and/or take action to reduce the impact. For instance, consider the following scenario.Suppose a corn grower in the Midwestern United States would like to monitor the average temperature in his corn field using a sensor to improve his crop yield. The optimal temperatures for corn growth during daylight hours range between 77°F and 91°F. When the grower is alerted of a temperature value that is not in the optimal range, he may want to assess what can be done to mitigate the effect.In this instance, we can use a filter to detect out-of-range temperature values.## Setting up the applicationWe assume that the environment has been set up following the steps outlined in the [Getting started guide](../docs/quarks-getting-started). Let's begin by creating a `DirectProvider` and `Topology`. We also define the optimal temperature range.```javaimport static quarks.function.Functions.identity;import java.util.concurrent.TimeUnit;import quarks.analytics.sensors.Filters;import quarks.analytics.sensors.Range;import quarks.analytics.sensors.Ranges;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;public class DetectValueOutOfRange { /** * Optimal temperature range (in Fahrenheit) */ static double OPTIMAL_TEMP_LOW = 77.0; static double OPTIMAL_TEMP_HIGH = 91.0; static Range optimalTempRange = Ranges.closed(OPTIMAL_TEMP_LOW, OPTIMAL_TEMP_HIGH); public static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(\"TemperatureSensor\"); // The rest of the code pieces belong here }}```## Generating temperature sensor readingsThe next step is to simulate a stream of temperature readings using [`SimulatedTemperatureSensor`](https://github.com/apache/incubator-quarks/blob/master/samples/utils/src/main/java/quarks/samples/utils/sensor/SimulatedTemperatureSensor.java). By default, the sensor sets the initial temperature to 80°F and ensures that new readings are between 28°F and 112°F. In our `main()`, we use the `poll()` method to generate a flow of tuples, where a new tuple (temperature reading) arrives every second.```java// Generate a stream of temperature sensor readingsSimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor();TStream temp = top.poll(tempSensor, 1, TimeUnit.SECONDS);```## Simple filteringIf the corn grower is interested in determining when the temperature is strictly out of the optimal range of 77°F and 91°F, a simple filter can be used. The `filter` method can be applied to `TStream` objects, where a filter predicate determines which tuples to keep for further processing. For its method declaration, refer to the [Javadoc]({{ site.docsurl }}/lastest/quarks/topology/TStream.html#filter-quarks.function.Predicate-).In this case, we want to keep temperatures below the lower range value *or* above the upper range value. This is expressed in the filter predicate, which follows Java's syntax for [lambda expressions](https://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html#syntax). Then, we terminate the stream (using `sink`) by printing out the warning to standard out. Note that `\\u00b0` is the Unicode encoding for the degree (°) symbol.```javaTStream simpleFiltered = temp.filter(tuple -> tuple OPTIMAL_TEMP_HIGH);simpleFiltered.sink(tuple -> System.out.println(\"Temperature is out of range! \" + \"It is \" + tuple + \"\\u00b0F!\"));```## Deadband filterAlternatively, a deadband filter can be used to glean more information about temperature changes, such as extracting the in-range temperature immediately after a reported out-of-range temperature. For example, large temperature fluctuations could be investigated more thoroughly.The `deadband` filter is a part of the `quarks.analytics` package focused on handling sensor data. Let's look more closely at the method declaration below.```javadeadband(TStream stream, Function value, Predicate inBand)```The first parameter is the stream to the filtered, which is `temp` in our scenario. The second parameter is the value to examine. Here, we use the `identity()` method to return a tuple on the stream. The last parameter is the predicate that defines the optimal range, that is, between 77°F and 91°F. it is important to note that this differs from the `TStream` version of `filter` in which one must explicitly specify the values that are out of range. The code snippet below demonstrates how the method call is pieced together. The `deadbandFiltered` stream contains temperature readings that follow the rules as described in the [Javadoc]({{ site.docsurl }}/lastest/quarks/analytics/sensors/Filters.html#deadband-quarks.topology.TStream-quarks.function.Function-quarks.function.Predicate-):* the value is outside of the optimal range (deadband)* the first value inside the optimal range after a period being outside it* the first tupleAs with the simple filter, the stream is terminated by printing out the warnings.```javaTStream deadbandFiltered = Filters.deadband(temp, identity(), tuple -> tuple >= OPTIMAL_TEMP_LOW && tuple System.out.println(\"Temperature may not be \" + \"optimal! It is \" + tuple + \"\\u00b0F!\"));```We end our application by submitting the `Topology`.## Observing the outputTo see what the temperatures look like, we can print the stream to standard out.```javatemp.print();```When the final application is run, the output looks something like the following:```Temperature may not be optimal! It is 79.1°F!79.179.479.078.878.078.377.4Temperature is out of range! It is 76.5°F!Temperature may not be optimal! It is 76.5°F!76.5Temperature may not be optimal! It is 77.5°F!77.577.1...```Note that the deadband filter outputs a warning message for the very first temperature reading of 79.1°F. When the temperature falls to 76.5°F (which is outside the optimal range), both the simple filter and deadband filter print out a warning message. However, when the temperature returns to normal at 77.5°F, only the deadband filter prints out a message as it is the first value inside the optimal range after a period of being outside it.## Range valuesFiltering against a range of values is such a common analytic activity that the `quarks.analytics.sensors.Range` class is provided to assist with that.Using a `Range` can simplify and clarify your application code and lessen mistakes that may occur when writing expressions to deal with ranges. Though not covered in this recipe, `Range`s offer additional conveniences for creating applications with external range specifications and adaptable filters.In the above examples, a single `Range` can be used in place of the two different expressions for the same logical range:```javastatic double OPTIMAL_TEMP_LOW = 77.0;static double OPTIMAL_TEMP_HIGH = 91.0;static Range optimalTempRange = Ranges.closed(OPTIMAL_TEMP_LOW, OPTIMAL_TEMP_HIGH);```Using `optimalTempRange` in the Simple filter example code:```javaTStream simpleFiltered = temp.filter(tuple -> !optimalTempRange.contains(tuple));```Using `optimalTempRange` in the Deadband filter example code:```javaTStream deadbandFiltered = Filters.deadband(temp, identity(), optimalTempRange);```## The final application```javaimport static quarks.function.Functions.identity;import java.util.concurrent.TimeUnit;import quarks.analytics.sensors.Filters;import quarks.analytics.sensors.Range;import quarks.analytics.sensors.Ranges;import quarks.providers.direct.DirectProvider;import quarks.samples.utils.sensor.SimulatedTemperatureSensor;import quarks.topology.TStream;import quarks.topology.Topology;/** * Detect a sensor value out of expected range. */public class DetectValueOutOfRange { /** * Optimal temperature range (in Fahrenheit) */ static double OPTIMAL_TEMP_LOW = 77.0; static double OPTIMAL_TEMP_HIGH = 91.0; static Range optimalTempRange = Ranges.closed(OPTIMAL_TEMP_LOW, OPTIMAL_TEMP_HIGH); /** * Polls a simulated temperature sensor to periodically obtain * temperature readings (in Fahrenheit). Use a simple filter * and a deadband filter to determine when the temperature * is out of the optimal range. */ public static void main(String[] args) throws Exception { DirectProvider dp = new DirectProvider(); Topology top = dp.newTopology(\"TemperatureSensor\"); // Generate a stream of temperature sensor readings SimulatedTemperatureSensor tempSensor = new SimulatedTemperatureSensor(); TStream temp = top.poll(tempSensor, 1, TimeUnit.SECONDS); // Simple filter: Perform analytics on sensor readings to // detect when the temperature is completely out of the // optimal range and generate warnings TStream simpleFiltered = temp.filter(tuple -> !optimalTempRange.contains(tuple)); simpleFiltered.sink(tuple -> System.out.println(\"Temperature is out of range! \" + \"It is \" + tuple + \"\\u00b0F!\")); // Deadband filter: Perform analytics on sensor readings to // output the first temperature, and to generate warnings // when the temperature is out of the optimal range and // when it returns to normal TStream deadbandFiltered = Filters.deadband(temp, identity(), optimalTempRange); deadbandFiltered.sink(tuple -> System.out.println(\"Temperature may not be \" + \"optimal! It is \" + tuple + \"\\u00b0F!\")); // See what the temperatures look like temp.print(); dp.submit(top); }}```"
},
{
"title": "Sample programs",
"tags": "",
"keywords": "",
"url": "../docs/samples",
"summary": "",
"body": "The [Getting started guide](quarks-getting-started) includes a step-by-step walkthrough of a simple Quarks application.Quarks also includes a number of sample Java applications that demonstrate different ways that you can use and implement Quarks.If you are using a released version of Quarks, the samples are already compiled and ready to use. If you downloaded the source or the Git project, the samples are built when you build Quarks.## ResourcesThe samples are currently available only for Java 8 environments. To use the samples, you'll need the resources in the following subdirectories of the Quarks package.:* The `java8/samples` directory contains the Java code for the samples* The `java8/scripts` directory contains the shell scripts that you need to run the samplesIf you use any of the samples in your own applications, ensure that you include the related Quarks JAR files in your `classpath`.## Recommended samplesIn addition to the sample application in the [Getting started guide](quarks-getting-started), the following samples can help you start developing with Quarks:* **HelloQuarks** - This simple program demonstrates the basic mechanics of declaring and executing a topology* **PeriodicSource** - This simple program demonstrates how to periodically poll a source for data to create a source stream* **SimpleFilterTransform** - This simple program demonstrates a simple analytics pipeline: `source -> filter -> transform -> sink`* **SensorAnalytics** - This more complex program demonstrates multiple facets of a Quarks application, including: * Configuration control * Reading data from a device with multiple sensors * Running common analytic algorithms * Publishing results to MQTT server * Receiving commands * Logging results locally * Conditional stream tracing* **IBM Watson IoT Platform** - Samples that demonstrate how to use IBM Watson IoT Platform as the IoT scale message hub between Quarks and back-end analytic systems: * [Sample using the no-registration Quickstart service](quickstart)Additional samples are documented in the [Quarks Overview]({{ site.docsurl }}/lastest/overview-summary.html#overview.description) section of the Javadoc."
},
null
]