blob: 3994dbe79ac17dcffc26dff897db5abb11ffb0f9 [file] [log] [blame]
<html lang="en"><head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<!-- The above 3 meta tags *must* come first in the head; any other head content must come *after* these tags -->
<meta name="description" content="Apex is an enterprise grade native YARN big data-in-motion platform that unifies stream processing as well as batch processing.">
<meta name="author" content="Apache Software Foundation">
<link rel="icon" href="favicon.ico">
<title>Apache Apex</title>
<!-- Main Stylesheet -->
<link href="css/main.css" rel="stylesheet">
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
ga('create', 'UA-85540278-1', 'auto');
ga('send', 'pageview');
<nav class="navbar navbar-default navbar-static-top" id="main-nav">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<a class="navbar-brand" href="/">
<img src="images/apex-logo.svg" class="logo" alt="Apache Apex Logo">
Apache Apex<span class="trademark">&trade;</span>
<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
<ul class="nav navbar-right navbar-nav">
<li class="nav-item">
<a class="nav-link " href="/">Home</a>
<li class="nav-item">
<a class="nav-link " href="/docs.html">Documentation</a>
<li class="nav-item">
<a class="nav-link " href="/powered-by-apex.html">Powered By Apex</a>
<li class="nav-item">
<a class="nav-link " href="/roadmap.html">Roadmap</a>
<li class="nav-item nav-mouseover">
<ul class="dropdown-menu">
<li class=""><a href="/community.html#mailing-lists">Mailing Lists</a></li>
<li class=""><a href="/community.html#issue-tracking">Issue Tracking</a></li>
<li class=""><a href="">Stack Overflow</a></li>
<li class=""><a href="/community.html#events">Events</a></li>
<li class=""><a href="/community.html#contributing">Contributing</a></li>
<li class=""><a href="">Apache Foundation</a></li>
<a href="/community.html" class="nav-link">Community</a>
<li class="nav-item nav-mouseover">
<ul class="dropdown-menu">
<li class=""><a href="">Apex Core</a></li>
<li class=""><a href="">Apex Malhar</a></li>
<li class=""><a href="">Apex Site</a></li>
<a href="" class="nav-link">Github</a>
<li class="nav-item">
<a class="nav-link btn btn-success btn-download" href="/downloads.html">Download</a>
<div class="container">
<div class="docs-nav"><strong>
<div class="fixed-links">
<li class="docs active"><a href="#documentation">Documentation</a></li>
<li class="write-apex"><a href="#writing-apache-apex-applications">Writing Apache Apex Applications</a></li>
<li><a href="#presentations">Presentations</a></li>
<li class="blogs"><a href="#blogs">Blogs</a></li>
<li><a href="#troubleshooting">Troubleshooting</a></li>
<div class="container">
<h3 id="documentation">Documentation</h3>
<li><strong><a href="/docs/apex/">Apache Apex Core Documentation</a></strong> including overviews of the product, security, application development, operators and the commandline tool.</li>
<li><strong><a href="/docs/malhar/">Apache Apex Malhar Documentation</a></strong> for the operator library including a diagrammatic taxonomy and some in-depth tutorials for selected operators (such as Kafka Input).</li>
<li><strong><a href="/downloads.html">Java API documentation</a></strong> for recent releases is available under <a href="/downloads.html">Downloads</a>.</li>
<h3 id="writing-apache-apex-applications">Writing Apache Apex Applications</h3>
<li><a href="" rel="nofollow">Beginner&#39;s Guide to Apache Apex</a> This document provides a comprehensive overview of Apex and is recommended for developers just starting out with Apex.</li>
<li><a href="">Building Your First Apache Apex Application</a> This video has a hands-on demonstration of how to check out the source code repositories and build them, then run the maven archetype command to generate a new Apache Apex project, populate the project with Java source files for a new application, and finally, build and run the application -- all on a virtual machine running Linux with Apache Hadoop installed.</li>
<li><a href="" rel="nofollow">Learning Apache Apex Book</a> and related blog <a href="" rel="nofollow">Apache Apex in a Nutshell</a> An instructional and example driven guide on how to build Apex applications for developers and hands-on enterprise architects. It will help identify use cases, the building blocks for solutions and the process of implementing and testing production ready Apex applications.</li>
<li><a href="">Writing an Apache Apex application</a> A PDF document that frames a hands-on exercise of building a basic application; also includes a diagram illustrating the life-cycle of operators.</li>
<li><a href="">Examples</a> This is part of the source repository for Apache Apex Malhar and contains a number of readily runnable applications that developers will find especially useful. They include the important IO connectors as well as typical processing patterns like a Twitter stream analyzer, computation of statistics (such as moving averages) from a live stream of stock transactions from <em>Yahoo! Finance</em>; and one that analyzes a synthetic stream of eruption event data for the <em>Old Faithful</em> geyser.</li>
<li><a href="" rel="nofollow">Top N Words Application Tutorial</a> This document provides a detailed step-by-step description of how to build and run a
word counting application with Apache Apex starting with setting up your development environment, progressing to building, running and monitoring the application, visualizing the output and concluding with some advanced features such as assessing operator memory requirements, partitioning, and debugging.</li>
<li><a href="" rel="nofollow">Sales Dimensions Application Tutorial</a> Similar to the Top N Words application but covers
dimensional computations on a simulated sales data stream.</li>
<li><a href="" rel="nofollow">More Example Applications</a> Sample code for more IO connectors and specialized tutorials covering a variety of topics such as large key-value state management (HDHT), custom partitioning using stream codecs, etc.</li>
<h3 id="presentations">Presentations</h3>
<li><a href="">Slideshare/ApacheApex</a> Presentations from past events covering Apache Apex introduction, feature deep dive, integration, customer use cases and more.</li>
<li><a href="">Next Gen Decision Making in &lt; 2ms</a> A video discussing CapitalOne&#39;s experience with Apache Apex and evaluation of competing technologies along with the <a href="">slides</a>.</li>
<li><a href="">Stream Processing with Apache Apex (video)</a> and <a href="">(slides)</a> A broad overview slide deck covering topics such as windowing, static and dynamic partitioning, fault tolerance, locality, monitoring, etc.</li>
<li><a href="">Stateful Streaming Data Pipelines with Apache Apex (slides)</a> An overview of state management data structures and storage mechanism in Apache Apex.</li>
<li><a href="">Fault Tolerance and Processing Semantics (video)</a> and <a href="">(slides)</a> A webinar covering core Apache Apex features including checkpointing and fault tolerance with fast, incremental recovery via a buffer server which uses a publish-subscribe model for inter-operator data transport. A variety of failure scenarios and processing guarantees are discussed.</li>
<li><a href="">Smart Partitioning with Apache Apex (video)</a> and <a href="">(slides)</a> Webinar covering partitioning, including unique Apex features such as elasticity with dynamic resource allocation, parallel partitions for speculative execution and processing SLA etc.</li>
<li><a href="">Real Time Stream Processing Versus Batch</a> Slide deck compares and contrasts the needs, use cases and challenges of stream processing with those of batch processing.</li>
<h3 id="blogs">Blogs</h3>
<li><a href="'%20DataTorrent%20Blog.htm" rel="nofollow">Introducing Apache Apex</a> Introduces Apache Apex and discusses how it addresses the current challenges of Big Data in the areas of code reuse, operability, ease of use and the benefits of a YARN-native solution.</li>
<li><a href="" rel="nofollow">Tracing DAGs from Specification to Execution</a> Discusses DAGs (Directed Acyclic Graphs) as an application model, how they can be specified in Java or via JSON, how the platform transforms them to physical plans for scaling and how they can be monitored via the REST API.</li>
<li><a href="" rel="nofollow">An Introduction to Checkpointing in Apache Apex</a> Discusses checkpointing by saving serializing operator state to HDFS and how to configure the frequency of checkpointing (or skip it altogether) via attributes or annotations.</li>
<li><a href="" rel="nofollow"> End-to-end <em>Exactly-Once</em> with Apache Apex</a> Details how Apache Apex can work in conjunction with transactional systems to provide <em>exactly-once</em> semantics. A simple example of reading data from a Kafka topic and writing processed results to a SQL database is discussed along with the relevant operators (already provided in the Apex Malhar library) and the importance of idempotency.</li>
<li><a href="" rel="nofollow">Dimensions Computation - Part 1: Introduction</a> A two-part blog that discusses dimensions computation in Apache Apex in considerable detail. The first part introduces the domain, shows an <em>AdEvent</em> object to model tuples in the data stream and analyzes the various dimensions of interest.</li>
<li><a href="" rel="nofollow">Dimensions Computation - Part 2: Implementation</a> The second part continues with discussion of the three phases involved (<em>pre-aggregation</em>, <em>unification</em> and <em>storage</em>) the JSON schema to encapsulate the various keys and aggregates, code fragments and, finally, concludes with visualization of the results.</li>
<li><a href="" rel="nofollow">Apache Apex Performance Benchmarks</a> Discusses the performance suite used to certify releases.</li>
<li><a href="!%20Performance%20Benchmarks.%20Is%20there%20a%20winner_%20-%20DataTorrent.htm" rel="nofollow">Throughput, Latency, And Yahoo! Performance Benchmarks. Is There A Winner?</a></li>
<li><a href="" rel="nofollow">Fault-Tolerant File Processing with Apache Apex</a></li>
<li><a href="" rel="nofollow">SQL On Apache Apex</a></li>
<li><a href="">Apache Software Foundation</a> Discusses the history of the foundation, guiding principles, current statistics and provides numerous additional links for details of how the foundation operates and is managed.</li>
<li><a href="">Data Ingestion Platform - Xavient Information Systems</a> Discusses usage of Apache Apex in their data ingestion platform.</li>
<h3 id="troubleshooting">Troubleshooting</h3>
<li><a href="" rel="nofollow">Troubleshooting Guide</a></li>
<div class="container">
<footer id="main-footer">
Copyright &copy; <span id="copyright-year">2015</span> <a href="">The Apache Software Foundation</a>,
Licensed under the Apache License, Version 2.0<br>
Apache and the Apache feather logo are trademarks of The Apache Software Foundation. | <a href="/privacy.html">Privacy Policy</a><br>
<a class="footer-link-img" href=""><img src="/images/asf_logo.svg" alt="The Apache Software Foundation"></a>
</div> <!-- /container -->
<!-- Placed at the end of the document so the pages load faster -->
<script src=""></script>
<script src="/js/bootstrap.min.js"></script>
<script src="js/docs.js"></script>
$('#copyright-year').text((new Date()).getFullYear());
ga('send', {
hitType: 'event',
eventCategory: 'Download Button',
eventAction: 'Button Click',
eventLabel: 'Page Header'